MON-237 Fix Cosmos DB RU monitor & auto-update

This commit is contained in:
Laurent Piroelle 2018-08-27 17:59:01 +02:00
parent e2ce23c0f5
commit 49a8a22958
10 changed files with 53 additions and 25 deletions

View File

@ -29,41 +29,46 @@ Creates DataDog monitors with the following checks:
| cosmos_db_4xx_request_extra_tags | Extra tags for Cosmos DB 4xx requests monitor | list | `<list>` | no |
| cosmos_db_4xx_request_rate_threshold_critical | Critical threshold for Cosmos DB 4xx requests monitor | string | `80` | no |
| cosmos_db_4xx_request_rate_threshold_warning | Warning threshold for Cosmos DB 4xx requests monitor | string | `50` | no |
| cosmos_db_4xx_request_time_aggregator | Monitor aggregator for Cosmos DB status [available values: min, max or avg] | string | `sum` | no |
| cosmos_db_4xx_request_timeframe | Monitor timeframe for Cosmos DB status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_15m` | no |
| cosmos_db_4xx_request_time_aggregator | Monitor aggregator for Cosmos DB 4xx requests [available values: min, max or avg] | string | `sum` | no |
| cosmos_db_4xx_request_timeframe | Monitor timeframe for Cosmos DB 4xx requests [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| cosmos_db_4xx_requests_enabled | Flag to enable Cosmos DB 4xx requests monitor | string | `true` | no |
| cosmos_db_4xx_requests_message | Custom message for Cosmos DB 4xx requests monitor | string | `` | no |
| cosmos_db_4xx_requests_silenced | Groups to mute for Cosmos DB 4xx requests monitor | map | `<map>` | no |
| cosmos_db_5xx_request_rate_extra_tags | Extra tags for Cosmos DB 5xx requests monitor | list | `<list>` | no |
| cosmos_db_5xx_request_rate_threshold_critical | Critical threshold for Cosmos DB 5xx requests monitor | string | `80` | no |
| cosmos_db_5xx_request_rate_threshold_warning | Warning threshold for Cosmos DB 5xx requests monitor | string | `50` | no |
| cosmos_db_5xx_request_time_aggregator | Monitor aggregator for Cosmos DB status [available values: min, max or avg] | string | `sum` | no |
| cosmos_db_5xx_request_timeframe | Monitor timeframe for Cosmos DB status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_15m` | no |
| cosmos_db_5xx_request_time_aggregator | Monitor aggregator for Cosmos DB 5xx requests [available values: min, max or avg] | string | `sum` | no |
| cosmos_db_5xx_request_timeframe | Monitor timeframe for Cosmos DB 5xx requests [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| cosmos_db_5xx_requests_enabled | Flag to enable Cosmos DB 5xx requests monitor | string | `true` | no |
| cosmos_db_5xx_requests_message | Custom message for Cosmos DB 5xx requests monitor | string | `` | no |
| cosmos_db_5xx_requests_silenced | Groups to mute for Cosmos DB 5xx requests monitor | map | `<map>` | no |
| cosmos_db_no_request_enabled | Flag to enable Cosmos DB no request monitor | string | `true` | no |
| cosmos_db_no_request_extra_tags | Extra tags for Cosmos DB no request monitor | list | `<list>` | no |
| cosmos_db_no_request_message | Custom message for Cosmos DB no request monitor | string | `` | no |
| cosmos_db_no_request_silenced | Groups to mute for Cosmos DB no request monitor | map | `<map>` | no |
| cosmos_db_no_request_time_aggregator | Monitor aggregator for Cosmos DB status [available values: min, max or avg] | string | `max` | no |
| cosmos_db_no_request_timeframe | Monitor timeframe for Cosmos DB status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_15m` | no |
| cosmos_db_ru_utilization_collections | Group to associate Cosmos DB collection to RU max | map | - | yes |
| cosmos_db_no_request_time_aggregator | Monitor aggregator for Cosmos DB no request [available values: min, max or avg] | string | `max` | no |
| cosmos_db_no_request_timeframe | Monitor timeframe for Cosmos DB no request [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| cosmos_db_ru_utilization_collections | Group to associate Cosmos DB collection to RU max. RU value has to be correlated with the monitor timeframe | map | - | yes |
| cosmos_db_ru_utilization_enabled | Flag to enable Cosmos DB collection RU utilization monitor | string | `true` | no |
| cosmos_db_ru_utilization_extra_tags | Extra tags for Cosmos DB collection RU utilization monitor | list | `<list>` | no |
| cosmos_db_ru_utilization_message | Custom message for Cosmos DB collection RU utilization monitor | string | `` | no |
| cosmos_db_ru_utilization_rate_threshold_critical | Critical threshold for Cosmos DB collection RU utilization monitor | string | `90` | no |
| cosmos_db_ru_utilization_rate_threshold_warning | Warning threshold for Cosmos DB collection RU utilization monitor | string | `80` | no |
| cosmos_db_ru_utilization_silenced | Groups to mute for Cosmos DB collection RU utilization monitor | map | `<map>` | no |
| cosmos_db_ru_utilization_time_aggregator | Monitor aggregator for Cosmos DB status [available values: min, max or avg] | string | `max` | no |
| cosmos_db_ru_utilization_timeframe | Monitor timeframe for Cosmos DB status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_15m` | no |
| cosmos_db_ru_utilization_time_aggregator | Monitor aggregator for Cosmos DB RU utilization [available values: min, max or avg] | string | `sum` | no |
| cosmos_db_ru_utilization_timeframe | Monitor timeframe for Cosmos DB RU utilization [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| environment | Architecture environment | string | - | yes |
| evaluation_delay | Delay in seconds for the metric evaluation | string | `900` | no |
| filter_tags_custom | Tags used for custom filtering when filter_tags_use_defaults is false | string | `*` | no |
| filter_tags_use_defaults | Use default filter tags convention | string | `true` | no |
| message | Message sent when a monitor is triggered | string | - | yes |
| new_host_delay | Delay in seconds before monitor new resource | string | `300` | no |
| status_enabled | Flag to enable Cosmos DB status monitor | string | `true` | no |
| status_extra_tags | Extra tags for Cosmos DB status monitor | list | `<list>` | no |
| status_message | Custom message for Cosmos DB status monitor | string | `` | no |
| status_silenced | Groups to mute for Cosmos DB status monitor | map | `<map>` | no |
| status_time_aggregator | Monitor aggregator for Cosmos DB status [available values: min, max or avg] | string | `max` | no |
| status_timeframe | Monitor timeframe for Cosmos DB status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_15m` | no |
| status_timeframe | Monitor timeframe for Cosmos DB status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
## Outputs

View File

@ -229,7 +229,7 @@ variable "cosmos_db_ru_utilization_extra_tags" {
variable "cosmos_db_ru_utilization_time_aggregator" {
description = "Monitor aggregator for Cosmos DB RU utilization [available values: min, max or avg]"
type = "string"
default = "min"
default = "sum"
}
variable "cosmos_db_ru_utilization_timeframe" {
@ -239,6 +239,6 @@ variable "cosmos_db_ru_utilization_timeframe" {
}
variable "cosmos_db_ru_utilization_collections" {
description = "Group to associate Cosmos DB collection to RU max"
description = "Group to associate Cosmos DB collection to RU max. RU value has to be correlated with the monitor timeframe"
type = "map"
}

View File

@ -173,8 +173,8 @@ resource "datadog_monitor" "cosmos_db_ru_utilization" {
query = <<EOF
${var.cosmos_db_ru_utilization_time_aggregator}(${var.cosmos_db_ru_utilization_timeframe}): (
(
avg:azure.cosmosdb.total_request_units${format(module.filter-tags-collection.query_alert,lower(element(keys(var.cosmos_db_ru_utilization_collections),count.index)))} by {resource_group,region,name,collectionname} +
avg:azure.documentdb_databaseaccounts.total_request_units${format(module.filter-tags-collection.query_alert,lower(element(keys(var.cosmos_db_ru_utilization_collections),count.index)))} by {resource_group,region,name,collectionname}
sum:azure.cosmosdb.total_request_units${format(module.filter-tags-collection.query_alert,lower(element(keys(var.cosmos_db_ru_utilization_collections),count.index)))} by {resource_group,region,name,collectionname} +
sum:azure.documentdb_databaseaccounts.total_request_units${format(module.filter-tags-collection.query_alert,lower(element(keys(var.cosmos_db_ru_utilization_collections),count.index)))} by {resource_group,region,name,collectionname}
) /
${element(values(var.cosmos_db_ru_utilization_collections),count.index)}
) * 100 > ${var.cosmos_db_ru_utilization_rate_threshold_critical}

View File

@ -1,4 +0,0 @@
output "cosmos_db_ru_utilization_id" {
description = "id for monitor cosmos_db_ru_utilization"
value = "${datadog_monitor.cosmos_db_ru_utilization.*.id}"
}

View File

@ -17,3 +17,8 @@ output "cosmos_db_success_no_data_id" {
description = "id for monitor cosmos_db_success_no_data"
value = "${datadog_monitor.cosmos_db_success_no_data.id}"
}
output "cosmos_db_ru_utilization_id" {
description = "id for monitor cosmos_db_ru_utilization"
value = "${datadog_monitor.cosmos_db_ru_utilization.*.id}"
}

View File

@ -28,11 +28,12 @@ Creates DataDog monitors with the following checks:
| filter_tags_use_defaults | Use default filter tags convention | string | `true` | no |
| message | Message sent when a monitor is triggered | string | - | yes |
| new_host_delay | Delay in seconds before monitor new resource | string | `300` | no |
| status_enabled | Flag to enable Datalake Store status monitor | string | `true` | no |
| status_extra_tags | Extra tags for Datalake Store status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | list | `<list>` | no |
| status_message | Custom message for Datalake Store status monitor | string | `` | no |
| status_silenced | Groups to mute for Datalake Store status monitor | map | `<map>` | no |
| status_time_aggregator | Monitor aggregator for Datalake Store status [available values: min, max or avg] | string | `max` | no |
| status_timeframe | Monitor timeframe for Datalake Store status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_15m` | no |
| status_timeframe | Monitor timeframe for Datalake Store status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
## Outputs

View File

@ -1824,7 +1824,7 @@ variable "cosmos_db_ru_utilization_extra_tags" {
variable "cosmos_db_ru_utilization_time_aggregator" {
description = "Monitor aggregator for Cosmos DB RU utilization [available values: min, max or avg]"
type = "string"
default = "avg"
default = "sum"
}
variable "cosmos_db_ru_utilization_timeframe" {
@ -1834,7 +1834,7 @@ variable "cosmos_db_ru_utilization_timeframe" {
}
variable "cosmos_db_ru_utilization_collections" {
description = "Group to associate Cosmos DB collection to RU max"
description = "Group to associate Cosmos DB collection to RU max. RU value has to be correlated with the monitor timeframe"
type = "map"
}

View File

@ -16,6 +16,7 @@ module "datadog-monitors-cloud-azure-keyvault" {
Creates DataDog monitors with the following checks:
- Key Vault API latency is high
- Key Vault API result rate is low
- Key Vault is down
@ -23,29 +24,40 @@ Creates DataDog monitors with the following checks:
| Name | Description | Type | Default | Required |
|------|-------------|:----:|:-----:|:-----:|
| api_latency_enabled | Flag to enable Key Vault API latency monitor | string | `true` | no |
| api_latency_extra_tags | Extra tags for Key Vault API latency monitor | list | `<list>` | no |
| api_latency_message | Custom message for Key Vault API latency monitor | string | `` | no |
| api_latency_silenced | Groups to mute for Key Vault API latency monitor | map | `<map>` | no |
| api_latency_threshold_critical | Critical threshold for Key Vault API latency rate | string | `100` | no |
| api_latency_threshold_warning | Warning threshold for Key Vault API latency rate | string | `80` | no |
| api_latency_time_aggregator | Monitor aggregator for Key Vault API latency [available values: min, max or avg] | string | `min` | no |
| api_latency_timeframe | Monitor timeframe for Key Vault API latency [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| api_result_enabled | Flag to enable Key Vault API result monitor | string | `true` | no |
| api_result_extra_tags | Extra tags for Key Vault API result monitor | list | `<list>` | no |
| api_result_message | Custom message for Key Vault API result monitor | string | `` | no |
| api_result_silenced | Groups to mute for Key Vault API result monitor | map | `<map>` | no |
| api_result_threshold_critical | Critical threshold for Key Vault API result rate | string | `10` | no |
| api_result_threshold_warning | Warning threshold for Key Vault API result rate | string | `30` | no |
| api_result_time_aggregator | Monitor aggregator for Key Vault API result [available values: min, max or avg] | string | `sum` | no |
| api_result_timeframe | Monitor timeframe for Key Vault API result [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_30m` | no |
| api_result_timeframe | Monitor timeframe for Key Vault API result [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| environment | Architecture environment | string | - | yes |
| evaluation_delay | Delay in seconds for the metric evaluation | string | `900` | no |
| filter_tags_custom | Tags used for custom filtering when filter_tags_use_defaults is false | string | `*` | no |
| filter_tags_use_defaults | Use default filter tags convention | string | `true` | no |
| message | Message sent when a monitor is triggered | string | - | yes |
| new_host_delay | Delay in seconds before monitor new resource | string | `300` | no |
| status_enabled | Flag to enable Key Vault status monitor | string | `true` | no |
| status_extra_tags | Extra tags for Key Vault status monitor | list | `<list>` | no |
| status_message | Custom message for Key Vault status monitor | string | `` | no |
| status_silenced | Groups to mute for Key Vault status monitor | map | `<map>` | no |
| status_time_aggregator | Monitor aggregator for Key Vault status [available values: min, max or avg] | string | `max` | no |
| status_timeframe | Monitor timeframe for Key Vault status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_15m` | no |
| status_timeframe | Monitor timeframe for Key Vault status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
## Outputs
| Name | Description |
|------|-------------|
| keyvault_api_latency_id | id for monitor keyvault_api_latency |
| keyvault_api_result_id | id for monitor keyvault_api_result |
| keyvault_status_id | id for monitor keyvault_status |

View File

@ -7,3 +7,8 @@ output "keyvault_api_result_id" {
description = "id for monitor keyvault_api_result"
value = "${datadog_monitor.keyvault_api_result.id}"
}
output "keyvault_api_latency_id" {
description = "id for monitor keyvault_api_latency"
value = "${datadog_monitor.keyvault_api_latency.id}"
}

View File

@ -31,20 +31,24 @@ Creates DataDog monitors with the following checks:
| filter_tags_use_defaults | Use default filter tags convention | string | `true` | no |
| message | Message sent when an alert is triggered | string | - | yes |
| new_host_delay | Delay in seconds before monitor new resource | string | `300` | no |
| no_active_connections_enabled | Flag to enable Service Bus status monitor | string | `true` | no |
| no_active_connections_message | Custom message for Service Bus status monitor | string | `` | no |
| no_active_connections_silenced | Groups to mute for Service Bus status monitor | map | `<map>` | no |
| no_active_connections_time_aggregator | Monitor aggregator for Service Bus status [available values: min, max or avg] | string | `max` | no |
| no_active_connections_timeframe | Monitor timeframe for Service Bus status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_15m` | no |
| no_active_connections_timeframe | Monitor timeframe for Service Bus status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| server_errors_enabled | Flag to enable Service Bus server errors monitor | string | `true` | no |
| server_errors_message | Custom message for Service Bus server errors monitor | string | `` | no |
| server_errors_silenced | Groups to mute for Service Bus server errors monitor | map | `<map>` | no |
| server_errors_threshold_critical | Critical threshold for Service Bus server errors monitor | string | `90` | no |
| server_errors_threshold_warning | Warning threshold for Service Bus server errors monitor | string | `50` | no |
| server_errors_timeframe | Monitor timeframe for Service Bus server errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| status_enabled | Flag to enable Service Bus status monitor | string | `true` | no |
| status_extra_tags | Extra tags for Service Bus status monitor | list | `<list>` | no |
| status_message | Custom message for Service Bus status monitor | string | `` | no |
| status_silenced | Groups to mute for Service Bus status monitor | map | `<map>` | no |
| status_time_aggregator | Monitor aggregator for Service Bus status [available values: min, max or avg] | string | `max` | no |
| status_timeframe | Monitor timeframe for Service Bus status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_15m` | no |
| status_timeframe | Monitor timeframe for Service Bus status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| user_errors_enabled | Flag to enable Service Bus user errors monitor | string | `true` | no |
| user_errors_message | Custom message for Service Bus user errors monitor | string | `` | no |
| user_errors_silenced | Groups to mute for Service Bus user errors monitor | map | `<map>` | no |
| user_errors_threshold_critical | Critical threshold for Service Bus user errors monitor | string | `90` | no |