MON-160 - Azure Storage & Stream Analytics updated

This commit is contained in:
Alexandre Gaillet 2018-04-27 09:58:57 +02:00 committed by Quentin Manfroi
parent a56f5ddb39
commit 6acd9c5671
6 changed files with 112 additions and 14 deletions

View File

@ -36,14 +36,17 @@ Inputs
| authorization_error_requests_silenced | Groups to mute for Storage authorization errors monitor | map | `<map>` | no | | authorization_error_requests_silenced | Groups to mute for Storage authorization errors monitor | map | `<map>` | no |
| authorization_error_requests_threshold_critical | Maximum acceptable percent of authorization error requests for a storage | string | `90` | no | | authorization_error_requests_threshold_critical | Maximum acceptable percent of authorization error requests for a storage | string | `90` | no |
| authorization_error_requests_threshold_warning | Warning regarding acceptable percent of authorization error requests for a storage | string | `50` | no | | authorization_error_requests_threshold_warning | Warning regarding acceptable percent of authorization error requests for a storage | string | `50` | no |
| authorization_error_requests_timeframe | Monitor timeframe for Storage authorization errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| availability_message | Custom message for Storage availability monitor | string | `` | no | | availability_message | Custom message for Storage availability monitor | string | `` | no |
| availability_silenced | Groups to mute for Storage availability monitor | map | `<map>` | no | | availability_silenced | Groups to mute for Storage availability monitor | map | `<map>` | no |
| availability_threshold_critical | Minimum acceptable percent of availability for a storage | string | `50` | no | | availability_threshold_critical | Minimum acceptable percent of availability for a storage | string | `50` | no |
| availability_threshold_warning | Warning regarding acceptable percent of availability for a storage | string | `90` | no | | availability_threshold_warning | Warning regarding acceptable percent of availability for a storage | string | `90` | no |
| availability_timeframe | Monitor timeframe for Storage availability [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| client_other_error_requests_message | Custom message for Storage other errors monitor | string | `` | no | | client_other_error_requests_message | Custom message for Storage other errors monitor | string | `` | no |
| client_other_error_requests_silenced | Groups to mute for Storage other errors monitor | map | `<map>` | no | | client_other_error_requests_silenced | Groups to mute for Storage other errors monitor | map | `<map>` | no |
| client_other_error_requests_threshold_critical | Maximum acceptable percent of client other error requests for a storage | string | `90` | no | | client_other_error_requests_threshold_critical | Maximum acceptable percent of client other error requests for a storage | string | `90` | no |
| client_other_error_requests_threshold_warning | Warning regarding acceptable percent of client other error requests for a storage | string | `50` | no | | client_other_error_requests_threshold_warning | Warning regarding acceptable percent of client other error requests for a storage | string | `50` | no |
| client_other_error_requests_timeframe | Monitor timeframe for Storage other errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| delay | Delay in seconds for the metric evaluation | string | `900` | no | | delay | Delay in seconds for the metric evaluation | string | `900` | no |
| environment | Architecture environment | string | - | yes | | environment | Architecture environment | string | - | yes |
| filter_tags_custom | Tags used for custom filtering when filter_tags_use_defaults is false | string | `*` | no | | filter_tags_custom | Tags used for custom filtering when filter_tags_use_defaults is false | string | `*` | no |
@ -52,27 +55,33 @@ Inputs
| latency_silenced | Groups to mute for Storage latency monitor | map | `<map>` | no | | latency_silenced | Groups to mute for Storage latency monitor | map | `<map>` | no |
| latency_threshold_critical | Maximum acceptable end to end latency (ms) for a storage | string | `2000` | no | | latency_threshold_critical | Maximum acceptable end to end latency (ms) for a storage | string | `2000` | no |
| latency_threshold_warning | Warning regarding acceptable end to end latency (ms) for a storage | string | `1000` | no | | latency_threshold_warning | Warning regarding acceptable end to end latency (ms) for a storage | string | `1000` | no |
| latency_timeframe | Monitor timeframe for Storage latency [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| message | Message sent when a Redis monitor is triggered | string | - | yes | | message | Message sent when a Redis monitor is triggered | string | - | yes |
| network_error_requests_message | Custom message for Storage network errors monitor | string | `` | no | | network_error_requests_message | Custom message for Storage network errors monitor | string | `` | no |
| network_error_requests_silenced | Groups to mute for Storage network errors monitor | map | `<map>` | no | | network_error_requests_silenced | Groups to mute for Storage network errors monitor | map | `<map>` | no |
| network_error_requests_threshold_critical | Maximum acceptable percent of network error requests for a storage | string | `90` | no | | network_error_requests_threshold_critical | Maximum acceptable percent of network error requests for a storage | string | `90` | no |
| network_error_requests_threshold_warning | Warning regarding acceptable percent of network error requests for a storage | string | `50` | no | | network_error_requests_threshold_warning | Warning regarding acceptable percent of network error requests for a storage | string | `50` | no |
| network_error_requests_timeframe | Monitor timeframe for Storage network errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| server_other_error_requests_message | Custom message for Storage server other errors monitor | string | `` | no | | server_other_error_requests_message | Custom message for Storage server other errors monitor | string | `` | no |
| server_other_error_requests_silenced | Groups to mute for Storage server other errors monitor | map | `<map>` | no | | server_other_error_requests_silenced | Groups to mute for Storage server other errors monitor | map | `<map>` | no |
| server_other_error_requests_threshold_critical | Maximum acceptable percent of server other error requests for a storage | string | `90` | no | | server_other_error_requests_threshold_critical | Maximum acceptable percent of server other error requests for a storage | string | `90` | no |
| server_other_error_requests_threshold_warning | Warning regarding acceptable percent of server other error requests for a storage | string | `50` | no | | server_other_error_requests_threshold_warning | Warning regarding acceptable percent of server other error requests for a storage | string | `50` | no |
| server_other_error_requests_timeframe | Monitor timeframe for Storage server other errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| successful_requests_message | Custom message for Storage sucessful requests monitor | string | `` | no | | successful_requests_message | Custom message for Storage sucessful requests monitor | string | `` | no |
| successful_requests_silenced | Groups to mute for Storage sucessful requests monitor | map | `<map>` | no | | successful_requests_silenced | Groups to mute for Storage sucessful requests monitor | map | `<map>` | no |
| successful_requests_threshold_critical | Minimum acceptable percent of successful requests for a storage | string | `10` | no | | successful_requests_threshold_critical | Minimum acceptable percent of successful requests for a storage | string | `10` | no |
| successful_requests_threshold_warning | Warning regarding acceptable percent of successful requests for a storage | string | `30` | no | | successful_requests_threshold_warning | Warning regarding acceptable percent of successful requests for a storage | string | `30` | no |
| successful_requests_timeframe | Monitor timeframe for Storage sucessful requests [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| throttling_error_requests_message | Custom message for Storage throttling error monitor | string | `` | no | | throttling_error_requests_message | Custom message for Storage throttling error monitor | string | `` | no |
| throttling_error_requests_silenced | Groups to mute for Storage throttling error monitor | map | `<map>` | no | | throttling_error_requests_silenced | Groups to mute for Storage throttling error monitor | map | `<map>` | no |
| throttling_error_requests_threshold_critical | Maximum acceptable percent of throttling error requests for a storage | string | `90` | no | | throttling_error_requests_threshold_critical | Maximum acceptable percent of throttling error requests for a storage | string | `90` | no |
| throttling_error_requests_threshold_warning | Warning regarding acceptable percent of throttling error requests for a storage | string | `50` | no | | throttling_error_requests_threshold_warning | Warning regarding acceptable percent of throttling error requests for a storage | string | `50` | no |
| throttling_error_requests_timeframe | Monitor timeframe for Storage throttling errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| timeout_error_requests_message | Custom message for Storage timeout monitor | string | `` | no | | timeout_error_requests_message | Custom message for Storage timeout monitor | string | `` | no |
| timeout_error_requests_silenced | Groups to mute for Storage timeout monitor | map | `<map>` | no | | timeout_error_requests_silenced | Groups to mute for Storage timeout monitor | map | `<map>` | no |
| timeout_error_requests_threshold_critical | Maximum acceptable percent of timeout error requests for a storage | string | `90` | no | | timeout_error_requests_threshold_critical | Maximum acceptable percent of timeout error requests for a storage | string | `90` | no |
| timeout_error_requests_threshold_warning | Warning regarding acceptable percent of timeout error requests for a storage | string | `50` | no | | timeout_error_requests_threshold_warning | Warning regarding acceptable percent of timeout error requests for a storage | string | `50` | no |
| timeout_error_requests_timeframe | Monitor timeframe for Storage timeout [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
Related documentation Related documentation
--------------------- ---------------------

View File

@ -37,6 +37,12 @@ variable "availability_message" {
default = "" default = ""
} }
variable "availability_timeframe" {
description = "Monitor timeframe for Storage availability [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "availability_threshold_critical" { variable "availability_threshold_critical" {
description = "Minimum acceptable percent of availability for a storage" description = "Minimum acceptable percent of availability for a storage"
default = 50 default = 50
@ -59,6 +65,12 @@ variable "successful_requests_message" {
default = "" default = ""
} }
variable "successful_requests_timeframe" {
description = "Monitor timeframe for Storage sucessful requests [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "successful_requests_threshold_critical" { variable "successful_requests_threshold_critical" {
description = "Minimum acceptable percent of successful requests for a storage" description = "Minimum acceptable percent of successful requests for a storage"
default = 10 default = 10
@ -81,6 +93,12 @@ variable "latency_message" {
default = "" default = ""
} }
variable "latency_timeframe" {
description = "Monitor timeframe for Storage latency [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "latency_threshold_critical" { variable "latency_threshold_critical" {
description = "Maximum acceptable end to end latency (ms) for a storage" description = "Maximum acceptable end to end latency (ms) for a storage"
default = 2000 default = 2000
@ -103,6 +121,12 @@ variable "timeout_error_requests_message" {
default = "" default = ""
} }
variable "timeout_error_requests_timeframe" {
description = "Monitor timeframe for Storage timeout [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "timeout_error_requests_threshold_critical" { variable "timeout_error_requests_threshold_critical" {
description = "Maximum acceptable percent of timeout error requests for a storage" description = "Maximum acceptable percent of timeout error requests for a storage"
default = 90 default = 90
@ -125,6 +149,12 @@ variable "network_error_requests_message" {
default = "" default = ""
} }
variable "network_error_requests_timeframe" {
description = "Monitor timeframe for Storage network errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "network_error_requests_threshold_critical" { variable "network_error_requests_threshold_critical" {
description = "Maximum acceptable percent of network error requests for a storage" description = "Maximum acceptable percent of network error requests for a storage"
default = 90 default = 90
@ -147,6 +177,12 @@ variable "throttling_error_requests_message" {
default = "" default = ""
} }
variable "throttling_error_requests_timeframe" {
description = "Monitor timeframe for Storage throttling errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "throttling_error_requests_threshold_critical" { variable "throttling_error_requests_threshold_critical" {
description = "Maximum acceptable percent of throttling error requests for a storage" description = "Maximum acceptable percent of throttling error requests for a storage"
default = 90 default = 90
@ -169,6 +205,12 @@ variable "server_other_error_requests_message" {
default = "" default = ""
} }
variable "server_other_error_requests_timeframe" {
description = "Monitor timeframe for Storage server other errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "server_other_error_requests_threshold_critical" { variable "server_other_error_requests_threshold_critical" {
description = "Maximum acceptable percent of server other error requests for a storage" description = "Maximum acceptable percent of server other error requests for a storage"
default = 90 default = 90
@ -191,6 +233,12 @@ variable "client_other_error_requests_message" {
default = "" default = ""
} }
variable "client_other_error_requests_timeframe" {
description = "Monitor timeframe for Storage other errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "client_other_error_requests_threshold_critical" { variable "client_other_error_requests_threshold_critical" {
description = "Maximum acceptable percent of client other error requests for a storage" description = "Maximum acceptable percent of client other error requests for a storage"
default = 90 default = 90
@ -213,6 +261,12 @@ variable "authorization_error_requests_message" {
default = "" default = ""
} }
variable "authorization_error_requests_timeframe" {
description = "Monitor timeframe for Storage authorization errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "authorization_error_requests_threshold_critical" { variable "authorization_error_requests_threshold_critical" {
description = "Maximum acceptable percent of authorization error requests for a storage" description = "Maximum acceptable percent of authorization error requests for a storage"
default = 90 default = 90

View File

@ -11,7 +11,7 @@ resource "datadog_monitor" "availability" {
message = "${coalesce(var.availability_message, var.message)}" message = "${coalesce(var.availability_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m): (default( avg(${var.availability_timeframe}): (default(
avg:azure.storage.availability{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name}, avg:azure.storage.availability{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name},
100)) < ${var.availability_threshold_critical} 100)) < ${var.availability_threshold_critical}
EOF EOF
@ -42,7 +42,7 @@ resource "datadog_monitor" "successful_requests" {
message = "${coalesce(var.successful_requests_message, var.message)}" message = "${coalesce(var.successful_requests_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m): (default( avg(${var.successful_requests_timeframe}): (default(
avg:azure.storage.percent_success{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name}, avg:azure.storage.percent_success{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name},
100)) < ${var.successful_requests_threshold_critical} 100)) < ${var.successful_requests_threshold_critical}
EOF EOF
@ -73,7 +73,7 @@ resource "datadog_monitor" "latency" {
message = "${coalesce(var.latency_message, var.message)}" message = "${coalesce(var.latency_message, var.message)}"
query = <<EOF query = <<EOF
min(last_5m): (default( min(${var.latency_timeframe}): (default(
avg:azure.storage.average_e2_e_latency{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name}, avg:azure.storage.average_e2_e_latency{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name},
0)) > ${var.latency_threshold_critical} 0)) > ${var.latency_threshold_critical}
EOF EOF
@ -104,7 +104,7 @@ resource "datadog_monitor" "timeout_error_requests" {
message = "${coalesce(var.timeout_error_requests_message, var.message)}" message = "${coalesce(var.timeout_error_requests_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m): (default( avg(${var.timeout_error_requests_timeframe}): (default(
avg:azure.storage.percent_timeout_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name}, avg:azure.storage.percent_timeout_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name},
0)) > ${var.timeout_error_requests_threshold_critical} 0)) > ${var.timeout_error_requests_threshold_critical}
EOF EOF
@ -135,7 +135,7 @@ resource "datadog_monitor" "network_error_requests" {
message = "${coalesce(var.network_error_requests_message, var.message)}" message = "${coalesce(var.network_error_requests_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m): (default( avg(${var.network_error_requests_timeframe}): (default(
avg:azure.storage.percent_network_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name}, avg:azure.storage.percent_network_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name},
0)) > ${var.network_error_requests_threshold_critical} 0)) > ${var.network_error_requests_threshold_critical}
EOF EOF
@ -166,7 +166,7 @@ resource "datadog_monitor" "throttling_error_requests" {
message = "${coalesce(var.throttling_error_requests_message, var.message)}" message = "${coalesce(var.throttling_error_requests_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m): (default( avg(${var.throttling_error_requests_timeframe}): (default(
avg:azure.storage.percent_throttling_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name}, avg:azure.storage.percent_throttling_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name},
0)) > ${var.throttling_error_requests_threshold_critical} 0)) > ${var.throttling_error_requests_threshold_critical}
EOF EOF
@ -197,7 +197,7 @@ resource "datadog_monitor" "server_other_error_requests" {
message = "${coalesce(var.server_other_error_requests_message, var.message)}" message = "${coalesce(var.server_other_error_requests_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m): (default( avg(${var.server_other_error_requests_timeframe}): (default(
avg:azure.storage.percent_server_other_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name}, avg:azure.storage.percent_server_other_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name},
0)) > ${var.server_other_error_requests_threshold_critical} 0)) > ${var.server_other_error_requests_threshold_critical}
EOF EOF
@ -228,7 +228,7 @@ resource "datadog_monitor" "client_other_error_requests" {
message = "${coalesce(var.client_other_error_requests_message, var.message)}" message = "${coalesce(var.client_other_error_requests_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m): (default( avg(${var.client_other_error_requests_timeframe}): (default(
avg:azure.storage.percent_client_other_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name}, avg:azure.storage.percent_client_other_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name},
0)) > ${var.client_other_error_requests_threshold_critical} 0)) > ${var.client_other_error_requests_threshold_critical}
EOF EOF
@ -259,7 +259,7 @@ resource "datadog_monitor" "authorization_error_requests" {
message = "${coalesce(var.authorization_error_requests_message, var.message)}" message = "${coalesce(var.authorization_error_requests_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m): (default( avg(${var.authorization_error_requests_timeframe}): (default(
avg:azure.storage.percent_authorization_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name}, avg:azure.storage.percent_authorization_error{${data.template_file.filter.rendered},transaction_type:all} by {resource_group,storage_type,name},
0)) > ${var.authorization_error_requests_threshold_critical} 0)) > ${var.authorization_error_requests_threshold_critical}
EOF EOF

View File

@ -22,12 +22,14 @@ Inputs
| conversion_errors_silenced | Groups to mute for Stream Analytics conversion errors monitor | map | `<map>` | no | | conversion_errors_silenced | Groups to mute for Stream Analytics conversion errors monitor | map | `<map>` | no |
| conversion_errors_threshold_critical | Conversion errors limit (critical threshold) | string | `10` | no | | conversion_errors_threshold_critical | Conversion errors limit (critical threshold) | string | `10` | no |
| conversion_errors_threshold_warning | Conversion errors limit (warning threshold) | string | `0` | no | | conversion_errors_threshold_warning | Conversion errors limit (warning threshold) | string | `0` | no |
| conversion_errors_timeframe | Monitor timeframe for Stream Analytics conversion errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| delay | Delay in seconds for the metric evaluation | string | `900` | no | | delay | Delay in seconds for the metric evaluation | string | `900` | no |
| environment | Architecture environment | string | - | yes | | environment | Architecture environment | string | - | yes |
| failed_function_requests_message | Custom message for Stream Analytics failed requests monitor | string | `` | no | | failed_function_requests_message | Custom message for Stream Analytics failed requests monitor | string | `` | no |
| failed_function_requests_silenced | Groups to mute for Stream Analytics failed requests monitor | map | `<map>` | no | | failed_function_requests_silenced | Groups to mute for Stream Analytics failed requests monitor | map | `<map>` | no |
| failed_function_requests_threshold_critical | Failed Function Request rate limit (critical threshold) | string | `10` | no | | failed_function_requests_threshold_critical | Failed Function Request rate limit (critical threshold) | string | `10` | no |
| failed_function_requests_threshold_warning | Failed Function Request rate limit (warning threshold) | string | `0` | no | | failed_function_requests_threshold_warning | Failed Function Request rate limit (warning threshold) | string | `0` | no |
| failed_function_requests_timeframe | Monitor timeframe for Stream Analytics failed requests [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| filter_tags_custom | Tags used for custom filtering when filter_tags_use_defaults is false | string | `*` | no | | filter_tags_custom | Tags used for custom filtering when filter_tags_use_defaults is false | string | `*` | no |
| filter_tags_use_defaults | Use default filter tags convention | string | `true` | no | | filter_tags_use_defaults | Use default filter tags convention | string | `true` | no |
| message | Message sent when a Redis monitor is triggered | string | - | yes | | message | Message sent when a Redis monitor is triggered | string | - | yes |
@ -35,12 +37,15 @@ Inputs
| runtime_errors_silenced | Groups to mute for Stream Analytics runtime errors monitor | map | `<map>` | no | | runtime_errors_silenced | Groups to mute for Stream Analytics runtime errors monitor | map | `<map>` | no |
| runtime_errors_threshold_critical | Runtime errors limit (critical threshold) | string | `10` | no | | runtime_errors_threshold_critical | Runtime errors limit (critical threshold) | string | `10` | no |
| runtime_errors_threshold_warning | Runtime errors limit (warning threshold) | string | `0` | no | | runtime_errors_threshold_warning | Runtime errors limit (warning threshold) | string | `0` | no |
| runtime_errors_timeframe | Monitor timeframe for Stream Analytics runtime errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| status_message | Custom message for Stream Analytics status monitor | string | `` | no | | status_message | Custom message for Stream Analytics status monitor | string | `` | no |
| status_silenced | Groups to mute for Stream Analytics status monitor | map | `<map>` | no | | status_silenced | Groups to mute for Stream Analytics status monitor | map | `<map>` | no |
| status_timeframe | Monitor timeframe for Stream Analytics status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| su_utilization_message | Custom message for Stream Analytics utilization monitor | string | `` | no | | su_utilization_message | Custom message for Stream Analytics utilization monitor | string | `` | no |
| su_utilization_silenced | Groups to mute for Stream Analytics utilization monitor | map | `<map>` | no | | su_utilization_silenced | Groups to mute for Stream Analytics utilization monitor | map | `<map>` | no |
| su_utilization_threshold_critical | Streaming Unit utilization rate limit (critical threshold) | string | `80` | no | | su_utilization_threshold_critical | Streaming Unit utilization rate limit (critical threshold) | string | `80` | no |
| su_utilization_threshold_warning | Streaming Unit utilization rate limit (warning threshold) | string | `60` | no | | su_utilization_threshold_warning | Streaming Unit utilization rate limit (warning threshold) | string | `60` | no |
| su_utilization_timeframe | Monitor timeframe for Stream Analytics utilization [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
Related documentation Related documentation
--------------------- ---------------------

View File

@ -37,6 +37,12 @@ variable "status_message" {
default = "" default = ""
} }
variable "status_timeframe" {
description = "Monitor timeframe for Stream Analytics status [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "su_utilization_silenced" { variable "su_utilization_silenced" {
description = "Groups to mute for Stream Analytics utilization monitor" description = "Groups to mute for Stream Analytics utilization monitor"
type = "map" type = "map"
@ -49,6 +55,12 @@ variable "su_utilization_message" {
default = "" default = ""
} }
variable "su_utilization_timeframe" {
description = "Monitor timeframe for Stream Analytics utilization [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "su_utilization_threshold_warning" { variable "su_utilization_threshold_warning" {
description = "Streaming Unit utilization rate limit (warning threshold)" description = "Streaming Unit utilization rate limit (warning threshold)"
default = 60 default = 60
@ -71,6 +83,12 @@ variable "failed_function_requests_message" {
default = "" default = ""
} }
variable "failed_function_requests_timeframe" {
description = "Monitor timeframe for Stream Analytics failed requests [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "failed_function_requests_threshold_warning" { variable "failed_function_requests_threshold_warning" {
description = "Failed Function Request rate limit (warning threshold)" description = "Failed Function Request rate limit (warning threshold)"
default = 0 default = 0
@ -93,6 +111,12 @@ variable "conversion_errors_message" {
default = "" default = ""
} }
variable "conversion_errors_timeframe" {
description = "Monitor timeframe for Stream Analytics conversion errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "conversion_errors_threshold_warning" { variable "conversion_errors_threshold_warning" {
description = "Conversion errors limit (warning threshold)" description = "Conversion errors limit (warning threshold)"
default = 0 default = 0
@ -115,6 +139,12 @@ variable "runtime_errors_message" {
default = "" default = ""
} }
variable "runtime_errors_timeframe" {
description = "Monitor timeframe for Stream Analytics runtime errors [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "runtime_errors_threshold_warning" { variable "runtime_errors_threshold_warning" {
description = "Runtime errors limit (warning threshold)" description = "Runtime errors limit (warning threshold)"
default = 0 default = 0

View File

@ -11,7 +11,7 @@ resource "datadog_monitor" "status" {
message = "${coalesce(var.status_message, var.message)}" message = "${coalesce(var.status_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m):avg:azure.streamanalytics_streamingjobs.status{${data.template_file.filter.rendered}} by {resource_group,region,name} < 1 avg(${var.status_timeframe}):avg:azure.streamanalytics_streamingjobs.status{${data.template_file.filter.rendered}} by {resource_group,region,name} < 1
EOF EOF
type = "metric alert" type = "metric alert"
@ -36,7 +36,7 @@ resource "datadog_monitor" "su_utilization" {
message = "${coalesce(var.su_utilization_message, var.message)}" message = "${coalesce(var.su_utilization_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m): ( avg(${var.su_utilization_timeframe}): (
avg:azure.streamanalytics_streamingjobs.resource_utilization{${data.template_file.filter.rendered}} by {resource_group,region,name} avg:azure.streamanalytics_streamingjobs.resource_utilization{${data.template_file.filter.rendered}} by {resource_group,region,name}
) > ${var.su_utilization_threshold_critical} ) > ${var.su_utilization_threshold_critical}
EOF EOF
@ -68,7 +68,7 @@ resource "datadog_monitor" "failed_function_requests" {
message = "${coalesce(var.failed_function_requests_message, var.message)}" message = "${coalesce(var.failed_function_requests_message, var.message)}"
query = <<EOF query = <<EOF
sum(last_5m): ( sum(${var.failed_function_requests_timeframe}): (
avg:azure.streamanalytics_streamingjobs.aml_callout_failed_requests{${data.template_file.filter.rendered}} by {resource_group,region,name}.as_count() / avg:azure.streamanalytics_streamingjobs.aml_callout_failed_requests{${data.template_file.filter.rendered}} by {resource_group,region,name}.as_count() /
avg:azure.streamanalytics_streamingjobs.aml_callout_requests{${data.template_file.filter.rendered}} by {resource_group,region,name}.as_count() avg:azure.streamanalytics_streamingjobs.aml_callout_requests{${data.template_file.filter.rendered}} by {resource_group,region,name}.as_count()
) * 100 > ${var.failed_function_requests_threshold_critical} ) * 100 > ${var.failed_function_requests_threshold_critical}
@ -101,7 +101,7 @@ resource "datadog_monitor" "conversion_errors" {
message = "${coalesce(var.conversion_errors_message, var.message)}" message = "${coalesce(var.conversion_errors_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m): ( avg(${var.conversion_errors_timeframe}): (
avg:azure.streamanalytics_streamingjobs.conversion_errors{${data.template_file.filter.rendered}} by {resource_group,region,name} avg:azure.streamanalytics_streamingjobs.conversion_errors{${data.template_file.filter.rendered}} by {resource_group,region,name}
) > ${var.conversion_errors_threshold_critical} ) > ${var.conversion_errors_threshold_critical}
EOF EOF
@ -133,7 +133,7 @@ resource "datadog_monitor" "runtime_errors" {
message = "${coalesce(var.runtime_errors_message, var.message)}" message = "${coalesce(var.runtime_errors_message, var.message)}"
query = <<EOF query = <<EOF
avg(last_5m): ( avg(${var.runtime_errors_timeframe}): (
avg:azure.streamanalytics_streamingjobs.errors{${data.template_file.filter.rendered}} by {resource_group,region,name} avg:azure.streamanalytics_streamingjobs.errors{${data.template_file.filter.rendered}} by {resource_group,region,name}
) > ${var.runtime_errors_threshold_critical} ) > ${var.runtime_errors_threshold_critical}
EOF EOF