MON-160 - ALB monitors updated

This commit is contained in:
Alexandre Gaillet 2018-04-26 17:11:58 +02:00 committed by Quentin Manfroi
parent d3bf631223
commit 3f17c0215c
3 changed files with 48 additions and 6 deletions

View File

@ -32,6 +32,7 @@ Inputs
|------|-------------|:----:|:-----:|:-----:| |------|-------------|:----:|:-----:|:-----:|
| alb_no_healthy_instances_message | Custom message for ALB no healthy instances monitor | string | `` | no | | alb_no_healthy_instances_message | Custom message for ALB no healthy instances monitor | string | `` | no |
| alb_no_healthy_instances_silenced | Groups to mute for ALB no healthy instances monitor | map | `<map>` | no | | alb_no_healthy_instances_silenced | Groups to mute for ALB no healthy instances monitor | map | `<map>` | no |
| alb_no_healthy_instances_timeframe | Monitor timeframe for ALB no healthy instances [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_1m` | no |
| artificial_requests_count | Number of false requests used to mitigate false positive in case of low trafic | string | `5` | no | | artificial_requests_count | Number of false requests used to mitigate false positive in case of low trafic | string | `5` | no |
| delay | Delay in seconds for the metric evaluation | string | `900` | no | | delay | Delay in seconds for the metric evaluation | string | `900` | no |
| environment | Architecture environment | string | - | yes | | environment | Architecture environment | string | - | yes |
@ -41,22 +42,27 @@ Inputs
| httpcode_elb_4xx_silenced | Groups to mute for ALB httpcode 4xx monitor | map | `<map>` | no | | httpcode_elb_4xx_silenced | Groups to mute for ALB httpcode 4xx monitor | map | `<map>` | no |
| httpcode_elb_4xx_threshold_critical | loadbalancer 4xx critical threshold in percentage | string | `80` | no | | httpcode_elb_4xx_threshold_critical | loadbalancer 4xx critical threshold in percentage | string | `80` | no |
| httpcode_elb_4xx_threshold_warning | loadbalancer 4xx warning threshold in percentage | string | `60` | no | | httpcode_elb_4xx_threshold_warning | loadbalancer 4xx warning threshold in percentage | string | `60` | no |
| httpcode_elb_4xx_timeframe | Monitor timeframe for ALB httpcode 4xx [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| httpcode_elb_5xx_message | Custom message for ALB httpcode 5xx monitor | string | `` | no | | httpcode_elb_5xx_message | Custom message for ALB httpcode 5xx monitor | string | `` | no |
| httpcode_elb_5xx_silenced | Groups to mute for ALB httpcode 5xx monitor | map | `<map>` | no | | httpcode_elb_5xx_silenced | Groups to mute for ALB httpcode 5xx monitor | map | `<map>` | no |
| httpcode_elb_5xx_threshold_critical | loadbalancer 5xxcritical threshold in percentage | string | `80` | no | | httpcode_elb_5xx_threshold_critical | loadbalancer 5xxcritical threshold in percentage | string | `80` | no |
| httpcode_elb_5xx_threshold_warning | loadbalancer 5xx warning threshold in percentage | string | `60` | no | | httpcode_elb_5xx_threshold_warning | loadbalancer 5xx warning threshold in percentage | string | `60` | no |
| httpcode_elb_5xx_timeframe | Monitor timeframe for ALB httpcode 5xx [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| httpcode_target_4xx_message | Custom message for ALB target httpcode 4xx monitor | string | `` | no | | httpcode_target_4xx_message | Custom message for ALB target httpcode 4xx monitor | string | `` | no |
| httpcode_target_4xx_silenced | Groups to mute for ALB target httpcode 4xx monitor | map | `<map>` | no | | httpcode_target_4xx_silenced | Groups to mute for ALB target httpcode 4xx monitor | map | `<map>` | no |
| httpcode_target_4xx_threshold_critical | target 4xx critical threshold in percentage | string | `80` | no | | httpcode_target_4xx_threshold_critical | target 4xx critical threshold in percentage | string | `80` | no |
| httpcode_target_4xx_threshold_warning | target 4xx warning threshold in percentage | string | `60` | no | | httpcode_target_4xx_threshold_warning | target 4xx warning threshold in percentage | string | `60` | no |
| httpcode_target_4xx_timeframe | Monitor timeframe for ALB target httpcode 4xx [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| httpcode_target_5xx_message | Custom message for ALB target httpcode 5xx monitor | string | `` | no | | httpcode_target_5xx_message | Custom message for ALB target httpcode 5xx monitor | string | `` | no |
| httpcode_target_5xx_silenced | Groups to mute for ALB target httpcode 5xx monitor | map | `<map>` | no | | httpcode_target_5xx_silenced | Groups to mute for ALB target httpcode 5xx monitor | map | `<map>` | no |
| httpcode_target_5xx_threshold_critical | target 5xx critical threshold in percentage | string | `80` | no | | httpcode_target_5xx_threshold_critical | target 5xx critical threshold in percentage | string | `80` | no |
| httpcode_target_5xx_threshold_warning | target 5xx warning threshold in percentage | string | `60` | no | | httpcode_target_5xx_threshold_warning | target 5xx warning threshold in percentage | string | `60` | no |
| httpcode_target_5xx_timeframe | Monitor timeframe for ALB target httpcode 5xx [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| latency_message | Custom message for ALB latency monitor | string | `` | no | | latency_message | Custom message for ALB latency monitor | string | `` | no |
| latency_silenced | Groups to mute for ALB latency monitor | map | `<map>` | no | | latency_silenced | Groups to mute for ALB latency monitor | map | `<map>` | no |
| latency_threshold_critical | latency critical threshold in milliseconds | string | `1000` | no | | latency_threshold_critical | latency critical threshold in milliseconds | string | `1000` | no |
| latency_threshold_warning | latency warning threshold in milliseconds | string | `500` | no | | latency_threshold_warning | latency warning threshold in milliseconds | string | `500` | no |
| latency_timeframe | Monitor timeframe for ALB latency [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`] | string | `last_5m` | no |
| message | Message sent when a monitor is triggered | string | - | yes | | message | Message sent when a monitor is triggered | string | - | yes |
Related documentation Related documentation

View File

@ -38,6 +38,12 @@ variable "alb_no_healthy_instances_message" {
default = "" default = ""
} }
variable "alb_no_healthy_instances_timeframe" {
description = "Monitor timeframe for ALB no healthy instances [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_1m"
}
variable "latency_silenced" { variable "latency_silenced" {
description = "Groups to mute for ALB latency monitor" description = "Groups to mute for ALB latency monitor"
type = "map" type = "map"
@ -50,6 +56,12 @@ variable "latency_message" {
default = "" default = ""
} }
variable "latency_timeframe" {
description = "Monitor timeframe for ALB latency [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "latency_threshold_critical" { variable "latency_threshold_critical" {
default = 1000 default = 1000
description = "latency critical threshold in milliseconds" description = "latency critical threshold in milliseconds"
@ -72,6 +84,12 @@ variable "httpcode_elb_4xx_message" {
default = "" default = ""
} }
variable "httpcode_elb_4xx_timeframe" {
description = "Monitor timeframe for ALB httpcode 4xx [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "httpcode_elb_4xx_threshold_critical" { variable "httpcode_elb_4xx_threshold_critical" {
default = 80 default = 80
description = "loadbalancer 4xx critical threshold in percentage" description = "loadbalancer 4xx critical threshold in percentage"
@ -94,6 +112,12 @@ variable "httpcode_target_4xx_message" {
default = "" default = ""
} }
variable "httpcode_target_4xx_timeframe" {
description = "Monitor timeframe for ALB target httpcode 4xx [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "httpcode_target_4xx_threshold_critical" { variable "httpcode_target_4xx_threshold_critical" {
default = 80 default = 80
description = "target 4xx critical threshold in percentage" description = "target 4xx critical threshold in percentage"
@ -116,6 +140,12 @@ variable "httpcode_elb_5xx_message" {
default = "" default = ""
} }
variable "httpcode_elb_5xx_timeframe" {
description = "Monitor timeframe for ALB httpcode 5xx [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "httpcode_elb_5xx_threshold_critical" { variable "httpcode_elb_5xx_threshold_critical" {
default = 80 default = 80
description = "loadbalancer 5xxcritical threshold in percentage" description = "loadbalancer 5xxcritical threshold in percentage"
@ -138,6 +168,12 @@ variable "httpcode_target_5xx_message" {
default = "" default = ""
} }
variable "httpcode_target_5xx_timeframe" {
description = "Monitor timeframe for ALB target httpcode 5xx [available values: `last_#m` (1, 5, 10, 15, or 30), `last_#h` (1, 2, or 4), or `last_1d`]"
type = "string"
default = "last_5m"
}
variable "httpcode_target_5xx_threshold_critical" { variable "httpcode_target_5xx_threshold_critical" {
default = 80 default = 80
description = "target 5xx critical threshold in percentage" description = "target 5xx critical threshold in percentage"

View File

@ -14,7 +14,7 @@ resource "datadog_monitor" "ALB_no_healthy_instances" {
message = "${coalesce(var.alb_no_healthy_instances_message, var.message)}" message = "${coalesce(var.alb_no_healthy_instances_message, var.message)}"
query = <<EOF query = <<EOF
min(last_1m): ( min(${var.alb_no_healthy_instances_timeframe}): (
min:aws.applicationelb.healthy_host_count{${data.template_file.filter.rendered}} by {region,loadbalancer} min:aws.applicationelb.healthy_host_count{${data.template_file.filter.rendered}} by {region,loadbalancer}
) <= 0 ) <= 0
EOF EOF
@ -43,7 +43,7 @@ resource "datadog_monitor" "ALB_latency" {
message = "${coalesce(var.latency_message, var.message)}" message = "${coalesce(var.latency_message, var.message)}"
query = <<EOF query = <<EOF
min(last_5m): ( min(${var.latency_timeframe}): (
min:aws.applicationelb.target_response_time.average{${data.template_file.filter.rendered}} by {region,loadbalancer} min:aws.applicationelb.target_response_time.average{${data.template_file.filter.rendered}} by {region,loadbalancer}
) > ${var.latency_threshold_critical} ) > ${var.latency_threshold_critical}
EOF EOF
@ -73,7 +73,7 @@ resource "datadog_monitor" "ALB_httpcode_elb_5xx" {
message = "${coalesce(var.httpcode_elb_5xx_message, var.message)}" message = "${coalesce(var.httpcode_elb_5xx_message, var.message)}"
query = <<EOF query = <<EOF
min(last_5m): ( min(${var.httpcode_elb_5xx_timeframe}): (
default( default(
min:aws.applicationelb.httpcode_elb_5xx{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() / min:aws.applicationelb.httpcode_elb_5xx{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() /
(min:aws.applicationelb.request_count{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() + ${var.artificial_requests_count}), (min:aws.applicationelb.request_count{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() + ${var.artificial_requests_count}),
@ -106,7 +106,7 @@ resource "datadog_monitor" "ALB_httpcode_elb_4xx" {
message = "${coalesce(var.httpcode_elb_4xx_message, var.message)}" message = "${coalesce(var.httpcode_elb_4xx_message, var.message)}"
query = <<EOF query = <<EOF
min(last_5m): ( min(${var.httpcode_elb_4xx_timeframe}): (
default( default(
min:aws.applicationelb.httpcode_elb_4xx{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() / min:aws.applicationelb.httpcode_elb_4xx{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() /
(min:aws.applicationelb.request_count{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() + ${var.artificial_requests_count}), (min:aws.applicationelb.request_count{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() + ${var.artificial_requests_count}),
@ -139,7 +139,7 @@ resource "datadog_monitor" "ALB_httpcode_target_5xx" {
message = "${coalesce(var.httpcode_target_5xx_message, var.message)}" message = "${coalesce(var.httpcode_target_5xx_message, var.message)}"
query = <<EOF query = <<EOF
min(last_5m): ( min(${var.httpcode_target_5xx_timeframe}): (
default( default(
min:aws.applicationelb.httpcode_target_5xx{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() / min:aws.applicationelb.httpcode_target_5xx{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() /
(min:aws.applicationelb.request_count{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() + ${var.artificial_requests_count}), (min:aws.applicationelb.request_count{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() + ${var.artificial_requests_count}),
@ -172,7 +172,7 @@ resource "datadog_monitor" "ALB_httpcode_target_4xx" {
message = "${coalesce(var.httpcode_target_4xx_message, var.message)}" message = "${coalesce(var.httpcode_target_4xx_message, var.message)}"
query = <<EOF query = <<EOF
min(last_5m): ( min(${var.httpcode_target_4xx_timeframe}): (
default( default(
min:aws.applicationelb.httpcode_target_4xx{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() / min:aws.applicationelb.httpcode_target_4xx{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() /
(min:aws.applicationelb.request_count{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() + ${var.artificial_requests_count}), (min:aws.applicationelb.request_count{${data.template_file.filter.rendered}} by {region,loadbalancer}.as_count() + ${var.artificial_requests_count}),