3.3 KiB
Executable File
3.3 KiB
Executable File
Terraform Datadog App Dashboard Module
Overview
This Terraform module creates a comprehensive Kubernetes/Docker application monitoring dashboard in Datadog with CPU and memory utilization metrics, health monitoring, and synthetic API testing.
Features
- Kubernetes Resource Monitoring: Visualizes pod and node resource utilization
- CPU & Memory Tracking: Top 10 containers by CPU and memory usage
- Health Monitoring: Pod health metric alerts
- Synthetic Testing: HTTP API endpoint monitoring
- Read-Only Dashboard: Prevents accidental modifications in the UI
Resources Created
datadog_dashboard: Application monitoring dashboard with multiple widgetsdatadog_monitor: Kubernetes Pod Health metric alertdatadog_synthetics_test: API HTTP health check
Dashboard Widgets
- Kubernetes Pods Hostmap: CPU utilization by Docker image
- CPU Utilization Timeseries: Top 10 containers ranked by CPU usage
- Kubernetes Nodes Hostmap: CPU utilization by host
- Memory Utilization Timeseries: Top 10 containers ranked by memory usage
- Alert Graph: Visualization from the pod health monitor
Requirements
| Name | Version |
|---|---|
| terraform | >= 0.12 |
| datadog | >= 3.2.0 |
Usage
module "app_dashboard" {
source = "./terraform-datadog-app-dashboard"
app_namespace = "production"
cfa_name = "my-cfa"
app_name = "my-application"
team_name = "platform-team"
image_name = "my-app"
region = "eu-west-1"
stage = "prd"
url = "https://api.example.com/health"
}
Inputs
| Name | Description | Type | Required |
|---|---|---|---|
app_namespace |
Namespace that the application runs in | string |
yes |
cfa_name |
Name of the CFA | string |
yes |
app_name |
Name of the application | string |
yes |
team_name |
Name of the responsible team | string |
yes |
image_name |
Name of the Docker Image | string |
yes |
region |
AWS region where resources are located | string |
yes |
stage |
Stage to monitor (dev, tst, stg, prd) | string |
yes |
url |
URL for Datadog Synthetics to monitor | string |
yes |
Outputs
Currently, this module does not export any outputs.
Monitor Configuration
Pod Health Monitor
- Query:
avg(last_5m):sum:docker.containers.running{image_name:{image_name}} by {docker_image}.rollup(avg, 60) <= 1 - Type: Metric alert
- Thresholds:
- OK: 3 containers
- Warning: 2 containers
- Critical: 1 container
- No Data: Alert after 10 minutes
Synthetic API Test
- Type: HTTP API test
- Method: GET
- Interval: Every 15 minutes (900 seconds)
- Assertion: HTTP status code is 200
- Locations: AWS EU and US regions
Tagging Strategy
All resources are tagged with:
cfa:{cfa_name}team:{team_name}app:{app_name}env:{stage}type:kubernetesortype:synthetics
Notes
- The dashboard is set to read-only mode to prevent accidental modifications
- Metrics are filtered by the
image_namevariable - Synthetic tests run from multiple AWS regions for geographic coverage
- Pod health monitor uses a 1-minute rollup average
License
Internal use only - Sanoma/WeBuildYourCloud
Authors
Created and maintained by the Platform Engineering team.