# Terraform Datadog App Dashboard Module ## Overview This Terraform module creates a comprehensive Kubernetes/Docker application monitoring dashboard in Datadog with CPU and memory utilization metrics, health monitoring, and synthetic API testing. ## Features - **Kubernetes Resource Monitoring**: Visualizes pod and node resource utilization - **CPU & Memory Tracking**: Top 10 containers by CPU and memory usage - **Health Monitoring**: Pod health metric alerts - **Synthetic Testing**: HTTP API endpoint monitoring - **Read-Only Dashboard**: Prevents accidental modifications in the UI ## Resources Created - `datadog_dashboard`: Application monitoring dashboard with multiple widgets - `datadog_monitor`: Kubernetes Pod Health metric alert - `datadog_synthetics_test`: API HTTP health check ## Dashboard Widgets 1. **Kubernetes Pods Hostmap**: CPU utilization by Docker image 2. **CPU Utilization Timeseries**: Top 10 containers ranked by CPU usage 3. **Kubernetes Nodes Hostmap**: CPU utilization by host 4. **Memory Utilization Timeseries**: Top 10 containers ranked by memory usage 5. **Alert Graph**: Visualization from the pod health monitor ## Requirements | Name | Version | |------|---------| | terraform | >= 0.12 | | datadog | >= 3.2.0 | ## Usage ```hcl module "app_dashboard" { source = "./terraform-datadog-app-dashboard" app_namespace = "production" cfa_name = "my-cfa" app_name = "my-application" team_name = "platform-team" image_name = "my-app" region = "eu-west-1" stage = "prd" url = "https://api.example.com/health" } ``` ## Inputs | Name | Description | Type | Required | |------|-------------|------|----------| | `app_namespace` | Namespace that the application runs in | `string` | yes | | `cfa_name` | Name of the CFA | `string` | yes | | `app_name` | Name of the application | `string` | yes | | `team_name` | Name of the responsible team | `string` | yes | | `image_name` | Name of the Docker Image | `string` | yes | | `region` | AWS region where resources are located | `string` | yes | | `stage` | Stage to monitor (dev, tst, stg, prd) | `string` | yes | | `url` | URL for Datadog Synthetics to monitor | `string` | yes | ## Outputs Currently, this module does not export any outputs. ## Monitor Configuration ### Pod Health Monitor - **Query**: `avg(last_5m):sum:docker.containers.running{image_name:{image_name}} by {docker_image}.rollup(avg, 60) <= 1` - **Type**: Metric alert - **Thresholds**: - OK: 3 containers - Warning: 2 containers - Critical: 1 container - **No Data**: Alert after 10 minutes ### Synthetic API Test - **Type**: HTTP API test - **Method**: GET - **Interval**: Every 15 minutes (900 seconds) - **Assertion**: HTTP status code is 200 - **Locations**: AWS EU and US regions ## Tagging Strategy All resources are tagged with: - `cfa:{cfa_name}` - `team:{team_name}` - `app:{app_name}` - `env:{stage}` - `type:kubernetes` or `type:synthetics` ## Notes - The dashboard is set to read-only mode to prevent accidental modifications - Metrics are filtered by the `image_name` variable - Synthetic tests run from multiple AWS regions for geographic coverage - Pod health monitor uses a 1-minute rollup average ## License Internal use only - Sanoma/WeBuildYourCloud ## Authors Created and maintained by the Platform Engineering team.