169 lines
5.9 KiB
Markdown

# Terraform Datadog Belgie (RDS Dashboard) Module
## Overview
This Terraform module creates an AWS RDS database monitoring dashboard in Datadog with comprehensive metrics for performance, storage, connections, and replication lag. Designed specifically for Belgian/EU deployments with pre-configured alert recipients and monitoring thresholds.
## Features
- **Comprehensive RDS Metrics**: CPU, memory, connections, storage, disk queue, latency
- **Dynamic Dashboard**: 25+ preconfigured widgets with automatic metric visualization
- **Flexible Alerting**: Configurable alert recipients for different severity levels
- **CloudWatch Integration**: Leverages AWS RDS CloudWatch metrics
- **Customizable Monitors**: Map-based monitor configuration for easy customization
## Resources Created
- `datadog_dashboard`: RDS Database Dashboard with 25+ widgets including:
- Query value widgets for read/write latency and IOPS
- Timeseries for replication lag, connections, CPU, memory, disk metrics
- Toplist widgets for metric ranking
- Note widgets for dashboard organization
## Dashboard Widgets
The dashboard includes comprehensive monitoring for:
- Read/Write Latency (query value widgets)
- Replication Lag (timeseries)
- Database Connections (timeseries)
- CPU Utilization (timeseries + toplist)
- Read/Write Operations (timeseries)
- Freeable Memory (timeseries + toplist)
- Disk Queue Depth (timeseries)
- Free Storage Space (timeseries + toplist)
## Requirements
| Name | Version |
|------|---------|
| terraform | >= 0.12 |
| datadog | >= 3.1.2 |
| aws | >= 2.0 |
## Usage
```hcl
module "rds_dashboard" {
source = "./terraform-datadog-belgie"
region = "eu-west-1"
api_key = var.datadog_api_key
app_key = var.datadog_app_key
datadog_site = "https://api.datadoghq.eu/"
aws_profile = "production"
cfa_slug = "my-cfa"
team = "platform-team"
application = "myapp"
stage = "prd"
alert_recipients = ["team@example.com"]
recipients = ["team@example.com"]
warning_recipients = ["team@example.com"]
dd_rds_monitors = {
cpu = {
enabled = true
warning = 75
critical = 90
name = "RDS CPU High"
}
# ... additional monitors
}
}
```
## Inputs
| Name | Description | Type | Required | Default |
|------|-------------|------|----------|---------|
| `region` | AWS region for resources | `string` | yes | - |
| `api_key` | Datadog API key | `string` | yes | - |
| `app_key` | Datadog APP key | `string` | yes | - |
| `datadog_site` | Datadog site (EU or US) | `string` | no | `"https://api.datadoghq.eu/"` |
| `aws_profile` | AWS account this integration belongs to | `string` | yes | - |
| `cfa_slug` | CFA this integration belongs to | `string` | yes | - |
| `team` | Team this integration belongs to | `string` | yes | - |
| `application` | Application name | `string` | yes | - |
| `stage` | Stage (dev, tst, acc, prd) | `string` | yes | - |
| `alert_recipients` | Alert notification recipients | `list(string)` | no | `["patrick.de.ruiter@sanoma.com"]` |
| `recipients` | General notification recipients | `list(string)` | no | `["patrick.de.ruiter@sanoma.com"]` |
| `warning_recipients` | Warning notification recipients | `list(string)` | no | `["patrick.de.ruiter@sanoma.com"]` |
| `dd_rds_monitors` | RDS monitor configuration map | `map(any)` | no | See variables.tf |
## RDS Monitors Configuration
The `dd_rds_monitors` variable accepts a map with the following monitor types:
- `cpu`: CPU utilization monitoring
- `memory`: Freeable memory monitoring
- `connections`: Database connections monitoring
- `storage`: Free storage space monitoring
- `disk_queue`: Disk queue depth monitoring
- `read_latency`: Read latency monitoring
- `write_latency`: Write latency monitoring
- `replication_lag`: Replication lag monitoring
Each monitor can be configured with:
```hcl
{
enabled = bool # Enable/disable the monitor
warning = number # Warning threshold
critical = number # Critical threshold
name = string # Monitor name
}
```
## Outputs
Currently, this module does not export any outputs (outputs are commented out).
## Local Values
The module uses several local values for dynamic configuration:
- `dbidentifier`: Formatted as `{application}-{stage}`
- `rdsgraphs`: Map of 8 RDS metric queries
- `full_message`: Constructed alert message with links
- `tags`: Standard tags (team, stage, application)
## RDS Metrics Monitored
| Metric | CloudWatch Metric Name | Description |
|--------|----------------------|-------------|
| CPU | `aws.rds.cpuutilization` | CPU utilization percentage |
| Memory | `aws.rds.freeable_memory` | Freeable memory in bytes |
| Connections | `aws.rds.database_connections` | Number of database connections |
| Storage | `aws.rds.free_storage_space` | Free storage space in bytes |
| Read Latency | `aws.rds.read_latency` | Read operation latency |
| Write Latency | `aws.rds.write_latency` | Write operation latency |
| Disk Queue | `aws.rds.disk_queue_depth` | Disk queue depth |
| Replication Lag | `aws.rds.replica_lag` | Replication lag in seconds |
## Dashboard Layout
- **Layout Type**: Free layout (allows custom positioning)
- **Read-Only**: No (allows modifications in UI)
- **Widget Organization**: Sections with note widgets as headers
- **Conditional Formatting**: Metric values color-coded based on thresholds
## Tagging Strategy
All resources are tagged with:
- `team:{team}`
- `stage:{stage}`
- `application:{application}`
## Notes
- Default Datadog site is EU (GDPR compliance)
- Dashboard uses free layout for flexible widget positioning
- Metrics are sourced from AWS RDS CloudWatch integration
- Database identifier is auto-generated from application and stage variables
- Alert messages include links to related monitors
## License
Internal use only - Sanoma/WeBuildYourCloud
## Authors
Created and maintained by the Platform Engineering team.