169 lines
5.9 KiB
Markdown
169 lines
5.9 KiB
Markdown
# Terraform Datadog Belgie (RDS Dashboard) Module
|
|
|
|
## Overview
|
|
|
|
This Terraform module creates an AWS RDS database monitoring dashboard in Datadog with comprehensive metrics for performance, storage, connections, and replication lag. Designed specifically for Belgian/EU deployments with pre-configured alert recipients and monitoring thresholds.
|
|
|
|
## Features
|
|
|
|
- **Comprehensive RDS Metrics**: CPU, memory, connections, storage, disk queue, latency
|
|
- **Dynamic Dashboard**: 25+ preconfigured widgets with automatic metric visualization
|
|
- **Flexible Alerting**: Configurable alert recipients for different severity levels
|
|
- **CloudWatch Integration**: Leverages AWS RDS CloudWatch metrics
|
|
- **Customizable Monitors**: Map-based monitor configuration for easy customization
|
|
|
|
## Resources Created
|
|
|
|
- `datadog_dashboard`: RDS Database Dashboard with 25+ widgets including:
|
|
- Query value widgets for read/write latency and IOPS
|
|
- Timeseries for replication lag, connections, CPU, memory, disk metrics
|
|
- Toplist widgets for metric ranking
|
|
- Note widgets for dashboard organization
|
|
|
|
## Dashboard Widgets
|
|
|
|
The dashboard includes comprehensive monitoring for:
|
|
- Read/Write Latency (query value widgets)
|
|
- Replication Lag (timeseries)
|
|
- Database Connections (timeseries)
|
|
- CPU Utilization (timeseries + toplist)
|
|
- Read/Write Operations (timeseries)
|
|
- Freeable Memory (timeseries + toplist)
|
|
- Disk Queue Depth (timeseries)
|
|
- Free Storage Space (timeseries + toplist)
|
|
|
|
## Requirements
|
|
|
|
| Name | Version |
|
|
|------|---------|
|
|
| terraform | >= 0.12 |
|
|
| datadog | >= 3.1.2 |
|
|
| aws | >= 2.0 |
|
|
|
|
## Usage
|
|
|
|
```hcl
|
|
module "rds_dashboard" {
|
|
source = "./terraform-datadog-belgie"
|
|
|
|
region = "eu-west-1"
|
|
api_key = var.datadog_api_key
|
|
app_key = var.datadog_app_key
|
|
datadog_site = "https://api.datadoghq.eu/"
|
|
aws_profile = "production"
|
|
cfa_slug = "my-cfa"
|
|
team = "platform-team"
|
|
application = "myapp"
|
|
stage = "prd"
|
|
|
|
alert_recipients = ["team@example.com"]
|
|
recipients = ["team@example.com"]
|
|
warning_recipients = ["team@example.com"]
|
|
|
|
dd_rds_monitors = {
|
|
cpu = {
|
|
enabled = true
|
|
warning = 75
|
|
critical = 90
|
|
name = "RDS CPU High"
|
|
}
|
|
# ... additional monitors
|
|
}
|
|
}
|
|
```
|
|
|
|
## Inputs
|
|
|
|
| Name | Description | Type | Required | Default |
|
|
|------|-------------|------|----------|---------|
|
|
| `region` | AWS region for resources | `string` | yes | - |
|
|
| `api_key` | Datadog API key | `string` | yes | - |
|
|
| `app_key` | Datadog APP key | `string` | yes | - |
|
|
| `datadog_site` | Datadog site (EU or US) | `string` | no | `"https://api.datadoghq.eu/"` |
|
|
| `aws_profile` | AWS account this integration belongs to | `string` | yes | - |
|
|
| `cfa_slug` | CFA this integration belongs to | `string` | yes | - |
|
|
| `team` | Team this integration belongs to | `string` | yes | - |
|
|
| `application` | Application name | `string` | yes | - |
|
|
| `stage` | Stage (dev, tst, acc, prd) | `string` | yes | - |
|
|
| `alert_recipients` | Alert notification recipients | `list(string)` | no | `["patrick.de.ruiter@sanoma.com"]` |
|
|
| `recipients` | General notification recipients | `list(string)` | no | `["patrick.de.ruiter@sanoma.com"]` |
|
|
| `warning_recipients` | Warning notification recipients | `list(string)` | no | `["patrick.de.ruiter@sanoma.com"]` |
|
|
| `dd_rds_monitors` | RDS monitor configuration map | `map(any)` | no | See variables.tf |
|
|
|
|
## RDS Monitors Configuration
|
|
|
|
The `dd_rds_monitors` variable accepts a map with the following monitor types:
|
|
- `cpu`: CPU utilization monitoring
|
|
- `memory`: Freeable memory monitoring
|
|
- `connections`: Database connections monitoring
|
|
- `storage`: Free storage space monitoring
|
|
- `disk_queue`: Disk queue depth monitoring
|
|
- `read_latency`: Read latency monitoring
|
|
- `write_latency`: Write latency monitoring
|
|
- `replication_lag`: Replication lag monitoring
|
|
|
|
Each monitor can be configured with:
|
|
```hcl
|
|
{
|
|
enabled = bool # Enable/disable the monitor
|
|
warning = number # Warning threshold
|
|
critical = number # Critical threshold
|
|
name = string # Monitor name
|
|
}
|
|
```
|
|
|
|
## Outputs
|
|
|
|
Currently, this module does not export any outputs (outputs are commented out).
|
|
|
|
## Local Values
|
|
|
|
The module uses several local values for dynamic configuration:
|
|
- `dbidentifier`: Formatted as `{application}-{stage}`
|
|
- `rdsgraphs`: Map of 8 RDS metric queries
|
|
- `full_message`: Constructed alert message with links
|
|
- `tags`: Standard tags (team, stage, application)
|
|
|
|
## RDS Metrics Monitored
|
|
|
|
| Metric | CloudWatch Metric Name | Description |
|
|
|--------|----------------------|-------------|
|
|
| CPU | `aws.rds.cpuutilization` | CPU utilization percentage |
|
|
| Memory | `aws.rds.freeable_memory` | Freeable memory in bytes |
|
|
| Connections | `aws.rds.database_connections` | Number of database connections |
|
|
| Storage | `aws.rds.free_storage_space` | Free storage space in bytes |
|
|
| Read Latency | `aws.rds.read_latency` | Read operation latency |
|
|
| Write Latency | `aws.rds.write_latency` | Write operation latency |
|
|
| Disk Queue | `aws.rds.disk_queue_depth` | Disk queue depth |
|
|
| Replication Lag | `aws.rds.replica_lag` | Replication lag in seconds |
|
|
|
|
## Dashboard Layout
|
|
|
|
- **Layout Type**: Free layout (allows custom positioning)
|
|
- **Read-Only**: No (allows modifications in UI)
|
|
- **Widget Organization**: Sections with note widgets as headers
|
|
- **Conditional Formatting**: Metric values color-coded based on thresholds
|
|
|
|
## Tagging Strategy
|
|
|
|
All resources are tagged with:
|
|
- `team:{team}`
|
|
- `stage:{stage}`
|
|
- `application:{application}`
|
|
|
|
## Notes
|
|
|
|
- Default Datadog site is EU (GDPR compliance)
|
|
- Dashboard uses free layout for flexible widget positioning
|
|
- Metrics are sourced from AWS RDS CloudWatch integration
|
|
- Database identifier is auto-generated from application and stage variables
|
|
- Alert messages include links to related monitors
|
|
|
|
## License
|
|
|
|
Internal use only - Sanoma/WeBuildYourCloud
|
|
|
|
## Authors
|
|
|
|
Created and maintained by the Platform Engineering team.
|