212 lines
6.2 KiB
Markdown

# Terraform AWS Datadog2 Integration & Monitoring Module
## Overview
The `terraform-aws-datadog2` module is a comprehensive Terraform configuration that integrates AWS with Datadog for monitoring and alerting. It sets up AWS-Datadog integration and creates pre-configured Datadog monitors to track critical infrastructure metrics.
## Features
- Automated AWS-Datadog integration setup
- Pre-configured infrastructure monitors for:
- CPU utilization
- Memory utilization
- System load
- Disk space
- Disk inodes
- Disk usage forecasting (7-day prediction)
- CloudPosse label/tagging context for consistent naming
- Support for both EU and US Datadog endpoints
## Resources Created
### AWS Resources (via CloudPosse Module)
- **IAM Role** - Allows Datadog to assume this role for monitoring AWS resources
- **External ID** - Security mechanism for cross-account role assumption
- Associated IAM policies for AWS monitoring permissions
### Datadog Monitors
1. **CPU Utilization Monitor**
- Type: Metric alert
- Warning: 50%
- Critical: 60%
2. **Memory Utilization Monitor**
- Type: Query alert
- Evaluation: 5 minutes
- Warning: 10% usable memory remaining
- Critical: 5% usable memory remaining
3. **System Load Monitor**
- Type: Query alert
- Tracks: 5-minute normalized system load
- Evaluation: 30 minutes
- Warning: 2.0
- Critical: 2.5
4. **Disk Space Monitor**
- Type: Query alert
- Evaluation: 5 minutes
- Warning: 80% used
- Critical: 90% used
5. **Disk Inodes Monitor**
- Type: Query alert
- Evaluation: 5 minutes
- Warning: 90% used
- Critical: 95% used
6. **Disk Usage Forecast Monitor**
- Type: Query alert with forecasting
- Prediction: Next 7 days
- Forecast model: Linear
- Warning: 72% predicted usage
- Critical: 80% predicted usage
## Usage
```hcl
module "datadog_monitoring" {
source = "path/to/terraform-aws-datadog2"
# Required variables
region = "eu-west-1"
api_key = var.datadog_api_key # Store securely!
app_key = var.datadog_app_key # Store securely!
aws_profile = "your-aws-profile"
prefix_slug = "mycompany"
team = "platform"
# Optional variables
datadog_site = "https://api.datadoghq.eu/" # Default
# CloudPosse label context (optional)
namespace = "myorg"
environment = "prod"
stage = "production"
name = "monitoring"
tags = {
Project = "Infrastructure"
ManagedBy = "Terraform"
}
}
```
## Variables
### Required Variables
| Variable | Type | Description |
|----------|------|-------------|
| `region` | string | AWS region where monitored resources reside |
| `api_key` | string | Datadog API key for sending logs, metrics, and traces |
| `app_key` | string | Datadog application key for API manipulation |
| `aws_profile` | string | AWS profile name for authentication |
| `prefix_slug` | string | Prefix slug for naming |
| `team` | string | Team identifier |
### Optional Variables
| Variable | Type | Default | Description |
|----------|------|---------|-------------|
| `datadog_site` | string | `https://api.datadoghq.eu/` | Datadog site endpoint |
### CloudPosse Label Context Variables
| Variable | Type | Default | Description |
|----------|------|---------|-------------|
| `enabled` | bool | null | Enable/disable resource creation |
| `namespace` | string | null | Organization name or abbreviation |
| `environment` | string | null | Environment identifier |
| `stage` | string | null | Stage identifier |
| `name` | string | null | Solution name |
| `delimiter` | string | null | Delimiter between name components |
| `attributes` | list(string) | [] | Additional attributes for naming |
| `tags` | map(string) | {} | Additional tags |
| `label_order` | list(string) | null | Custom ordering of name components |
## Outputs
| Output | Description |
|--------|-------------|
| `aws_account_id` | AWS Account ID of the IAM Role for Datadog |
| `aws_role_name` | Name of the AWS IAM Role for Datadog |
| `datadog_external_id` | External ID for secure role assumption |
**Note:** These outputs are essential for completing the Datadog integration by providing values to enter in Datadog's AWS integration settings.
## Dependencies
### Terraform Requirements
- Terraform >= 0.13.0
### Provider Requirements
- `hashicorp/aws` - AWS infrastructure management
- `datadog/datadog` - Datadog monitoring resources
- `hashicorp/local` >= 1.3 - Local file operations
### External Modules
1. **cloudposse/datadog-integration/aws** (v0.11.0)
- Creates AWS IAM role and permissions for Datadog
- Handles cross-account role assumption
2. **cloudposse/label/null** (v0.24.1)
- Provides consistent tagging and naming conventions
### Prerequisites
- Valid AWS account with IAM role creation permissions
- Active Datadog account with monitor creation access
- Network connectivity to AWS and Datadog APIs
- Proper AWS profile configured
## Post-Deployment Setup
After applying this module, complete the integration in Datadog:
1. Navigate to AWS integration settings in Datadog console
2. Add AWS account using the `aws_account_id` output
3. Add the `aws_role_name` as the IAM role name
4. Provide the `datadog_external_id` as the external ID
5. Complete the AWS integration in Datadog console
## Monitor Alert Notifications
To receive alerts, configure notification channels in Datadog and update the monitors to include your notification preferences.
## Customization
### Adjusting Monitor Thresholds
To adjust alert thresholds, modify the monitor resources in `monitors.tf`:
```hcl
# Example: Adjust CPU warning to 60% and critical to 80%
resource "datadog_monitor" "cpumonitor" {
# ... other settings ...
thresholds = {
warning = 60
critical = 80
}
}
```
### Adding Additional Monitors
Add new monitor resources to `monitors.tf` following the existing patterns.
## Security Considerations
- Store API keys and app keys securely (use Terraform Cloud, AWS Secrets Manager, or HashiCorp Vault)
- Never commit sensitive credentials to version control
- Use IAM role-based access instead of IAM user credentials where possible
- Review and adjust monitor thresholds based on your workload requirements
## License
See project license file.
## Authors
Maintained by WebBuildYourCloud team.