From 372fa8fabc0916b8e3310f9a4b6fddd8054b5d1b Mon Sep 17 00:00:00 2001
From: Patrick de Ruiter <pderuiter@bsdserver.nl>
Date: Sat, 1 Nov 2025 10:43:46 +0100
Subject: [PATCH] Initial commit with README and module files

---
 .gitignore          |   0
 .terraform.lock.hcl |   0
 README.md           | 140 ++++++++++++++++++++++++++++++++++++++++++++
 main.tf             |   0
 provider.tf         |   0
 variables.tf        |   0
 6 files changed, 140 insertions(+)
 mode change 100644 => 100755 .gitignore
 mode change 100644 => 100755 .terraform.lock.hcl
 create mode 100644 README.md
 mode change 100644 => 100755 main.tf
 mode change 100644 => 100755 provider.tf
 mode change 100644 => 100755 variables.tf

diff --git a/.gitignore b/.gitignore
old mode 100644
new mode 100755
diff --git a/.terraform.lock.hcl b/.terraform.lock.hcl
old mode 100644
new mode 100755
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..29069aa
--- /dev/null
+++ b/README.md
@@ -0,0 +1,140 @@
+# Terraform Datadog Monitors Module
+
+## Overview
+
+This Terraform module creates basic host metrics monitors for CPU and disk usage with accompanying visualization timeboard in Datadog.
+
+## Features
+
+- **CPU Monitoring**: Track EC2 instance CPU utilization
+- **Disk Monitoring**: Monitor disk usage across hosts
+- **Automated Alerting**: No-data notifications included
+- **Visualization**: Read-only timeboard with alert thresholds
+- **Configurable Thresholds**: Customizable warning and critical levels
+
+## Resources Created
+
+- `datadog_monitor` (disk_usage): Metric alert for disk usage
+- `datadog_monitor` (cpu_usage): Query alert for CPU usage
+- `datadog_timeboard` (host_metrics): Read-only visualization dashboard
+
+## Requirements
+
+| Name | Version |
+|------|---------|
+| terraform | >= 0.12 |
+| datadog | >= 3.2.0 |
+
+## Usage
+
+```hcl
+module "datadog_monitors" {
+  source = "./terraform-datadog-monitors"
+
+  datadog_api_key = var.datadog_api_key
+  datadog_app_key = var.datadog_app_key
+  api_url         = "https://api.datadoghq.eu"
+  
+  disk_usage = {
+    query     = "max:system.disk.in_use"
+    threshold = "85"
+  }
+  
+  cpu_usage = {
+    query     = "avg:aws.ec2.cpuutilization"
+    threshold = "85"
+  }
+}
+```
+
+## Inputs
+
+| Name | Description | Type | Required | Default |
+|------|-------------|------|----------|---------|
+| `datadog_api_key` | Datadog API key | `string` | yes | - |
+| `datadog_app_key` | Datadog APP key | `string` | yes | - |
+| `api_url` | API endpoint | `string` | no | `"https://api.datadoghq.eu"` |
+| `http_client_retry_enabled` | Enable request retries (429, 5xx) | `bool` | no | `true` |
+| `http_client_retry_timeout` | HTTP retry timeout | `string` | no | `""` |
+| `validate` | Validate API/APP keys on init | `bool` | no | `true` |
+| `disk_usage` | Query and threshold for disk monitor | `map` | no | See default |
+| `cpu_usage` | Query and threshold for CPU monitor | `map` | no | See default |
+| `datadog_alert_footer` | Alert message footer | `string` | no | PagerDuty + Slack template |
+| `trigger_by` | Grouping for alerts | `string` | no | `"{host,env}"` |
+
+## Monitor Configuration
+
+### Disk Usage Monitor
+
+- **Query**: `avg(last_5m):max:system.disk.in_use{*} by {host,env} * 100 > 85`
+- **Type**: Metric alert
+- **Threshold**: 85% (configurable)
+- **Evaluation**: Last 5 minutes average
+- **Grouping**: By host and env
+- **No Data**: Notifies after 10 minutes
+
+### CPU Usage Monitor
+
+- **Query**: `avg(last_5m):avg:aws.ec2.cpuutilization{*} by {host,env} > 85`
+- **Type**: Query alert
+- **Threshold**: 85% (configurable)
+- **Evaluation**: Last 5 minutes average
+- **Grouping**: By host and env
+- **No Data**: Notifies after 10 minutes
+
+## Timeboard
+
+The module creates a read-only timeboard with:
+- CPU usage graph with alert threshold marker
+- Disk usage graph with alert threshold marker
+- Alert overlay showing when thresholds are breached
+
+## Alert Message Template
+
+Default alert footer includes integration with:
+- PagerDuty: `@pagerduty-service_name`
+- Slack: `@slack-channel_name`
+
+Customize via the `datadog_alert_footer` variable.
+
+## Outputs
+
+Currently, this module does not export any outputs.
+
+## Customization
+
+### Custom Thresholds
+
+```hcl
+disk_usage = {
+  query     = "max:system.disk.in_use"
+  threshold = "90"  # Raise to 90%
+}
+
+cpu_usage = {
+  query     = "avg:aws.ec2.cpuutilization"
+  threshold = "75"  # Lower to 75%
+}
+```
+
+### Custom Grouping
+
+```hcl
+trigger_by = "{host,env,service}"
+```
+
+## Notes
+
+- Monitors include no-data alerting by default
+- Timeboard is read-only to prevent accidental modifications
+- Uses 5-minute evaluation windows
+- Supports HTTP client retries for reliability
+- Can be reused across multiple environments via variable configuration
+
+## License
+
+Internal use only - Sanoma/WeBuildYourCloud
+
+## Authors
+
+Created and maintained by the Platform Engineering team.
diff --git a/main.tf b/main.tf
old mode 100644
new mode 100755
diff --git a/provider.tf b/provider.tf
old mode 100644
new mode 100755
diff --git a/variables.tf b/variables.tf
old mode 100644
new mode 100755