Terraform AWS EKS Node Group Module
Overview
This Terraform module provisions an Amazon EKS Node Group with managed EC2 worker nodes. It provides a production-ready configuration for running containerized workloads on Amazon EKS with support for auto-scaling, custom AMIs, SSH access, and IAM role management.
Features
- Managed EKS Node Group with auto-scaling capabilities
- Configurable EC2 instance types and disk sizes
- Support for AL2_x86_64 and AL2_x86_64_GPU AMI types
- Optional SSH access with key pairs
- Kubernetes labels support
- IAM roles and policies for worker nodes
- Automatic attachment of required AWS managed policies
- Support for custom IAM policies
- CloudPosse naming conventions for consistent resource naming
- Conditional module enablement
Resources Created
Compute
- AWS EKS Node Group
- Auto Scaling Group (managed by EKS)
- EC2 Instances (managed by EKS)
IAM
- IAM Role for worker nodes
- IAM Role Policy Attachments:
- AmazonEKSWorkerNodePolicy
- AmazonEKS_CNI_Policy
- AmazonEC2ContainerRegistryReadOnly
- Custom policies (optional)
Usage
Basic Example
module "eks_node_group" {
source = "git@github.com:webuildyourcloud/terraform-aws-eks_node_group.git?ref=tags/0.0.2"
# Naming
namespace = "myorg"
stage = "prod"
name = "app"
attributes = []
# Cluster Configuration
cluster_name = "my-eks-cluster"
subnet_ids = ["subnet-12345678", "subnet-87654321"]
# Node Group Sizing
desired_size = 3
min_size = 2
max_size = 5
# Instance Configuration
instance_types = ["t3.medium"]
disk_size = 20
# Kubernetes Labels
kubernetes_labels = {
Environment = "production"
Team = "platform"
}
# Tags
tags = {
ManagedBy = "terraform"
}
}
Advanced Example with SSH Access
module "eks_node_group" {
source = "git@github.com:webuildyourcloud/terraform-aws-eks_node_group.git?ref=tags/0.0.2"
namespace = "myorg"
stage = "prod"
name = "app"
cluster_name = module.eks_cluster.cluster_id
subnet_ids = module.vpc.private_subnet_ids
# Node Group Configuration
desired_size = 5
min_size = 3
max_size = 10
instance_types = ["t3.large", "t3.xlarge"]
disk_size = 50
# AMI Configuration
ami_type = "AL2_x86_64"
ami_release_version = "1.21.5-20220123"
kubernetes_version = "1.21"
# SSH Access
ec2_ssh_key = "my-keypair"
source_security_group_ids = ["sg-12345678"]
# Custom IAM Policies
existing_workers_role_policy_arns = [
"arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy"
]
existing_workers_role_policy_arns_count = 1
kubernetes_labels = {
NodeType = "general-purpose"
Environment = "production"
}
tags = {
CostCenter = "engineering"
}
}
GPU-Enabled Node Group
module "eks_gpu_node_group" {
source = "git@github.com:webuildyourcloud/terraform-aws-eks_node_group.git?ref=tags/0.0.2"
namespace = "myorg"
stage = "prod"
name = "gpu"
cluster_name = module.eks_cluster.cluster_id
subnet_ids = module.vpc.private_subnet_ids
desired_size = 2
min_size = 1
max_size = 4
instance_types = ["g4dn.xlarge"]
ami_type = "AL2_x86_64_GPU"
disk_size = 100
kubernetes_labels = {
NodeType = "gpu"
"nvidia.com/gpu" = "true"
}
}
Variables
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
| namespace | Namespace (e.g., 'eg' or 'cp') | string |
"" |
no |
| stage | Stage (e.g., 'prod', 'staging', 'dev') | string |
"" |
no |
| name | Solution name (e.g., 'app' or 'cluster') | string |
n/a | yes |
| delimiter | Delimiter between namespace, stage, name and attributes | string |
"-" |
no |
| attributes | Additional attributes | list(string) |
[] |
no |
| tags | Additional tags | map(string) |
{} |
no |
| enabled | Enable/disable module resources | bool |
true |
no |
| cluster_name | Name of the EKS cluster | string |
n/a | yes |
| ec2_ssh_key | SSH key name for worker node access | string |
null |
no |
| desired_size | Desired number of worker nodes | number |
n/a | yes |
| max_size | Maximum number of worker nodes | number |
n/a | yes |
| min_size | Minimum number of worker nodes | number |
n/a | yes |
| subnet_ids | List of subnet IDs to launch resources in | list(string) |
n/a | yes |
| existing_workers_role_policy_arns | List of existing policy ARNs to attach | list(string) |
[] |
no |
| existing_workers_role_policy_arns_count | Count of existing policy ARNs | number |
0 |
no |
| ami_type | Type of AMI (AL2_x86_64, AL2_x86_64_GPU) | string |
"AL2_x86_64" |
no |
| disk_size | Disk size in GiB for worker nodes | number |
20 |
no |
| instance_types | List of instance types for EKS Node Group | list(string) |
n/a | yes |
| kubernetes_labels | Key-value mapping of Kubernetes labels | map(string) |
{} |
no |
| ami_release_version | AMI version of the EKS Node Group | string |
null |
no |
| kubernetes_version | Kubernetes version | string |
null |
no |
| source_security_group_ids | Security Group IDs to allow SSH access | list(string) |
[] |
no |
Outputs
| Name | Description |
|---|---|
| eks_node_group_role_arn | ARN of the worker nodes IAM role |
| eks_node_group_role_name | Name of the worker nodes IAM role |
| eks_node_group_id | EKS Cluster name and EKS Node Group name separated by colon |
| eks_node_group_arn | Amazon Resource Name (ARN) of the EKS Node Group |
| eks_node_group_resources | List of objects containing information about underlying resources |
| eks_node_group_status | Status of the EKS Node Group |
Requirements
| Name | Version |
|---|---|
| terraform | >= 0.13 |
| aws | ~> 3.27 |
| template | ~> 2.2 |
| local | ~> 2.0 |
Dependencies
This module uses:
- cloudposse/terraform-null-label - Resource naming
IAM Policies
The module automatically attaches the following AWS managed policies to the worker node IAM role:
- AmazonEKSWorkerNodePolicy - Allows worker nodes to connect to EKS cluster
- AmazonEKS_CNI_Policy - Provides IP address management for pods
- AmazonEC2ContainerRegistryReadOnly - Allows pulling images from ECR
Additional custom policies can be attached via existing_workers_role_policy_arns.
Important Notes
- Cluster Name: The
cluster_namemust match an existing EKS cluster - Subnets: Node groups should typically be deployed in private subnets
- Instance Types: You can specify multiple instance types for better availability
- SSH Access: If
ec2_ssh_keyis specified withoutsource_security_group_ids, port 22 will be open to the internet (0.0.0.0/0) - Auto Scaling: The node group will automatically scale between
min_sizeandmax_sizebased on pod scheduling needs - Kubernetes Version: If not specified, the cluster's Kubernetes version will be used
- AMI Updates: When
ami_release_versionis not specified, the latest AMI for the Kubernetes version is used - Tagging: The module automatically adds
kubernetes.io/cluster/<cluster_name> = "owned"tag
Best Practices
- Use Multiple Instance Types: Specify multiple instance types for better EC2 capacity availability
- Private Subnets: Deploy node groups in private subnets for security
- Right-Sizing: Start with conservative instance sizes and scale based on actual usage
- Disk Size: Allocate sufficient disk space for container images and logs (minimum 20 GiB)
- Labels: Use Kubernetes labels for node selection in pod specifications
- Security Groups: Restrict SSH access to specific security groups
- IAM Policies: Only attach necessary custom IAM policies
- Version Management: Pin
ami_release_versionandkubernetes_versionfor consistency
Kubernetes Integration
After node group creation, nodes automatically join the cluster. You can target specific node groups using node selectors:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
nodeSelector:
Environment: production
Team: platform
containers:
- name: my-container
image: nginx
Scaling Behavior
The node group will automatically scale based on:
- Pod resource requests that cannot be scheduled
- Cluster Autoscaler policies (if installed)
- Manual scaling via AWS console or API
- Kubernetes Horizontal Pod Autoscaler demands
Troubleshooting
Nodes not joining cluster
- Verify the cluster name is correct
- Check that subnets have proper tags (
kubernetes.io/cluster/<cluster_name> = "shared") - Ensure IAM role has required policies attached
- Verify security groups allow communication with cluster
Auto-scaling not working
- Install and configure Cluster Autoscaler
- Verify IAM permissions for auto-scaling
- Check that min/max size allows scaling
SSH access not working
- Verify
ec2_ssh_keyexists in the region - Check security group rules allow SSH from your IP
- Ensure bastion host or VPN connectivity to private subnets
License
This module is provided as-is for use within your organization.
Description
Languages
HCL
56.5%
Go
28.6%
Makefile
14.9%