Terraform AWS EKS Node Group Module

Overview

This Terraform module provisions an Amazon EKS Node Group with managed EC2 worker nodes. It provides a production-ready configuration for running containerized workloads on Amazon EKS with support for auto-scaling, custom AMIs, SSH access, and IAM role management.

Features

  • Managed EKS Node Group with auto-scaling capabilities
  • Configurable EC2 instance types and disk sizes
  • Support for AL2_x86_64 and AL2_x86_64_GPU AMI types
  • Optional SSH access with key pairs
  • Kubernetes labels support
  • IAM roles and policies for worker nodes
  • Automatic attachment of required AWS managed policies
  • Support for custom IAM policies
  • CloudPosse naming conventions for consistent resource naming
  • Conditional module enablement

Resources Created

Compute

  • AWS EKS Node Group
  • Auto Scaling Group (managed by EKS)
  • EC2 Instances (managed by EKS)

IAM

  • IAM Role for worker nodes
  • IAM Role Policy Attachments:
    • AmazonEKSWorkerNodePolicy
    • AmazonEKS_CNI_Policy
    • AmazonEC2ContainerRegistryReadOnly
    • Custom policies (optional)

Usage

Basic Example

module "eks_node_group" {
  source = "git@github.com:webuildyourcloud/terraform-aws-eks_node_group.git?ref=tags/0.0.2"

  # Naming
  namespace  = "myorg"
  stage      = "prod"
  name       = "app"
  attributes = []

  # Cluster Configuration
  cluster_name = "my-eks-cluster"
  subnet_ids   = ["subnet-12345678", "subnet-87654321"]

  # Node Group Sizing
  desired_size = 3
  min_size     = 2
  max_size     = 5

  # Instance Configuration
  instance_types = ["t3.medium"]
  disk_size      = 20

  # Kubernetes Labels
  kubernetes_labels = {
    Environment = "production"
    Team        = "platform"
  }

  # Tags
  tags = {
    ManagedBy = "terraform"
  }
}

Advanced Example with SSH Access

module "eks_node_group" {
  source = "git@github.com:webuildyourcloud/terraform-aws-eks_node_group.git?ref=tags/0.0.2"

  namespace = "myorg"
  stage     = "prod"
  name      = "app"

  cluster_name = module.eks_cluster.cluster_id
  subnet_ids   = module.vpc.private_subnet_ids

  # Node Group Configuration
  desired_size   = 5
  min_size       = 3
  max_size       = 10
  instance_types = ["t3.large", "t3.xlarge"]
  disk_size      = 50

  # AMI Configuration
  ami_type            = "AL2_x86_64"
  ami_release_version = "1.21.5-20220123"
  kubernetes_version  = "1.21"

  # SSH Access
  ec2_ssh_key               = "my-keypair"
  source_security_group_ids = ["sg-12345678"]

  # Custom IAM Policies
  existing_workers_role_policy_arns       = [
    "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy"
  ]
  existing_workers_role_policy_arns_count = 1

  kubernetes_labels = {
    NodeType    = "general-purpose"
    Environment = "production"
  }

  tags = {
    CostCenter = "engineering"
  }
}

GPU-Enabled Node Group

module "eks_gpu_node_group" {
  source = "git@github.com:webuildyourcloud/terraform-aws-eks_node_group.git?ref=tags/0.0.2"

  namespace = "myorg"
  stage     = "prod"
  name      = "gpu"

  cluster_name = module.eks_cluster.cluster_id
  subnet_ids   = module.vpc.private_subnet_ids

  desired_size   = 2
  min_size       = 1
  max_size       = 4
  instance_types = ["g4dn.xlarge"]
  ami_type       = "AL2_x86_64_GPU"
  disk_size      = 100

  kubernetes_labels = {
    NodeType     = "gpu"
    "nvidia.com/gpu" = "true"
  }
}

Variables

Name Description Type Default Required
namespace Namespace (e.g., 'eg' or 'cp') string "" no
stage Stage (e.g., 'prod', 'staging', 'dev') string "" no
name Solution name (e.g., 'app' or 'cluster') string n/a yes
delimiter Delimiter between namespace, stage, name and attributes string "-" no
attributes Additional attributes list(string) [] no
tags Additional tags map(string) {} no
enabled Enable/disable module resources bool true no
cluster_name Name of the EKS cluster string n/a yes
ec2_ssh_key SSH key name for worker node access string null no
desired_size Desired number of worker nodes number n/a yes
max_size Maximum number of worker nodes number n/a yes
min_size Minimum number of worker nodes number n/a yes
subnet_ids List of subnet IDs to launch resources in list(string) n/a yes
existing_workers_role_policy_arns List of existing policy ARNs to attach list(string) [] no
existing_workers_role_policy_arns_count Count of existing policy ARNs number 0 no
ami_type Type of AMI (AL2_x86_64, AL2_x86_64_GPU) string "AL2_x86_64" no
disk_size Disk size in GiB for worker nodes number 20 no
instance_types List of instance types for EKS Node Group list(string) n/a yes
kubernetes_labels Key-value mapping of Kubernetes labels map(string) {} no
ami_release_version AMI version of the EKS Node Group string null no
kubernetes_version Kubernetes version string null no
source_security_group_ids Security Group IDs to allow SSH access list(string) [] no

Outputs

Name Description
eks_node_group_role_arn ARN of the worker nodes IAM role
eks_node_group_role_name Name of the worker nodes IAM role
eks_node_group_id EKS Cluster name and EKS Node Group name separated by colon
eks_node_group_arn Amazon Resource Name (ARN) of the EKS Node Group
eks_node_group_resources List of objects containing information about underlying resources
eks_node_group_status Status of the EKS Node Group

Requirements

Name Version
terraform >= 0.13
aws ~> 3.27
template ~> 2.2
local ~> 2.0

Dependencies

This module uses:

IAM Policies

The module automatically attaches the following AWS managed policies to the worker node IAM role:

  1. AmazonEKSWorkerNodePolicy - Allows worker nodes to connect to EKS cluster
  2. AmazonEKS_CNI_Policy - Provides IP address management for pods
  3. AmazonEC2ContainerRegistryReadOnly - Allows pulling images from ECR

Additional custom policies can be attached via existing_workers_role_policy_arns.

Important Notes

  1. Cluster Name: The cluster_name must match an existing EKS cluster
  2. Subnets: Node groups should typically be deployed in private subnets
  3. Instance Types: You can specify multiple instance types for better availability
  4. SSH Access: If ec2_ssh_key is specified without source_security_group_ids, port 22 will be open to the internet (0.0.0.0/0)
  5. Auto Scaling: The node group will automatically scale between min_size and max_size based on pod scheduling needs
  6. Kubernetes Version: If not specified, the cluster's Kubernetes version will be used
  7. AMI Updates: When ami_release_version is not specified, the latest AMI for the Kubernetes version is used
  8. Tagging: The module automatically adds kubernetes.io/cluster/<cluster_name> = "owned" tag

Best Practices

  1. Use Multiple Instance Types: Specify multiple instance types for better EC2 capacity availability
  2. Private Subnets: Deploy node groups in private subnets for security
  3. Right-Sizing: Start with conservative instance sizes and scale based on actual usage
  4. Disk Size: Allocate sufficient disk space for container images and logs (minimum 20 GiB)
  5. Labels: Use Kubernetes labels for node selection in pod specifications
  6. Security Groups: Restrict SSH access to specific security groups
  7. IAM Policies: Only attach necessary custom IAM policies
  8. Version Management: Pin ami_release_version and kubernetes_version for consistency

Kubernetes Integration

After node group creation, nodes automatically join the cluster. You can target specific node groups using node selectors:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  nodeSelector:
    Environment: production
    Team: platform
  containers:
  - name: my-container
    image: nginx

Scaling Behavior

The node group will automatically scale based on:

  • Pod resource requests that cannot be scheduled
  • Cluster Autoscaler policies (if installed)
  • Manual scaling via AWS console or API
  • Kubernetes Horizontal Pod Autoscaler demands

Troubleshooting

Nodes not joining cluster

  • Verify the cluster name is correct
  • Check that subnets have proper tags (kubernetes.io/cluster/<cluster_name> = "shared")
  • Ensure IAM role has required policies attached
  • Verify security groups allow communication with cluster

Auto-scaling not working

  • Install and configure Cluster Autoscaler
  • Verify IAM permissions for auto-scaling
  • Check that min/max size allows scaling

SSH access not working

  • Verify ec2_ssh_key exists in the region
  • Check security group rules allow SSH from your IP
  • Ensure bastion host or VPN connectivity to private subnets

License

This module is provided as-is for use within your organization.

Description
Terraform module for provisioning AWS EKS managed node groups with auto-scaling
Readme 59 KiB
Languages
HCL 56.5%
Go 28.6%
Makefile 14.9%