Introduction
In Part 1, we successfully finished building the EKS Control Plane, we set up the VPC, Subnets, NAT Gateway, the Kubernetes API Server, and configured kubectl connectivity.
- VPC, Subnets, NAT
- Kubernetes API Server
- An IAM role with enough permissions for the Control Plane
-
kubectlalready connected to the cluster
But if you run:
kubectl get nodes
you’ll see… nothing.
That’s the real state of the cluster we created: it has a brain (Control Plane), but no hands and feet (Worker Nodes). Without compute capacity, the Kubernetes Scheduler can only stare at Pods stuck in the Pending state.
In this Part 2, we’ll “grow limbs” for your EKS cluster by deploying Worker Nodes using AWS Managed Nodes – the most practical middle ground between production readiness and operational effort.
This post won’t stop at “click and it works.” We’ll also explore what AWS quietly builds for you in a Managed Node Group, and contrast it with the Self-managed way so you understand the fundamentals and can debug with confidence.
📝A Quick Primer on Worker Nodes
What is a Worker Node, really?
In the most accurate – and simplest – definition:
A Worker Node is an EC2 instance configured to run Kubernetes workloads (Pods).
At minimum, every worker node runs:
-
kubelet— the agent that talks to the Kubernetes API Server - a container runtime — typically
containerd -
kube-proxy— manages service/network rules - AWS VPC CNI — the plugin that allocates Pod IPs from your VPC
Following production best practices, EKS worker nodes usually live inside private subnets without public IPs. That significantly reduces exposure: workloads are not directly reachable from the Internet, and outbound connectivity is handled through a NAT Gateway.
Worker Node Deployment Models
AWS offers three main ways to run compute for EKS, each with a different balance of control, operational cost, and “serverless-ness”:
| Criteria | Managed Nodes | Self-managed Nodes | AWS Fargate |
|---|---|---|---|
| Infrastructure | EC2 Instances (partly managed by AWS) | EC2 Instances (fully managed by you) | Serverless (no nodes to manage) |
| Operational cost (OpEx) | Low | High | Very low |
| Control | Medium | Highest | Lowest |
| Billing | Per EC2 instance | Per EC2 instance | Per Pod (CPU/Mem) |
Fargate hides almost everything—including the “node layer.” That’s convenient, but if your goal is to understand EKS fundamentals, the interesting engineering happens in Managed Nodes vs Self-managed Nodes.
| Criteria | Managed Nodes (Managed Node Group) | Self-managed Nodes |
|---|---|---|
| Node creation | EKS Console or CLI command creates a Node Group | EC2 Launch Template + ASG + User Data (bootstrap) |
| ASG management | AWS manages the Auto Scaling Group lifecycle | You manage the ASG entirely |
| Cluster Join | AWS automatically handles the wiring and credentials for join | You must provide user data calling /etc/eks/bootstrap.sh
|
| Node auth mapping (IAM→ RBAC) | AWS typically maps the Node Role automatically | You must manually update aws-auth to map the Node Role |
| Upgrades/Updates | Built-in, managed rolling update workflows | You design and manage drain / replace strategies |
| Debug level | Fewer common traps, higher abstraction | More control, but more responsibility for low-level configuration |
What AWS does for you in a Managed Node Group
When you click Create Node Group, AWS typically handles a long checklist that Self-managed nodes would require you to build manually:
- Creates an Auto Scaling Group
- Picks an EKS-Optimized AMI compatible with your cluster version
- Creates/manages a Launch Template (or uses one you provide)
- Attaches an IAM Instance Profile using your Node IAM Role
- Injects the bootstrap configuration so nodes can join the cluster
- Automates the node registration path
- Provides a rolling update workflow for node group upgrades
- Typically handles node role mapping into the cluster auth mechanism
Recommendation: Use Managed Nodes for most production setups to reduce operational overhead. Choose Self-managed only when you have very specific requirements (custom OS hardening, special bootstrap, deep control of the node lifecycle).
Cluster IAM Role vs Node IAM Role
This is one of the most common points of confusion, so let’s make it crystal clear.
Cluster IAM Role
- Used by the EKS Control Plane
- Allows the Control Plane to manage ENIs and interact with your VPC resources
This role is not meant for workloads.
Node IAM Role
Worker nodes need a Node IAM Role to:
- Join the cluster
- Allow the VPC CNI to attach ENIs and allocate Pod IPs
- Pull images from ECR
- Access other required AWS APIs (later: secrets, parameters, logs, etc.)
Your worker nodes won’t become Ready without (at least) these managed policies:
| Policy | Purpose |
|---|---|
AmazonEKSWorkerNodePolicy |
Join cluster, talk to the API Server |
AmazonEKS_CNI_Policy |
Attach ENIs, allocate Pod IPs |
AmazonEC2ContainerRegistryReadOnly |
Pull images from ECR |
AWS IAM & Kubernetes Authentication
Access to an EKS cluster is a combination of two layers: AWS IAM & Kubernetes RBAC.
IAM = Authentication
IAM answers: “Who are you?”
When a principal (an EC2 node or a human user) calls the Kubernetes API Server, EKS uses IAM authentication to verify:
- Which IAM principal (Role/User) the request comes from
- Whether the request has a valid SigV4 signature
✅ This is why worker nodes must have a proper Node IAM Role attached.
Kubernetes RBAC = Authorization
RBAC answers: “What are you allowed to do?”
Even if IAM authentication succeeds, the API call can still fail with Forbidden if RBAC doesn’t grant the required permissions.
Bridging the two worlds: mapping IAM → Kubernetes identity
After IAM authentication, EKS maps IAM identity into Kubernetes users/groups so RBAC can evaluate permissions. Two common mechanisms exist:
-
aws-auth ConfigMap (classic, still widely used), example:
mapRoles: | - rolearn: arn:aws:iam::<account-id>:role/EKSNodeRole username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes -
EKS Access Entries / Access Policies (newer Console-based approach)
For this article:
-
Nodes must be mapped into groups like
system:bootstrappersandsystem:nodes -
Humans/admins are commonly mapped to
system:mastersor granted an equivalent Access Policy
Hands-on Time
The AWS architecture after completing the steps below:
Step 0 — Preparation
Make sure all resources from Part 1 are created correctly and your cluster is ACTIVE.
Step 1 — Create the IAM Role for Worker Nodes
Open the AWS Console:
→ Go to IAM
→ Choose Roles → Create role
→ Configure Trusted entity:
→ Click Next
→ Attach policies:
AmazonEKSWorkerNodePolicyAmazonEKS_CNI_PolicyAmazonEC2ContainerRegistryReadOnly
→ Role name: EKSNodeRole
→ Click Create role
Step 2 — Create a Managed Node Group
→ Open the EKS service
→ Select cluster demo-eks-cluster
→ Go to Compute → Add node group
Configure node group
→ Click Next.
Configure compute & scaling
- AMI type: EKS optimized (Amazon Linux / Bottlerocket)
- Capacity type: On-Demand
- Instance type:
t3.medium - Disk size: 20 GiB
- Scaling:
- Desired: 2
- Min: 1
- Max: 3
→ Keep other settings as default.
→ Click Next.
Configure networking
Select only the two private subnets.
This is critical: subnet selection here determines where your worker nodes live. Private subnets are ideal for production worker nodes because they don’t expose instances to the Internet.
→ Click Next → Create.
Wait until the node group status becomes Active.
Step 3 – Join the Cluster – AWS handles it
You don’t need to manually configure bootstrap steps for a Managed Node Group – but understanding the join flow is what makes you effective at troubleshooting.
3.1 Node join flow
When you create a Managed Node Group, AWS launches EC2 worker nodes. Each instance gets an Instance Profile containing your Node IAM Role (EKSNodeRole). The join flow looks like this:
- The EC2 instance boots using an EKS-Optimized AMI
- Bootstrap config provides
kubeletwith:- cluster name
- API endpoint
- cluster CA certificate
-
kubeletcalls the Kubernetes API Server to register the node - EKS performs IAM authentication and identifies the IAM Role from the instance profile
- EKS maps IAM identity → Kubernetes identity via aws-auth / Access Entries
- If mapping is valid (node is in
system:nodes), the node becomes Ready
In short: nodes don’t “join by Kubernetes magic.” They join because:
IAM authentication proves identity + Kubernetes group mapping allows the node role to function.
3.2 What Managed Node Group eliminates compared to Self-managed
With Self-managed nodes, you must build the join path yourself:
- Create Launch Template (AMI, instance profile, user data)
- Ensure bootstrap via
/etc/eks/bootstrap.sh <cluster-name> - Create ASG and subnet placement
- Manually update
aws-authto map the node role into:system:bootstrapperssystem:nodes
Managed Node Groups remove most of this plumbing.
Step 4 — Verify
This is the moment your EKS cluster finally starts to feel alive.
4.1 Verify with kubectl
→ Open a terminal and run:
kubectl get nodes -o wide
→ Then check system pods:
kubectl get pods -n kube-system
Expected results:
- You should see two nodes (based on desired size), in
Readystate -
coredns,metrics-server, andkube-proxyshould transition toRunning
4.2 What AWS resources were created behind the scenes?
Now let’s satisfy curiosity and confirm what AWS created inside your account.
(1) Auto Scaling Group
→ Open EKS Service
→ Select EKS cluster demo-eks-cluster
→ Click Compute tab
→ Select node group eks-mng-general
→ In Details, click the Auto Scaling Group
Inside the ASG page, you’ll find:
- Desired / Min / Max configuration
- EC2 instances in
InService - Launch Template reference
- Security Groups
(2) Launch Template
From the ASG page:
→ Click the Launch template link.
You’ll see:
- AMI ID
- Instance type
- Security groups attached
- User data/bootstrap wiring (partly hidden, but it’s there)
(3) Security Group for worker nodes
From the ASG details page
→ Click the Security group IDs.
Review inbound/outbound rules applied to worker nodes.
Common Pitfalls
1) Node group is Active, but kubectl get nodes shows nothing
Likely causes:
- The node group is using the wrong IAM role (not
EKSNodeRole) - Node IAM role is missing required policies
- Wrong subnet selection or private subnet route tables are incorrect
2) Instances keep launching and terminating in the ASG
Likely causes:
- Instance type capacity shortage → try a more common type (
t3.large,m5.large, etc.) - Subnet/AZ constraints → expand to more AZs/subnets
- EC2 quota limits → request quota increase
3) Pods stuck in Pending
Likely causes:
- Insufficient node resources (CPU/memory) → choose a larger instance type
- Taints/labels preventing scheduling → remove taints or adjust selectors
4) ImagePullBackOff / ErrImagePull
Likely causes:
- Private subnets have no NAT gateway, or routes are wrong
- DNS resolution is broken → check VPC settings (DNS resolution and DNS hostnames)
Summary
In Part 2, we:
- Added production-style worker nodes (private subnets) so workloads finally have somewhere to run
- Clearly separated Cluster Role vs Node Role
- Covered the IAM → Kubernetes authentication story
- Explored what a Managed Node Group creates behind the scenes
Next, in Part 3, we’ll go deeper into EKS networking: VPC CNI, ENIs, Pod IP allocation, and traffic flow debugging.













