Intro:
A deep dive
into NAT Gateway with High Availability
in AWS.
Focus:
- Concepts,
- Architecture,
- Deployment
strategies,
- Cost
considerations,
- Pitfalls,
- High
Availability (HA) best
practices.
Breakdown:
- The concept: NAT Gateway,
- The Core Architecture,
- How NAT Gateway Works (Under the Hood),
- High Availability (HA) Design,
- Routing Configuration,
- Cost Considerations,
- Common Pitfalls,
- Alternatives for HA or Cost
Efficiency,
- Operational Best Practices,
- Example Terraform Snippet (Multi-AZ NAT Gateway),
- Summary – HA NAT Gateway
Checklist.
1. The concept: NAT
Gateway
- A NAT
(Network Address Translation) Gateway
allows
instances in a private
subnet to access the internet (for updates,
downloads, etc.) without exposing
them to inbound internet traffic.
In other words:
- “Private instances can go out, but nothing can come in.”
2. The Core
Architecture
Let’s visualize a standard
3-tier VPC:
Key Components:
- Internet Gateway (IGW) – for outbound internet access.
- NAT Gateway – managed service for NAT.
- Elastic IP (EIP) – static public IP assigned to NAT Gateway.
- Private Subnet
–
no direct route to IGW, instead routes
through NAT.
3. How NAT
Gateway Works (Under the Hood)
- A
private EC2 instance initiates a connection to an internet address.
- The
packet is routed (via route table) to the NAT Gateway.
- NAT
Gateway replaces the source private IP with its own public Elastic IP.
- The
packet goes out via the Internet Gateway.
- When
the response returns, NAT Gateway maps it back to the private IP and forwards
it internally.
4. High
Availability (HA) Design
This is where it gets critical.
NAT
Gateway Is AZ-Scoped
·
Each NAT Gateway lives within a single Availability Zone (AZ).
It is inherently redundant within its AZ, but not across AZs.
So if an AZ fails:
- NAT
Gateway in that AZ becomes unavailable.
- Private
subnets routing to that NAT Gateway lose internet connectivity.
Best Practice: One NAT Gateway per AZ
To achieve high availability:
- Deploy 1 NAT Gateway per AZ.
- Route private subnets in each AZ to the local
NAT Gateway in the same AZ.
Sample HA Design (Multi-AZ)
|
AZ |
Public Subnet |
NAT Gateway |
Private Subnet |
Route Target |
|
us-east-2a |
Public-1a |
NATGW-1a |
Private-1a |
NATGW-1a |
|
us-east-2b |
Public-1b |
NATGW-1b |
Private-1b |
NATGW-1b |
|
us-east-2c |
Public-1c |
NATGW-1c |
Private-1c |
NATGW-1c |
This ensures:
- No
cross-AZ data path.
- Localized
routing.
- Fault isolation (if one AZ or NAT fails, others still work).
5.
Routing Configuration
For each private subnet route table:
Destination:0.0.0.0/0→ Target: nat-xxxxxxxx(NAT Gateway in same AZ)- For each public subnet route table:
Destination:0.0.0.0/0→Target: igw-xxxxxxxx(Internet Gateway)
6. Cost
Considerations
NAT Gateway pricing (as of 2025):
- Per hour: ~$0.045/hour
(~$32.40/month per NAT)
- Per GB processed:
~$0.045/GB
(varies by region)
NB:
- If twtech has 3 AZs → 3 NAT Gateways → ~$100/month + data charges.
Optimization
Tip:
NB:
- If twtech workload in one AZ is light, it can route multiple private subnets through one NAT Gateway temporarily to reduce cost—but this sacrifices HA.
7. Common
Pitfalls
|
Pitfall |
Description |
Impact |
|
|
Single
NAT Gateway |
Used for all AZs |
AZ dependency; cross-AZ traffic; higher cost and latency |
|
|
Wrong
Route Target |
Routing to wrong NAT Gateway |
Asymmetric routing; broken connections |
|
|
No
EIP association |
NAT Gateway without Elastic IP |
No internet access |
|
|
No
IGW in VPC |
Missing Internet Gateway |
NAT Gateway can’t reach the internet |
|
|
Ephemeral
scaling |
Not monitoring throughput |
NAT Gateway can be a bottleneck in high traffic scenarios |
|
8. Alternatives
for HA or Cost Efficiency
|
Option |
Description |
Pros |
Cons |
|
NAT
Instance |
EC2 instance with iptables-based NAT |
Customizable; cheaper |
Manual scaling, patching, less HA |
|
PrivateLink
/ VPC Endpoints |
Direct AWS service access |
No NAT cost for AWS services |
Limited to AWS services only |
|
Centralized
Egress VPC (Shared Services) |
Shared NAT Gateways via Transit Gateway |
Centralized management |
Cross-AZ data charges; complexity |
9.
Operational Best Practices
✅ Deploy one NAT Gateway per AZ
✅ Use separate route
tables for each AZ’s private subnets
✅ Monitor with CloudWatch
(metrics: ActiveConnectionCount, BytesOutToDestination)
✅ Use VPC Flow Logs for
troubleshooting
✅ Enable VPC Reachability
Analyzer for connectivity checks
✅ Consider combining NAT
Gateway with VPC Endpoints to minimize egress costs
10. Example Terraform Snippet (Multi-AZ
NAT Gateway)
# hclresource "aws_nat_gateway" "twtchnatgw" { for_each = var.azs allocation_id = aws_eip.nat_eip[each.key].id subnet_id = aws_subnet.public[each.key].id}resource "aws_route" "private_default" { for_each = var.azs route_table_id = aws_route_table.private[each.key].id destination_cidr_block = "0.0.0.0/0" nat_gateway_id = aws_nat_gateway.natgw[each.key].id}
11. Summary – HA
NAT Gateway Checklist
|
Item |
Description |
Status |
|
One NAT per AZ |
✅ |
Ensures redundancy |
|
Local routing |
✅ |
Reduces cross-AZ latency/cost |
|
Monitoring setup |
✅ |
CloudWatch, Flow Logs |
|
Cost optimization |
✅ |
VPC Endpoints where possible |
|
Tested failover |
✅ |
Verified per-AZ independence |
Project: Hands-On
How twtech uses the NAT Gateway to connect to instances in the
Private subnet of the VPC (twtechvpc)
Search for AWS service: Nat Gateways
Step-1:
Create a NAT gateway:
Allocate an Elastic IP:
Create NAT gatewasy: twtechNATG
Step-2:
While the NATGW is being provisoned, Select and Edit the private route table of the VPC (twtechPrivateRT) to: Add route from internet to NATGW
From: local connection only
To: Add
route
Save changes in Private route table of the VPC (twtechvpc)
NB:
It takes some time for NAT gateway to be fully provisioned: wait until it’s Available.
Step-3:
From EC2 console, Select the Bastion Host instance, then connect (SSH) to the instance in to the private
subnet: to Verify if internet is available.
Connect to instance: EC2
connect is used. However, any terminal configured can be used to connect (SSH) into the Bastion Host
Step-4:
From Bastion Host, twtech can do SSH-forwarding: to connect to the instance in Private subnet
using with its IPv4 address:
sudo ssh ec2-user@10.x.xx.65 -i twtechkey.pem
Step-5:
Verify whether EC2 instance in the Private subnet now has internet
access from Nat Gateway:
ping think-with-tech.blogspot.com
Next:
Use the Curl command:
curl https://think-with-tech.blogspot.com
curl think-with-tech.blogspot.com
curl example.com
curl google.com
twtech now run other commands in the private subnet instance: Like
sudo yum install net-tools -y
twtech can now Use the net-tools installed: to verify all the ports listening in the private subnet instance.
sudo netstat -utln
NB:
- Updating and patching of the instance in the Private subnet is done without access to
the public internet.
- Connection to Private instances is via the NAT Gateway created (twtecNATG)
- For High Availability ( HA) twtech can create other NAT gateways in other Region for disaster recovery (DR), then edit the route table to add the gateways created in those regions to the routes as well.
No comments:
Post a Comment