Sunday, November 16, 2025

AWS Transit Gateway Throughput With ECMP | Overview.

An Overview of AWS Transit Gateway (TGW) throughput with Equal-Cost Multi-Path (ECMP) 

Scope:

  •        Architecture,
  •        Per-attachment limits,
  •        Scaling models,
  •        VPN/Direct Connect behavior,
  •        Flow hashing,
  •        Bottlenecks,
  •        Best-practice designs for high-throughput architectures.

Intro:

  •        AWS Transit Gateway (TGW) is a distributed, horizontally-scaled router that spans an AWS Region.
  •         AWS Transit Gateway (TGW) throughput characteristics vary depending on the type of attachment (VPC, VPN, Direct Connect, Peering),
  •        Equal-Cost Multi-Path (ECMP) radically affects how twtech design for aggregate bandwidth.

Breakdown:

  •        TGW Architecture = Distributed Router,
  •        Baseline TGW Throughput Limits (Per Attachment),
  •        ECMP (Equal-Cost Multipath) on AWS TGW,
  •        How ECMP Traffic Distribution Works in TGW,
  •        Throughput Scaling With ECMP,
  •        TGW Is Flow-Limited: Understanding Per-Flow Limits,
  •        Common Bottlenecks in TGW ECMP Designs,
  •        Best-Practice Designs for High-Throughput TGW ECMP.

1. TGW Architecture = Distributed Router

TGW is not a single device. Internally:

  •         TGW uses a distributed data plane of “route processing units.”
  •         Each attachment (VPC, VPN, DX, Peering) terminates on a set of distributed nodes.
  •         TGW can scale traffic horizontally as long as flows are distributed across nodes.

This matters because TGW throughput is attachment-level, not gateway-level.

2. Baseline TGW Throughput Limits (Per Attachment)

VPC Attachment

  •         Up to 50 Gbps burst per VPC attachment
  •         Distributed across multiple ENIs
  •         Single TCP flow limited by ENI processing path (~5 Gbps typical)

Transit Gateway Peering Attachment

  •         Up to 50 Gbps per peering attachment
  •         No ECMP on TGW peering.

Direct Connect Transit Virtual Interface (Transit VIF)

  •         DXGW TGW traffic:
    •    Single Transit VIF: up to 5–50 Gbps depending on physical DX
    •    TGW supports ECMP with DX (multiple Transit VIFs)

Site-to-Site VPN

  •         Each S2S VPN tunnel:
    •    1.25 Gbps encrypted throughput ceiling
    •    In practice ~1 Gbps max due to IPsec overhead
  •         TGW supports ECMP across multiple VPN tunnels.

3. ECMP (Equal-Cost Multipath) on AWS TGW

TGW supports ECMP for:

Attachment Type

ECMP Supported?

                   Notes

VPN (IPSec)

✅ Yes

Up to 8 tunnels active/ECMP

Direct Connect (via DXGW)

✅ Yes

Multiple Transit VIFs

VPC Attachments

❌ No

TGW load-balances flows internally, not ECMP paths

TGW Peering

❌ No

Single flow per attachment

For ECMP to be enabled:

  •         There must be multiple equal-cost routes in the TGW route table.
  •         Typically built using multiple Customer Gateway (CGW) IPs, multiple tunnels, or multiple Transit VIFs.

4. How ECMP Traffic Distribution Works in TGW

TGW uses a 5-tuple flow hashing

src-ip, dst-ip, src-port, dst-port, protocol

Implications:

  •         Large single flows (e.g., single TCP stream) do not get split across paths → stick to a single VPN tunnel / VIF.
  •         Multiple smaller flows scale horizontally across multiple ECMP paths.
  •         UDP traffic spreads more naturally (due to more randomized ports).
  •         Multipath throughput is aggregate, not per-flow.

5. Throughput Scaling With ECMP

Below are practical scaling behaviors:

5.1 ECMP With VPN Tunnels

Example: 4 tunnels using ECMP

Tunnels

Total Max Aggregate Throughput

2 tunnels

~2 Gbps

4 tunnels

~4–5 Gbps

8 tunnels

~7–8 Gbps

NB:

  •         Max IPSec per tunnel: ~1 Gbps
  •         Actual throughput depends heavily on CGW device limits (physical or virtual).

5.2 ECMP With Direct Connect Transit VIFs

If twtech creates:

  •         2 × 10-Gbps Transit VIFs
  •         Same BGP metrics (AS-PATH / MED)
  •         Same prefix advertisements

TGW will do ECMP across both VIFs ~20 Gbps aggregate.

This is the highest-throughput TGW design besides AWS backbone-based VPC or inter-region traffic.

6. TGW Is Flow-Limited: Understanding Per-Flow Limits

The throughput twtech gets depends on the number of flows and the per-flow bandwidth capability.

Per-flow throughput depends on:

  •         VPC ENI path (5 Gbps typical per-flow)
  •         VPN tunnel encryption limits (1 Gbps)
  •         DX NIC speed
  •         Remote CGW or on-prem firewall performance

If twtech pushes a single TCP flow, it rarely exceed:

  •         VPN: ~1 Gbps
  •         DX: 5–10 Gbps depending on NIC offload
  •         VPC-to-VPC: 5 Gbps typical single-flow

To fully utilize TGW, twtech needs parallel flows.

7. Common Bottlenecks in TGW ECMP Designs

Bottleneck

Symptoms

On-prem firewall maxes out

VPN tunnels flapping, CPU 90%

Using only 1 VPN tunnel

1 Gbps ceiling

Only 1 Transit VIF

5–10 Gbps ceiling

Application uses single TCP flow

twtech never reach ECMP potential

TGW route table not configured for ECMP

Only 1 path used

CGW devices not supporting multiple BGP sessions

No ECMP

8. Best-Practice Designs for High-Throughput TGW ECMP

A. High-throughput VPN → TGW (5–8 Gbps)

Use:

  •         4–8 IPSec tunnels
  •         Each tunnel on a unique CGW IP
  •         Ensure:
    •    Dead-peer detection fast rekeying enabled
    •    Equal BGP metrics
    •    ECMP enabled on twtech on-prem router

B. High-throughput Direct Connect → TGW (20–100 Gbps)

Use multiple Transit VIFs:

Example high-bandwidth design:

  •         4 × 10 Gbps DX connections
  •         4 Transit VIFs
  •         ECMP enabled
  •         Aggregate = 40 Gbps+

Customers with 100-Gbps DX fiber can reach 100 Gbps+ using multiple 10/100G connections.

C. VPC to on-prem Throughput-Maximizing Pattern

Even though TGW VPC attachments don’t use ECMP:

twtech can still scale by:

  •         Multiple AZs (each adds bandwidth)
  •         ENA Express (v1 & v2)
  •         Horizontal scaling (multiple app instances)
  •         Multi-flow parallelism at the application layer

9. Putting It All Together – Realistic Throughput Expectations

VPN

  •         Typical: 3–6 Gbps with 4–6 tunnels
  •         Max theoretical: 8–10 Gbps with 8 tunnels

Direct Connect

  •         Typical ECMP bundle: 20–40 Gbps
  •         Max theoretical: 100+ Gbps (multiple 100G links)

VPC Attachments

  •         50 Gbps burst per VPC attachment
  •         But single-flow still ~5 Gbps

No comments:

Post a Comment

Amazon EventBridge | Overview.

Amazon EventBridge - Overview. Scope: Intro, Core Concepts, Key Benefits, Link to official documentation, Insights. Intro: Amazon EventBridg...