An Overview of VMware Cloud on AWS (VMC on AWS).
Scope:
- Architecture,
- Networking,
- Storage,
- Operations,
- Lifecycle,
- Integrations,
- Best-practice
considerations.
Breakdown:
- Underlying Architecture,
- SDDC Architecture Overview,
- Networking
- Storage,
- Operations,
- Consumption Model,
- Use Cases,
- Best Practices & Design
Considerations,
- Architecture Diagram (Text Version)
Intro:
- VMware Cloud on AWS (VMC) is a jointly engineered service by VMware and AWS that runs vSphere, vSAN, NSX, and HCX natively on bare-metal EC2 hosts inside AWS data centers.
- VMware Cloud on AWS (VMC) provides a fully managed Software-Defined Data Center (SDDC) with tight integration into native AWS services.
1. Underlying Architecture
1.1 Bare Metal Hosts
VMC uses dedicated, single-tenant bare-metal EC2 i3en and i4i
instances, each including:
- Dual Intel CPUs (e.g., i3en.metal: 96 vCPUs)
- Hundreds of GiB RAM (e.g., 768 GiB)
- NVMe SSD directly attached (for vSAN)
- 25 Gbps ENA network connectivity
Each host runs:
- vSphere ESXi
- vSAN for distributed storage
- NSX-T for networking & security
2. SDDC Architecture Overview
2.1 Management Domain
Contains:
- vCenter Server
- NSX-T Manager / Controllers
- HCX Manager (optional)
- ESXi management components
This domain is fully managed by VMware — you cannot SSH
into these management VMs.
2.2 Compute Domain (Workload Clusters)
Everything twteh deploys (VMs, clusters, etc.) lives here.
Capabilities:
- Scale clusters up to 16 hosts (standard)
- Aggregate multiple clusters per SDDC
- Use vSAN datastore across the cluster
VMware handles:
- ESXi patching/updates
- vCenter upgrades
- Hardware lifecycle
twtech manages:
- VM workloads
- vSphere configurations (resource pools, tags, DRS rules)
- NSX-T networking for workload segments
3. Networking
3.1 Connectivity Options
- AWS Direct Connect → High throughput, low latency
- IPsec VPN → Quick setup, moderate latency
- Transit Gateway (TGW) → Multi-VPC spoke architecture
- SDDC-to-SDDC NSX-T VPN
3.2 NSX-T Components
VMC uses NSX-T for:
- Logical routing
- Security groups / firewall policies
- Distributed firewall (microsegmentation)
- NAT & edge services
Router Tiering:
- Tier-0 Gateway → North–south traffic (AWS, on-prem, Internet)
- Tier-1 Gateway → Workload segments
3.3 vMotion to/from On-Prem
With HCX, you get:
- Bulk migration
- vMotion without downtime
- Replication-assisted vMotion
- L2 stretch across clouds
HCX abstracts latency but optimal latency is <150ms RTT
for vMotion.
4. Storage
4.1 vSAN Storage
Each bare-metal host contributes:
- NVMe SSD cache tier
- NVMe SSD capacity tier
Configured as:
- All-flash vSAN datastore
- RAID-1 / RAID-5 / RAID-6 policies
- Adaptive Resync & I/O balancing
Scaling:
- Add host → vSAN capacity automatically grows
- Remove host → storage is rebalanced
4.2 Supplemental Storage Options
1. Amazon FSx for NetApp ONTAP
- NFS datastore for VMC
- High capacity, lower cost
- Snapshots, cloning, DR
2. Amazon EBS (limited use cases)
- Used behind the scenes in some features, not for VM datastores.
5. Operations
5.1 Lifecycle Management (LCM)
VMware performs:
- ESXi patching
- vCenter upgrades
- NSX upgrades
- Firmware, BIOS, hardware lifecycle
Service Level Agreement (SLA) often covers:
- 99.9% uptime for SDDC
No customer downtime during upgrades — VMware uses rolling upgrades.
5.2 Security
- NSX-T distributed firewall
- Identity-FW (AD/ID integration)
- Microsegmentation
- End-to-end encryption
- AWS-managed security layers (VPC isolation, IAM integration)
5.3 Monitoring & Troubleshooting
Integrations:
- vRealize Operations (vROps)
- CloudWatch logs for SDDC events
- SDDC Manager console (web UI)
- VRLI Cloud (Log Insight Cloud)
6. Consumption Model
6.1 Host Types
Common host SKUs:
- i4i.metal → Latest-gen; best performance/price
- i3en.metal → High storage density
6.2 Pricing
- On-Demand hourly
- 1-year or 3-year Reserved
- Flexible subscription
- Elastic DRS auto-scaling
Savings:
- 30–60% with reserved hosts
- EDRS → Auto-scale with load spikes
7. Use Cases
7.1 Data Center Extension
- Lift-and-shift within days
- Retain tools/processes
- Connect to AWS-native services (RDS, S3, Lambda)
7.2 Disaster Recovery
- VMware Cloud DR (VCDR)
- Low RPO (minutes)
- Rapid scale-out during failover
7.3 Application Modernization
Through AWS integrations:
- S3 directly via ENI
- RDS for databases
- Lambda for operations
- API Gateway, DynamoDB, etc.
VMs continue running on vSphere but interface with AWS services.
8. Best Practices & Design
Considerations
Networking
- Always use ENI high-bandwidth connection for AWS service access
- Set up TGW for multi-VPC connectivity
- Plan CIDRs to avoid overlap
Storage
- Use storage policies intelligently
- Consider FSxN for capacity-heavy workloads
Governance
- Tag resources (SDDC, clusters)
- Use AWS IAM with least-privilege principles
- Control EDRS limits to avoid surprise scaling costs
Security
- Leverage Distributed Firewall for app segmentation
- Use NSX Intelligence for traffic flows
No comments:
Post a Comment