A Deep Dive deep into AWS Storage.
Focus:
- Tailored
for DevOps/DevSecOps./Cloud Engineering.
- Hands-on
concepts,
- Real
architectural patterns,
- Tuning
guidance,
- Hybrid/HPC/multi-cloud
considerations
Breakdown:
- Core AWS Storage Categories,
- Data Management & Performance Tuning,
- Workload-Specific Architecture Patterns,
- Security & Governance,
- Storage Architectures by Use Case,
- Disaster Recovery Strategy,
- Full AWS Storage Architecture Diagram.
1. Core AWS Storage Categories
AWS storage falls under 5 functional domains:
1.1 Object Storage
Amazon S3 (Simple
Storage Service)
- Designed for massive scale, durability (11 9’s), multi-AZ availability.
- Not POSIX compliant.
- Best for: Data lakes, ML datasets, backups, logs, HPC staging,
container artifacts, static hosting.
Key S3 Features for Engineers
- Storage Classes: Standard,
IA, One Zone, Glacier Instant/Deep Archive.
- Lifecycle Policies
→
automate tiering & expiration.
- Versioning → object-level snapshot history.
- Object Lock (WORM) →
security/compliance.
- S3 Access Points
→
multi-tenant access patterns.
- S3 Multi-Region Access Points
→
automatic global replication + routing.
- S3 Express One Zone
→
ultra-low-latency object access (~ms-scale).
- Multipart Upload / Transfer Acceleration
for
large files.
1.2 Instance Store
- Instance scale to millions of Input/Output Operations Per Second (IOPS), linked to EC2 instance, low latency
- An instance store in AWS provides temporary, block-level storage physically attached to the host computer of an Amazon EC2 instance.
- This storage is ephemeral, meaning all data stored on it is lost if the instance stops, hibernates, terminates, or if the underlying drive fails.
Key CharacteristicsTemporary Storage:
- The data does not persist beyond the life of the instance or if the instance is stopped (data does persist through a simple reboot).
High Performance:
- Because the storage is physically attached to the host, it offers very low latency and high I/O performance compared to network-attached Amazon EBS volumes.
Cost-Effective:
- The cost of the instance store is included in the price of the EC2 instance itself, so there are no separate storage charges.
Instance-Specific:
- Not all EC2 instance types offer instance store volumes, and the available size, type (SSD or HDD), and quantity vary by instance type.
No Snapshots:
- twtech cannot take a snapshot of an instance store volume, and data is not included in an Amazon Machine Image (AMI) created from the instance.
Common Use CasesInstance stores are ideal for applications requiring a high-speed, temporary scratchpad or data that can be easily recreated. Buffering and Caching: - Used for temporary data that requires fast access but does not need long-term retention, such as session caching or image processing pipelines.
Scratch Data:
- Ideal for computational workloads that generate temporary data for the duration of a task.
High-Performance Computing (HPC):
- Suitable for scenarios that need massive I/O operations per second (IOPS) and can tolerate potential data loss, such as big data processing or analytical workloads.
Replicated Workloads:
- Can be used for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers, where individual instance failure does not compromise data availability
1.3 Block Storage
Amazon EBS (Elastic
Block Store)
- High performance, persistence block volumes attached to EC2.
- Single-AZ unless using replication features.
Key Types:
- gp3 – General purpose SSD, baseline 3,000 IOPS.
- io2/io2 Block Express
–
Highest durability (99.999%) & IOPS
(up to 256K).
- st1/sc1 – HDD-based workloads (cold & throughput workloads).
Use Cases:
- Databases, EC2 boot disks, Elasticsearch, Kafka brokers,
containers needing persistent block.
1.4 File Storage
Amazon EFS (Elastic
File System)
- NFSv4.1/4.2 compatible, multi-AZ, elastic.
- Highly parallel — 10,000s of concurrent connections.
Modes:
- Standard / One Zone
- Performance Modes: General Purpose, Max I/O
Amazon FSx
Fully managed file systems:
- FSx for Lustre
→
HPC, ML training, large data pipelines.
- FSx for NetApp ONTAP
→
multi-protocol (NFS/SMB/iSCSI), SnapMirror, enterprise use.
- FSx for OpenZFS
→
low latency, ZFS snapshots, compression.
- FSx for Windows
→
SMB for Windows workloads.
1.5 Archival Storage
Glacier Family
- Glacier Instant Retrieval
–
ms access
- Glacier Flexible Retrieval
–
minutes
- Glacier Deep Archive
–
hours
NB:
Used for compliance, cold backups, log retention, disaster recovery.
1.6 Edge & Hybrid Storage
AWS Storage Gateway
- File
Gateway → NFS/SMB with S3 backend
- Tape
Gateway → Virtual tapes → S3 Glacier
- Volume
Gateway → Block replicas from on-prem to AWS
AWS DataSync
- High-speed transfer between:
- On-prem → AWS
- AWS ↔ other cloud
- AWS services (S3 ↔ EFS, FSx, etc.)
Snow Family
- Snowcone, Snowball Edge, Snowmobile For offline transfer + edge compute at scale.
2. Data
Management & Performance Tuning
2.1 S3 Performance
- Request parallelization via prefixes:
- Modern S3 scales automatically across prefixes (no manual
management needed).
- Use multipart for >100MB objects.
- S3 Select — pushdown filtering for lower transfer costs.
2.2 EBS Performance
- Pre-warm io2 volumes for max throughput.
- RAID 0 striping across multiple volumes for IOPS-heavy workloads (e.g., MongoDB).
- EBS-optimized instances are mandatory for stable latency.
2.3 EFS Performance
- Burst Credits matter for sustained workloads.
- Use Provisioned Throughput for analytics / ML workloads.
- Use One Zone for low latency HPC pipelines.
2.4 FSx Tuning
FSx for Lustre
- 100s GB/s throughput and ms latency.
- Integrates natively with S3 (import/export).
- Excellent for ML training, genomics, seismic processing.
3.
Workload-Specific Architecture Patterns
3.1 Data Lake / Lakehouse
- S3 as central data lake storage.
- Glue Catalog / Lake Formation
for
schema + governance.
- Athena for SQL analysis.
- Iceberg + S3 for lakehouse tables.
- EFS/FSx for ETL
staging.
Architecture decisions:
- Use S3 Intelligent-Tiering for unpredictable access.
- Use S3 Object Lock if regulatory compliance needed.
3.2 HPC (High-Performance
Computing) Storage
Typical stack:
- FSx for Lustre
→
primary compute scratch.
- S3 → durable dataset repository.
- ParallelCluster integrates
Lustre natively.
- EFA networking
for
low-latency HPC nodes.
Pattern:
On-prem HPC → DataSync → S3 → FSx for Lustre → HPC Compute → Results → S33.3 Hybrid / Multi-Cloud Storage
Patterns:
Pattern A: On-Prem NAS → S3 via File Gateway
- Offload backups
- Cold archive to Glacier
Pattern B: Multi-Cloud Dataset Sharing
- DataSync between AWS ↔ Azure Blob / GCP Storage
- S3 Multi-Region Access Points for global apps
Pattern C: DR Across Clouds
- FSx/EC2 snapshot replication → S3 → export to another cloud
- VMware Cloud on AWS for lift-and-shift workloads
4. Security
& Governance
4.1 S3
- Encryption: SSE-S3,
SSE-KMS, CSE-KMS.
- Block Public Access (enable at account level).
- VPC
endpoints → no public internet traffic.
- Bucket
policies scoped to IAM conditions like:
-
aws:PrincipalOrgID -
aws:SourceVpc -
aws:SourceArn
4.2 EFS/FSx
- Enforce access via:
- Security Groups
- IAM & POSIX permissions
- KMS encryption
4.3 Backup
- AWS Backup automates protection for:
- S3
- EFS
- FSx
- RDS
- DynamoDB
- EBS
5.
Storage Architectures by Use Case
5.1 DevOps / CI/CD
- S3 for artifact and container image storage
- EFS for shared build caches
- FSx for ONTAP for enterprise CI/CD systems requiring snapshots
& clones
5.2 Kubernetes / EKS
Recommended:
- EBS for per-pod persistent volumes (CSI).
- EFS for shared workloads across nodes.
- FSx for Lustre
for
high-performance ML workloads.
- FSx for ONTAP for
multi-protocol storage.
5.3 Analytics / Big Data
- S3 → main Data lake
- Redshift
→
compute warehouse
- Glue → ETL
- EMR/Hadoop
+ FSx for Lustre → HPC-scale ETL
6. Disaster
Recovery Strategy
Levels:
- Warm Standby → S3
cross-region replication + EFS One Zone → Multi-AZ.
- Pilot Light → critical S3
+ FSx replication only.
- Backup & Restore
→
S3 Glacier as cold storage.
- Multi-Region Active-Active → S3 Multi-Region Access Points + DynamoDB Global Tables.
No comments:
Post a Comment