AWS Storage Strategies - Deep Dive.
Scope:
- Intro,
- Core AWS Storage Categories,
- Functional domains-1: Object Storage,
Functional domains-2:
Instance Store,Functional domains-3:
Block Storage,Functional domains-4:
File Storage,Functional domains-4:
Archival Storage,- Data Management & Performance Tuning,
- Workload-Specific Architecture Patterns,
- Security & Governance,
- Storage Architectures by Use Case,
- Disaster Recovery Strategy,
- Full AWS Storage Architecture Diagram.
1. Core AWS Storage Categories
- AWS storage falls under 5 functional domains:
Functional domains-1:
Object StorageAmazon S3 (Simple
Storage Service)
- Designed for:
- Massive scale,
- Durability (11 9’s),
- Multi-AZ availability.
- Not POSIX compliant.
- Best for:
- Data lakes,
- ML datasets,
- backups,
- logs,
- HPC staging,
- container artifacts,
- static hosting.
Key S3 Features for Engineers
- Storage Classes: Standard,
IA, One Zone, Glacier Instant/Deep Archive.
- Lifecycle Policies → automate tiering & expiration.
- Versioning → object-level snapshot history.
- Object Lock (WORM) → security/compliance.
- S3 Access Points → multi-tenant access patterns.
- S3 Multi-Region Access Points → automatic global replication + routing.
- S3 Express One Zone → ultra-low-latency object access (~ms-scale).
- Multipart Upload / Transfer Acceleration for large files.
Functional domains-2:
Instance Store- Instance scale to millions of Input/Output Operations Per Second (IOPS), linked to EC2 instance, low latency
- An instance store in AWS provides temporary, block-level storage physically attached to the host computer of an Amazon EC2 instance.
- This storage is ephemeral, meaning all data stored on it is lost if the instance stops, hibernates, terminates, or if the underlying drive fails.
Key CharacteristicsTemporary Storage:
- The data does not persist beyond the life of the instance or if the instance is stopped (data does persist through a simple reboot).
High Performance:
- Because the storage is physically attached to the host, it offers very low latency and high I/O performance compared to network-attached Amazon EBS volumes.
Cost-Effective:
- The cost of the instance store is included in the price of the EC2 instance itself, so there are no separate storage charges.
Instance-Specific:
- Not all EC2 instance types offer instance store volumes, and the available size, type (SSD or HDD), and quantity vary by instance type.
No Snapshots:
- twtech cannot take a snapshot of an instance store volume, and data is not included in an Amazon Machine Image (AMI) created from the instance.
Common Use Cases- Instance stores are ideal for applications requiring a high-speed, temporary scratchpad or data that can be easily recreated.
Buffering and Caching: - Used for temporary data that requires fast access but does not need long-term retention, such as session caching or image processing pipelines.
Scratch Data:
- Ideal for computational workloads that generate temporary data for the duration of a task.
High-Performance Computing (HPC):
- Suitable for scenarios that need massive I/O operations per second (IOPS) and can tolerate potential data loss, such as big data processing or analytical workloads.
Replicated Workloads:
- Can be used for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers, where individual instance failure does not compromise data availability
- Instance scale to millions of Input/Output Operations Per Second (IOPS), linked to EC2 instance, low latency
- An instance store in AWS provides temporary, block-level storage physically attached to the host computer of an Amazon EC2 instance.
- This storage is ephemeral, meaning all data stored on it is lost if the instance stops, hibernates, terminates, or if the underlying drive fails.
- The data does not persist beyond the life of the instance or if the instance is stopped (data does persist through a simple reboot).
- Because the storage is physically attached to the host, it offers very low latency and high I/O performance compared to network-attached Amazon EBS volumes.
- The cost of the instance store is included in the price of the EC2 instance itself, so there are no separate storage charges.
- Not all EC2 instance types offer instance store volumes, and the available size, type (SSD or HDD), and quantity vary by instance type.
- twtech cannot take a snapshot of an instance store volume, and data is not included in an Amazon Machine Image (AMI) created from the instance.
- Instance stores are ideal for applications requiring a high-speed, temporary scratchpad or data that can be easily recreated.
- Used for temporary data that requires fast access but does not need long-term retention, such as session caching or image processing pipelines.
- Ideal for computational workloads that generate temporary data for the duration of a task.
- Suitable for scenarios that need massive I/O operations per second (IOPS) and can tolerate potential data loss, such as big data processing or analytical workloads.
- Can be used for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers, where individual instance failure does not compromise data availability
Functional domains-3:
Block StorageAmazon EBS (Elastic
Block Store)
- High performance, persistence block volumes attached to EC2.
- Single-AZ unless using replication features.
Key Types:
- gp3 – General purpose SSD, baseline 3,000 IOPS.
- io2/io2 Block Express – Highest durability (99.999%) & IOPS (up to 256K).
- st1/sc1 – HDD-based workloads (cold & throughput workloads).
Use Cases:
- Databases, EC2 boot disks, Elasticsearch, Kafka brokers, containers needing persistent block.
Functional domains-4:
File StorageAmazon EFS (Elastic
File System)
- NFSv4.1/4.2 compatible, multi-AZ, elastic.
- Highly parallel — 10,000s of concurrent connections.
Modes:
- Standard / One Zone
- Performance Modes: General Purpose, Max I/O
Amazon FSx
Fully managed file systems:
- FSx for Lustre
→
HPC, ML training, large data pipelines.
- FSx for NetApp ONTAP → multi-protocol (NFS/SMB/iSCSI), SnapMirror, enterprise use.
- FSx for OpenZFS → low latency, ZFS snapshots, compression.
- FSx for Windows → SMB for Windows workloads.
Functional domains-4:
Archival StorageGlacier Family
- Glacier Instant Retrieval
–
ms access
- Glacier Flexible Retrieval – minutes
- Glacier Deep Archive – hours
NB:
- Used for compliance, cold backups, log retention, disaster recovery.
1.6 Edge & Hybrid Storage
AWS Storage Gateway
- File
Gateway → NFS/SMB with S3 backend
- Tape Gateway → Virtual tapes → S3 Glacier
- Volume Gateway → Block replicas from on-prem to AWS
AWS DataSync
- High-speed transfer between:
- On-prem → AWS
- AWS ↔ other cloud
- AWS services (S3 ↔ EFS, FSx, etc.)
Snow Family
- Snowcone, Snowball Edge, Snowmobile For offline transfer + edge compute at scale.
2. Data
Management & Performance Tuning
2.1 S3 Performance
- Request parallelization via prefixes:
- Modern S3 scales automatically across prefixes (no manual management needed).
- Use multipart for >100MB objects.
- S3 Select — pushdown filtering for lower transfer costs.
2.2 EBS Performance
- Pre-warm io2 volumes for max throughput.
- RAID 0 striping across multiple volumes for IOPS-heavy workloads (e.g., MongoDB).
- EBS-optimized instances are mandatory for stable latency.
2.3 EFS Performance
- Burst Credits matter for sustained workloads.
- Use Provisioned Throughput for analytics / ML workloads.
- Use One Zone for low latency HPC pipelines.
2.4 FSx Tuning
FSx for Lustre
- 100s GB/s throughput and ms latency.
- Integrates natively with S3 (import/export).
- Excellent for ML training, genomics, seismic processing.
3.
Workload-Specific Architecture Patterns
3.1 Data Lake / Lakehouse
- S3 as central data lake storage.
- Glue Catalog / Lake Formation for schema + governance.
- Athena for SQL analysis.
- Iceberg + S3 for lakehouse tables.
- EFS/FSx for ETL staging.
Architecture decisions:
- Use S3 Intelligent-Tiering for unpredictable access.
- Use S3 Object Lock if regulatory compliance needed.
3.2 HPC (High-Performance
Computing) Storage
Typical stack:
- FSx for Lustre
→
primary compute scratch.
- S3 → durable dataset repository.
- ParallelCluster integrates Lustre natively.
- EFA networking for low-latency HPC nodes.
Pattern:
On-prem HPC → DataSync → S3 → FSx for Lustre → HPC Compute → Results → S33.3 Hybrid / Multi-Cloud Storage
Patterns:
Pattern A: On-Prem NAS → S3 via File Gateway
- Offload backups
- Cold archive to Glacier
Pattern B: Multi-Cloud Dataset Sharing
- DataSync between AWS ↔ Azure Blob / GCP Storage
- S3 Multi-Region Access Points for global apps
Pattern C: DR Across Clouds
- FSx/EC2 snapshot replication → S3 → export to another cloud
- VMware Cloud on AWS for lift-and-shift workloads
4. Security
& Governance
4.1 S3
- Encryption: SSE-S3,
SSE-KMS, CSE-KMS.
- Block Public Access (enable at account level).
- VPC endpoints → no public internet traffic.
- Bucket policies scoped to IAM conditions like:
-
aws:PrincipalOrgID -
aws:SourceVpc -
aws:SourceArn
4.2 EFS/FSx
- Enforce access via:
- Security Groups
- IAM & POSIX permissions
- KMS encryption
4.3 Backup
- AWS Backup automates protection for:
- S3
- EFS
- FSx
- RDS
- DynamoDB
- EBS
5.
Storage Architectures by Use Case
5.1 DevOps / CI/CD
- S3 for artifact and container image storage
- EFS for shared build caches
- FSx for ONTAP for enterprise CI/CD systems requiring snapshots & clones
5.2 Kubernetes / EKS
Recommended:
- EBS for per-pod persistent volumes (CSI).
- EFS for shared workloads across nodes.
- FSx for Lustre for high-performance ML workloads.
- FSx for ONTAP for multi-protocol storage.
5.3 Analytics / Big Data
- S3 → main Data lake
- Redshift → compute warehouse
- Glue → ETL
- EMR/Hadoop + FSx for Lustre → HPC-scale ETL
6. Disaster
Recovery Strategy
Levels:
- Warm Standby → S3
cross-region replication + EFS One Zone → Multi-AZ.
- Pilot Light → critical S3 + FSx replication only.
- Backup & Restore → S3 Glacier as cold storage.
- Multi-Region Active-Active → S3 Multi-Region Access Points + DynamoDB Global Tables.
No comments:
Post a Comment