Thursday, May 8, 2025

Amazon EFS (Elastic File System)

 

Amazon EFS (Elastic File System) is a scalable, fully managed NFS-based file storage service designed for use with AWS Cloud services and on-premises resources.

It provides shared, elastic file storage that can grow or shrink automatically as twtech adds or remove files.

 Key Features of AWS EFS

  • Fully managed – No infrastructure to manage.
  • Elastic scaling – Automatically scales storage up/down.
  • POSIX-compliant – Works like a traditional Linux file system.
  • Multi-AZ availability – Designed for high availability and durability.
  • Shared access – Supports simultaneous access from thousands of EC2 instances.
  • Encryption – Supports encryption at rest and in transit.
  • Performance modes – General Purpose and Max I/O.
  • Throughput modes – Bursting and Provisioned.

Common Use Cases for Amazon EFS

1. Web Serving and Content Management

  • Shared file storage for web servers (e.g., WordPress with EC2 Auto Scaling).
  • Store images, videos, and assets that need to be accessed by multiple instances.

2. Lift-and-Shift Applications

  • Supports legacy applications that require file system access with NFS.
  • No need to refactor apps to use object storage (like S3).

3. Big Data and Analytics

  • Store large datasets used by Hadoop, Spark, and analytics tools.
  • Provides low-latency access and parallel read/write.

4. Machine Learning and AI Workloads

  • Share datasets across multiple ML training jobs and instances.
  • Useful for distributed training or batch inference.

5. Container Storage (EKS, ECS)

  • Persistent storage for Kubernetes pods using EFS CSI driver.
  • Ideal for microservices that require shared storage.

6. CI/CD Pipelines

  • Share build artifacts and test results across build agents.
  • Efficient for DevOps environments where multiple workers need access.

7. Backup and Restore

  • Use AWS Backup with EFS for simple and automated file system backups.
  • Useful for applications that generate daily data dumps or logs.

8. Media Processing Workflows

  • Store raw and processed media files.
  • Simultaneous access from rendering or transcoding applications.

 When to Use EFS vs. Alternatives

Use Case

Use EFS

Use EBS

Use S3

Shared access by multiple EC2s

✅ Yes

❌ No

✅ Yes (via SDKs, not NFS)

NFS file system support

✅ Yes

❌ No

❌ No

High performance block storage

❌ No

✅ Yes

❌ No

Long-term archival

❌ No

❌ No

✅ Yes (S3 Glacier)

Cost-sensitive object storage

❌ No

❌ No

✅ Yes

The difference between Amazon EFS (Elastic File System) and NFS (Network File System) lies in their nature, scope, and implementation.

Here's a clear breakdown:

 What They Are

Aspect

Amazon EFS

NFS (Network File System)

Type

Fully managed cloud service

File system protocol

Provider

AWS

Open standard (RFCs)

Purpose

Scalable, shared file storage in AWS

Protocol for sharing files over a network

 Key Differences

Feature

Amazon EFS

NFS

Nature

A managed storage service that uses NFS behind the scenes

A file-sharing protocol (v2, v3, v4) used to mount remote file systems

Setup

No server management; AWS handles everything

Requires you to set up and manage your own NFS server

Scalability

Automatically scales storage and performance

Must manually scale server, storage, and performance

Availability

Multi-AZ high availability (EFS Standard)

Depends on how you configure your NFS server

Performance Modes

General Purpose or Max I/O modes

Performance is limited to server/network/hardware configuration

Security & Compliance

Integrated with IAM, VPC, KMS encryption, AWS Backup

You manage authentication, access, and security policies manually

Access

Mount via NFSv4.1 or v4.2 on Linux

Mount via NFS (various versions), on-prem or cloud

Use Case

Cloud-native apps, serverless, containers, multi-instance workloads

Legacy or on-prem applications requiring file-level storage

Cost Model

Pay-as-you-go, based on usage

Hardware/software costs, plus ongoing admin overhead

twtech-Summary

  • Amazon EFS = a fully managed file system in the cloud that uses the NFS protocol.
  • NFS = a file access protocol, commonly used in on-prem and DIY environments.

twtech thinks of EFS as "NFS as a service" — AWS abstracts away the server and scaling management so twtech can focus on using the file system, not running it.

Addendum:

Amazon EFS: Elastic File System

• Managed NFS (network file system) that can be mounted on many EC2.

• EFS works with EC2 instances in multi-AZ.

• Highly available, scalable, expensive (3x gp2), pay per use.

Amazon EFS: Elastic File System

• efs is a managed NFS (network file system) that can be mounted on many instances.

• The EFS can works with instances in multi-AZs.

• EFS is highly available, scalable, but expensive (3x gp2), pay-per-use.

Amazon EFS – Elastic File System

twtech usecases: EFS

·       EFS are used for Content management, web serving, data sharing, Wordpress

     • twtech uses EFS as the NFSv4.1 protocol.

NFSv4.1 (Network File System version 4.1) is a protocol developed by the IETF (Internet Engineering Task Force) for sharing files over a network in a secure, scalable, and performant way. It is a significant improvement over earlier versions of NFS and includes enhancements primarily for performance, security, and support for clustered environments.

Here are the key features and components of NFSv4.1:

 Key Features of NFSv4.1

  1. Session-Based Protocol:
    • Introduced SESSIONS using the Parallel NFS (pNFS) architecture.
    • Adds client-server session states to improve reliability and performance.
  2. Parallel NFS (pNFS):
    • Decouples metadata and data access.
    • Metadata is managed by a metadata server; actual file data can be fetched directly from storage servers in parallel.
    • Great for performance in large-scale and distributed file systems.
  3. Stateful Protocol:
    • Maintains the state (open files, locks, etc.) across sessions, unlike NFSv3.
    • Client recovery is improved after server or network failures.
  4. Lease-Based Locking:
    • Uses leases for file locking instead of the separate NLM (Network Lock Manager).
    • Built-in, more efficient locking mechanism.
  5. Security Enhancements:
    • Built on RPCSEC_GSS to support Kerberos, LDAP, or other secure authentication.
    • More secure than previous versions (like NFSv3) which relied on UID/GID-based access.
  6. COMPOUND RPCs:
    • Multiple operations can be bundled into a single RPC call, reducing network round trips.
    • Increases performance, especially over high-latency links.
  7. Firewall-Friendly:
    • Operates over a single well-known TCP port (2049), making it easier to configure across firewalls/NATs.
    • No reliance on portmapper/rpcbind.
  8. Delegations:
    • Clients can cache file data and perform operations locally under certain conditions.
    • Server "delegates" rights to the client, improving performance.

Common Use Cases

  • High-performance computing (HPC)
  • Media & entertainment workflows (shared access to large files)
  • Enterprise file sharing across departments or global offices
  • Cloud environments supporting NFS-based storage

 Limitations / Considerations

  • Complexity: Especially when using pNFS, the setup can be complex.
  • Performance Tuning: May require tuning of RPC parameters, cache settings, and session behaviors.
  • Client/Server Support: Ensure both NFS client and server support v4.1 and features like pNFS if required.

• twtech uses the security group to access the EFS.

• EFS are only compatible for Linux based AMI and not Windows AMIs.

• twtech may enable encryption at rest using KMS.

• EFS uses the POSIX file system (~Linux) that has a standard file API in linux system.

A POSIX file system refers to any file system that complies with the POSIX (Portable Operating System Interface) standard, particularly IEEE 1003.1, which defines a set of APIs and behaviors for Unix-like operating systems, including file operations.

The concept: POSIX File System

At its core, a POSIX-compliant file system:

  • Supports hierarchical directories (with / as the separator)
  • Provides standard file operations, such as:
    • open(), read(), write(), close()
    • mkdir(), rmdir(), rename(), unlink()
    • chmod(), chown(), stat(), fstat(), lstat()
  • Implements standard file types:
    • Regular files, directories, symbolic links, FIFOs, sockets, device files
  • Uses permission-based access control:
    • Read, write, execute permissions for user, group, and others (e.g., chmod 755)
  • Supports user and group ownership
  • Maintains time stamps:
    • Access time (atime), modification time (mtime), and change time (ctime)

 Examples of POSIX-Compliant File Systems

File System

Description

ext3/ext4

Common Linux file systems; fully POSIX-compliant

XFS

High-performance journaling file system; POSIX-compliant

Btrfs

Advanced Linux file system with snapshotting and volume management

ZFS

Robust file system with advanced features (POSIX compliance depends on implementation)

NFSv4

Supports POSIX semantics over the network

FUSE-based FS

Compliance depends on the implementation of the FUSE file system

Importance of POSIX Compliance

  • Application Portability: Ensures that apps using file I/O behave consistently across platforms.
  • Interoperability: POSIX-compliant apps and tools (like cp, mv, rsync, etc.) work as expected.
  • Standardization: Developers can rely on a uniform API and behavior when interacting with files.

 Non-POSIX File Systems (or Partial)

Some file systems (especially distributed or cloud-native ones) may not fully support POSIX semantics:

  • Amazon S3 – Object storage, not a file system, lacks full POSIX support.
  • HDFS (Hadoop Distributed File System) – Mostly POSIX-like, but doesn't fully support random writes or file updates.
  • Google Cloud Storage, Azure Blob Storage – Not POSIX-compliant; designed for object storage.

• The File system scales automatically, twtech therefore is expected to pay-per-use, no capacity planning is required.

twtech EFS Performance & Storage Classes 

Amazon EFS (Elastic File System) offers scalable, managed file storage for use with AWS services and on-premises resources. It's fully POSIX-compliant and integrates with EC2, Lambda, ECS, and other services. EFS supports multiple storage classes and performance modes to balance cost, latency, and throughput depending on your workload.

 EFS Storage Classes

Storage Class

Description

Use Case

EFS Standard

Default class with low-latency, high-throughput for frequently accessed data

Active file workloads, web servers, analytics

EFS Infrequent Access (IA)

Lower-cost tier for files not accessed often (costs less per GB, higher access cost)

Backup, archive, long-term logs

 Automatic Lifecycle Management:
  • You can enable lifecycle policies to automatically move files from Standard to IA based on age (e.g., after 7, 14, 30, 60, 90 days).
  • This helps reduce cost without manual intervention.

 EFS Performance Modes

Performance Mode

Description

Use Case

General Purpose (default)

Low-latency, suitable for most workloads including web, dev, and analytics apps

Ideal for latency-sensitive, single-threaded apps

Max I/O

Higher throughput and scalability; higher latency

Large-scale, highly parallelized workloads (e.g., big data, media processing)

Note: Performance mode is set at creation time and cannot be changed later.

 Throughput Modes

Throughput Mode

Description

Use Case

Bursting (default)

Throughput scales with file system size (1 MiB/s per 1 GiB of storage)

Suitable for spiky workloads

Provisioned

Specify throughput independent of file system size

For workloads needing guaranteed performance regardless of size


 Cost Summary

  • Standard: Higher $/GB but lower access cost.
  • IA: ~92% cheaper per GB, but ~$0.01/GB for access.
  • Data in IA must be at least 128 KB and not modified for the duration of the lifecycle transition setting to be eligible.

twtech Recommendations

Workload Type

Suggested Settings

General apps, dev/test

Standard + General Purpose

Media processing, analytics

Standard + Max I/O

Backups, long-term storage

Standard + IA (via lifecycle)

High throughput, small file system

Provisioned throughput

Project: EFS (Hands-on) 


Create file system

How twtech creates  EFS file systems with recommended settings.

Create file system


Customerize settings: 

This setting is only  for the dev environment.

This following configuration is good for the Prod environment… highly available:  twtech Automatically backup the file system data with AWS Backup using recommended settings. Additional pricing applies

twtech recommends that automatic backups  be enabled or  kept.


Lifecycle management:  twtech Automatically saves money as access patterns change by moving files into the Infrequent Access (IA) or Archive storage class. 

Encryption 

twtech chooses to enable encryption of file system's data at rest. Uses the AWS KMS service key (aws/elasticfilesystem) by default.

Performance settings

Throughput mode 

twech Chooses a method for the file system's throughput limits.


Additional settings

Performance mode 

twtech sets the file system's performance mode based on IOPS required. File systems using Elastic throughput mode only support General Purpose performance mode.

Tags optional

Network access

twtech creates a specific security group: for twtech-efs


twtech sets inbound rules: defaults


Select the EFS security group created: twtech-efs-SG


File system policy - optional

Policy options


twtech reviews configuration and create the efs: twtech-efs





Details of efs created: twtech-efs 


How twtech lunches instances and attaches (mounts) efs to ec2-instances: twtech-web-server



Select the vpc and subnet that correspond to the AZ of the efs: us-east-2b

Configure inbound rules:





Head to the efs console: Network tab

Head to ec2-console: make sure the instances are running.

Head to Security group to verify that: efs-SG created allows the NFS protocol in port 2049


Make sure Security group (efs-SG) created allows the NFS protocol in port 2049


How twtech Accesses its instances (twtech-web-server) created without key pair:

twtech successfully log in: to create and mount the path to the efs (CLI)

Verify any existing efs mount points: CLI

ls /mnt/efs/fs1

No mount point exist

How twtech creates a mount point with files:

twtech needs to Elevate itself with the rights to create mount point using the command:  twtech needs to run the command as a root user

sudo su

twtech append (adds to content) or redirect(overites content) some text in the nfs file:

echo "twtech devsecops world" >> /mnt/efs/fs1/twtech-greetings.txt


twtech Verifies the content of the file in the twtech-db-server: /mnt/efs/fs1/twtech-greetings.txt

cat /mnt/efs/fs1/twtech-greetings.txt

NB:

When twtech also cat the same mounted file in another instance(twtech-web-server)  the content will display same.

cat /mnt/efs/fs1/twtech-greetings.txt


The efs  which is not in use can be deleted:


Confirm and delete by entering the:  fs ID



No comments:

Post a Comment

Kubernetes Clusters | Upstream Vs Downstream.

  The terms "upstream" and "downstream" in the context of Kubernetes clusters often refer to the direction of code fl...