Here’s twtech overview of AWS DataSync, including its 2025 enhancements and
core use cases:
The Concept: AWS
DataSync.
AWS DataSync is a high-speed,
secure, fully managed data transfer service for moving files and objects
between on‑premises systems, edge storage, other cloud providers, and AWS
storage services like Amazon S3, Amazon EFS, and Amazon FSx.
It automates and accelerates data
migration, replication, archiving, and hybrid cloud workflows, while preserving
metadata, supporting incremental transfer, scheduling, throttling, encryption,
and data integrity validation.
Noteworthy Updates in
2025
•
Kerberos Authentication for SMB (Jan 28, 2025)
SMB source locations now support Kerberos
authentication (in addition to NTLM) when connecting to self-managed SMB
servers such as those using Active Directory Kerberos v5,
•
Enhanced Mode Cross‑Cloud Transfers (May 29, 2025)
DataSync can now copy data between
AWS and storage services in other clouds (like Google Cloud Storage, Azure
Blob and Azure Files, Oracle OCI, Wasabi, Cloudflare R2, Backblaze,
DigitalOcean Spaces, and more)—without deploying a DataSync agent.
Enhanced mode provides higher throughput, scalability, and simplified setup.
•
Agentless Cross‑Region Transfers (Jul 2024 but still relevant)
Agentless transfers between any AWS
regions—including opt‑in regions—are now possible, enabling you to replicate or
move data across AWS Storage services without deploying or managing agents.
•
DataSync Availability in AWS Secret Region (May 8, 2025)
DataSync is now supported in the AWS
Secret Region, allowing secure transfers between on‑prem systems and AWS
Storage services in the secret cloud.
•
End of DataSync Discovery (May 20, 2025)
The Discovery feature
(agent-based storage scanning) was officially deprecated and is no longer
available as of May 20, 2025.
Core Use Cases & Benefits
- Hybrid and Cloud Migration
Migrate files/objects from on-prem NFS, SMB, HDFS, or object storage, and sync to Amazon S3, EFS, FSx, or AWS-managed storage layers with encryption and integrity validation. - Multicloud Data Pipelines
Perform cross-cloud transfers between AWS and other cloud storage providers—no agent required—with enhanced mode’s performance and scalability. - Cross-Region Replication
Copy data between AWS regions—including opt‑in regions—agent-free, useful for disaster recovery, backup, or data locality optimization.. - Secure File Access
Kerberos support for SMB ensures compatibility with enterprise AD environments, enhancing secure authentication flows. - Adoption in High Compliance Environments
With availability in the Secret Region and support for secure network transfer controls, DataSync fits highly regulated workloads.
Performance Insights
- Users report that DataSync can utilize available network capacity effectively—achieving up to ~10 Gbps transfers when the infrastructure supports it.
- For lighter or container-managed workloads, users
suggest AWS CLI or rclone sync from within containers may offer more flexibility
or lower costs for smaller use cases
Quick Start Guide
- Define Source and Destination Locations
- Choose SMB/NFS/HDFS/object/edge storage or other
clouds as source.
- Choose AWS storage targets like S3, EFS, or FSx as
destination.
- Select Authentication & Mode
- Use Kerberos or NTLM for SMB.
- For cross-cloud or cross-region transfers, use Enhanced
Mode or agentless options.
- Configure a Task
- Set up filters (include/exclude), bandwidth
throttling, scheduling, and metadata preservation.
- Run, Monitor & Validate
- Use CloudWatch metrics, task reports (JSON logs), and
CloudTrail audit logs to monitor and verify transfer integrity.
- Post-Transfer Automation
- Trigger Lambda, SNS, or AWS Glue/Athena workflows
based on task outcomes or stored reports.
Final
thoughts:
AWS DataSync is a secure,
high-speed, automated solution for moving data across on‑premises, AWS, and
multicloud environments.
Key 2025 enhancements include Kerberos
support for SMB, agentless cross-cloud transfers, expanded
cross-region replication, and support in the AWS Secret Region. The
deprecated Discovery feature is no longer available as of May 20, 2025.
Useful link for Documentation: https://docs.aws.amazon.com/datasync/latest/userguide/what-is-datasync.html?utm_source=chatgpt.com
Data Can synchronize to:
• All types of Amazon S3 (storage classes – including
Glacier).
• Amazon EFS.
• Amazon FSx (Windows, Lustre, NetApp, OpenZFS).
• With File permissions and metadata preserved (NFS POSIX,
SMB) synchronize of data subsequently becomes easier.
• One agent task can use 10 Gbps & bandwidth limit can be configured.
No comments:
Post a Comment