AWS DataSync

AWS DataSync

  • AWS DataSync is an online data transfer service that simplifies, automates, and accelerates moving data between storage systems and services.
  • DataSync provides end-to-end security, including encryption and integrity validation.
  • DataSync automates both the management of data-transfer processes and the infrastructure required for high-performance and secure data transfer.
  • DataSync uses a purpose-built network protocol and a parallel, multi-threaded architecture to accelerate the transfers.
  • A DataSync agent is a VM or EC2 instance that AWS DataSync uses to read from or write to a storage system. Agents are commonly used when copying data from on-premises storage to AWS.
  • DataSync transfer is described by a Task and a Task Execution is an individual run of a DataSync task.
  • A task can be configured for locations (source and destination), schedule and how it treats metadata, deleted files, and permissions.
  • Task scheduling automatically runs tasks on the configured schedule with hourly, daily, or weekly options.
  • Each time a task is started it performs an incremental copy, transferring only the changes from the source to the destination.
  • If a task is interrupted, for instance, if the network connection goes down or the agent is restarted, the next run of the task will transfer missing files, and the data will be complete and consistent at the end of this run.
  • AWS DataSync can be used with the Direct Connect link to access public service endpoints or private VPC endpoints.
  • The amount of network bandwidth that AWS DataSync will use can be controlled by configuring the built-in bandwidth throttle.

DataSync Supported Locations

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company is migrating its applications to AWS. Currently, applications that run on-premises generate hundreds of terabytes of data that is stored on a shared file system. The company is running an analytics application in the cloud that runs hourly to generate insights from this data. The company needs a solution to handle the ongoing data transfer between the on-premises shared file system and Amazon S3. The solution also must be able to handle occasional interruptions in internet connectivity. Which solutions should the company use for the data transfer to meet these requirements?
    1. AWS DataSync
    2. AWS Migration Hub
    3. AWS Snowball Edge Storage Optimized
    4. AWS Transfer for SFTP

References

AWS_DataSync