Amazon DynamoDB Backup and Restore

DynamoDB Backup and Restore

  • DynamoDB Backup and Restore provides fully automated on-demand backup, restore, and point-in-time recovery for data protection and archiving.
  • On-demand backup allows the creation of full backups of DynamoDB table for data archiving, helping you meet corporate and governmental regulatory requirements.
  • Point-in-time recovery (PITR) provides continuous backups of your DynamoDB table data with per-second granularity.
  • All backups are automatically encrypted, cataloged, and easily discoverable.
  • Backups can be created for tables from a few megabytes to hundreds of terabytes of data, with no impact on performance and availability of production applications.

On-demand Backups

  • DynamoDB on-demand backup helps create full backups of the tables for long-term retention, and archiving for regulatory compliance needs.
  • On-demand backups create a snapshot of the table that DynamoDB stores and manages.
  • Backup and restore actions run with no impact on table performance or availability.
  • Backups process in seconds regardless of the size of the tables.
  • Backups are preserved regardless of table deletion and retained until they are explicitly deleted.
  • On-demand backups are cataloged, and discoverable.
  • Charged based on the size and duration of the backups.
  • Can restore the entire DynamoDB table to the exact state it was in when the backup was created.

Creating On-demand Backups

  • On-demand backups can be created using two methods:

DynamoDB Native Backup

  • Can be used to backup and restore DynamoDB tables.
  • Create backups via AWS Management Console, AWS CLI, or API.
  • Limitation: DynamoDB on-demand backups cannot be copied to a different account or Region.
  • Suitable for simple backup and restore within the same account and region.

AWS Backup (Recommended)

  • AWS Backup is a fully managed data protection service that makes it easy to centralize and automate backups across AWS services, in the cloud, and on-premises.
  • Provides enhanced backup features beyond native DynamoDB backups.
  • Key Advantages:
    • Centralized Management: Configure backup schedules & policies and monitor activity for AWS resources and on-premises workloads in one place.
    • Cross-Region Backup: Copy on-demand backups across AWS Regions.
    • Cross-Account Backup: Copy on-demand backups across AWS accounts (requires enabling advanced features).
    • Independent Encryption: Encryption using an AWS KMS key that is independent of the DynamoDB table encryption key.
    • Vault Lock (WORM): Apply write-once-read-many (WORM) setting for backups using AWS Backup Vault Lock policy for compliance.
    • Cost Allocation Tags: Add cost allocation tags to on-demand backups for better cost tracking.
    • Cold Storage Tier: Transition on-demand backups to cold storage for lower costs (requires opting in to advanced features).
    • Automated Backup Plans: Create scheduled backup plans with retention policies.

Cross-Region and Cross-Account Restore

  • DynamoDB table data can be restored across AWS Regions such that the restored table is created in a different Region from where the source table resides.
  • Cross-Region restores are supported between:
    • AWS commercial Regions
    • AWS China Regions
    • AWS GovCloud (US) Regions
  • Cross-Account Backup and Restore: Using AWS Backup, backups can be copied across AWS accounts for disaster recovery or data migration scenarios.
  • Pricing: Pay for data transfer out of the source Region and for restoring to a new table in the destination Region.

PITR – Point-In-Time Recovery

  • DynamoDB point-in-time recovery – PITR enables automatic, continuous, incremental backup of the table with per-second granularity.
  • PITR backups are fully managed by DynamoDB.
  • PITR helps protect against accidental writes and deletes.
  • PITR can back up tables with hundreds of terabytes of data with no impact on the performance or availability of the production applications.

Configurable Recovery Period (January 2025)

  • Announced in January 2025, DynamoDB now supports a configurable recovery period for PITR.
  • Recovery period can be set to any value between 1 and 35 days on a per-table basis.
  • Default: Recovery period is 35 days if not explicitly configured.
  • Can restore to any given second from within the configured recovery period.
  • Use Cases:
    • Shorter retention (e.g., 7 days) for cost optimization when long-term recovery is not needed.
    • Compliance requirements that mandate specific retention periods.
    • Development/test environments where shorter recovery windows are acceptable.
  • Pricing Impact: Shortening the recovery period has no impact on PITR pricing because the price is based on the size of table and local secondary indexes, not the retention period.

PITR Restore Capabilities

  • Can restore to any point in time between EarliestRestorableDateTime and LatestRestorableDateTime.
  • LatestRestorableDateTime is typically five minutes before the current time.
  • PITR-enabled tables that were deleted can be recovered in the preceding 35 days (or configured retention period) and restored to their state just before they were deleted.
  • Restored table is created as a new, independent table (not part of the original global table if applicable).

PITR with Global Tables

  • Can enable point-in-time recovery on each local replica of a global table.
  • When restoring a global table replica, the backup restores to an independent table that is not part of the global table.
  • If using Global Tables version 2019.11.21 (Current), a new global table can be created from the restored table.

PITR Considerations

  • If PITR is disabled and later re-enabled on a table, the start time for recovery is reset.
  • After re-enabling, can only immediately restore using the LatestRestorableDateTime.
  • AWS CloudTrail logs all console and API actions for PITR for auditing and compliance.
  • PITR can be enabled or disabled at any time without impacting table performance.

Backup and Restore Best Practices

  • Use AWS Backup for Production: Leverage AWS Backup for centralized management, cross-region/cross-account capabilities, and advanced features.
  • Enable PITR for Critical Tables: Always enable PITR for production tables to protect against accidental data loss.
  • Configure Appropriate Retention: Set PITR retention period based on recovery requirements and compliance needs.
  • Test Restore Procedures: Regularly test backup restoration to ensure recovery processes work as expected.
  • Use Vault Lock for Compliance: Apply AWS Backup Vault Lock for immutable backups when required by regulations.
  • Implement Cross-Region Backups: Copy critical backups to another region for disaster recovery.
  • Tag Backups: Use cost allocation tags to track backup costs by project, environment, or team.
  • Automate Backup Plans: Create scheduled backup plans with AWS Backup for consistent data protection.
  • Monitor Backup Status: Use CloudWatch and AWS Backup monitoring to track backup success and failures.
  • Consider Cold Storage: Transition long-term backups to cold storage tier for cost savings.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A sysops engineer must create nightly backups of an Amazon DynamoDB table. Which backup methodology should the database specialist use to MINIMIZE management overhead?
    1. Install the AWS CLI on an Amazon EC2 instance. Write a CLI command that creates a backup of the DynamoDB table. Create a scheduled job or task that runs the command on a nightly basis.
    2. Create an AWS Lambda function that creates a backup of the DynamoDB table. Create an Amazon CloudWatch Events rule that runs the Lambda function on a nightly basis.
    3. Create a backup plan using AWS Backup, specify a backup frequency of every 24 hours, and give the plan a nightly backup window.
    4. Configure DynamoDB backup and restore for an on-demand backup frequency of every 24 hours.
  2. A company needs to copy DynamoDB table backups to a different AWS account for disaster recovery purposes. What is the BEST solution?
    1. Use DynamoDB native backup and manually export/import data to the other account.
    2. Use AWS Backup to create backups and copy them across accounts after enabling advanced features and cross-account backup.
    3. Enable PITR and restore the table in the other account.
    4. Use AWS Data Pipeline to copy data between accounts.
  3. A company wants to protect a DynamoDB table against accidental deletions with the ability to recover data from any point in the last 7 days. What should a solutions architect recommend?
    1. Create daily on-demand backups and retain them for 7 days.
    2. Enable PITR with a recovery period configured to 7 days.
    3. Use AWS Backup with a 7-day retention policy.
    4. Enable DynamoDB Streams and store data in S3 for 7 days.
  4. A company needs to restore a DynamoDB table to a different AWS Region. The table is currently in us-east-1 and needs to be restored to eu-west-1. What is the correct approach?
    1. Enable PITR and restore directly to eu-west-1.
    2. Use DynamoDB native backup and restore to eu-west-1.
    3. Create a backup and perform a cross-Region restore to eu-west-1.
    4. Create a Global Table with a replica in eu-west-1.
  5. A company has enabled PITR on a DynamoDB table with a 35-day retention period. They want to reduce costs by shortening the retention to 14 days. What will be the impact on PITR pricing?
    1. PITR costs will be reduced by approximately 60%.
    2. PITR costs will be reduced proportionally to the retention period.
    3. There will be no impact on PITR pricing as it is based on table size, not retention period.
    4. PITR costs will increase due to more frequent backup cycles.
  6. Which of the following are advantages of using AWS Backup over DynamoDB native backups? (Select THREE)
    1. Cross-account backup and restore capabilities
    2. Faster backup creation time
    3. Ability to transition backups to cold storage tier
    4. Lower backup storage costs
    5. Centralized backup management across multiple AWS services
    6. Automatic PITR enablement
  7. A DynamoDB table with PITR enabled was accidentally deleted. How long does the company have to recover the table?
    1. 7 days from deletion
    2. 24 hours from deletion
    3. Up to 35 days (or the configured retention period) from deletion
    4. PITR cannot recover deleted tables

References

AWS DynamoDB Advanced Features

AWS DynamoDB Advanced Features

  • DynamoDB Secondary indexes on a table allow efficient access to data with attributes other than the primary key.
  • DynamoDB Time to Live – TTL enables a per-item timestamp to determine when an item is no longer needed.
  • DynamoDB cross-region replication allows identical copies (called replicas) of a DynamoDB table (called master table) to be maintained in one or more AWS regions.
  • DynamoDB Global Tables is a new multi-master, cross-region replication capability of DynamoDB to support data access locality and regional fault tolerance for database workloads.
  • DynamoDB Streams provides a time-ordered sequence of item-level changes made to data in a table.
  • DynamoDB Triggers (just like database triggers) are a feature that allows the execution of custom actions based on item-level updates on a table.
  • DynamoDB Accelerator – DAX is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement – from ms to µs – even at millions of requests per second.
  • VPC Gateway Endpoints provide private access to DynamoDB from within a VPC without the need for an internet gateway or NAT gateway.

DynamoDB Secondary Indexes

  • DynamoDB Secondary indexes on a table allow efficient access to data with attributes other than the primary key.
  • Global secondary index – an index with a partition key and a sort key that can be different from those on the base table.
  • Local secondary index – an index that has the same partition key as the base table, but a different sort key.

DynamoDB TTL

  • DynamoDB Time to Live (TTL) enables a per-item timestamp to determine when an item is no longer needed.
  • After the date and time of the specified timestamp, DynamoDB deletes the item from the table without consuming any write throughput.
  • DynamoDB TTL is provided at no extra cost and can help reduce data storage by retaining only required data.
  • Items that are deleted from the table are also removed from any local secondary index and global secondary index in the same way as a DeleteItem operation.
  • Expired items get removed from the table and indexes within about 48 hours.
  • DynamoDB Stream tracks the delete operation as a system delete and not a regular delete.
  • TTL is useful if the stored items lose relevance after a specific time. for e.g.
    • Remove user or sensor data after a year of inactivity in an application
    • Archive expired items to an S3 data lake via DynamoDB Streams and AWS Lambda.
    • Retain sensitive data for a certain amount of time according to contractual or regulatory obligations.

DynamoDB Cross-region Replication

  • DynamoDB cross-region replication allows identical copies (called replicas) of a DynamoDB table (called master table) to be maintained in one or more AWS regions.
  • Writes to the table will be automatically propagated to all replicas.
  • Cross-region replication currently supports a single master mode. A single master has one master table and one or more replica tables.
  • Read replicas are updated asynchronously as DynamoDB acknowledges a write operation as successful once it has been accepted by the master table. The write will then be propagated to each replica with a slight delay.
  • Cross-region replication can be helpful in scenarios
    • Efficient disaster recovery, in case a data center failure occurs.
    • Faster reads, for customers in multiple regions by delivering data faster by reading a DynamoDB table from the closest AWS data center.
    • Easier traffic management, to distribute the read workload across tables and thereby consume less read capacity in the master table.
    • Easy regional migration, by promoting a read replica to master
    • Live data migration, to replicate data and when the tables are in sync, switch the application to write to the destination region
  • Cross-region replication costing depends on
    • Provisioned throughput (Writes and Reads)
    • Storage for the replica tables.
    • Data Transfer across regions
    • Reading data from DynamoDB Streams to keep the tables in sync.
    • Cost of EC2 instances provisioned, depending upon the instance types and region, to host the replication process.
  • NOTE : Cross Region replication on DynamoDB was performed defining AWS Data Pipeline job which used EMR internally to transfer data before the DynamoDB streams and out-of-box cross-region replication support.

DynamoDB Global Tables

  • DynamoDB Global Tables is a multi-master, active-active, cross-region replication capability of DynamoDB to support data access locality and regional fault tolerance for database workloads.
  • Applications can now perform reads and writes to DynamoDB in AWS regions around the world, with changes in any region propagated to every region where a table is replicated.
  • Global Tables help in building applications to advantage of data locality to reduce overall latency.
  • Global Tables supports eventual consistency & strong consistency for same region reads, but only eventual consistency for cross-region reads.
  • Global Tables replicates data among regions within a single AWS account and currently does not support cross-account access.
  • Global Tables uses the Last Write Wins approach for conflict resolution.
  • Global Tables requires DynamoDB streams enabled with New and Old image settings.

DynamoDB Streams

  • DynamoDB Streams provides a time-ordered sequence of item-level changes made to data in a table.
  • DynamoDB Streams stores the data for the last 24 hours, after which they are erased.
  • DynamoDB Streams maintains an ordered sequence of the events per item however, sequence across items is not maintained.
  • Example
    • For e.g., suppose that you have a DynamoDB table tracking high scores for a game and that each item in the table represents an individual player. If you make the following three updates in this order:
      • Update 1: Change Player 1’s high score to 100 points
      • Update 2: Change Player 2’s high score to 50 points
      • Update 3: Change Player 1’s high score to 125 points
    • DynamoDB Streams will maintain the order for Player 1 score events. However, it would not maintain order across the players. So Player 2 score event is not guaranteed between the 2 Player 1 events
  • DynamoDB Streams APIs help developers consume updates and receive the item-level data before and after items are changed.
  • DynamoDB Streams allow reads at up to twice the rate of the provisioned write capacity of the DynamoDB table.
  • DynamoDB Streams have to be enabled on a per-table basis.
  • DynamoDB streams support Encryption at rest to encrypt the data.
  • DynamoDB Streams is designed for No Duplicates so that every update made to the table will be represented exactly once in the stream.
  • DynamoDB Streams writes stream records in near-real time so that applications can consume these streams and take action based on the contents.
  • DynamoDB streams can be used for multi-region replication to keep other data stores up-to-date with the latest changes to DynamoDB or to take actions based on the changes made to the table
  • DynamoDB steam records can be processed using Kinesis Data Streams, Lambda, or KCL application.

DynamoDB Triggers

  • DynamoDB Triggers (just like database triggers) are a feature that allows the execution of custom actions based on item-level updates on a table.
  • DynamoDB triggers can be used in scenarios like sending notifications, updating an aggregate table, and connecting DynamoDB tables to other data sources.
  • DynamoDB Trigger flow
    • Custom logic for a DynamoDB trigger is stored in an AWS Lambda function as code.
    • A trigger for a given table can be created by associating an AWS Lambda function to the stream (via DynamoDB Streams) on a table.
    • When the table is updated, the updates are published to DynamoDB Streams.
    • In turn, AWS Lambda reads the updates from the associated stream and executes the code in the function.

DynamoDB Backup and Restore

  • DynamoDB on-demand backup helps create full backups of the tables for long-term retention, and archiving for regulatory compliance needs.
  • Backup and restore actions run with no impact on table performance or availability.
  • Backups are preserved regardless of table deletion and retained until they are explicitly deleted.
  • On-demand backups are cataloged, and discoverable.
  • On-demand backups can be created using
    • DynamoDB
      • DynamoDB on-demand backups cannot be copied to a different account or Region.
    • AWS Backup (Recommended)
      • is a fully managed data protection service that makes it easy to centralize and automate backups across AWS services, in the cloud, and on-premises
      • provides enhanced backup features
      • can configure backup schedule, policies and monitor activity for the AWS resources and on-premises workloads in one place.
      • can copy the on-demand backups across AWS accounts and Regions,
      • encryption using an AWS KMS key that is independent of the DynamoDB table encryption key.
      • apply write-once-read-many (WORM) setting for the backups using the AWS Backup Vault Lock policy.
      • add cost allocation tags to on-demand backups, and
      • transition on-demand backups to cold storage for lower costs.

DynamoDB PITR – Point-In-Time Recovery

  • DynamoDB point-in-time recovery – PITR enables automatic, continuous, incremental backup of the table with per-second granularity.
  • PITR-enabled tables that were deleted can be recovered in the preceding 35 days and restored to their state just before they were deleted.
  • PITR helps protect against accidental writes and deletes.
  • PITR can back up tables with hundreds of terabytes of data with no impact on the performance or availability of the production applications.

DynamoDB Accelerator – DAX

  • DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement – from milliseconds to microseconds – even at millions of requests per second.
  • DAX is intended for high-performance read applications. As a write-through cache, DAX writes directly so that the writes are immediately reflected in the item cache.
  • DAX as a managed service handles the cache invalidation, data population, or cluster management.
  • DAX provides API-compatible with DynamoDB. Therefore, it requires only minimal functional changes to use with an existing application.
  • DAX saves costs by reducing the read load (RCU) on DynamoDB.
  • DAX helps prevent hot partitions.
  • DAX only supports eventual consistency, and strong consistency requests are passed-through to DynamoDB.
  • DAX is fault-tolerant and scalable.
  • DAX cluster has a primary node and zero or more read-replica nodes. Upon a failure for a primary node, DAX will automatically failover and elect a new primary. For scaling, add or remove read replicas.
  • DAX supports server-side encryption.
  • DAX also supports encryption in transit, ensuring that all requests and responses between the application and the cluster are encrypted by TLS, and connections to the cluster can be authenticated by verification of a cluster x509 certificate

DynamoDB Accelerator - DAX

VPC Endpoints

  • VPC endpoints for DynamoDB improve privacy and security, especially those dealing with sensitive workloads with compliance and audit requirements, by enabling private access to DynamoDB from within a VPC without the need for an internet gateway or NAT gateway.
  • VPC endpoints for DynamoDB support IAM policies to simplify DynamoDB access control, where access can be restricted to a specific VPC endpoint.
  • VPC endpoints can be created only for Amazon DynamoDB tables in the same AWS Region as the VPC
  • DynamoDB Streams cannot be accessed using VPC endpoints for DynamoDB.

VPC Gateway Endpoints

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What are the services supported by VPC endpoints, using Gateway endpoint type? Choose 2 answers
    1. Amazon S3
    2. Amazon EFS
    3. Amazon DynamoDB
    4. Amazon Glacier
    5. Amazon SQS
  2. A company has setup an application in AWS that interacts with DynamoDB. DynamoDB is currently responding in milliseconds, but the application response guidelines require it to respond within microseconds. How can the performance of DynamoDB be further improved? [SAA-C01]
    1. Use ElastiCache in front of DynamoDB
    2. Use DynamoDB inbuilt caching
    3. Use DynamoDB Accelerator
    4. Use RDS with ElastiCache instead

References