AWS RDS Multi-AZ – Failover & High Availability

RDS Multi-AZ Instance Deployment

RDS Multi-AZ Deployment

  • RDS Multi-AZ deployments provide high availability and automatic failover support for DB instances
  • Multi-AZ helps improve the durability and availability of a critical system, enhancing availability during planned system maintenance, DB instance failure, and Availability Zone disruption.
  • A Multi-AZ DB instance deployment
    • has one standby DB instance that provides failover support but doesn’t serve read traffic.
    • There is only one row for the DB instance.
    • The value of Role is Instance or Primary.
    • The value of Multi-AZ is Yes.
  • A Multi-AZ DB cluster deployment
    • has two standby DB instances that provide failover support and can also serve read traffic.
    • There is a cluster-level row with three DB instance rows under it.
    • For the cluster-level row, the value of Role is Multi-AZ DB cluster.
    • For each instance-level row, the value of Role is Writer instance or Reader instance.
    • For each instance-level row, the value of Multi-AZ is 3 Zones.

RDS Multi-AZ DB Instance Deployment

  • RDS automatically creates a primary DB Instance and synchronously replicates the data to a standby instance in a different AZ.
  • RDS performs an automatic failover to the standby, so that database operations can be resumed as soon as the failover is complete.
  • RDS Multi-AZ deployment maintains the same endpoint for the DB Instance after a failover, so the application can resume database operation without the need for manual administrative intervention.
  • Multi-AZ is a high-availability feature and NOT a scaling solution for read-only scenarios; a standby replica can’t be used to serve read traffic. To service read-only traffic, use a Read Replica.
  • RDS performs an automatic failover to the standby, so that database operations can be resumed as soon as the failover is complete.
  • Multi-AZ deployments for Oracle, PostgreSQL, MySQL, and MariaDB DB instances use Amazon technology, while SQL Server DB instances use SQL Server Mirroring.

RDS Multi-AZ Instance Deployment

RDS Multi-AZ DB Cluster Deployment

  • RDS Multi-AZ DB cluster deployment is a high-availability deployment mode of RDS with two readable standby DB instances.
  • RDS Multi-AZ DB cluster has a writer DB instance and two reader DB instances in three separate AZs in the same AWS Region.
  • With a Multi-AZ DB cluster, RDS semi-synchronously replicates data from the writer DB instance to both of the reader DB instances using the DB engine’s native replication capabilities.
  • Multi-AZ DB clusters provide high availability, increased capacity for read workloads, and lower write latency when compared to Multi-AZ DB instance deployments.
  • If an event of an outage, RDS manages failover from the writer DB instance to one of the reader DB instances. RDS does this based on which reader DB instance has the most recent change record.

RDS Mulit-AZ DB Cluster

Multi-AZ DB Instance vs Multi-AZ DB Cluster

RDS Multi-AZ DB Instance vs DB Cluster

RDS Multi-AZ vs Read Replicas

RDS Mulit-AZ vs Multi-Region vs Read Replicas

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company is deploying a new two-tier web application in AWS. The company has limited staff and requires high availability, and the application requires complex queries and table joins. Which configuration provides the solution for the company’s requirements?
    1. MySQL Installed on two Amazon EC2 Instances in a single Availability Zone (does not provide High Availability out of the box)
    2. Amazon RDS for MySQL with Multi-AZ
    3. Amazon ElastiCache (Just a caching solution)
    4. Amazon DynamoDB (Not suitable for complex queries and joins)
  2. What would happen to an RDS (Relational Database Service) multi-Availability Zone deployment if the primary DB instance fails?
    1. IP of the primary DB Instance is switched to the standby DB Instance.
    2. A new DB instance is created in the standby availability zone.
    3. The canonical name record (CNAME) is changed from primary to standby.
    4. The RDS (Relational Database Service) DB instance reboots.
  3. Will my standby RDS instance be in the same Availability Zone as my primary?
    1. Only for Oracle RDS types
    2. Yes
    3. Only if configured at launch
    4. No
  4. Is creating a Read Replica of another Read Replica supported?
    1. Only in certain regions
    2. Only with MySQL based RDS
    3. Only for Oracle RDS types
    4. No
  5. A user is planning to set up the Multi-AZ feature of RDS. Which of the below mentioned conditions won’t take advantage of the Multi-AZ feature?
    1. Availability zone outage
    2. A manual failover of the DB instance using Reboot with failover option
    3. Region outage
    4. When the user changes the DB instance’s server type
  6. When you run a DB Instance as a Multi-AZ deployment, the “_____” serves database writes and reads
    1. secondary
    2. backup
    3. stand by
    4. primary
  7. When running my DB Instance as a Multi-AZ deployment, can I use the standby for read or write operations?
    1. Yes
    2. Only with MSSQL based RDS
    3. Only for Oracle RDS instances
    4. No
  8. Read Replicas require a transactional storage engine and are only supported for the _________ storage engine
    1. OracleISAM
    2. MSSQLDB
    3. InnoDB
    4. MyISAM
  9. A user is configuring the Multi-AZ feature of an RDS DB. The user came to know that this RDS DB does not use the AWS technology, but uses server mirroring to achieve replication. Which DB is the user using right now?
    1. MySQL
    2. Oracle
    3. MS SQL
    4. PostgreSQL
  10. If you have chosen Multi-AZ deployment, in the event of a planned or unplanned outage of your primary DB Instance, Amazon RDS automatically switches to the standby replica. The automatic failover mechanism simply changes the ______ record of the main DB Instance to point to the standby DB Instance.
    1. DNAME
    2. CNAME
    3. TXT
    4. MX
  11. When automatic failover occurs, Amazon RDS will emit a DB Instance event to inform you that automatic failover occurred. You can use the _____ to return information about events related to your DB Instance
    1. FetchFailure
    2. DescriveFailure
    3. DescribeEvents
    4. FetchEvents
  12. The new DB Instance that is created when you promote a Read Replica retains the backup window period.
    1. TRUE
    2. FALSE
  13. Will I be alerted when automatic failover occurs?
    1. Only if SNS configured
    2. No
    3. Yes
    4. Only if Cloudwatch configured
  14. Can I initiate a “forced failover” for my MySQL Multi-AZ DB Instance deployment?
    1. Only in certain regions
    2. Only in VPC
    3. Yes
    4. No
  15. A user is accessing RDS from an application. The user has enabled the Multi-AZ feature with the MS SQL RDS DB. During a planned outage how will AWS ensure that a switch from DB to a standby replica will not affect access to the application?
    1. RDS will have an internal IP which will redirect all requests to the new DB
    2. RDS uses DNS to switch over to standby replica for seamless transition
    3. The switch over changes Hardware so RDS does not need to worry about access
    4. RDS will have both the DBs running independently and the user has to manually switch over
  16. Which of the following is part of the failover process for a Multi-AZ Amazon Relational Database Service (RDS) instance?
    1. The failed RDS DB instance reboots.
    2. The IP of the primary DB instance is switched to the standby DB instance.
    3. The DNS record for the RDS endpoint is changed from primary to standby.
    4. A new DB instance is created in the standby availability zone.
  17. Which of these is not a reason a Multi-AZ RDS instance will failover?
    1. An Availability Zone outage
    2. A manual failover of the DB instance was initiated using Reboot with failover
    3. To autoscale to a higher instance class (Refer link)
    4. Master database corruption occurs
    5. The primary DB instance fails
  18. How does Amazon RDS multi Availability Zone model work?
    1. A second, standby database is deployed and maintained in a different availability zone from master, using synchronous replication. (Refer link)
    2. A second, standby database is deployed and maintained in a different availability zone from master using asynchronous replication.
    3. A second, standby database is deployed and maintained in a different region from master using asynchronous replication.
    4. A second, standby database is deployed and maintained in a different region from master using synchronous replication.
  19. A user is using a small MySQL RDS DB. The user is experiencing high latency due to the Multi AZ feature. Which of the below mentioned options may not help the user in this situation?
    1. Schedule the automated back up in non-working hours
    2. Use a large or higher size instance
    3. Use PIOPS
    4. Take a snapshot from standby Replica
  20. What is the charge for the data transfer incurred in replicating data between your primary and standby?
    1. No charge. It is free.
    2. Double the standard data transfer charge
    3. Same as the standard data transfer charge
    4. Half of the standard data transfer charge
  21. A user has enabled the Multi AZ feature with the MS SQL RDS database server. Which of the below mentioned statements will help the user understand the Multi AZ feature better?
    1. In a Multi AZ, AWS runs two DBs in parallel and copies the data asynchronously to the replica copy
    2. In a Multi AZ, AWS runs two DBs in parallel and copies the data synchronously to the replica copy
    3. In a Multi AZ, AWS runs just one DB but copies the data synchronously to the standby replica
    4. AWS MS SQL does not support the Multi AZ feature

Choosing the Right Data Science Specialization: Where to Focus Your Skills

Choosing the Right Data Science Specialization: Where to Focus Your Skills

In the rapidly evolving world of technology, data science stands out as a field of endless opportunities and diverse pathways. With its foundations deeply rooted in statistics, computer science, and domain-specific knowledge, data science has become indispensable for organizations seeking to make data-driven decisions. However, the vastness of this field can be overwhelming, making specialization a strategic necessity for aspiring data scientists.

This article aims to navigate through the labyrinth of data science specializations, helping you align your career with your interests, skills, and the evolving demands of the job market.

Understanding the Breadth of Data Science

Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to draw knowledge and discover insights from structured and unstructured data. It includes multiranged activities, from data collection and cleaning to complex algorithmic computations and predictive modeling.

Key Areas Within Data Science

  • Machine Learning: This involves creating algorithms that can learn from pre-fed data and make predictions or decisions based on it.
  • Deep Learning: A specialized subdomain of machine learning, focusing on neural networks and algorithms inspired by the structure and function of the brain.
  • Data Engineering: This is the backbone of data science, focusing on the practical aspects of data collection, storage, and retrieval.
  • Data Visualization: It involves converting complex data sets into understandable and interactive graphical representations.
  • Big Data Analytics: This deals with extracting meaningful insights from very large, diverse data sets that are often beyond the capability of traditional data-processing applications.
  • AI and Robotics: This cutting-edge field combines data science with robotics, focusing on creating machines that can perform actions/operations that typically require human intelligence.

Interconnectivity of These Areas

While these specializations are distinct, they are interconnected. For instance, data engineering is foundational for machine learning, and AI applications often rely on insights derived from big data analytics.

Factors to Consider When Choosing a Specialization

  • Personal Interests and Strengths
    • Your choice should resonate with your personal interests. If you are fascinated by how algorithms can mimic human learning, deep learning could be your calling. Alternatively, if you enjoy the challenges of handling and organizing large data sets, data engineering might suit you.
  • Industry Demand and Job Market Trends
    • It’s crucial to align your specialization with the market demand. Fields like AI and machine learning are rapidly growing and offer numerous job opportunities. Tracking industry trends can provide valuable insights into which specializations are most in demand.
  • Long-term Career Goals
    • Consider where you want to be in your career in the next five to ten years. Some specializations may offer more opportunities for growth, leadership roles, or transitions into different areas of data science.
  • Impact of Emerging Technologies
    • Emerging technologies can redefine the landscape of data science. Continuously updating with the knowledge about these changes can help you choose a specialization that remains relevant in the future.

Deep Dive into Popular Data Science Specializations

  • Machine Learning
    • Overview and Applications: From predictive modeling in finance to recommendation systems in e-commerce, machine learning is revolutionizing various industries.
    • Required Skills and Tools: Proficiency in programming languages like Python or R, understanding of algorithms, and familiarity with TensorFlow or Scikit-learn like machine learning frameworks are essential.
  • Data Engineering
    • Role in Data Science: Data engineers build and maintain the infrastructure that allows data scientists to analyze and utilize data effectively.
    • Key Skills and Technologies: Skills in database management, ETL (Extract, Transform, Load) processes, and knowledge of SQL, NoSQL, Hadoop, and Spark are crucial.
  • Big Data Analytics
    • Understanding Big Data: This specialization deals with extremely large data sets that discover patterns, trends, and associations, particularly relating to human behavior and interactions.
    • Tools and Techniques: Familiarity with big data platforms like Apache Hadoop and Spark, along with data mining and statistical analysis, is important.
  • AI and Robotics
    • The Frontier of Data Science: This field is at the cutting edge, developing intelligent systems with the capability of performing tasks that particularly require human intelligence.
    • Skills and Knowledge Base: A deep understanding of AI principles, programming, and robotics is necessary, along with skills in machine learning and neural networks.

Educational Pathways for Each Specialization

  • Academic Courses and Degrees
    • Pursuing a formal education in data science or a related field can provide a strong theoretical foundation. Many universities like MIT now offer specialized courses in machine learning, AI, and big data analytics, like the Data Analysis Certificate program.
  • Online Courses and Bootcamps
    • Online platforms like Great Learning offer specialized courses that are more flexible and often industry-oriented. Bootcamps, on the other hand, provide intensive, hands-on training in specific areas of data science.
  • Certifications and Workshops
    • Professional certifications from recognized bodies can add significant value to your resume. Educational choices like the Data Science course showcase your expertise and commitment to professional development.
  • Self-learning Resources
    • The internet is replete with resources for self-learners. From online tutorials and forums to webinars and eBooks, the opportunities for self-paced learning in data science are abundant.

Building Experience in Your Chosen Specialization

  • Internships and Entry-level Positions
    • Gaining practical experience is crucial. Internships and entry-level positions provide real-world experience and help you understand the practical challenges and applications of your chosen specialization.
  • Personal and Open-source Projects
    • Working on personal data science projects or contributing to open-source projects can be a great way to apply your skills. These projects can also be a valuable addition to your portfolio.
  • Networking and Community Involvement
    • Building a professional network and participating in data science communities can lead to job opportunities and collaborations. Attending industry conferences and seminars is also a great way to stay updated and connected.
  • Industry Conferences and Seminars
    • These events are excellent for learning about the latest industry trends, best data science practices, and emerging technologies. They also offer opportunities to meet industry leaders and peers.

Future Trends and Evolving Specializations

  • Predicting the Future of Data Science
    • The field of data science is constantly evolving. Staying informed about future trends is crucial for choosing a specialization that will remain relevant and in demand.
  • Emerging Specializations and Technologies
    • Areas like quantum computing, edge analytics, and ethical AI are emerging as new frontiers in data science. These fields are likely to offer exciting new opportunities for specialization in the coming years.
  • Staying Adaptable and Continuous Learning
    • The work-way to a successful career in data science is adaptability and a commitment to continuous learning. The field is dynamic, and staying abreast of new developments is essential.

Conclusion

Choosing the right data science specialization is a critical decision that can shape your career trajectory. It requires a careful consideration of your personal interests, the current job market, and future industry trends. Whether your passion lies in the intricate algorithms of machine learning, the structural challenges of data engineering, or the innovative frontiers of AI and robotics, there is a niche for every aspiring data scientist. The journey is one of continuous learning, adaptability, and an unwavering curiosity about the power of data. As the field continues to grow and diversify, the opportunities for data scientists are bound to expand, offering a rewarding and dynamic career path.

 

2025 Black Friday & Cyber Monday Deals

Udemy – Black Friday Sale (Upto 85% Off)- till 28th Nov


Braincert – till 27th Nov

Use Coupon Code – BLACK_FRIDAY

AWS Certifications

KodeKloud – Black Friday Sale – till 30th Nov



Coursera – till 28th Nov


Whizlabs – Black Friday Sale – till 27th Nov



AWS Certified Database – Specialty (DBS-C01) Exam Learning Path

AWS Database - Specialty Certificate

AWS Certified Database – Specialty (DBS-C01) Exam Learning Path

⚠️ CERTIFICATION RETIRED

AWS Certified Database – Specialty (DBS-C01) was retired on April 30, 2024. The last day to take this exam was April 29, 2024.

Certifications earned before retirement remain active for the standard three-year period but cannot be renewed.

Recommended Alternatives:

This content remains valuable as a study guide for AWS database services regardless of certification status.

I recently revalidated my AWS Certified Database – Specialty (DBS-C01) certification just before it expired. The format and domains are pretty much the same as the previous exam, however, it has been enhanced to cover a lot of new services.

AWS Certified Database – Specialty (DBS-C01) Exam Content

AWS Certified Database – Specialty (DBS-C01) exam validates your understanding of databases, including the concepts of design, migration, deployment, access, maintenance, automation, monitoring, security, and troubleshooting, and covers the following tasks:

  • Understand and differentiate the key features of AWS database services.
  • Analyze needs and requirements to design and recommend appropriate database solutions using AWS services

Refer to AWS Database – Specialty Exam Guide

DBS-C01 Domains

AWS Certified Database – Specialty (DBS-C01) Exam Summary

  • Specialty exams are tough, lengthy, and tiresome. Most of the questions and answers options have a lot of prose and a lot of reading that needs to be done, so be sure you are prepared and manage your time well.
  • DBS-C01 exam has 65 questions to be solved in 170 minutes which gives you roughly 2 1/2 minutes to attempt each question.
  • DBS-C01 exam includes two types of questions, multiple-choice and multiple-response.
  • DBS-C01 has a scaled score between 100 and 1,000. The scaled score needed to pass the exam is 750.
  • Specialty exams currently cost $ 300 + tax.
  • You can get an additional 30 minutes if English is your second language by requesting Exam Accommodations. It might not be needed for Associate exams but is helpful for Professional and Specialty ones.
  • As always, mark the questions for review, move on, and come back to them after you are done with all.
  • As always, having a rough architecture or mental picture of the setup helps focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach the right answer or at least have a 50% chance of getting it right.
  • AWS exams can be taken either remotely or online, I prefer to take them online as it provides a lot of flexibility. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.

AWS Certified Database – Specialty (DBS-C01) Exam Resources

AWS Certified Data Engineer – Associate (DEA-C01)

💡 Recommended Replacement Certification

The AWS Certified Data Engineer – Associate (DEA-C01) launched in March 2024 and is the closest active certification covering database and data services.

  • Exam domains: Data Ingestion & Transformation (34%), Data Store Management (26%), Data Operations & Support (22%), Data Security & Governance (18%)
  • Format: 65 questions, 130 minutes, $150 + tax, passing score 720
  • Key services: DynamoDB, Aurora, RDS, Redshift, S3, Glue, Kinesis, MSK, Lake Formation, DMS

AWS Database Services – Study Summary

  • AWS Certified Database – Specialty exam focuses completely on AWS Data services from relational, non-relational, graph, caching, and data warehousing. It also covers deployments, automation, migration, security, monitoring, and troubleshooting aspects of them.

Database Services

  • Make sure you know and cover all the services in-depth, as 80% of the exam is focused on topics like Aurora, RDS, DynamoDB
  • DynamoDB
    • is a fully managed NoSQL database service providing single-digit millisecond latency.
    • DynamoDB provisioned throughput supports On-demand and provisioned throughput capacity modes.
      • On-demand mode
        • provides a flexible billing option capable of serving thousands of requests per second without capacity planning
        • does not support reserved capacity
        • [Updated 2024] AWS reduced DynamoDB on-demand pricing by 50% (November 2024), making on-demand mode significantly more cost-effective.
      • Provisioned mode
        • requires you to specify the number of reads and writes per second as required by the application
        • Understand the provisioned capacity calculations
    • DynamoDB Auto Scaling uses the AWS Application Auto Scaling service to dynamically adjust provisioned throughput capacity on your behalf, in response to actual traffic patterns.
    • Know DynamoDB Burst capacity, Adaptive capacity
    • DynamoDB Consistency mode determines the manner and timing in which the successful write or update of a data item is reflected in a subsequent read operation of that same item.
      • supports eventual and strongly consistent reads.
      • Eventual requires less throughput but might return stale data, whereas, Strongly consistent reads require higher throughput but would always return correct data.
    • DynamoDB secondary indexes provide efficient access to data with attributes other than the primary key.
      • LSI uses the same partition key but a different sort key, whereas, GSI is a separate table with a different partition key and/or sort key.
      • GSI can cause primary table throttling if under-provisioned.
      • Make sure you understand the difference between the Local Secondary Index and the Global Secondary Index
    • DynamoDB Global Tables is a multi-active, cross-region replication capability of DynamoDB to support data access locality and regional fault tolerance for database workloads.
    • DynamoDB Time to Live – TTL enables a per-item timestamp to determine when an item is no longer needed. (hint: know TTL can expire the data and this can be captured by using DynamoDB Streams)
    • DynamoDB cross-region replication allows identical copies (called replicas) of a DynamoDB table (called master table) to be maintained in one or more AWS regions.
    • DynamoDB Streams provides a time-ordered sequence of item-level changes made to data in a table.
    • DynamoDB Triggers (just like database triggers) is a feature that allows the execution of custom actions based on item-level updates on a table.
    • DynamoDB Accelerator – DAX is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement even at millions of requests per second.
      • DAX does not support fine-grained access control like DynamoDB.
    • DynamoDB Backups support PITR
      • AWS Backup can be used to backup and restore, and it supports cross-region snapshot copy as well.
    • VPC Gateway Endpoints provide private access to DynamoDB from within a VPC without the need for an internet gateway or NAT gateway
    • Understand DynamoDB Best practices (hint: selection of keys to avoid hot partitions and creation of LSI and GSI)
    • [New 2024] DynamoDB supports zero-ETL integration with Amazon Redshift (GA October 2024), enabling near real-time analytics on DynamoDB data without building ETL pipelines.
  • Aurora
    • is a relational database engine that combines the speed and reliability with the simplicity and cost-effectiveness of open-source databases.
    • provides MySQL and PostgreSQL compatibility
    • Aurora Disaster Recovery & High Availability can be achieved using Read Replicas with very minimal downtime.
      • Aurora promotes read replicas as per the priority tier (tier 0 is the highest), the largest size if the tier matches
    • Aurora Global Database provides cross-region read replicas for low-latency reads. Remember it is not multi-master and would not provide low latency writes across regions as DynamoDB Global tables.
    • Aurora Connection endpoints support
      • Cluster for primary read/write
      • Reader for read replicas
      • Custom for a specific group of instances
      • Instance for specific single instance – Not recommended
    • Aurora Fast Failover techniques
      • set TCP keepalives low
      • set Java DNS caching timeouts low
      • Set the timeout variables used in the JDBC connection string as low
      • Use the provided read and write Aurora endpoints
      • Use cluster cache management for Aurora PostgreSQL. Cluster cache management ensures that application performance is maintained if there’s a failover.
    • Aurora Serverless is an on-demand, autoscaling configuration for the MySQL-compatible and PostgreSQL-compatible editions of Aurora.
      • [Updated 2024-2025] Aurora Serverless v2 now supports scaling from 0 to 256 ACUs (Aurora Capacity Units). Scale-to-zero (November 2024) eliminates costs during inactivity, while the 256 ACU maximum (October 2024) supports larger workloads.
      • [New 2025] Aurora Serverless v2 platform version 4 delivers up to 30% better performance and 45% faster scaling at no additional cost.
    • Aurora Backtrack feature helps rewind the DB cluster to the specified time. It is not a replacement for backups.
    • Aurora Server Auditing Events for different activities cover log-in, DML, permission changes DCL, schema changes DDL, etc.
    • Aurora Cluster Cache management feature which helps fast failover
    • Aurora Clone feature which allows you to create quick and cost-effective clones
    • Aurora supports fault injection queries to simulate various failovers like node down, primary failover, etc.
    • RDS PostgreSQL and MySQL can be migrated to Aurora, by creating an Aurora Read Replica from the instance. Once the replica lag is zero, switch the DNS with no data loss
    • Aurora Database Activity Streams help stream audit logs to external services like Kinesis
    • Supports stored procedures calling lambda functions
    • [New 2024] Aurora PostgreSQL Limitless Database (GA November 2024) enables horizontal write scaling by distributing workloads across multiple Aurora writer instances while maintaining single-database semantics and transactional consistency.
    • [New 2024] Aurora supports zero-ETL integration with Amazon Redshift, enabling near real-time analytics without building ETL pipelines.
  • Relational Database Service (RDS)
    • provides a relational database in the cloud with multiple database options.
    • RDS Snapshots, Backups, and Restore
      • restoring a DB from a snapshot does not retain the parameter group and security group
      • automated snapshots cannot be shared. Make a manual backup from the snapshot before sharing the same.
    • RDS Read Replicas
      • allow elastic scaling beyond the capacity constraints of a single DB instance for read-heavy database workloads.
      • increased scalability and database availability in the case of an AZ failure.
      • supports cross-region replicas.
    • RDS Multi-AZ provides high availability and automatic failover support for DB instances.
    • Understand the differences between RDS Multi-AZ vs Read Replicas
      • Multi-AZ failover can be simulated using Reboot with Failure option
      • Read Replicas require automated backups enabled
    • [New] RDS Multi-AZ DB Cluster deployments provide a primary and two readable standby instances across three AZs, offering lower write latency, faster failover (~35 seconds), and readable standbys compared to traditional Multi-AZ.
    • Understand DB components esp. DB parameter group, DB options groups
      • Dynamic parameters are applied immediately
      • Static parameters need manual reboot.
      • Default parameter group cannot be modified. Need to create custom parameter group and associate to RDS
      • Know max connections also depends on DB instance size
    • RDS Custom automates database administration tasks and operations. while making it possible for you as a database administrator to access and customize the database environment and operating system.
    • RDS Performance Insights is a database performance tuning and monitoring feature that helps you quickly assess the load on the database, and determine when and where to take action.
    • RDS Security
      • RDS supports security groups to control who can access RDS instances
      • RDS supports data at rest encryption and SSL for data in transit encryption
      • RDS supports IAM database authentication with temporary credentials.
      • Existing RDS instance cannot be encrypted, create a snapshot -> encrypt it –> restore as encrypted DB
      • RDS PostgreSQL requires rds.force_ssl=1 and sslmode=ca/verify-full to enable SSL encryption
      • Know RDS Encrypted Database limitations
    • Understand RDS Monitoring and Notification
      • Know RDS supports notification events through SNS for events like database creation, deletion, snapshot creation, etc.
      • CloudWatch gathers metrics about CPU utilization from the hypervisor for a DB instance, and Enhanced Monitoring gathers its metrics from an agent on the instance.
      • Enhanced Monitoring metrics are useful to understand how different processes or threads on a DB instance use the CPU.
      • RDS Performance Insights is a database performance tuning and monitoring feature that helps illustrate the database’s performance and help analyze any issues that affect it
    • RDS instance cannot be stopped if with read replicas
    • [New 2024] RDS supports zero-ETL integration with Amazon Redshift for MySQL and PostgreSQL, enabling near real-time analytics.
    • [New 2026] RDS now supports ENA Express for Multi-AZ replication, improving replication performance through multiple network paths.
  • ElastiCache
    • is a managed web service that helps deploy and run Memcached or Redis protocol-compliant cache clusters in the cloud easily.
    • Understand the differences between Redis vs. Memcached
    • [New 2024] ElastiCache now supports Valkey, a community-driven, open-source fork of Redis. Valkey is the recommended engine on ElastiCache with 33% lower Serverless pricing and 20% lower node-based pricing than other engines.
    • [Updated 2026] Valkey 9.0 is now available for ElastiCache, offering improved performance. Valkey has become the default high-performance key-value datastore across major cloud providers.
    • [New] ElastiCache Serverless enables creating a cache in under a minute with automatic scaling based on traffic patterns, eliminating the need to right-size clusters.
  • Neptune
    • is a fully managed database service built for the cloud that makes it easier to build and run graph applications. Neptune provides built-in security, continuous backups, serverless compute, and integrations with other AWS services.
    • provides Neptune loader to quickly import data from S3
    • supports VPC endpoints
    • [New 2024] Neptune Analytics provides a serverless graph analytics engine for running algorithms on large graphs without managing infrastructure. Supports NetworkX integration for Python-based graph workflows.
    • Neptune Serverless automatically scales capacity based on workload demands.
  • Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra–compatible database service.
  • Amazon Quantum Ledger Database (Amazon QLDB) is a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log.
    ⚠️ Amazon QLDB reached End of Support on July 31, 2025. All data not migrated was permanently deleted. AWS recommends migrating to Amazon Aurora PostgreSQL for audit use cases using the ledger functionality with cryptographic verification.
  • Redshift
    • is a fully managed, fast, and powerful, petabyte-scale data warehouse service. It is not covered in depth.
    • Know Redshift Best Practices w.r.t selection of Distribution style, Sort key, importing/exporting data
      • COPY command which allows parallelism, and performs better than multiple COPY commands
      • COPY command can use manifest files to load data
      • COPY command handles encrypted data
    • Know Redshift cross region encrypted snapshot copy
      • Create a new key in destination region
      • Use CreateSnapshotCopyGrant to allow Amazon Redshift to use the KMS key from the destination region.
      • In the source region, enable cross-region replication and specify the name of the copy grant created.
    • Know Redshift supports Audit logging which covers authentication attempts, connections and disconnections usually for compliance reasons.
    • [New 2024-2025] Redshift supports zero-ETL integrations from Aurora MySQL/PostgreSQL, RDS MySQL, DynamoDB, and SaaS applications (Salesforce, SAP, Zendesk), eliminating the need for custom ETL pipelines.
    • [New] Redshift Serverless provides automatic scaling and pay-per-use pricing without managing clusters.
  • Data Migration Service (DMS)
    • DMS helps in migration of homogeneous and heterogeneous database
    • DMS with Full load plus Change Data Capture (CDC) migration capability can be used to migrate databases with zero downtime and no data loss.
    • DMS with SCT (Schema Conversion Tool) can be used to migrate heterogeneous databases.
    • Premigration Assessment evaluates specified components of a database migration task to help identify any problems that might prevent a migration task from running as expected.
    • Multiserver assessment report evaluates multiple servers based on input that you provide for each schema definition that you want to assess.
    • DMS provides support for data validation to ensure that your data was migrated accurately from the source to the target.
    • DMS supports LOB migration as a 2-step process. It can do a full or limited LOB migration
      • In full LOB mode, AWS DMS migrates all LOBs from source to target regardless of size. Full LOB mode can be quite slow.
      • In limited LOB mode, a maximum LOB size can be set that AWS DMS should accept. Doing so allows AWS DMS to pre-allocate memory and load the LOB data in bulk. LOBs that exceed the maximum LOB size are truncated and a warning is issued to the log file. In limited LOB mode, you get significant performance gains over full LOB mode.
      • Recommended to use limited LOB mode whenever possible.
    • [New 2024] DMS Homogeneous Data Migrations is a serverless feature for like-to-like migrations (e.g., PostgreSQL to Aurora PostgreSQL) that uses native database tooling. No replication instances to manage. Supports PostgreSQL, MySQL, MariaDB, and MongoDB.
    • [New 2024] DMS Schema Conversion now uses generative AI to automatically convert up to 90% of schema objects from commercial databases to PostgreSQL.
    • [Deprecated 2026] AWS DMS Fleet Advisor reached end of support on May 20, 2026.

Security, Identity & Compliance

  • Identity and Access Management (IAM)
  • Key Management Services
    • is a managed encryption service that allows the creation and control of encryption keys to enable data encryption.
    • provides data at rest encryption for the databases.
  • AWS Secrets Manager
    • protects secrets needed to access applications, services, etc.
    • enables you to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle
    • supports automatic rotation of credentials for RDS, DocumentDB, etc.
  • Secrets Manager vs. Systems Manager Parameter Store
    • Secrets Manager supports automatic rotation while SSM Parameter Store does not
    • Parameter Store is cost-effective as compared to Secrets Manager.
  • Trusted Advisor provides RDS Idle instances

Management & Governance Tools

  • Understand AWS CloudWatch for Logs and Metrics.
    • EventBridge (CloudWatch Events) provides real-time alerts
    • CloudWatch can be used to store RDS logs with a custom retention period, which is indefinite by default.
    • CloudWatch Application Insights support .Net and SQL Server monitoring
  • Know CloudFormation for provisioning, in terms of
    • Stack drifts – to understand the difference between current state and on actual environment with any manual changes
    • Change Set – allows you to verify the changes before being propagated
    • parameters – allows you to configure variables or environment-specific values
    • Stack policy defines the update actions that can be performed on designated resources.
    • Deletion policy for RDS allows you to configure if the resources are retained, snapshot, or deleted once destroy is initiated
    • Supports secrets manager for DB credentials generation, storage, and easy rotation
    • System parameter store for environment-specific parameters

Whitepapers and articles

On the Exam Day

  • Make sure you are relaxed and get some good night’s sleep. The exam is not tough if you are well-prepared.
  • If you are taking the AWS Online exam
    • Try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.
    • The online verification process does take some time and usually, there are glitches.
    • Remember, you would not be allowed to take the take if you are late by more than 30 minutes.
    • Make sure you have your desk clear, no hand-watches, or external monitors, keep your phones away, and nobody can enter the room.

Finally, All the Best 🙂

DynamoDB with VPC Endpoints – Gateway & Interface

DynamoDB VPC Endpoint

DynamoDB with VPC Endpoints

  • By default, communications to and from DynamoDB use the HTTPS protocol, which protects network traffic by using SSL/TLS encryption.
  • A VPC endpoint for DynamoDB enables EC2 instances in the VPC to use their private IP addresses to access DynamoDB with no exposure to the public internet.
  • Traffic between the VPC and the AWS service does not leave the Amazon network.
  • EC2 instances do not require public IP addresses, an internet gateway, a NAT device, or a virtual private gateway in the VPC.

  • VPC endpoint for DynamoDB routes any requests to a DynamoDB endpoint within the Region to a private DynamoDB endpoint within the Amazon network.
  • Applications running on EC2 instances in the VPC don’t need to be modified.
  • Endpoint name remains the same, but the route to DynamoDB stays entirely within the Amazon network and does not access the public internet.
  • VPC Endpoint Policies to control access to DynamoDB.

DynamoDB VPC Endpoint

Types of VPC Endpoints for DynamoDB

  • DynamoDB supports two types of VPC endpoints: Gateway Endpoints and Interface Endpoints (using AWS PrivateLink).
  • Both types keep network traffic on the AWS network.
  • Gateway endpoints and interface endpoints can be used together in the same VPC.

Gateway Endpoints

  • A gateway endpoint is specified in the route table to access DynamoDB from the VPC over the AWS network.
  • Use DynamoDB public IP addresses.
  • Do not allow access from on-premises networks.
  • Do not allow access from another AWS Region.
  • Not billed – Gateway endpoints are free of charge.
  • Available only in the Region where created.
  • Supported for both DynamoDB tables and DynamoDB Streams.

Interface Endpoints (AWS PrivateLink)

  • Announced in March 2024, DynamoDB now supports AWS PrivateLink for interface endpoints.
  • Use private IP addresses from the VPC to route requests to DynamoDB.
  • Represented by one or more elastic network interfaces (ENIs) with private IP addresses.
  • Allow access from on-premises networks via AWS Direct Connect or Site-to-Site VPN.
  • Allow cross-region access from another VPC using VPC peering or AWS Transit Gateway.
  • Billed – Interface endpoints incur hourly charges and data processing charges.
  • Support up to 50,000 requests per second per endpoint.
  • Compatible with existing gateway endpoints in the same VPC.
  • Enable simplified private network connectivity from on-premises workloads to DynamoDB.

Choosing Between Gateway and Interface Endpoints

  • Use Gateway Endpoints when:
    • Access is only needed from within the VPC.
    • Cost optimization is a priority (gateway endpoints are free).
    • Simple VPC-only connectivity is sufficient.
  • Use Interface Endpoints when:
    • Access is needed from on-premises networks via Direct Connect or VPN.
    • Cross-region access is required via VPC peering or Transit Gateway.
    • Private IP addressing is required for compliance or security policies.
    • Integration with AWS Management Console Private Access is needed.
  • Use Both Together when:
    • In-VPC applications can use the free gateway endpoint.
    • On-premises applications use interface endpoints for private connectivity.
    • This approach optimizes costs while enabling hybrid connectivity.

DynamoDB Streams with AWS PrivateLink

  • Announced in March 2025, DynamoDB Streams now supports AWS PrivateLink.
  • Allows invoking DynamoDB Streams APIs from within the VPC without traversing the public internet.
  • Only interface endpoints are supported for DynamoDB Streams – gateway endpoints are not supported.
  • Enables private connectivity for stream processing applications running on-premises or in other regions.
  • Supports FIPS endpoints in US and Canada commercial AWS Regions (announced November 2025).
  • To use DynamoDB console with AWS Management Console Private Access, create VPC endpoints for both:
    • com.amazonaws.<region>.dynamodb
    • com.amazonaws.<region>.dynamodb-streams

DynamoDB Accelerator (DAX) with AWS PrivateLink

  • Announced in October 2025, DAX now supports AWS PrivateLink.
  • Enables secure access to DAX management APIs (CreateCluster, DescribeClusters, DeleteCluster) over private IP addresses within the VPC.
  • Customers can access DAX using private DNS names.
  • Provides private connectivity for DAX cluster management operations.

IPv6 Support

  • Announced in October 2025, DynamoDB now supports Internet Protocol version 6 (IPv6).
  • IPv6 addresses can be used in VPCs when connecting to:
    • DynamoDB tables
    • DynamoDB Streams
    • DynamoDB Accelerator (DAX)
  • IPv6 support includes both AWS PrivateLink Gateway and Interface endpoints.
  • DAX supports IPv6 addressing with IPv4-only, IPv6-only, or dual-stack networking modes.
  • Available in all commercial AWS Regions and AWS GovCloud (US) Regions.

VPC Endpoint Policies

  • Endpoint policies can be attached to VPC endpoints to control access to DynamoDB.
  • Policies specify:
    • IAM principals that can perform actions
    • Actions that can be performed
    • Resources on which actions can be performed
  • Can restrict access to specific DynamoDB tables from a VPC endpoint.
  • Useful for implementing least-privilege access controls.

Considerations and Limitations

  • AWS PrivateLink for DynamoDB does not support:
    • Transport Layer Security (TLS) 1.1
    • Private and Hybrid Domain Name System (DNS) services
  • Network connectivity timeouts to AWS PrivateLink endpoints need to be handled by applications.
  • Interface endpoints support up to 50,000 requests per second per endpoint.
  • When using both gateway and interface endpoints together, applications must use endpoint-specific DNS names to route traffic through interface endpoints.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What are the services supported by VPC endpoints, using the Gateway endpoint type?
    1. Amazon EFS
    2. Amazon DynamoDB
    3. Amazon Glacier
    4. Amazon SQS
  2. A business application is hosted on Amazon EC2 and uses Amazon DynamoDB for its storage. The chief information security officer has directed that no application traffic between the two services should traverse the public internet. Which capability should the solutions architect use to meet the compliance requirements?
    1. AWS Key Management Service (AWS KMS)
    2. VPC endpoint
    3. Private subnet
    4. Virtual private gateway
  3. A company runs an application in the AWS Cloud and uses Amazon DynamoDB as the database. The company deploys Amazon EC2 instances to a private network to process data from the database. The company uses two NAT instances to provide connectivity to DynamoDB.
    The company wants to retire the NAT instances. A solutions architect must implement a solution that provides connectivity to DynamoDB and that does not require ongoing management. What is the MOST cost-effective solution that meets these requirements?

    1. Create a gateway VPC endpoint to provide connectivity to DynamoDB.
    2. Configure a managed NAT gateway to provide connectivity to DynamoDB.
    3. Establish an AWS Direct Connect connection between the private network and DynamoDB.
    4. Deploy an AWS PrivateLink endpoint service between the private network and DynamoDB.
  4. A company has an on-premises data center connected to AWS via AWS Direct Connect. The company needs to access DynamoDB tables from on-premises applications without traversing the public internet. What is the BEST solution?
    1. Create a gateway VPC endpoint for DynamoDB.
    2. Create an interface VPC endpoint (AWS PrivateLink) for DynamoDB.
    3. Configure a NAT gateway in the VPC.
    4. Use an internet gateway with security groups.
  5. A solutions architect needs to enable private connectivity to DynamoDB Streams for a stream processing application. Which VPC endpoint type should be used?
    1. Gateway endpoint only
    2. Interface endpoint only
    3. Either gateway or interface endpoint
    4. Both gateway and interface endpoints together
  6. A company wants to minimize costs for accessing DynamoDB from EC2 instances within the same VPC while maintaining private connectivity. What should they implement?
    1. Interface VPC endpoint
    2. Gateway VPC endpoint
    3. NAT gateway
    4. Internet gateway with security groups
  7. Which of the following are true about DynamoDB interface endpoints? (Select TWO)
    1. They support access from on-premises networks via Direct Connect or VPN.
    2. They are free of charge.
    3. They use private IP addresses from the VPC.
    4. They cannot be used with gateway endpoints in the same VPC.
    5. They support unlimited requests per second.

References

Amazon DynamoDB TTL – Automatically Expire & Delete Items

DynamoDB Time to Live – TTL

  • DynamoDB Time to Live – TTL enables a per-item timestamp to determine when an item is no longer needed.
  • After the date and time of the specified timestamp, DynamoDB deletes the item from the table without consuming any write throughput.

  • DynamoDB TTL is provided at no extra cost and can help reduce data storage by retaining only required data.
  • Items that are deleted from the table are also removed from any local secondary index and global secondary index in the same way as a DeleteItem operation.
  • Expired items are typically deleted within a few days of their expiration time (DynamoDB documentation states items are typically deleted within two days of expiration).
  • Items with valid, expired TTL attributes may be deleted by the system at any time after expiration. You can still update expired items that are pending deletion, including changing or removing their TTL attributes.
  • DynamoDB Streams tracks the TTL delete operation as a system delete (service deletion), not a regular user delete. The streams record contains userIdentity.type: "Service" and userIdentity.principalId: "dynamodb.amazonaws.com".
  • TTL deletions can be identified in DynamoDB Streams only in the Region where the deletion occurred. TTL deletions replicated to global table replica regions are not identifiable in DynamoDB Streams in those replica regions.
  • TTL requirements
    • TTL attributes must use the Number data type. Other data types, such as String, are not supported and will be ignored by the TTL process.
    • TTL attributes must use the Unix epoch time format (seconds granularity). Ensure the timestamp is in seconds, not milliseconds.
  • TTL is useful if the stored items lose relevance after a specific time. for e.g.
    • Remove user or sensor data after a year of inactivity in an application
    • Archive expired items to an S3 data lake via DynamoDB Streams and AWS Lambda.
    • Retain sensitive data for a certain amount of time according to contractual or regulatory obligations.
    • Manage session data, temporary tokens, or short-lived cache entries.

TTL Best Practices

  • Use filter expressions in Scan and Query operations to exclude expired items that are pending deletion, as they still appear in read results until physically removed.
  • Use condition expressions to avoid writing to expired items that are pending deletion.
  • Expired items still count towards storage and read costs until they are physically deleted by the background process.
  • For Global Tables (version 2019.11.21), DynamoDB replicates TTL deletes to all replica tables. The initial TTL delete does not consume WCU in the region where expiry occurs, but replicated TTL deletes consume a replicated Write Capacity Unit (provisioned) or Replicated Write Unit (on-demand) in each replica region.
  • TTL will continue to process deletions for approximately 30 minutes after it is disabled on a table.

Near Real-Time Data Eviction (Alternative Patterns)

  • DynamoDB’s native TTL deletes items within a few days (typically within two days), which may not suit time-sensitive use cases.
  • For applications requiring near real-time data eviction (less than one minute), consider using Amazon EventBridge Scheduler in combination with DynamoDB to schedule precise deletions.
  • Another pattern uses a purpose-built Global Secondary Index (GSI) for strict data management and precise eviction control.
  • These event-driven architecture patterns can reduce deletion latency from days to under one minute but require additional infrastructure.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company developed an application by using AWS Lambda and
    Amazon DynamoDB. The Lambda function periodically pulls data from the company’s S3 bucket based on date and time tags and inserts specific values into a DynamoDB table for further processing. The company must remove data that is older than 30 days from the DynamoDB table. Which solution will meet this requirement with the MOST operational efficiency?

    1. Update the Lambda function to add the Version attribute in the DynamoDB table. Enable TTL on the DynamoDB table to expire entries that are older than 30 days based on the TTL attribute.
    2. Update the Lambda function to add the TTL attribute in the DynamoDB table. Enable TTL on the DynamoDB table to expire entries that are older than 30 days based on the TTL attribute.
    3. Use AWS Step Functions to delete entries that are older than 30 days.
    4. Use EventBridge to schedule the Lambda function to delete entries that are older than 30 days.
  2. A company stores session data in a DynamoDB table. Each session must be automatically removed exactly when it expires, with no tolerance for delay. The application requires sub-minute deletion precision. Which approach provides the MOST precise deletion timing?
    1. Enable DynamoDB TTL on the session table with the expiration timestamp attribute.
    2. Use a scheduled Lambda function running every minute to scan and delete expired items.
    3. Use Amazon EventBridge Scheduler to schedule individual delete operations for each session at its exact expiration time.
    4. Create a DynamoDB Stream with a Lambda function to delete items when they are marked as expired.
  3. A developer is implementing DynamoDB TTL for a global table replicated across three regions. Which statement correctly describes how TTL deletions are handled in global tables?
    1. TTL deletions consume Write Capacity Units in all regions including the region where expiry occurs.
    2. The initial TTL delete does not consume WCU in the region where expiry occurs, but replicated deletes consume replicated Write Capacity Units in each replica region.
    3. TTL deletions are not replicated to other regions and must be handled separately in each region.
    4. TTL deletions consume no capacity in any region as they are system operations.

References

DynamoDB Global Tables – Multi-Region Replication

Amazon DynamoDB Global Tables

  • DynamoDB Global Tables is a fully managed, serverless, multi-Region, multi-active database.
  • Global tables provide 99.999% availability, increased application resiliency, and improved business continuity.
  • Global table’s automatic cross-Region replication capability helps achieve fast, local read and write performance and regional fault tolerance for database workloads.
  • Applications can now perform reads and writes to DynamoDB in AWS Regions around the world, with changes in any Region propagated to every Region where a table is replicated.
  • Global Tables help in building applications to advantage of data locality to reduce overall latency.
  • Global Tables replicates data among Regions within a single AWS account.

Global Tables Working

  • Global Table is a collection of one or more replica tables, all owned by a single AWS account.
  • A single Amazon DynamoDB global table can only have one replica table per AWS Region.
  • Each replica table stores the same set of data items, has the same table name, and the same primary key schema.
  • When an application writes data to a replica table in one Region, DynamoDB replicates the writes to other replica tables in the other AWS Regions.
  • All replicas in a global table share the same table name, primary key schema, and item data.

Consistency Modes

  • When creating a global table, a consistency mode must be configured.
  • Global tables support two consistency modes: Multi-Region Eventual Consistency (MREC) and Multi-Region Strong Consistency (MRSC).
  • If no consistency mode is specified, the global table defaults to MREC.
  • A global table cannot contain replicas configured with different consistency modes.
  • Consistency mode cannot be changed after creation.

Multi-Region Eventual Consistency (MREC) – Default

  • MREC is the default consistency mode for global tables.
  • Item changes are asynchronously replicated to all other replicas, typically within a second or less.
  • Conflict Resolution: Uses Last Write Wins approach based on the latest internal timestamp on a per-item basis.
  • Consistency Behavior:
    • Supports eventual consistency for cross-Region reads.
    • Supports strong consistency for same-Region reads (returns latest version if item was last updated in that Region).
    • May return stale data for strongly consistent reads if the item was last updated in a different Region.
  • Recovery Point Objective (RPO): Equal to replication delay between replicas (usually a few seconds).
  • Replica Management:
    • Create by adding a replica to an existing DynamoDB table.
    • Can add replicas to expand to more Regions or remove replicas if no longer needed.
    • Can have a replica in any Region where DynamoDB is available.
    • No performance impact when adding replicas.
  • Requirements: Requires DynamoDB Streams enabled with New and Old image settings.
  • Use Cases:
    • Applications that can tolerate stale data from strongly consistent reads if data was updated in another Region.
    • Prioritize lower write and strongly consistent read latencies over multi-Region read consistency.
    • Multi-Region high availability strategy can tolerate RPO greater than zero.

Multi-Region Strong Consistency (MRSC) – January 2025

  • Announced at AWS re:Invent 2024 (preview) and generally available in January 2025.
  • Item changes are synchronously replicated to at least one other Region before the write operation returns a successful response.
  • Zero RPO: Provides Recovery Point Objective (RPO) of zero for highest resilience.
  • Consistency Behavior:
    • Strongly consistent read operations on any MRSC replica always return the latest version of an item.
    • Conditional writes always evaluate against the latest version of an item.
    • Provides strong read-after-write consistency across all Regions.
  • Deployment Requirements:
    • Must be deployed in exactly three Regions.
    • Can configure with three replicas OR two replicas + one witness.
    • Witness: A component that contains data written to replicas and supports MRSC’s availability architecture. Cannot perform read/write operations on a witness. Witness is owned and managed by DynamoDB.
  • Regional Availability: Available in three Region sets:
    • US Region set: US East (N. Virginia), US East (Ohio), US West (Oregon)
    • EU Region set: Europe (Ireland), Europe (London), Europe (Paris), Europe (Frankfurt)
    • AP Region set: Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Osaka)
    • MRSC global tables cannot span Region sets (e.g., cannot mix US and EU Regions).
  • Creation Requirements:
    • Create by adding one replica and a witness OR two replicas to an existing DynamoDB table.
    • Table must be empty when converting to MRSC (existing items not supported).
    • Cannot add additional replicas after creation.
    • Cannot delete a single replica or witness (must delete two replicas or one replica + witness to convert back to single-Region table).
  • Write Conflicts: Write operation fails with ReplicatedWriteConflictException when attempting to modify an item already being modified in another Region. Failed writes can be retried.
  • Limitations:
    • Time to Live (TTL): Not supported
    • Local Secondary Indexes (LSIs): Not supported
    • Transactions: Not supported (TransactWriteItems and TransactGetItems return errors)
    • DynamoDB Streams: Not used for replication (can be enabled separately)
  • Performance Trade-off: Higher write and strongly consistent read latencies compared to MREC.
  • Use Cases:
    • Need strongly consistent reads across multiple Regions.
    • Prioritize global read consistency over lower write latency.
    • Multi-Region high availability strategy requires RPO of zero.
    • Financial applications, inventory management, or any system requiring strict consistency.

Pricing Reduction (November 2024)

  • Effective November 1, 2024, DynamoDB reduced prices for global tables by up to 67%.
  • On-demand mode: Global tables cost 67% less than before.
  • Provisioned capacity mode: Global tables cost 33% less than before.
  • Makes global tables significantly more cost-effective for multi-Region deployments.

Replication and Throughput

  • MREC Replication:
    • Uses DynamoDB Streams to replicate changes.
    • Streams are enabled by default on all replicas and cannot be disabled.
    • Replication process may combine multiple changes into a single replicated write.
    • Stream records are ordered per-item but ordering between items may differ between replicas.
  • MRSC Replication:
    • Does not use DynamoDB Streams for replication.
    • Streams can be enabled separately if needed.
    • Stream records are identical for every replica, including ordering.
  • Provisioned Mode:
    • Replication consumes write capacity.
    • Auto scaling settings for read and write capacities are synchronized between replicas.
    • Read capacity can be independently configured per replica using ProvisionedThroughputOverride.
  • On-demand Mode:
    • Write capacity is automatically synchronized across all replicas.
    • DynamoDB automatically adjusts capacity based on traffic.

Monitoring and Testing

  • Replication Latency (MREC only):
    • MREC global tables publish ReplicationLatency metric to CloudWatch.
    • Tracks elapsed time between item write in one replica and appearance in another replica.
    • Expressed in milliseconds for every source-destination Region pair.
    • MRSC global tables do not publish this metric (synchronous replication).
  • Fault Injection Testing:
    • Both MREC and MRSC integrate with AWS Fault Injection Service (AWS FIS).
    • Can simulate Region isolation by pausing replication to/from a selected replica.
    • Test error handling, recovery mechanisms, and multi-Region traffic shift behavior.

Additional Features and Considerations

  • Time to Live (TTL):
    • MREC: Supported. TTL settings synchronized across all replicas. TTL deletes replicated to all replicas (charged for replicated deletes).
    • MRSC: Not supported.
  • Transactions:
    • MREC: Supported but only atomic within the Region where invoked. Not replicated as a unit.
    • MRSC: Not supported.
  • Point-in-Time Recovery (PITR):
    • Can be enabled on each local replica independently.
    • PITR settings are not synchronized between replicas.
  • DynamoDB Accelerator (DAX):
    • Writes to global table replicas bypass DAX, updating DynamoDB directly.
    • DAX caches can become stale and are only refreshed when cache TTL expires.
  • Settings Synchronization:
    • Always synchronized: Capacity mode, write capacity, GSI definitions, encryption, TTL (MREC)
    • Can be overridden per replica: Read capacity, table class
    • Never synchronized: Deletion protection, PITR, tags, Contributor Insights

DynamoDB Global Tables vs. Aurora Global Databases

AWS Aurora Global Database vs DynamoDB Global Tables

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company is building a web application on AWS. The application requires the database to support read and write operations in multiple AWS Regions simultaneously. The database also needs to propagate data changes between Regions as the changes occur. The application must be highly available and must provide a latency of single-digit milliseconds. Which solution meets these requirements?
    1. Amazon DynamoDB global tables
    2. Amazon DynamoDB streams with AWS Lambda to replicate the data
    3. An Amazon ElastiCache for Redis cluster with cluster mode enabled and multiple shards
    4. An Amazon Aurora global database
  2. A financial services company requires a multi-Region database with zero data loss (RPO = 0) and strongly consistent reads across all Regions. Which DynamoDB global tables consistency mode should they use?
    1. Multi-Region Eventual Consistency (MREC)
    2. Multi-Region Strong Consistency (MRSC)
    3. Single-Region Strong Consistency
    4. Cross-Region Read Replicas
  3. A company wants to create a DynamoDB global table with MRSC for their inventory management system. They have existing data in a table in us-east-1. What must they do before converting to MRSC?
    1. Enable DynamoDB Streams on the table.
    2. Configure three replicas in different Regions.
    3. Empty the table of all existing data.
    4. Enable Point-in-Time Recovery (PITR).
  4. A company has a DynamoDB global table with MREC configured across us-east-1, eu-west-1, and ap-southeast-1. An item is updated simultaneously in us-east-1 and eu-west-1. How does DynamoDB resolve this conflict?
    1. The write in the primary Region takes precedence.
    2. Last Write Wins based on the latest internal timestamp.
    3. Both writes are rejected and must be retried.
    4. The write with the larger data size takes precedence.
  5. A company wants to deploy a DynamoDB global table with MRSC. They need replicas in us-east-1, eu-west-1, and ap-southeast-1. What will happen?
    1. The global table will be created successfully.
    2. The creation will fail because MRSC cannot span Region sets.
    3. The global table will be created with MREC instead.
    4. A witness will be automatically placed in a fourth Region.
  6. Which of the following features are NOT supported with DynamoDB MRSC global tables? (Select THREE)
    1. Time to Live (TTL)
    2. DynamoDB Streams
    3. Local Secondary Indexes (LSIs)
    4. Global Secondary Indexes (GSIs)
    5. Transaction operations (TransactWriteItems)
    6. Point-in-Time Recovery (PITR)
  7. A company has a DynamoDB global table with MREC. They perform a strongly consistent read in us-west-2, but the item was last updated in eu-west-1. What will the read return?
    1. The latest version of the item from eu-west-1.
    2. Potentially stale data (the version before the eu-west-1 update).
    3. An error indicating the item is being replicated.
    4. The read will be automatically redirected to eu-west-1.
  8. What is the typical replication latency for DynamoDB global tables with MREC?
    1. 5-10 seconds
    2. Within a second or less
    3. Within 5 minutes
    4. Synchronous (no latency)

References

Amazon DynamoDB Streams – Change Data Capture

Amazon DynamoDB Streams

  • DynamoDB Streams provides a time-ordered sequence of item-level changes made to data in a table.
  • DynamoDB Streams is a serverless data streaming feature that makes it straightforward to track, process, and react to item-level changes in DynamoDB tables in near real-time.
  • DynamoDB Streams stores the data for the last 24 hours, after which they are erased.
  • DynamoDB Streams maintains an ordered sequence of the events per item; however, sequence across items is not maintained.
  • Example:
    • For e.g., suppose that you have a DynamoDB table tracking high scores for a game and that each item in the table represents an individual player. If you make the following three updates in this order:
      • Update 1: Change Player 1’s high score to 100 points
      • Update 2: Change Player 2’s high score to 50 points
      • Update 3: Change Player 1’s high score to 125 points
    • DynamoDB Streams will maintain the order for Player 1 score events. However, it would not maintain order across the players. So Player 2 score event is not guaranteed between the 2 Player 1 events.
  • Applications can access this log and view the data items as they appeared before and after they were modified, in near-real time.
  • DynamoDB Streams APIs help developers consume updates and receive the item-level data before and after items are changed.

DynamoDB Streams Features

  • Streams allow reads at up to twice the rate of the provisioned write capacity of the DynamoDB table.
  • Streams have to be enabled on a per-table basis. When enabled on a table, DynamoDB captures information about every modification to data items in the table.
  • Streams support Encryption at rest to encrypt the data.
  • Streams are designed for No Duplicates so that every update made to the table will be represented exactly once in the stream.
  • Streams write stream records in near-real time so that applications can consume these streams and take action based on the contents.
  • Stream records contain information about a data modification to a single item in a DynamoDB table.
  • Each stream record has a sequence number that reflects the order in which the record was published to the stream.

Stream View Types

  • When enabling a stream on a table, you must specify the stream view type, which determines what information is written to the stream:
  • KEYS_ONLY: Only the key attributes of the modified item.
  • NEW_IMAGE: The entire item, as it appears after it was modified.
  • OLD_IMAGE: The entire item, as it appeared before it was modified.
  • NEW_AND_OLD_IMAGES: Both the new and the old images of the item (recommended for maximum flexibility).

Use Cases

  • Multi-Region Replication: Keep other data stores up-to-date with the latest changes to DynamoDB (used by DynamoDB Global Tables).
  • Real-time Analytics: Stream data to analytics services for real-time insights.
  • Event-Driven Architectures: Trigger actions based on changes made to the table.
  • Data Aggregation: Aggregate data from multiple tables into a single view.
  • Audit and Compliance: Maintain audit logs of all changes to data.
  • Search Index Updates: Keep search indexes (e.g., OpenSearch) synchronized with DynamoDB data.
  • Cache Invalidation: Invalidate caches when data changes.
  • Notifications: Send notifications when specific data changes occur.

Processing DynamoDB Streams

  • Stream records can be processed using multiple methods:

AWS Lambda

  • Most common and recommended approach for processing DynamoDB Streams.
  • Lambda polls the stream and invokes the function synchronously when new records are available.
  • Lambda automatically handles scaling, retries, and error handling.
  • Supports batch processing of stream records.
  • Can filter events using event filtering to reduce invocations and costs.

Kinesis Data Streams

  • DynamoDB can stream change data directly to Amazon Kinesis Data Streams.
  • Provides longer data retention (up to 365 days vs. 24 hours for DynamoDB Streams).
  • Enables integration with Kinesis Data Firehose, Kinesis Data Analytics, and other Kinesis consumers.
  • Supports fan-out to multiple consumers.
  • Better for high-throughput scenarios requiring multiple consumers.

Kinesis Client Library (KCL)

  • KCL can be used to build custom applications that process DynamoDB Streams.
  • DynamoDB Streams Kinesis Adapter allows KCL applications to consume DynamoDB Streams.
  • KCL 3.0 Support (June 2025): DynamoDB Streams now supports Kinesis Client Library 3.0.
    • Reduces compute costs to process streaming data by up to 33% compared to previous KCL versions.
    • Improved load balancing algorithm based on CPU utilization.
    • Enhanced performance and efficiency.
    • Note: KCL 1.x reaches end-of-support on January 30, 2026. Migrate to KCL 3.x.

AWS PrivateLink Support (March 2025)

  • Announced in March 2025, DynamoDB Streams now supports AWS PrivateLink.
  • Allows invoking DynamoDB Streams APIs from within your Amazon VPC without traversing the public internet.
  • Only interface endpoints are supported for DynamoDB Streams (gateway endpoints are not supported).
  • Enables private connectivity for stream processing applications running on-premises or in other Regions.
  • Supports FIPS endpoints in US and Canada commercial AWS Regions (announced November 2025).
  • Enhances security by keeping stream data within the AWS network.
  • Critical for compliance requirements that mandate private network connectivity.
  • Can be accessed from on-premises via AWS Direct Connect or Site-to-Site VPN.

DynamoDB Streams vs. Kinesis Data Streams

  • DynamoDB Streams:
    • 24-hour data retention
    • Automatically scales with table
    • No additional cost (included with DynamoDB)
    • Simpler to set up and use
    • Best for simple event-driven architectures
  • Kinesis Data Streams:
    • Up to 365 days data retention
    • Manual capacity management (or on-demand mode)
    • Additional cost for Kinesis
    • More complex but more flexible
    • Best for multiple consumers and longer retention needs
  • Recommendation: Use DynamoDB Streams for simple use cases with Lambda. Use Kinesis Data Streams for complex scenarios requiring multiple consumers or longer retention.

Best Practices

  • Choose the Right View Type: Use NEW_AND_OLD_IMAGES for maximum flexibility unless you have specific requirements.
  • Handle Duplicates: Although designed for no duplicates, implement idempotent processing logic.
  • Monitor Stream Processing: Use CloudWatch metrics to monitor Lambda invocations, errors, and iterator age.
  • Use Event Filtering: Filter events in Lambda to reduce unnecessary invocations and costs.
  • Batch Processing: Configure appropriate batch sizes for Lambda to optimize throughput and cost.
  • Error Handling: Implement proper error handling and configure dead-letter queues for failed records.
  • Consider Kinesis for Multiple Consumers: If you need multiple consumers, use Kinesis Data Streams instead.
  • Migrate to KCL 3.0: If using KCL, migrate to version 3.0 for cost savings and performance improvements.
  • Use PrivateLink for Security: Enable AWS PrivateLink for enhanced security and compliance.

Limitations and Considerations

  • Stream records are available for only 24 hours.
  • Streams do not guarantee ordering across different items (only per-item ordering).
  • Stream records are eventually consistent with the table.
  • Enabling streams does not affect table performance.
  • Streams cannot be enabled on tables with local secondary indexes that use non-key attributes in the projection.
  • For Global Tables with MREC, streams are enabled by default and cannot be disabled.
  • For Global Tables with MRSC, streams are not used for replication but can be enabled separately.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. An application currently writes a large number of records to a DynamoDB table in one region. There is a requirement for a secondary application to retrieve new records written to the DynamoDB table every 2 hours and process the updates accordingly. Which of the following is an ideal way to ensure that the secondary application gets the relevant changes from the DynamoDB table?
    1. Insert a timestamp for each record and then scan the entire table for the timestamp as per the last 2 hours.
    2. Create another DynamoDB table with the records modified in the last 2 hours.
    3. Use DynamoDB Streams to monitor the changes in the DynamoDB table.
    4. Transfer records to S3 which were modified in the last 2 hours.
  2. A company needs to process DynamoDB stream records from an on-premises application without exposing traffic to the public internet. What should they implement?
    1. Use a NAT gateway to access DynamoDB Streams.
    2. Create an interface VPC endpoint for DynamoDB Streams using AWS PrivateLink.
    3. Create a gateway VPC endpoint for DynamoDB Streams.
    4. Use an internet gateway with security groups.
  3. A company wants to reduce costs for processing DynamoDB Streams using KCL. What should they do?
    1. Switch from KCL to Lambda for processing.
    2. Migrate from KCL 1.x to KCL 3.0 for up to 33% cost reduction.
    3. Reduce the number of shards in the stream.
    4. Increase the batch size for stream processing.
  4. A company needs to maintain an audit log of all changes to a DynamoDB table for 90 days. DynamoDB Streams only retains data for 24 hours. What is the BEST solution?
    1. Enable PITR on the DynamoDB table.
    2. Stream DynamoDB changes to Kinesis Data Streams with 90-day retention.
    3. Use Lambda to copy stream records to S3 every 24 hours.
    4. Create on-demand backups every 24 hours.
  5. A developer wants to capture both the old and new values of items when they are modified in a DynamoDB table. Which stream view type should they configure?
    1. KEYS_ONLY
    2. NEW_IMAGE
    3. OLD_IMAGE
    4. NEW_AND_OLD_IMAGES
  6. Which of the following statements about DynamoDB Streams are correct? (Select TWO)
    1. Stream records are available for 24 hours.
    2. Streams guarantee ordering across all items in the table.
    3. Streams maintain ordered sequence of events per item.
    4. Streams can be processed only by Lambda functions.
    5. Enabling streams impacts table write performance.
  7. A company has multiple applications that need to process the same DynamoDB change events. What is the BEST approach?
    1. Create multiple Lambda functions triggered by the same DynamoDB Stream.
    2. Stream DynamoDB changes to Kinesis Data Streams and use multiple consumers.
    3. Enable multiple DynamoDB Streams on the same table.
    4. Use DynamoDB Streams with fan-out to multiple Lambda functions.

References

DynamoDB Consistency – Strong vs Eventual Reads

DynamoDB Consistency

  • AWS has a Region, which is a physical location around the world where we cluster data centers, with one or more Availability Zones which are discrete data centers with redundant power, networking, and connectivity in an AWS Region.
  • Amazon automatically stores each DynamoDB table in the three geographically distributed locations or AZs for durability.
  • DynamoDB consistency represents the manner and timing in which the successful write or update of a data item is reflected in a subsequent read operation of that same item.

DynamoDB Consistency Modes

Eventually Consistent Reads (Default)

  • Eventual consistency option maximizes the read throughput.
  • Consistency across all copies is usually reached within a second.
  • However, an eventually consistent read might not reflect the results of a recently completed write.
  • Repeating a read after a short time should return the updated data.
  • DynamoDB uses eventually consistent reads, by default.
  • Use Cases:
    • Applications that can tolerate reading slightly stale data.
    • Read-heavy workloads where throughput is more important than immediate consistency.
    • Cost-sensitive applications (eventually consistent reads are half the cost of strongly consistent reads).

Strongly Consistent Reads

  • Strongly consistent read returns a result that reflects all writes that received a successful response prior to the read.
  • Ensures that the most up-to-date data is returned.
  • Cost: Strongly consistent reads are 2x the cost of eventually consistent reads (consume twice the read capacity units).
  • Disadvantages:
    • A strongly consistent read might not be available if there is a network delay or outage. In this case, DynamoDB may return a server error (HTTP 500).
    • Strongly consistent reads may have higher latency than eventually consistent reads.
    • Not supported on Global Secondary Indexes (GSIs) – only eventually consistent reads are supported on GSIs.
    • Strongly consistent reads use more throughput capacity than eventually consistent reads.
  • Use Cases:
    • Applications requiring immediate read-after-write consistency.
    • Financial transactions or inventory management where stale data is unacceptable.
    • Scenarios where data accuracy is critical.

Specifying Consistency Mode

  • DynamoDB allows the user to specify whether the read should be eventually consistent or strongly consistent at the time of the request.
  • Read operations (such as GetItem, Query, and Scan) provide a ConsistentRead parameter. If set to true, DynamoDB uses strongly consistent reads during the operation.
  • Default Behavior: Query, GetItem, and BatchGetItem operations perform eventually consistent reads by default.
  • Forcing Strong Consistency:
    • Query and GetItem operations can be forced to be strongly consistent by setting ConsistentRead=true.
    • Query operations cannot perform strongly consistent reads on Global Secondary Indexes (GSIs).
    • BatchGetItem operations can be forced to be strongly consistent on a per-table basis.
    • Scan operations can be forced to be strongly consistent.

Transactional Consistency

  • DynamoDB supports transactions with full ACID (Atomicity, Consistency, Isolation, Durability) properties.
  • Transactions provide all-or-nothing execution for multiple operations across one or more tables.
  • Transaction Operations:
    • TransactWriteItems – Perform multiple write operations atomically.
    • TransactGetItems – Perform multiple read operations with snapshot isolation.
  • Consistency Guarantees:
    • Atomicity: All operations in a transaction succeed or fail together.
    • Consistency: Transactions move the database from one valid state to another.
    • Isolation: Transactions are isolated from each other using snapshot isolation.
    • Durability: Once a transaction is committed, it is durable.
  • Regional Scope: Transactional operations provide ACID guarantees within a single Region.
  • Global Tables Consideration: For Global Tables with MREC, transactions are only atomic within the Region where invoked (not replicated as a unit).
  • Cost: Transactional operations consume 2x the write capacity units compared to standard writes.

Multi-Region Strong Consistency (MRSC) – January 2025

  • Announced at AWS re:Invent 2024 and generally available in January 2025.
  • Available for DynamoDB Global Tables configured with Multi-Region Strong Consistency mode.
  • Capability: Provides strong consistency across multiple AWS Regions.
  • Guarantee: Strongly consistent reads on an MRSC table always return the latest version of an item, irrespective of the Region where the read is performed.
  • Zero RPO: Enables Recovery Point Objective (RPO) of zero for highest resilience.
  • How It Works:
    • Item changes are synchronously replicated to at least one other Region before the write operation returns success.
    • Strongly consistent reads always reflect the latest committed write across all Regions.
    • Conditional writes always evaluate against the latest version of an item globally.
  • Deployment Requirements:
    • Must be deployed in exactly three Regions.
    • Can configure with 3 replicas OR 2 replicas + 1 witness.
    • Available in three Region sets: US, EU, and AP (cannot span Region sets).
  • Trade-offs:
    • Higher write latency compared to MREC (eventual consistency) due to synchronous replication.
    • Higher strongly consistent read latency compared to MREC.
  • Use Cases:
    • Financial applications requiring global strong consistency.
    • Inventory management systems across multiple Regions.
    • Applications requiring zero data loss (RPO = 0).
    • Compliance scenarios requiring strict consistency guarantees.
  • Limitations:
    • Transactions not supported on MRSC tables.
    • TTL not supported on MRSC tables.
    • Local Secondary Indexes not supported on MRSC tables.

Consistency Comparison

Consistency Type Scope Latency Cost Use Case
Eventually Consistent Single Region Lowest 1x RCU Read-heavy, can tolerate stale data
Strongly Consistent Single Region Low-Medium 2x RCU Immediate consistency required
Transactional Single Region Medium 2x WCU ACID guarantees, multiple operations
MRSC (Global Tables) Multi-Region Higher Varies Global strong consistency, zero RPO

Best Practices

  • Default to Eventually Consistent: Use eventually consistent reads by default for cost and performance optimization.
  • Use Strong Consistency Selectively: Only use strongly consistent reads when immediate consistency is required.
  • Avoid Strong Consistency on GSIs: Design data models to avoid needing strongly consistent reads on GSIs (not supported).
  • Consider Read-After-Write Patterns: If your application writes and immediately reads, use strongly consistent reads or implement retry logic.
  • Use Transactions for Multi-Item Operations: When multiple items must be updated atomically, use transactions.
  • Evaluate MRSC for Global Applications: For applications requiring global strong consistency, consider MRSC Global Tables.
  • Monitor Consistency Metrics: Use CloudWatch to monitor read/write patterns and adjust consistency settings accordingly.
  • Handle Errors Gracefully: Implement retry logic for strongly consistent reads that may fail during network issues.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which of the following statements is true about DynamoDB?
    1. Requests are eventually consistent unless otherwise specified.
    2. Requests are strongly consistent.
    3. Tables do not contain primary keys.
    4. None of the above
  2. How is provisioned throughput affected by the chosen consistency model when reading data from a DynamoDB table?
    1. Strongly consistent reads use the same amount of throughput as eventually consistent reads
    2. Strongly consistent reads use variable throughput depending on read activity
    3. Strongly consistent reads use more throughput than eventually consistent reads.
    4. Strongly consistent reads use less throughput than eventually consistent reads
  3. A company needs to perform a query on a Global Secondary Index (GSI) and requires the most up-to-date data. What consistency mode should they use?
    1. Strongly consistent reads
    2. Eventually consistent reads (GSIs do not support strongly consistent reads)
    3. Transactional reads
    4. Multi-Region strong consistency
  4. A financial application requires strong consistency across multiple AWS Regions with zero data loss (RPO = 0). Which DynamoDB feature should they use?
    1. DynamoDB Global Tables with MREC (eventual consistency)
    2. DynamoDB Global Tables with MRSC (multi-Region strong consistency)
    3. DynamoDB with strongly consistent reads
    4. DynamoDB transactions
  5. A developer needs to update multiple items across two DynamoDB tables atomically. Which feature should they use?
    1. Strongly consistent writes
    2. BatchWriteItem operation
    3. DynamoDB transactions (TransactWriteItems)
    4. Conditional writes
  6. What is the cost difference between eventually consistent reads and strongly consistent reads in DynamoDB?
    1. No difference in cost
    2. Strongly consistent reads cost 2x more (consume 2x RCU)
    3. Strongly consistent reads cost 3x more
    4. Eventually consistent reads cost 2x more
  7. Which of the following statements about DynamoDB consistency are correct? (Select TWO)
    1. Eventually consistent reads are the default for Query and GetItem operations.
    2. Strongly consistent reads are supported on Global Secondary Indexes.
    3. Strongly consistent reads may return HTTP 500 errors during network issues.
    4. Transactions provide ACID guarantees across multiple Regions.
    5. MRSC Global Tables support local secondary indexes.

References

Amazon DynamoDB Auto Scaling

DynamoDB Auto Scaling

DynamoDB Auto Scaling

  • DynamoDB Auto Scaling uses the AWS Application Auto Scaling service to dynamically adjust provisioned throughput capacity on your behalf, in response to actual traffic patterns.
  • Application Auto Scaling enables a DynamoDB table or a global secondary index to increase its provisioned read and write capacity to handle sudden increases in traffic, without throttling.
  • When the workload decreases, Application Auto Scaling decreases the throughput so that you don’t pay for unused provisioned capacity.
  • Auto Scaling is available for both provisioned capacity mode and works alongside on-demand capacity mode.

DynamoDB Auto Scaling Process

DynamoDB Auto Scaling

  1. Application Auto Scaling policy can be created on the DynamoDB table.
  2. DynamoDB publishes consumed capacity metrics to CloudWatch.
  3. If the table’s consumed capacity exceeds the target utilization (or falls below the target) for a specific length of time, CloudWatch triggers an alarm. You can view the alarm on the console and receive notifications using Simple Notification Service – SNS.
    1. The upper threshold alarm is triggered when consumed reads or writes breach the target utilization percent for two consecutive minutes.
    2. The lower threshold alarm is triggered after traffic falls below the target utilization minus 20 percent for 15 consecutive minutes.
  4. CloudWatch alarm invokes Application Auto Scaling to evaluate the scaling policy.
  5. Application Auto Scaling issues an UpdateTable request to adjust the table’s provisioned throughput.
  6. DynamoDB processes the UpdateTable request, dynamically increasing (or decreasing) the table’s provisioned throughput capacity so that it approaches your target utilization.

Auto Scaling Configuration

  • Target Utilization: The percentage of consumed provisioned throughput at a point in time (typically 70%).
  • Minimum Capacity: The lower bound for provisioned throughput that Auto Scaling will not scale below.
  • Maximum Capacity: The upper bound for provisioned throughput that Auto Scaling will not scale above.
  • Scaling Policy: Defines how Auto Scaling responds to changes in workload.
  • Auto Scaling can be configured for:
    • Tables (read and write capacity)
    • Global Secondary Indexes (read and write capacity)
    • Each can be configured independently

Warm Throughput (November 2024)

  • Announced in November 2024, DynamoDB now supports warm throughput for tables and indexes.
  • Warm Throughput: The read and write capacity your DynamoDB table or index can immediately support, based on historical usage.
  • Provides visibility into the number of read and write operations your table can readily handle.
  • Automatic Growth: DynamoDB automatically adjusts warm throughput values as your usage increases.
  • Pre-warming Capability: You can proactively set higher warm throughput values to prepare for anticipated traffic spikes.
    • Useful for planned events like product launches, sales events, or marketing campaigns.
    • Ensures your table is immediately ready to handle increased load from the moment the event begins.
    • Prevents throttling during sudden traffic surges.
  • Availability:
    • Available for both provisioned and on-demand tables and indexes.
    • Available in all AWS commercial Regions and AWS GovCloud (US) Regions.
  • Pricing:
    • Warm throughput values are available at no cost.
    • Pre-warming your table’s throughput incurs a charge.
  • Use Cases:
    • Peak events with 10x or 100x traffic surges in short periods.
    • Product launches or shopping events (e.g., Black Friday).
    • Marketing campaigns with predictable traffic spikes.
    • Gaming events or live streaming scenarios.
  • How It Works:
    • Each partition is limited to 1,000 write units per second and 3,000 read units per second.
    • Warm throughput indicates the current capacity available across all partitions.
    • Pre-warming increases this capacity before the traffic spike occurs.
    • Auto Scaling takes time to react; pre-warming ensures immediate readiness.

Capacity Modes

Provisioned Capacity Mode with Auto Scaling

  • You specify the number of read and write capacity units.
  • Auto Scaling automatically adjusts capacity within configured min/max bounds.
  • Best for predictable workloads with gradual changes.
  • Cost-effective when you can forecast capacity needs.
  • Supports warm throughput and pre-warming.

On-Demand Capacity Mode

  • DynamoDB automatically scales to accommodate workload.
  • No need to specify capacity units or configure Auto Scaling.
  • Pay per request (no minimum capacity).
  • Best for unpredictable workloads or new applications.
  • Supports warm throughput and pre-warming.
  • Pricing reduced by 50% effective November 1, 2024.

Auto Scaling Best Practices

  • Set Appropriate Target Utilization: 70% is recommended to provide buffer for traffic spikes.
  • Configure Realistic Min/Max Bounds: Ensure maximum capacity can handle peak loads.
  • Use Warm Throughput for Planned Events: Pre-warm tables before anticipated traffic spikes.
  • Monitor CloudWatch Metrics: Track consumed capacity, throttled requests, and Auto Scaling activities.
  • Test Scaling Behavior: Simulate traffic patterns to validate Auto Scaling configuration.
  • Consider On-Demand for Unpredictable Workloads: Eliminates need for capacity planning.
  • Configure Alarms: Set up CloudWatch alarms for throttling events and capacity changes.
  • Review Scaling History: Analyze past scaling activities to optimize configuration.
  • Account for Partition Limits: Remember individual partition limits (1,000 WCU, 3,000 RCU).

Auto Scaling Limitations

  • Auto Scaling takes time to react to traffic changes (not instantaneous).
  • Scale-up is faster than scale-down (scale-down has 15-minute cooldown).
  • Individual partitions have throughput limits (1,000 WCU, 3,000 RCU per partition).
  • Hot partitions can cause throttling even with sufficient overall capacity.
  • Auto Scaling cannot prevent throttling during sudden, extreme traffic spikes (use pre-warming).
  • For Global Tables, Auto Scaling settings are synchronized across replicas.

Monitoring Auto Scaling

  • CloudWatch Metrics:
    • ConsumedReadCapacityUnits / ConsumedWriteCapacityUnits
    • ProvisionedReadCapacityUnits / ProvisionedWriteCapacityUnits
    • ReadThrottleEvents / WriteThrottleEvents
    • UserErrors (includes throttling errors)
  • Auto Scaling Activity: View scaling activities in Application Auto Scaling console.
  • Warm Throughput Values: Monitor current warm throughput via DynamoDB console or APIs.
  • Alarms: Configure CloudWatch alarms for proactive monitoring.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. An application running on Amazon EC2 instances writes data synchronously to an Amazon DynamoDB table configured for 60 write capacity units. During normal operation, the application writes 50KB/s to the table but can scale up to 500 KB/s during peak hours. The application is currently getting throttling errors from the DynamoDB table during peak hours. What is the MOST cost-effective change to support the increased traffic with minimal changes to the application?
    1. Use Amazon SNS to manage the write operations to the DynamoDB table
    2. Change DynamoDB table configuration to 600 write capacity units
    3. Increase the number of Amazon EC2 instances to support the traffic
    4. Configure Amazon DynamoDB Auto Scaling to handle the extra demand
  2. A company is planning a major product launch that will cause a 100x traffic spike to their DynamoDB table for 2 hours. They want to ensure the table can handle the load immediately without throttling. What should they do?
    1. Configure Auto Scaling with a high maximum capacity.
    2. Switch to on-demand capacity mode.
    3. Pre-warm the table using warm throughput before the launch.
    4. Manually increase provisioned capacity before the launch.
  3. A DynamoDB table with Auto Scaling configured is experiencing throttling despite having sufficient overall capacity. What is the MOST likely cause?
    1. Auto Scaling is not configured correctly.
    2. The target utilization is set too high.
    3. Hot partitions are exceeding per-partition throughput limits.
    4. CloudWatch alarms are not triggering properly.
  4. What is the recommended target utilization percentage for DynamoDB Auto Scaling?
    1. 50%
    2. 70%
    3. 90%
    4. 100%
  5. A company wants to minimize costs for a DynamoDB table with unpredictable traffic patterns. Which capacity mode should they choose?
    1. Provisioned capacity with Auto Scaling
    2. On-demand capacity mode
    3. Provisioned capacity with manual scaling
    4. Reserved capacity
  6. Which of the following statements about DynamoDB warm throughput are correct? (Select TWO)
    1. Warm throughput values are available at no cost.
    2. Warm throughput is only available for provisioned capacity mode.
    3. Pre-warming a table incurs a charge.
    4. Warm throughput cannot be used with on-demand capacity mode.
    5. Warm throughput eliminates the need for Auto Scaling.

References