Amazon DynamoDB Streams
- DynamoDB Streams provides a time-ordered sequence of item-level changes made to data in a table.
- DynamoDB Streams is a serverless data streaming feature that makes it straightforward to track, process, and react to item-level changes in DynamoDB tables in near real-time.
- DynamoDB Streams stores the data for the last 24 hours, after which they are erased.
- DynamoDB Streams maintains an ordered sequence of the events per item; however, sequence across items is not maintained.
- Example:
- For e.g., suppose that you have a DynamoDB table tracking high scores for a game and that each item in the table represents an individual player. If you make the following three updates in this order:
- Update 1: Change Player 1’s high score to 100 points
- Update 2: Change Player 2’s high score to 50 points
- Update 3: Change Player 1’s high score to 125 points
- DynamoDB Streams will maintain the order for Player 1 score events. However, it would not maintain order across the players. So Player 2 score event is not guaranteed between the 2 Player 1 events.
- For e.g., suppose that you have a DynamoDB table tracking high scores for a game and that each item in the table represents an individual player. If you make the following three updates in this order:
- Applications can access this log and view the data items as they appeared before and after they were modified, in near-real time.
- DynamoDB Streams APIs help developers consume updates and receive the item-level data before and after items are changed.
DynamoDB Streams Features
- Streams allow reads at up to twice the rate of the provisioned write capacity of the DynamoDB table.
- Streams have to be enabled on a per-table basis. When enabled on a table, DynamoDB captures information about every modification to data items in the table.
- Streams support Encryption at rest to encrypt the data.
- Streams are designed for No Duplicates so that every update made to the table will be represented exactly once in the stream.
- Streams write stream records in near-real time so that applications can consume these streams and take action based on the contents.
- Stream records contain information about a data modification to a single item in a DynamoDB table.
- Each stream record has a sequence number that reflects the order in which the record was published to the stream.
Stream View Types
- When enabling a stream on a table, you must specify the stream view type, which determines what information is written to the stream:
- KEYS_ONLY: Only the key attributes of the modified item.
- NEW_IMAGE: The entire item, as it appears after it was modified.
- OLD_IMAGE: The entire item, as it appeared before it was modified.
- NEW_AND_OLD_IMAGES: Both the new and the old images of the item (recommended for maximum flexibility).
Use Cases
- Multi-Region Replication: Keep other data stores up-to-date with the latest changes to DynamoDB (used by DynamoDB Global Tables).
- Real-time Analytics: Stream data to analytics services for real-time insights.
- Event-Driven Architectures: Trigger actions based on changes made to the table.
- Data Aggregation: Aggregate data from multiple tables into a single view.
- Audit and Compliance: Maintain audit logs of all changes to data.
- Search Index Updates: Keep search indexes (e.g., OpenSearch) synchronized with DynamoDB data.
- Cache Invalidation: Invalidate caches when data changes.
- Notifications: Send notifications when specific data changes occur.
Processing DynamoDB Streams
- Stream records can be processed using multiple methods:
AWS Lambda
- Most common and recommended approach for processing DynamoDB Streams.
- Lambda polls the stream and invokes the function synchronously when new records are available.
- Lambda automatically handles scaling, retries, and error handling.
- Supports batch processing of stream records.
- Can filter events using event filtering to reduce invocations and costs.
Kinesis Data Streams
- DynamoDB can stream change data directly to Amazon Kinesis Data Streams.
- Provides longer data retention (up to 365 days vs. 24 hours for DynamoDB Streams).
- Enables integration with Kinesis Data Firehose, Kinesis Data Analytics, and other Kinesis consumers.
- Supports fan-out to multiple consumers.
- Better for high-throughput scenarios requiring multiple consumers.
Kinesis Client Library (KCL)
- KCL can be used to build custom applications that process DynamoDB Streams.
- DynamoDB Streams Kinesis Adapter allows KCL applications to consume DynamoDB Streams.
- KCL 3.0 Support (June 2025): DynamoDB Streams now supports Kinesis Client Library 3.0.
- Reduces compute costs to process streaming data by up to 33% compared to previous KCL versions.
- Improved load balancing algorithm based on CPU utilization.
- Enhanced performance and efficiency.
- Note: KCL 1.x reaches end-of-support on January 30, 2026. Migrate to KCL 3.x.
AWS PrivateLink Support (March 2025)
- Announced in March 2025, DynamoDB Streams now supports AWS PrivateLink.
- Allows invoking DynamoDB Streams APIs from within your Amazon VPC without traversing the public internet.
- Only interface endpoints are supported for DynamoDB Streams (gateway endpoints are not supported).
- Enables private connectivity for stream processing applications running on-premises or in other Regions.
- Supports FIPS endpoints in US and Canada commercial AWS Regions (announced November 2025).
- Enhances security by keeping stream data within the AWS network.
- Critical for compliance requirements that mandate private network connectivity.
- Can be accessed from on-premises via AWS Direct Connect or Site-to-Site VPN.
DynamoDB Streams vs. Kinesis Data Streams
- DynamoDB Streams:
- 24-hour data retention
- Automatically scales with table
- No additional cost (included with DynamoDB)
- Simpler to set up and use
- Best for simple event-driven architectures
- Kinesis Data Streams:
- Up to 365 days data retention
- Manual capacity management (or on-demand mode)
- Additional cost for Kinesis
- More complex but more flexible
- Best for multiple consumers and longer retention needs
- Recommendation: Use DynamoDB Streams for simple use cases with Lambda. Use Kinesis Data Streams for complex scenarios requiring multiple consumers or longer retention.
Best Practices
- Choose the Right View Type: Use NEW_AND_OLD_IMAGES for maximum flexibility unless you have specific requirements.
- Handle Duplicates: Although designed for no duplicates, implement idempotent processing logic.
- Monitor Stream Processing: Use CloudWatch metrics to monitor Lambda invocations, errors, and iterator age.
- Use Event Filtering: Filter events in Lambda to reduce unnecessary invocations and costs.
- Batch Processing: Configure appropriate batch sizes for Lambda to optimize throughput and cost.
- Error Handling: Implement proper error handling and configure dead-letter queues for failed records.
- Consider Kinesis for Multiple Consumers: If you need multiple consumers, use Kinesis Data Streams instead.
- Migrate to KCL 3.0: If using KCL, migrate to version 3.0 for cost savings and performance improvements.
- Use PrivateLink for Security: Enable AWS PrivateLink for enhanced security and compliance.
Limitations and Considerations
- Stream records are available for only 24 hours.
- Streams do not guarantee ordering across different items (only per-item ordering).
- Stream records are eventually consistent with the table.
- Enabling streams does not affect table performance.
- Streams cannot be enabled on tables with local secondary indexes that use non-key attributes in the projection.
- For Global Tables with MREC, streams are enabled by default and cannot be disabled.
- For Global Tables with MRSC, streams are not used for replication but can be enabled separately.
AWS Certification Exam Practice Questions
- Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
- AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
- AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
- Open to further feedback, discussion and correction.
- An application currently writes a large number of records to a DynamoDB table in one region. There is a requirement for a secondary application to retrieve new records written to the DynamoDB table every 2 hours and process the updates accordingly. Which of the following is an ideal way to ensure that the secondary application gets the relevant changes from the DynamoDB table?
- Insert a timestamp for each record and then scan the entire table for the timestamp as per the last 2 hours.
- Create another DynamoDB table with the records modified in the last 2 hours.
- Use DynamoDB Streams to monitor the changes in the DynamoDB table.
- Transfer records to S3 which were modified in the last 2 hours.
- A company needs to process DynamoDB stream records from an on-premises application without exposing traffic to the public internet. What should they implement?
- Use a NAT gateway to access DynamoDB Streams.
- Create an interface VPC endpoint for DynamoDB Streams using AWS PrivateLink.
- Create a gateway VPC endpoint for DynamoDB Streams.
- Use an internet gateway with security groups.
- A company wants to reduce costs for processing DynamoDB Streams using KCL. What should they do?
- Switch from KCL to Lambda for processing.
- Migrate from KCL 1.x to KCL 3.0 for up to 33% cost reduction.
- Reduce the number of shards in the stream.
- Increase the batch size for stream processing.
- A company needs to maintain an audit log of all changes to a DynamoDB table for 90 days. DynamoDB Streams only retains data for 24 hours. What is the BEST solution?
- Enable PITR on the DynamoDB table.
- Stream DynamoDB changes to Kinesis Data Streams with 90-day retention.
- Use Lambda to copy stream records to S3 every 24 hours.
- Create on-demand backups every 24 hours.
- A developer wants to capture both the old and new values of items when they are modified in a DynamoDB table. Which stream view type should they configure?
- KEYS_ONLY
- NEW_IMAGE
- OLD_IMAGE
- NEW_AND_OLD_IMAGES
- Which of the following statements about DynamoDB Streams are correct? (Select TWO)
- Stream records are available for 24 hours.
- Streams guarantee ordering across all items in the table.
- Streams maintain ordered sequence of events per item.
- Streams can be processed only by Lambda functions.
- Enabling streams impacts table write performance.
- A company has multiple applications that need to process the same DynamoDB change events. What is the BEST approach?
- Create multiple Lambda functions triggered by the same DynamoDB Stream.
- Stream DynamoDB changes to Kinesis Data Streams and use multiple consumers.
- Enable multiple DynamoDB Streams on the same table.
- Use DynamoDB Streams with fan-out to multiple Lambda functions.
2 thoughts on “Amazon DynamoDB Streams”
Comments are closed.