AWS Data Analytics Services Cheat Sheet

AWS Data Analytics Services

AWS Data Analytics Services Cheat Sheet

📋 Last Updated: June 2026. This post has been updated to reflect service renamings (Kinesis Data Firehose → Amazon Data Firehose, Kinesis Data Analytics → Amazon Managed Service for Apache Flink, Elasticsearch → OpenSearch Service, QuickSight → Quick Suite), deprecations (AWS Data Pipeline, Kinesis Data Analytics for SQL), and major new features (Zero-ETL integrations, MSK Express brokers, Glue 5.0, SageMaker Lakehouse).

AWS Data Analytics Services

Kinesis Data Streams – KDS

  • enables real-time processing of streaming data at a massive scale
  • provides ordering of records per shard
  • provides an ability to read and/or replay records in the same order
  • allows multiple applications to consume the same data
  • data is replicated across three data centers within a region
  • data is preserved for 24 hours, by default, and can be extended to 365 days
  • data inserted in Kinesis, it can’t be deleted (immutability) but only expires
  • streams can be scaled using multiple shards, based on the partition key
  • each shard provides the capacity of 1MB/sec data input and 2MB/sec data output with 1000 PUT requests per second
  • supports two capacity modes:
    • Provisioned mode – you manage the number of shards
    • On-demand mode – automatically scales to accommodate up to 10 GB/s write and 20 GB/s read throughput per stream
  • On-demand Advantage mode (launched Nov 2025) – enables on-demand streams to handle instant throughput increases via warm throughput capability, with up to 10GB/s or 10 million events/second, eliminating over-provisioning needs and offering 60%+ cost savings for consistent workloads
  • supports record sizes up to 10 MiB (increased from 1 MiB in Oct 2025)
  • supports up to 50 enhanced fan-out consumers per stream (increased from 20 in Nov 2025)
  • Kinesis vs SQS
    • real-time processing of streaming big data vs reliable, highly scalable hosted queue for storing messages
    • ordered records, as well as the ability to read and/or replay records in the same order vs no guarantee on data ordering (with the standard queues before the FIFO queue feature was released)
    • data storage up to 24 hours, extended to 365 days vs 1 minute to extended to 14 days but cleared if deleted by the consumer.
    • supports multiple consumers vs a single consumer at a time and requires multiple queues to deliver messages to multiple consumers.
  • Kinesis Producer
    • API
      • PutRecord and PutRecords are synchronous
      • PutRecords uses batching and increases throughput
      • might experience ProvisionedThroughputExceeded Exceptions, when sending more data. Use retries with backoff, resharding, or change partition key.
    • KPL
      • producer supports synchronous or asynchronous use cases
      • supports inbuilt batching and retry mechanism
      • ⚠️ KPL 0.x reached end-of-support on January 30, 2026. Migrate to KPL 1.x.
    • Kinesis Agent can help monitor log files and send them to KDS
    • supports third-party libraries like Spark, Flume, Kafka connect, etc.
  • Kinesis Consumers
    • Kinesis SDK
      • Records are polled by consumers from a shard
    • Kinesis Client Library (KCL)
      • Read records from Kinesis produced with the KPL (de-aggregation)
      • supports the checkpointing feature to keep track of the application’s state and resume progress using the DynamoDB table.
      • if application receives provisioned-throughput exceptions, increase the provisioned throughput for the DynamoDB table
      • ⚠️ KCL 1.x reached end-of-support on January 30, 2026. Migrate to KCL 2.x.
    • Kinesis Connector Library – can be replaced using Firehose or Lambda
    • Third-party libraries: Spark, Log4J Appenders, Flume, Kafka Connect…
    • Amazon Data Firehose, AWS Lambda
    • Kinesis Consumer Enhanced Fan-Out
      • supports Multiple Consumer applications for the same Stream
      • provides Low Latency ~70ms
      • Higher costs
      • now supports up to 50 consumers per stream
  • Kinesis Security
    • allows access/authorization control using IAM policies
    • supports Encryption in flight using HTTPS endpoints
    • supports data encryption at rest using either server-side encryption with KMS or using client-side encryption before pushing the data to data streams.
    • supports VPC Endpoints to access within VPC

Amazon Data Firehose

(Previously known as Amazon Kinesis Data Firehose, renamed February 2024)

  • data transfer solution for delivering near real-time streaming data to destinations such as S3, Redshift, OpenSearch Service, Splunk, Snowflake, and other 3rd-party analytics services.
  • is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration
  • is Near Real Time (min. 60 secs) as it buffers incoming streaming data to a certain size or for a certain period of time before delivering it
  • supports batching, compression, and encryption of the data before loading it, minimizing the amount of storage used at the destination and increasing security
  • supports data compression, minimizing the amount of storage used at the destination. It currently supports GZIP, ZIP, and SNAPPY compression formats. Only GZIP is supported if the data is further loaded to Redshift.
  • supports out of box data transformation as well as custom transformation using Lambda function to transform incoming source data and deliver the transformed data to destinations
  • uses at least once semantics for data delivery.
  • supports multiple producers as datasource, which include Kinesis data stream, KPL, Kinesis Agent, or the Data Firehose API using the AWS SDK, CloudWatch Logs, CloudWatch Events, or AWS IoT
  • does NOT support consumers like Spark and KCL
  • supports interface VPC endpoint to keep traffic between the VPC and Data Firehose from leaving the Amazon network.
  • Apache Iceberg Tables destination (launched 2024) – delivers streaming data directly into Apache Iceberg format tables in S3 and S3 Tables, supporting record routing to different Iceberg tables, CDC replication from databases, schema evolution, and ACID transactions.
  • Database CDC replication (Preview 2024) – supports continuous replication of database changes from MySQL and PostgreSQL directly into Apache Iceberg Tables in S3.

Kinesis Data Streams vs Amazon Data Firehose

Amazon Managed Service for Apache Flink

(Previously known as Amazon Kinesis Data Analytics, renamed August 2023)

⚠️ Kinesis Data Analytics for SQL was discontinued on January 27, 2026. Migrate to Amazon Managed Service for Apache Flink or Apache Flink Studio for real-time stream processing.

  • helps analyze streaming data, gain actionable insights, and respond to the business and customer needs in real time.
  • is a fully managed and serverless service for building and running real-time streaming applications using Apache Flink.
  • reduces the complexity of building, managing, and integrating streaming applications with other AWS services.
  • supports Apache Flink applications written in Java, Scala, Python, and SQL.
  • provides automatic scaling, high availability, and exactly-once processing semantics.
  • integrates with Kinesis Data Streams, Amazon MSK, and Amazon S3 as data sources and sinks.

Managed Streaming for Kafka – MSK

  • Managed Streaming for Kafka- MSK is an AWS streaming data service that manages Apache Kafka infrastructure and operations.
  • makes it easy for developers and DevOps managers to run Kafka applications and Kafka Connect connectors on AWS, without the need to become experts in operating Kafka.
  • operates, maintains, and scales Kafka clusters, provides enterprise-grade security features out of the box, and has built-in AWS integrations that accelerate development of streaming data applications.
  • always runs within a VPC managed by the MSK and is available to your own selected VPC, subnet, and security group when the cluster is setup.
  • IP addresses from the VPC are attached to the MSK resources through elastic network interfaces (ENIs), and all network traffic stays within the AWS network and is not accessible to the internet by default.
  • integrates with CloudWatch for monitoring, metrics, and logging.
  • MSK Serverless is a cluster type for MSK that makes it easy for you to run Apache Kafka clusters without having to manage compute and storage capacity.
  • MSK Express Brokers (GA November 2024) – a new broker type for MSK Provisioned designed to deliver:
    • up to 3x more throughput per broker (500 MBps ingress, 1000 MBps egress on m7g instances)
    • up to 20x faster scaling
    • 90% faster recovery from failures
    • up to 5x more partitions per broker
    • virtually unlimited storage with instant storage scaling
    • supports Intelligent Rebalancing for 180x faster operation performance
  • supports EBS server-side encryption using KMS to encrypt storage.
  • supports encryption in transit enabled via TLS for inter-broker communication.
  • For provisioned clusters, you have three options:
    • IAM Access Control for both AuthN/Z (recommended),
    • TLS certificate authentication (CA) for AuthN and access control lists for AuthZ
    • SASL/SCRAM for AuthN and access control lists for AuthZ.
  • For serverless clusters, IAM Access Control can be used for both authentication and authorization.

Redshift

  • Redshift is a fast, fully managed data warehouse
  • provides simple and cost-effective solutions to analyze all the data using standard SQL and the existing Business Intelligence (BI) tools.
  • manages the work needed to set up, operate, and scale a data warehouse, from provisioning the infrastructure capacity to automating ongoing administrative tasks such as backups, and patching.
  • automatically monitors your nodes and drives to help you recover from failures.
  • only supported Single-AZ deployments. However, now supports Multi-AZ deployments.
  • replicates all the data within the data warehouse cluster when it is loaded and also continuously backs up your data to S3.
  • attempts to maintain at least three copies of your data (the original and replica on the compute nodes and a backup in S3).
  • supports cross-region snapshot replication to another region for disaster recovery
  • Redshift supports four distribution styles; AUTO, EVEN, KEY, or ALL.
    • KEY distribution uses a single column as distribution key (DISTKEY) and helps place matching values on the same node slice
    • Even distribution distributes the rows across the slices in a round-robin fashion, regardless of the values in any particular column
    • ALL distribution replicates whole table in every compute node.
    • AUTO distribution lets Redshift assigns an optimal distribution style based on the size of the table data
  • Redshift supports Compound and Interleaved sort keys
    • Compound key
      • is made up of all of the columns listed in the sort key definition, in the order they are listed and is more efficient when query predicates use a prefix, or query’s filter applies conditions, such as filters and joins, which is a subset of the sort key columns in order.
    • Interleaved sort key
      • gives equal weight to each column in the sort key, so query predicates can use any subset of the columns that make up the sort key, in any order.
      • Not ideal for monotonically increasing attributes
  • Import/Export Data
    • UNLOAD helps copy data from Redshift table to S3
    • COPY command
      • helps copy data from S3 to Redshift
      • also supports EMR, DynamoDB, remote hosts using SSH
      • parallelized and efficient
      • can decrypt data as it is loaded from S3
      • DON’T use multiple concurrent COPY commands to load one table from multiple files as Redshift is forced to perform a serialized load, which is much slower.
      • supports data decryption when loading data, if data encrypted
      • supports decompressing data, if data is compressed.
    • Split the Load Data into Multiple Files
    • Load the data in sort key order to avoid needing to vacuum.
    • Use a Manifest File
      • provides Data consistency, to avoid S3 eventual consistency issues
      • helps specify different S3 locations in a more efficient way that with the use of S3 prefixes.
  • Zero-ETL Integrations (2024-2025)
    • enable near real-time analytics by connecting operational databases and applications to Redshift without building data pipelines
    • supports integrations from Aurora (MySQL/PostgreSQL), DynamoDB, RDS, and third-party applications (Salesforce, SAP, Zendesk)
    • works with both Redshift Serverless workgroups and provisioned clusters using RA3 instance types
    • includes SQL features: QUERY_ALL_STATES, TRUNCATECOLUMNS, and ACCEPTINVCHARS for zero-ETL data handling
    • integrates with Amazon SageMaker Lakehouse for unified analytics and AI/ML
  • Redshift Distribution Style determines how data is distributed across compute nodes and helps minimize the impact of the redistribution step by locating the data where it needs to be before the query is executed.
  • Redshift Enhanced VPC routing forces all COPY and UNLOAD traffic between the cluster and the data repositories through the VPC.
  • Workload management (WLM) enables users to flexibly manage priorities within workloads so that short, fast-running queries won’t get stuck in queues behind long-running queries.
  • Redshift Spectrum helps query and retrieve structured and semistructured data from files in S3 without having to load the data into Redshift tables.
    • Redshift Spectrum external tables are read-only. You can’t COPY or INSERT to an external table.
  • Federated Query feature allows querying and analyzing data across operational databases, data warehouses, and data lakes.
  • Short query acceleration (SQA) prioritizes selected short-running queries ahead of longer-running queries.
  • Redshift Serverless is a serverless option of Redshift that makes it more efficient to run and scale analytics in seconds without the need to set up and manage data warehouse infrastructure.

EMR

  • is a web service that utilizes a hosted Hadoop framework running on the web-scale infrastructure of EC2 and S3
  • launches all nodes for a given cluster in the same Availability Zone, which improves performance as it provides a higher data access rate.
  • seamlessly supports Reserved, On-Demand, and Spot Instances
  • consists of Master/Primary Node for management and Slave nodes, which consist of Core nodes holding data and providing compute and Task nodes for performing tasks only.
  • is fault tolerant for slave node failures and continues job execution if a slave node goes down
  • supports Persistent and Transient cluster types
    • Persistent EMR clusters continue to run after the data processing job is complete
    • Transient EMR clusters shut down when the job or the steps (series of jobs) are complete
  • supports EMRFS which allows S3 to be used as a durable HA data storage
  • EMR Serverless helps run big data frameworks such as Apache Spark and Apache Hive without configuring, managing, and scaling clusters.
    • now supports Spark Connect (2026) for interactive PySpark development from local environments, IDEs, and SageMaker Unified Studio Notebooks
    • eliminates local storage provisioning, reducing costs by up to 20%
  • Apache Spark 4.0 support (GA 2026) – includes VARIANT data type, state-management improvements, and Spark Connect, with EMR optimized runtime running workloads up to 4.5x faster than open-source Spark
  • EMR Studio is an IDE that helps data scientists and data engineers to develop, visualize, and debug data engineering and data science applications written in R, Python, Scala, and PySpark.
  • EMR Notebooks provide a managed environment, based on Jupyter Notebook, that helps prepare and visualize data, collaborate with peers, build applications, and perform interactive analysis using EMR clusters.

Glue

  • AWS Glue is a fully managed, ETL service that automates the time-consuming steps of data preparation for analytics.
  • is serverless and supports a pay-as-you-go model.
  • handles provisioning, configuration, and scaling of the resources required to run the ETL jobs on a fully managed, scale-out Apache Spark environment.
  • helps setup, orchestrate, and monitor complex data flows.
  • supports custom Scala or Python code and import custom libraries and Jar files into the AWS Glue ETL jobs to access data sources not natively supported by AWS Glue.
  • supports server side encryption for data at rest and SSL for data in motion.
  • provides development endpoints to edit, debug, and test the code it generates.
  • AWS Glue natively supports data stored in RDS, Redshift, DynamoDB, S3, MySQL, Oracle, Microsoft SQL Server, and PostgreSQL databases in the VPC running on EC2 and Data streams from MSK, Kinesis Data Streams, and Apache Kafka.
  • Glue ETL engine to Extract, Transform, and Load data that can automatically generate Scala or Python code.
  • AWS Glue 5.0/5.1 (2024-2026):
    • provides performance-optimized Apache Spark 3.5 runtime for batch and stream processing
    • native support for open table formats: Apache Iceberg, Delta Lake, and Apache Hudi
    • Spark-native fine-grained access control (FGAC) integration with AWS Lake Formation
    • faster job start times and automatic partition pruning
    • Glue 5.1 adds support for Apache Iceberg format version 3.0, deletion vectors, and row lineage tracking
    • new worker types: G.12X, G.16X general compute, and R.1X/R.2X/R.4X/R.8X memory-optimized
  • Glue Materialized Views (2025) – Apache Iceberg-based materialized views for transforming data and accelerating query performance
  • supports generative AI assistance for data integration tasks
  • Glue Data Catalog is a central repository and persistent metadata store to store structural and operational metadata for all the data assets.
  • Glue Crawlers scan various data stores to automatically infer schemas and partition structures to populate the Data Catalog with corresponding table definitions and statistics.
  • Glue Job Bookmark tracks data that has already been processed during a previous run of an ETL job by persisting state information from the job run.
  • AWS Glue Streaming ETL enables performing ETL operations on streaming data using continuously-running jobs.
  • Glue provides flexible scheduler that handles dependency resolution, job monitoring, and retries.
  • Glue Studio offers a graphical interface for authoring AWS Glue jobs to process data allowing you to define the flow of the data sources, transformations, and targets in the visual interface and generating Apache Spark code on your behalf.
  • Glue Data Quality helps reduces manual data quality effort by automatically measuring and monitoring the quality of data in data lakes and pipelines.
  • Glue DataBrew is a visual data preparation tool that makes it easy for data analysts and data scientists to prepare, visualize, clean, and normalize terabytes, and even petabytes of data directly from your data lake, data warehouses, and databases, including S3, Redshift, Aurora, and RDS.
  • ⚠️ AWS Glue for Ray will no longer accept new customers starting April 30, 2026. Existing customers can continue using the service. Explore Amazon EKS for similar capabilities.

Lake Formation

  • AWS Lake Formation helps create secure data lakes, making data available for wide-ranging analytics.
  • is an integrated data lake service that helps to discover, ingest, clean, catalog, transform, and secure data and make it available for analysis and ML.
  • automatically manages access to the registered data in S3 through services including AWS Glue, Athena, Redshift, QuickSight, and EMR using Zeppelin notebooks with Apache Spark to ensure compliance with your defined policies.
  • helps configure and manage your data lake without manually integrating multiple underlying AWS services.
  • uses a shared infrastructure with AWS Glue, including console controls, ETL code creation and job monitoring, blueprints to create workflows for data ingest, the same data catalog, and a serverless architecture.
  • can manage data ingestion through AWS Glue. Data is automatically classified, and relevant data definitions, schema, and metadata are stored in the central Glue Data Catalog. Once the data is in the S3 data lake, access policies, including table-and-column-level access controls can be defined, and encryption for data at rest enforced.
  • integrates with IAM so authenticated users and roles can be automatically mapped to data protection policies that are stored in the data catalog. The IAM integration also supports Microsoft Active Directory or LDAP to federate into IAM using SAML.
  • helps centralize data access policy controls. Users and roles can be defined to control access, down to the table and column level.
  • supports fine-grained access control (FGAC) including row-level and cell-level security, and tag-based access control (LF-Tags) for scalable permission management.
  • supports private endpoints in the VPC and records all activity in AWS CloudTrail for network isolation and auditability.
  • ⚠️ Lake Formation’s Governed Tables feature was deprecated in February 2025. Use Apache Iceberg tables with Lake Formation for transactional data lake capabilities.

Amazon Quick Suite (formerly QuickSight)

(Amazon QuickSight evolved to Amazon Quick Suite on October 9, 2025, expanding from a single BI product to a comprehensive analytics and AI platform)

  • is a cloud-powered business analytics service that integrates BI capabilities with AI agents for business insights, deep research, and automation in one unified experience.
  • delivers fast and responsive query performance by using a robust in-memory engine (SPICE).
    • “SPICE” stands for a Super-fast, Parallel, In-memory Calculation Engine
    • can also be configured to keep the data in SPICE up-to-date as the data in the underlying sources change.
    • automatically replicates data for high availability and enables Quick Suite to scale to support users to perform simultaneous fast interactive analysis across a wide variety of AWS data sources.
  • Amazon Q in QuickSight (GA April 2024) – generative BI capabilities powered by Amazon Bedrock:
    • multi-visual data Q&A for asking questions of data not in dashboards
    • executive summaries for quick trend and insight discovery
    • automated stories – documents and slides explaining data
    • natural language generation for pixel-perfect reports
    • available to all Enterprise Edition users without separate Q add-on
  • supports
    • Excel files and flat files like CSV, TSV, CLF, ELF
    • on-premises databases like PostgreSQL, SQL Server and MySQL
    • SaaS applications like Salesforce
    • and AWS data sources such as Redshift, RDS, Aurora, Athena, and S3
  • supports various functions to format and transform the data.
  • supports assorted visualizations that facilitate different analytical approaches:
    • Comparison and distribution – Bar charts (several assorted variants)
    • Changes over time – Line graphs, Area line charts
    • Correlation – Scatter plots, Heat maps
    • Aggregation – Pie graphs, Tree maps
    • Tabular – Pivot tables
  • Amazon Quick Sight (a capability within Quick Suite) now offers visual data preparation experience for advanced data transformations without code.

Data Pipeline

⚠️ AWS Data Pipeline – No Longer Available to New Customers (July 25, 2024)

AWS Data Pipeline is in maintenance mode and is no longer available to new customers. Console access was removed on April 30, 2023. Existing customers can continue to use the service via CLI and API.

Migration Options:

  • Amazon MWAA (Managed Workflows for Apache Airflow) – for complex workflow orchestration
  • AWS Step Functions – for serverless workflow orchestration
  • AWS Glue – for ETL-focused data movement pipelines
  • Amazon EventBridge – for event-driven scheduling
  • orchestration service that helps define data-driven workflows to automate and schedule regular data movement and data processing activities
  • integrates with on-premises and cloud-based storage systems
  • allows scheduling, retry, and failure logic for the workflows

Amazon OpenSearch Service

(Previously known as Amazon Elasticsearch Service, renamed September 8, 2021)

  • Amazon OpenSearch Service is a managed service that makes it easy to deploy, operate, and scale OpenSearch clusters in the AWS Cloud.
  • OpenSearch Service provides
    • real-time, distributed search and analytics engine
    • ability to provision all the resources for OpenSearch cluster and launches the cluster
    • easy to use cluster scaling options. Scaling OpenSearch Service domain by adding or modifying instances, and storage volumes is an online operation that does not require any downtime.
    • provides self-healing clusters, which automatically detects and replaces failed nodes, reducing the overhead associated with self-managed infrastructures
    • domain snapshots to back up and restore domains and replicate domains across AZs
    • enhanced security with IAM, Network, Domain access policies, and fine-grained access control
    • storage volumes for the data using EBS volumes
    • ability to span cluster nodes across multiple AZs in the same region, known as zone awareness, for high availability and redundancy. OpenSearch Service automatically distributes the primary and replica shards across instances in different AZs.
    • dedicated master nodes to improve cluster stability
    • data visualization using OpenSearch Dashboards (formerly Kibana)
    • integration with CloudWatch for monitoring domain metrics
    • integration with CloudTrail for auditing configuration API calls to domains
    • integration with S3, Kinesis, and DynamoDB for loading streaming data
    • ability to handle structured and Unstructured data
    • supports encryption at rest through KMS, node-to-node encryption over TLS, and the ability to require clients to communicate with HTTPS
  • Amazon OpenSearch Serverless
    • automatically scales without managing infrastructure
    • NextGen architecture (2026) – decoupled compute from storage, provisions in seconds, scales to zero when idle, up to 20x faster autoscaling, and up to 60% lower cost than provisioned clusters
    • two collection architectures: Classic (original) and NextGen (default for new collections)
  • Vector Database Capabilities
    • stores vector embeddings from LLMs for semantic/similarity search
    • supports hybrid search combining vector, lexical, and agentic retrieval
    • GPU-accelerated vector indexes for billion-scale databases (2025)
    • auto-optimized vector indexes for search quality/speed/cost tradeoffs
    • integrates with Amazon Bedrock for RAG and agentic AI applications
  • Zero-ETL integrations – direct data access from other AWS services without pipeline management
  • Extended Support – Standard Support ends Nov 7, 2025 for legacy Elasticsearch versions up to 6.7, ES 7.1-7.8, and OpenSearch 1.0-1.2

Athena

  • Amazon Athena is a serverless, interactive analytics service built on open-source frameworks, supporting open-table and file formats.
  • provides a simplified, flexible way to analyze petabytes of data in an S3 data lake and 30 data sources, including on-premises data sources or other cloud systems using SQL or Python without loading the data.
  • is built on open-source Trino and Presto engines and Apache Spark frameworks, with no provisioning or configuration effort required.
  • is highly available and runs queries using compute resources across multiple facilities, automatically routing queries appropriately if a particular facility is unreachable
  • can process unstructured, semi-structured, and structured datasets.
  • integrates with QuickSight for visualizing the data or creating dashboards.
  • supports various standard data formats, including CSV, TSV, JSON, ORC, Avro, and Parquet.
  • supports compressed data in Snappy, Zlib, LZO, and GZIP formats. You can improve performance and reduce costs by compressing, partitioning, and using columnar formats.
  • can handle complex analysis, including large joins, window functions, and arrays
  • uses a managed Glue Data Catalog to store information and schemas about the databases and tables that you create for the data stored in S3
  • uses schema-on-read technology, which means that the table definitions are applied to the data in S3 when queries are being applied. There’s no data loading or transformation required. Table definitions and schema can be deleted without impacting the underlying data stored in S3.
  • supports fine-grained access control with AWS Lake Formation which allows for centrally managing permissions and access control for data catalog resources in the S3 data lake.
  • integrates with Amazon SageMaker Lakehouse for governed federated queries across data sources

Amazon SageMaker Lakehouse

(Launched at re:Invent 2024, GA March 2025)

  • part of the next generation of Amazon SageMaker – a unified platform for data, analytics, and AI
  • unifies data across S3 data lakes (including S3 Tables), Redshift data warehouses, and operational databases
  • supports zero-ETL integrations from Aurora, DynamoDB, RDS, and third-party applications (Salesforce, SAP, Zendesk) for near real-time data access
  • enables querying, analyzing, and joining data using Redshift, Athena, EMR, and AWS Glue
  • provides unified access through Amazon SageMaker Unified Studio – a single development experience for data engineers, data scientists, and analysts
  • supports Apache Iceberg open table format for interoperability
  • integrates with Lake Formation for fine-grained access control and governance

AWS Database Services Cheat Sheet

AWS Database Services Cheat Sheet

AWS Database Services

📋 Last Updated: June 2026

This cheat sheet has been updated to include Aurora DSQL, Aurora storage increase to 256 TiB, ElastiCache for Valkey, ElastiCache Serverless, Redshift Multi-AZ and Serverless, DynamoDB multi-Region strong consistency, zero-ETL integrations, RDS Multi-AZ DB Clusters with readable standbys, and RDS Extended Support.

Relational Database Service – RDS

  • provides Relational Database service
  • supports MySQL, MariaDB, PostgreSQL, Oracle, Microsoft SQL Server, Amazon Aurora, and IBM Db2 (added in 2023) DB engines
  • as it is a managed service, shell (root ssh) access is not provided
  • manages backups, software patching, automatic failure detection, and recovery
  • supports use initiated manual backups and snapshots
  • daily automated backups with database transaction logs enables Point in Time recovery up to the last five minutes of database usage
  • snapshots are user-initiated storage volume snapshot of DB instance, backing up the entire DB instance and not just individual databases that can be restored as a independent RDS instance
  • RDS Security
    • support encryption at rest using KMS as well as encryption in transit using SSL endpoints
    • supports IAM database authentication, which prevents the need to store static user credentials in the database, because authentication is managed externally using IAM.
    • supports Encryption only during creation of an RDS DB instance
    • existing unencrypted DB cannot be encrypted and you need to create a snapshot, create an encrypted copy of the snapshot and restore as encrypted DB
    • supports Secrets Manager for storing and rotating secrets
    • for encrypted database
      • logs, snapshots, backups, read replicas are all encrypted as well
      • cross region replicas and snapshots are supported for encrypted instances
  • Multi-AZ deployment
    • provides high availability and automatic failover support and is NOT a scaling solution
    • maintains a synchronous standby replica in a different AZ
    • transaction success is returned only if the commit is successful both on the primary and the standby DB
    • Oracle, PostgreSQL, MySQL, and MariaDB DB instances use Amazon technology, while SQL Server DB instances use SQL Server Always On Availability Groups
    • snapshots and backups are taken from standby & eliminate I/O freezes
    • during automatic failover, its seamless and RDS switches to the standby instance and updates the DNS record to point to standby
    • failover can be forced with the Reboot with failover option
  • Multi-AZ DB Cluster (Readable Standbys)
    • provides a primary DB instance and two readable standby DB instances in different AZs
    • standby instances can serve read traffic, providing additional read capacity
    • uses semi-synchronous replication with transaction log-based replication
    • provides faster failover (typically under 35 seconds) compared to Multi-AZ instance deployment
    • supports MySQL and PostgreSQL engines
    • offers lower write latency compared to Multi-AZ instance deployments
  • Read Replicas
    • uses the PostgreSQL, MySQL, and MariaDB DB engines’ built-in replication functionality to create a separate Read Only instance
    • updates are asynchronously copied to the Read Replica, and data might be stale
    • can help scale applications and reduce read only load
    • requires automatic backups enabled
    • replicates all databases in the source DB instance
    • for disaster recovery, can be promoted to a full fledged database
    • can be created in a different region for disaster recovery, migration and low latency across regions
    • can’t create encrypted read replicas from unencrypted DB or read replica
  • RDS does not support all the features of underlying databases, and if required the database instance can be launched on an EC2 instance
  • RDS Components
    • DB parameter groups contains engine configuration values that can be applied to one or more DB instances of the same instance type for e.g. SSL, max connections etc.
    • Default DB parameter group cannot be modified, create a custom one and attach to the DB
    • Supports static and dynamic parameters
      • changes to dynamic parameters are applied immediately (irrespective of apply immediately setting)
      • changes to static parameters are NOT applied immediately and require a manual reboot.
  • RDS Monitoring & Notification
    • integrates with CloudWatch and CloudTrail
    • CloudWatch provides metrics about CPU utilization from the hypervisor for a DB instance, and Enhanced Monitoring gathers its metrics from an agent on the instance
    • Performance Insights is a database performance tuning and monitoring feature that helps illustrate the database’s performance and help analyze any issues that affect it
    • supports RDS Event Notification which uses the SNS to provide notification when an RDS event like creation, deletion or snapshot creation etc occurs
  • RDS Blue/Green Deployments
    • creates a staging (green) environment that mirrors the production (blue) environment
    • enables safer database updates, major version upgrades, and schema changes with minimal downtime (under 5 seconds)
    • supports Aurora MySQL, Aurora PostgreSQL, RDS for MySQL, RDS for MariaDB, and RDS for PostgreSQL
    • now supports Aurora Global Database (2025)
  • RDS Extended Support
    • allows running databases on a major engine version up to 3 years past its RDS end of standard support date at an additional cost
    • provides critical security and bug fixes after the community ends support for a major version
    • databases are automatically enrolled if not upgraded before the end of standard support date
  • Zero-ETL Integrations
    • RDS for MySQL and Aurora support zero-ETL integration with Amazon Redshift
    • enables near real-time analytics on transactional data without building ETL pipelines
    • data is automatically replicated to Amazon Redshift within seconds of being written

⚠️ RDS Custom for Oracle – End of Support (March 31, 2027)

AWS will end support for Amazon RDS Custom for Oracle on March 31, 2027. After this date, you will no longer be able to access the RDS Custom for Oracle console or resources.

Migration Options: Migrate to Amazon RDS for Oracle (standard) or run Oracle on Amazon EC2 bare metal instances.

Aurora

  • is a relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases
  • is a managed service and handles time-consuming tasks such as provisioning, patching, backup, recovery, failure detection and repair
  • is a proprietary technology from AWS (not open sourced)
  • provides PostgreSQL and MySQL compatibility
  • is “AWS cloud optimized” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of PostgreSQL on RDS
  • scales storage automatically in increments of 10GB, up to 256 TiB (increased from 128 TiB in July 2025) with no impact to database performance. Storage is striped across 100s of volumes.
  • no need to provision storage in advance.
  • provides self-healing storage. Data blocks and disks are continuously scanned for errors and repaired automatically.
  • provides instantaneous failover
  • replicates each chunk of the database volume six ways across three Availability Zones i.e. 6 copies of the data across 3 AZ
    • requires 4 copies out of 6 needed for writes
    • requires 3 copies out of 6 need for reads
  • costs more than RDS (20% more) – but is more efficient
  • Read Replicas
    • can have 15 replicas while MySQL has 5, and the replication process is faster (sub 10 ms replica lag)
    • share the same data volume as the primary instance in the same AWS Region, there is virtually no replication lag
    • supports Automated failover for master in less than 30 seconds
    • supports Cross Region Replication using either physical or logical replication.
  • Security
    • supports Encryption at rest using KMS
    • supports Encryption in flight using SSL (same process as MySQL or Postgres)
    • Automated backups, snapshots and replicas are also encrypted
    • Possibility to authenticate using IAM token (same method as RDS)
    • supports protecting the instance with security groups
    • does not support SSH access to the underlying servers
  • Aurora I/O-Optimized
    • a cluster configuration that provides predictable pricing with no charges for I/O operations
    • ideal for I/O-intensive applications such as e-commerce, payment processing, and SaaS applications
    • can deliver up to 40% cost savings for I/O-intensive workloads
    • supports both Aurora Serverless and provisioned instances
    • can switch between I/O-Optimized and Standard configurations (once every 30 days to I/O-Optimized, back to Standard anytime)
  • Aurora Serverless
    • provides automated database instantiation and on-demand autoscaling based on actual usage
    • provides a relatively simple, cost-effective option for infrequent, intermittent, or unpredictable workloads
    • automatically starts up, shuts down, and scales capacity up or down based on the application’s needs. No capacity planning needed
    • Pay per second, can be more cost-effective
    • Aurora Serverless v1 reached end of life on March 31, 2025 – all clusters have been migrated to Aurora Serverless v2 (now simply called “Aurora Serverless”)
    • Aurora Serverless (v2) supports features like read replicas, Multi-AZ, Global Database, and logical replication that v1 did not
    • supports scale to zero capability and up to 30% better performance with smarter scaling (2026 enhancement)
  • Aurora Global Database
    • allows a single Aurora database to span multiple AWS regions.
    • provides Physical replication, which uses dedicated infrastructure that leaves the databases entirely available to serve the application
    • supports 1 Primary Region (read / write)
    • replicates across up to 5 secondary (read-only) regions, replication lag is less than 1 second
    • supports up to 16 Read Replicas per secondary region
    • recommended for low-latency global reads and disaster recovery with an RTO of < 1 minute
    • supports managed failover (Global Database Failover) which automates the cross-Region failover process, reducing operational overhead (introduced August 2023)
    • supports Blue/Green Deployments for Global Database (2025) for safer major version upgrades across all regions
    • supports a global writer endpoint for simplified application connectivity
  • Aurora Backtrack
    • Backtracking “rewinds” the DB cluster to the specified time
    • Backtracking performs in place restore and does not create a new instance. There is a minimal downtime associated with it.
  • Aurora Clone feature allows quick and cost-effective creation of Aurora Cluster duplicates
  • supports parallel or distributed query using Aurora Parallel Query, which refers to the ability to push down and distribute the computational load of a single query across thousands of CPUs in Aurora’s storage layer.
  • Aurora Optimized Reads
    • delivers up to 8x improved query latency for applications with datasets exceeding instance memory
    • uses local NVMe-based storage on Graviton-based instances to extend caching capacity
    • available for both PostgreSQL and MySQL compatible editions

Amazon Aurora DSQL (New – GA May 2025)

  • a serverless, distributed SQL database optimized for transaction processing
  • the fastest serverless distributed SQL database with active-active high availability
  • provides PostgreSQL compatibility (subset of features)
  • designed for 99.99% availability in single-Region and 99.999% availability in multi-Region configurations
  • delivers strong consistency for all reads and writes to any Regional endpoint
  • provides virtually unlimited scalability with zero infrastructure management and zero downtime maintenance
  • offers the fastest distributed SQL reads and writes with 4x faster reads and writes compared to other popular distributed SQL databases
  • employs an active-active deployment model where all database resources function as peers capable of handling both read and write traffic
  • supports up to 256 TiB of storage per database cluster
  • ideal for globally distributed applications requiring strong consistency, such as financial transactions, gaming, and SaaS applications

DynamoDB

  • fully managed NoSQL database service
  • synchronously replicates data across three facilities in an AWS Region, giving high availability and data durability
  • runs exclusively on SSDs to provide high I/O performance
  • provides provisioned table reads and writes
  • automatically partitions, reallocates, and re-partitions the data and provisions additional server capacity as data or throughput changes
  • creates and maintains indexes for the primary key attributes for efficient access to data in the table
  • DynamoDB Table classes currently support
    • DynamoDB Standard table class is the default and is recommended for the vast majority of workloads.
    • DynamoDB Standard-Infrequent Access (DynamoDB Standard-IA) table class which is optimized for tables where storage is the dominant cost.
  • supports Secondary Indexes
    • allows querying attributes other than the primary key attributes without impacting performance.
    • are automatically maintained as sparse objects
  • Local secondary index vs Global secondary index
    • shares partition key + different sort key vs different partition + sort key
    • search limited to partition vs across all partition
    • unique attributes vs non-unique attributes
    • linked to the base table vs independent separate index
    • only created during the base table creation vs can be created later
    • cannot be deleted after creation vs can be deleted
    • consumes provisioned throughput capacity of the base table vs independent throughput
    • returns all attributes for item vs only projected attributes
    • Eventually or Strongly vs Only Eventually consistent reads
    • size limited to 10Gb per partition vs unlimited
  • DynamoDB Consistency
    • provides Eventually consistent (by default) or Strongly Consistent option to be specified during a read operation
    • supports Strongly consistent reads for a few operations like Query, GetItem, and BatchGetItem using the ConsistentRead parameter
  • DynamoDB Throughput Capacity
    • supports On-demand and Provisioned read/write capacity modes
    • Provisioned mode requires the number of reads and writes per second as required by the application to be specified
    • On-demand mode provides flexible billing option capable of serving thousands of requests per second without capacity planning
    • On-demand pricing reduced by 50% in November 2024
    • supports switching from provisioned to on-demand up to 4 times in a rolling 24-hour period (2025 improvement)
  • DynamoDB Auto Scaling helps dynamically adjust provisioned throughput capacity on your behalf, in response to actual traffic patterns.
  • DynamoDB Adaptive capacity is a feature that enables DynamoDB to run imbalanced workloads indefinitely.
  • DynamoDB Global Tables
    • provide multi-active, cross-region replication capability of DynamoDB to support data access locality and regional fault tolerance for database workloads.
    • provide up to 99.999% availability
    • Multi-Region Strong Consistency (MRSC) – GA June 2025
      • enables applications to always read the latest version of data from any Region in a global table
      • provides zero RPO (Recovery Point Objective) for the highest application resilience
      • removes the need to manage consistency across multiple Regions manually
      • slightly higher write latencies compared to eventually consistent (MREC) mode
    • Global tables pricing reduced by up to 67% in November 2024
  • DynamoDB Streams provides a time-ordered sequence of item-level changes made to data in a table
  • DynamoDB Time to Live (TTL)
    • enables a per-item timestamp to determine when an item expiry
    • expired items are deleted from the table without consuming any write throughput.
  • DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement – from milliseconds to microseconds – even at millions of requests per second.
  • DynamoDB Triggers (just like database triggers) are a feature that allows the execution of custom actions based on item-level updates on a table.
  • VPC Gateway Endpoints provide private access to DynamoDB from within a VPC without the need for an internet gateway or NAT gateway.
  • DynamoDB Zero-ETL Integrations
    • Zero-ETL with Amazon Redshift (GA October 2024) – automatically replicates DynamoDB tables into Redshift for SQL analytics without building ETL pipelines
    • Zero-ETL with Amazon OpenSearch Service – provides seamless, code-free data replication for vector search and near real-time analytics
    • enables analytics on DynamoDB data without impacting production workload performance

ElastiCache

  • managed web service that provides in-memory caching to deploy and run Valkey, Redis OSS, or Memcached protocol-compliant cache clusters
  • ElastiCache for Valkey (Recommended – default since October 2024)
    • Valkey is an open-source fork of Redis OSS 7.2, maintained by the Linux Foundation with contributions from AWS, Google, Microsoft, and others
    • is a drop-in replacement for Redis OSS – supports the same data structures, commands, and protocols
    • all features available with Redis OSS 7.2 are available in Valkey 7.2 and above
    • AWS recommends Valkey for new deployments and offers migration paths from existing Redis OSS clusters
    • like Redis OSS, supports Multi-AZ, Read Replicas and Snapshots
    • supports cluster mode for horizontal scaling
  • ElastiCache with Redis OSS
    • available up to version 7.1 (the last BSD-licensed release); now a maintenance track with no active new feature development from AWS
    • Redis 8.0+ is licensed under AGPLv3, which is not supported by ElastiCache
    • Standard support for versions 4 and 5 ends January 31, 2026; clusters will be enrolled in Extended Support after that date
    • like RDS, supports Multi-AZ, Read Replicas and Snapshots
    • Read Replicas are created across AZ within same region using Redis’s asynchronous replication technology
    • Multi-AZ differs from RDS as there is no standby, but if the primary goes down a Read Replica is promoted as primary
    • allows snapshots for backup and restore
    • AOF can be enabled for recovery scenarios, to recover the data in case the node fails or service crashes. But it does not help in case the underlying hardware fails
    • Enabling Redis Multi-AZ as a Better Approach to Fault Tolerance
  • ElastiCache with Memcached
    • can be scaled up by increasing size and scaled out by adding nodes
    • nodes can span across multiple AZs within the same region
    • cached data is spread across the nodes, and a node failure will always result in some data loss from the cluster
    • supports auto discovery
    • every node should be homogenous and of same instance type
  • ElastiCache Valkey/Redis vs Memcached
    • complex data objects vs simple key value storage
    • persistent vs non persistent, pure caching
    • automatic failover with Multi-AZ vs Multi-AZ not supported
    • scaling using Read Replicas vs using multiple nodes
    • backup & restore supported vs not supported
  • ElastiCache Serverless (launched November 2023)
    • creates a cache in under a minute with zero capacity planning
    • instantly scales capacity based on application traffic patterns
    • provides zero infrastructure management and zero downtime maintenance
    • supports Valkey 7.2+, Redis OSS 7.0+, and Memcached 1.6+
    • pay-per-use pricing based on data stored and requests executed
    • automatically provisions resources across multiple AZs for high availability
  • can be used for state management to keep the web application stateless

Redshift

  • fully managed, fast and powerful, petabyte scale data warehouse service
  • uses replication and continuous backups to enhance availability and improve data durability and can automatically recover from node and component failures
  • provides Massive Parallel Processing (MPP) by distributing & parallelizing queries across multiple physical resources
  • columnar data storage improving query performance and allowing advance compression techniques
  • now supports Multi-AZ deployments for RA3 clusters (GA 2024), running the data warehouse in two AZs simultaneously with 99.99% SLA
  • spot instances are NOT an option
  • Redshift Serverless
    • enables running and scaling analytics without provisioning or managing clusters
    • automatically scales compute up or down based on workload demands
    • AI-driven scaling and optimization (default for new workgroups since April 2026) uses machine learning to predict compute needs and automatically adjust resources
    • offers minimum capacity as low as 4 RPUs for cost-effective development workloads
    • supports Serverless Reservations (2025) for discounted pricing and cost predictability
    • pay-as-you-go pricing based on compute used
  • Zero-ETL Integrations
    • supports zero-ETL from Aurora MySQL, Aurora PostgreSQL, RDS for MySQL, DynamoDB, and self-managed databases
    • automatically replicates data from source to Redshift without building ETL pipelines
    • enables near real-time analytics on transactional data
  • Enhanced Security Defaults (2025)
    • new clusters default to public accessibility disabled, encryption enabled, and secure connections enforced

AWS Identity Services Cheat Sheet

AWS Identity Services Cheat Sheet

AWS Identity and Security Services

IAM – Identity & Access Management

  • securely control access to AWS services and resources
  • helps create and manage user identities and grant permissions for those users to access AWS resources
  • helps create groups for multiple users with similar permissions
  • not appropriate for application authentication
  • is Global and does not need to be migrated to a different region
  • helps define Policies,
    • in JSON format
    • all permissions are implicitly denied by default
    • most restrictive policy wins
  • IAM Role
    • helps grants and delegate access to users and services without the need of creating permanent credentials
    • IAM users or AWS services can assume a role to obtain temporary security credentials that can be used to make AWS API calls
    • needs Trust policy to define who and Permission policy to define what the user or service can access
    • used with Security Token Service (STS), a lightweight web service that provides temporary, limited privilege credentials for IAM users or for authenticated federated users
    • IAM role scenarios
      • Service access for e.g. EC2 to access S3 or DynamoDB
      • Cross Account access for users
        • with user within the same account
        • with user within an AWS account owned the same owner
        • with user from a Third Party AWS account with External ID for enhanced security
      • Identity Providers & Federation
        • AssumeRoleWithWebIdentity – Web Identity Federation, where the user can be authenticated using external authentication Identity providers like Amazon, Google or any OpenId IdP
        • AssumeRoleWithSAML – Identity Provider using SAML 2.0, where the user can be authenticated using on premises Active Directory, Open Ldap or any SAML 2.0 compliant IdP
        • AssumeRole (recommended) or GetFederationToken – For other Identity Providers, use Identity Broker to authenticate and provide temporary Credentials
  • IAM Best Practices
    • Do not use Root account for anything other than billing
    • Create Individual IAM users
    • Use groups to assign permissions to IAM users
    • Grant least privilege
    • Use IAM roles for applications on EC2
    • Delegate using roles instead of sharing credentials
    • Rotate credentials regularly
    • Use Policy conditions for increased granularity
    • Use CloudTrail to keep a history of activity
    • Enforce a strong IAM password policy for IAM users
    • Remove all unused users and credentials

AWS Organizations

  • is an account management service that enables consolidating multiple AWS accounts into an organization that can be centrally managed.
  • include consolidated billing and account management capabilities that enable one to better meet the budgetary, security, and compliance needs of your business.
  • As an administrator of an organization, new accounts can be created in an organization and invite existing accounts to join the organization.
  • enables you to
    • Automate AWS account creation and management, and provision resources with AWS CloudFormation Stacksets.
    • Maintain a secure environment with policies and management of AWS security services
    • Govern access to AWS services, resources, and regions
    • Centrally manage policies across multiple AWS accounts
    • Audit your environment for compliance 
    • View and manage costs with consolidated billing 
    • Configure AWS services across multiple accounts 
  • supports Service Control Policies – SCPs
  • offer central control over the maximum available permissions for all of the accounts in your organization, ensuring member accounts stay within the organization’s access control guidelines.
  • are one type of policy that help manage the organization.
  • are available only in an organization that has all features enabled, and aren’t available if the organization has enabled only the consolidated billing features.
  • are NOT sufficient for granting access to the accounts in the organization.
  • defines a guardrail for what actions accounts within the organization root or OU can do, but IAM policies need to be attached to the users and roles in the organization’s accounts to grant permissions to them.
  • Effective permissions are the logical intersection between what is allowed by the SCP and what is allowed by the IAM and resource-based policies.
  • with an SCP attached to member accounts, identity-based and resource-based policies grant permissions to entities only if those policies and the SCP allow the action
  • don’t affect users or roles in the management account. They affect only the member accounts in your organization.

AWS Directory Services

  • gives applications in AWS access to Active Directory services
  • different from SAML + AD, where the access is granted to AWS services through Temporary Credentials
  • Simple AD
    • least expensive but does not support Microsoft AD advanced features
    • provides a Samba 4 Microsoft Active Directory compatible standalone directory service on AWS
    • No single point of Authentication or Authorization, as a separate copy is maintained
    • trust relationships cannot be setup between Simple AD and other Active Directory domains
    • Don’t use it, if the requirement is to leverage access and control through centralized authentication service
  • AD Connector
    • acts just as an hosted proxy service for instances in AWS to connect to on-premises Active Directory
    • enables consistent enforcement of existing security policies, such as password expiration, password history, and account lockouts, whether users are accessing resources on-premises or in the AWS cloud
    • needs VPN connectivity (or Direct Connect)
    • integrates with existing RADIUS-based MFA solutions to enabled multi-factor authentication
    • does not cache data which might lead to latency
  • Read-only Domain Controllers (RODCs)
    • works out as a Read-only Active Directory
    • holds a copy of the Active Directory Domain Service (AD DS) database and respond to authentication requests
    • they cannot be written to and are typically deployed in locations where physical security cannot be guaranteed
    • helps maintain a single point to authentication & authorization controls, however needs to be synced
  • Writable Domain Controllers
    • are expensive to setup
    • operate in a multi-master model; changes can be made on any writable server in the forest, and those changes are replicated to servers throughout the entire forest

AWS Single Sign-On SSO

  • is a cloud-based single sign-on (SSO) service that makes it easy to centrally manage SSO access to all of the AWS accounts and cloud applications.
  • helps manage access and permissions to commonly used third-party software as a service (SaaS) applications, AWS SSO-integrated applications as well as custom applications that support SAML 2.0.
  • includes a user portal where the end-users can find and access all their assigned AWS accounts, cloud applications, and custom applications in one place.

Amazon Cognito

  • Amazon Cognito provides authentication, authorization, and user management for the web and mobile apps.
  • Users can sign in directly with a username and password, or through a third party such as Facebook, Amazon, Google, or Apple.
  • Cognito has two main components.
    • User pools are user directories that provide sign-up and sign-in options for the app users.
    • Identity pools enable you to grant the users access to other AWS services.
  • Cognito Sync helps synchronize data across a user’s devices so that their app experience remains consistent when they switch between devices or upgrade to a new device.

AWS Security Services Cheat Sheet

AWS Identity and Security Services

AWS Security Services Cheat Sheet

AWS Identity and Security Services

AWS IAM Identity Center (Successor to AWS SSO)

  • is a centralized workforce identity management service that provides single sign-on (SSO) access to multiple AWS accounts and business applications.
  • was renamed from AWS Single Sign-On (AWS SSO) in July 2022.
  • enables administrators to define, customize, and assign fine-grained access across AWS accounts and applications.
  • provides workforce users a portal to access AWS accounts and cloud applications assigned to them.
  • supports integration with external identity providers (IdPs) like Microsoft Active Directory, Okta, and Azure AD.
  • simplifies multi-account access management through AWS Organizations integration.
  • provides temporary credentials instead of long-term IAM user credentials.
  • supports attribute-based access control (ABAC) for fine-grained permissions.

Key Management Service – KMS

  • is a managed encryption service that allows the creation and control of encryption keys to enable data encryption.
  • provides a highly available key storage, management, and auditing solution to encrypt the data across AWS services & within applications.
  • uses hardware security modules (HSMs) that are FIPS 140-3 Security Level 3 certified (upgraded from FIPS 140-2 in May 2023).
  • seamlessly integrates with several AWS services to make encrypting data in those services easy.
  • supports multi-region keys, which are AWS KMS keys in different AWS Regions. Multi-Region keys are not global and each multi-region key needs to be replicated and managed independently.
  • supports External Key Store (XKS) capability (November 2022) allowing customers to store and control encryption keys on-premises or outside AWS cloud while using AWS KMS.
  • provides three key store options: Default KMS key store, CloudHSM custom key store, and External key store (XKS).

CloudHSM

  • provides secure cryptographic key storage to customers by making hardware security modules (HSMs) available in the AWS cloud
  • helps manage your own encryption keys using FIPS 140-3 Level 3 validated HSMs (upgraded from FIPS 140-2).
  • single tenant, dedicated physical device to securely generate, store, and manage cryptographic keys used for data encryption
  • are inside the VPC (not EC2-classic) & isolated from the rest of the network
  • can use VPC peering to connect to CloudHSM from multiple VPCs
  • integrated with Amazon Redshift and Amazon RDS for Oracle
  • EBS volume encryption, S3 object encryption and key management can be done with CloudHSM but requires custom application scripting
  • is NOT fault-tolerant and would need to build a cluster as if one fails all the keys are lost
  • enables quick scaling by adding and removing HSM capacity on-demand, with no up-front costs.
  • automatically load balance requests and securely duplicates keys stored in any HSM to all of the other HSMs in the cluster.
  • expensive, prefer AWS Key Management Service (KMS) if cost is a criteria.

AWS Payment Cryptography

  • is a managed service for payment processing cryptographic operations (launched June 2023).
  • provides payment-specific HSMs that replace on-premises payment hardware security modules.
  • helps meet PCI (Payment Card Industry) security requirements and compliance needs.
  • supports cryptographic operations like PIN generation, validation, and credit/debit card security code processing.
  • manages underlying physical HSM infrastructure and key management automatically.
  • integrates with AWS IAM for authorization and AWS CloudTrail for auditing.
  • enables payment processing workloads to move to the cloud securely.
  • provides elastic scaling for payment cryptography operations.

AWS Private Certificate Authority (Private CA)

  • is a managed private certificate authority service for issuing and managing private SSL/TLS certificates.
  • removes upfront investment and ongoing maintenance costs of operating your own private CA.
  • supports two operating modes: General-purpose mode (certificates with any validity period) and Short-lived certificate mode (certificates valid up to 7 days, launched February 2023).
  • integrates with AWS Certificate Manager (ACM) for automated certificate provisioning and renewal.
  • supports Private CA Connector for Active Directory (September 2023) enabling AWS Private CA as drop-in replacement for self-managed enterprise CAs without local agents.
  • provides audit and compliance support through AWS CloudTrail integration.
  • enables certificate-based authentication for services like Amazon WorkSpaces.

AWS WAF

  • is a web application firewall that helps monitor the HTTP/HTTPS traffic and allows controlling access to the content.
  • helps protect web applications from attacks by allowing rules configuration that allow, block, or monitor (count) web requests based on defined conditions. These conditions include IP addresses, HTTP headers, HTTP body, URI strings, SQL injection and cross-site scripting.
  • helps define Web ACLs, which is a combination of Rules that is a combinations of Conditions and Action to block or allow
  • integrated with CloudFront, Application Load Balancer (ALB), API Gateway services commonly used to deliver content and applications
  • supports custom origins outside of AWS, when integrated with CloudFront
  • provides AWS WAF Fraud Control with three capabilities:
    • Account Takeover Prevention (ATP) – Launched February 2022, protects login pages against credential stuffing attacks
    • Account Creation Fraud Prevention (ACFP) – Launched June 2023, detects and blocks automated bot-based account creation
    • Bot Control – Detects and controls common bots and targeted bots that use advanced evasion techniques
  • supports Challenge and CAPTCHA actions for bot mitigation at no additional cost with Fraud Control.
  • offers usage-based tiered pricing for Fraud Control (introduced June 2023).

AWS Verified Access

  • provides VPN-less, secure access to corporate applications (GA April 2023).
  • implements Zero Trust security model for application access without traditional VPN.
  • validates each application request against identity and device security requirements before granting access.
  • integrates with identity providers (IdPs) and device management systems for authentication and authorization.
  • uses Cedar policy language for fine-grained access control policies.
  • supports AWS WAF integration for additional web application protection.
  • provides signed identity context to end applications for additional security.
  • simplifies remote access management and improves user experience compared to VPN.
  • eliminates VPN infrastructure management overhead.

Amazon Verified Permissions

  • is a fully managed fine-grained authorization service for custom applications (GA June 2023).
  • uses Cedar, an open-source policy language released May 2023, for defining authorization policies.
  • enables developers to externalize authorization logic from application code.
  • provides centralized policy management and administration.
  • offers millisecond-latency authorization decisions with provably correct results.
  • supports policy validation using automated reasoning to prevent misconfigurations.
  • integrates with identity providers for user and group information.
  • enables fine-grained permissions based on user attributes, resource attributes, and context.
  • provides policy versioning and audit capabilities.
  • follows “explicit permit” and “forbid overrides permit” principles.

AWS Secrets Manager

  • helps protect secrets needed to access applications, services, and IT resources.
  • enables you to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle.
  • secure secrets by encrypting them with encryption keys managed using AWS KMS.
  • offers native secret rotation with built-in integration for RDS, Redshift, and DocumentDB.
  • supports Lambda functions to extend secret rotation to other types of secrets, including API keys and OAuth tokens.
  • supports IAM and resource-based policies for fine-grained access control to secrets and centralized secret rotation audit for resources in the AWS Cloud, third-party services, and on-premises.
  • enables secret replication in multiple AWS regions (launched March 2021) to support multi-region applications and disaster recovery scenarios.
  • automatically keeps replica secrets in sync with primary secret including rotation.
  • supports private access using VPC Interface endpoints

AWS Shield

  • is a managed service that provides protection against Distributed Denial of Service (DDoS) attacks for applications running on AWS
  • provides protection for all AWS customers against common and most frequently occurring infrastructure (layer 3 and 4) attacks like SYN/UDP floods, reflection attacks, and others to support high availability of applications on AWS.
  • provides AWS Shield Advanced with additional protections against more sophisticated and larger attacks for applications running on EC2, ELB, CloudFront, AWS Global Accelerator, and Route 53.

AWS GuardDuty

  • offers threat detection that enables continuous monitoring and protects the AWS accounts and workloads.
  • is a Regional service
  • analyzes continuous streams of meta-data generated from AWS accounts and network activity found in AWS CloudTrail Events, EKS audit logs, VPC Flow Logs, and DNS Logs.
  • integrated threat intelligence
  • combines machine learning, anomaly detection, network monitoring, and malicious file discovery, utilizing both AWS-developed and industry-leading third-party sources to help protect workloads and data on AWS
  • supports suppression rules, trusted IP lists, and thread lists.
  • provides Malware Protection to detect malicious files on EBS volumes
  • provides EKS Runtime Monitoring (March 2023) using fully managed EKS add-on for visibility into container runtime activities (file access, process execution, network connections).
  • provides RDS Protection (March 2023) for profiling and monitoring access activity to Amazon Aurora databases.
  • provides Lambda Protection for monitoring AWS Lambda function invocations and runtime behavior.
  • can identify specific containers within EKS clusters that are potentially compromised and detect privilege escalation attempts.
  • operates completely independently from the resources so there is no risk of performance or availability impacts on the workloads.

Amazon Inspector

  • is a vulnerability management service that continuously scans the AWS workloads for vulnerabilities
  • automatically discovers and scans EC2 instances and container images residing in Elastic Container Registry (ECR) for software vulnerabilities and unintended network exposure.
  • supports AWS Lambda function scanning for vulnerabilities in application code and dependencies.
  • provides CI/CD integration (November 2023) with open-source plugins for Jenkins, TeamCity, and other CI/CD tools to scan container images at build time.
  • enables vulnerability scanning directly from CI/CD pipelines wherever they are running without activating Inspector service.
  • scans Lambda functions on each deployment or update of application code or dependencies.
  • creates a finding, when a software vulnerability or network issue is discovered, that describes the vulnerability, rates its severity, identifies the affected resource, and provides remediation guidance.
  • is a Regional service.
  • requires Systems Manager (SSM) agent to be installed and enabled for EC2 scanning.

Amazon Security Lake

  • is a fully managed security data lake service (GA November 2023).
  • automatically centralizes security data from AWS environments, SaaS providers, on-premises, and cloud sources into a purpose-built data lake.
  • normalizes security data into the Open Cybersecurity Schema Framework (OCSF) standard format.
  • aggregates data from AWS services like CloudTrail, VPC Flow Logs, Route 53 logs, and third-party sources.
  • enables comprehensive security data analysis across entire organization.
  • automatically collects data for existing and new accounts with multi-account support.
  • stores security data in customer’s own AWS account for data ownership and control.
  • integrates with analytics tools like Amazon Athena, Amazon OpenSearch, and third-party SIEM solutions.
  • supports cross-region data aggregation for centralized security monitoring.
  • pricing based on data ingestion volume and normalization (no charge for third-party or custom data).

Amazon Detective

  • helps analyze, investigate, and quickly identify the root cause of potential security issues or suspicious activities.
  • automatically collects log data from the AWS resources and uses machine learning, statistical analysis, and graph theory to build a linked set of data to easily conduct faster and more efficient security investigations.
  • enables customers to view summaries and analytical data associated with CloudTrail logs, EKS audit logs, VPC Flow Logs.
  • provides detailed summaries, analysis, and visualizations of the behaviors and interactions amongst your AWS accounts, EC2 instances, AWS users, roles, and IP addresses.
  • maintains up to a year of aggregated data
  • is a Regional service and needs to be enabled on a region-by-region basis.
  • is a multi-account service that aggregates data from monitored member accounts under a single administrative account within the same region.
  • has no impact on the performance or availability of the AWS infrastructure since it retrieves the log data and findings directly from the AWS services.

AWS Security Hub

  • a cloud security posture management service that performs security best practice checks, aggregates alerts, and enables automated remediation.
  • collects security data from across AWS accounts, services, and supported third-party partner products and helps you analyze your security trends and identify the highest priority security issues.
  • is Regional abut supports cross-region aggregation of findings.
  • automatically runs continuous, account-level configuration and security checks based on AWS best practices and industry standards which include CIS Foundations, PCI DSS.
  • consolidates the security findings across accounts and provider products and displays results on the Security Hub console.
  • supports integration with Amazon EventBridge. Custom actions can be defined when a finding is received.
  • has multi-account management through AWS Organizations integration, which allows delegating an administrator account for the organization.
  • works with AWS Config to perform most of its security checks for controls

AWS Macie

  • Macie is a data security service that discovers sensitive data by using machine learning and pattern matching, provides visibility into data security risks, and enables automated protection against those risks.
  • provides an inventory of the S3 buckets and automatically evaluates and monitors the buckets for security and access control.
  • automates the discovery, classification, and reporting of sensitive data.
  • generates a finding for you to review and remediate as necessary if it detects a potential issue with the security or privacy of the data, such as a bucket that becomes publicly accessible.
  • provides multi-account support using AWS Organizations to enable Macie across all of the accounts.
  • is a regional service and must be enabled on a region-by-region basis and helps view findings across all the accounts within each Region.
  • supports VPC Interface Endpoints to access Macie privately from a VPC without an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.

AWS Artifact

  • is a self-service audit artifact retrieval portal that provides customers with on-demand access to AWS’ compliance documentation and agreements
  • can use AWS Artifact Reports to download AWS security and compliance documents, such as AWS ISO certifications, Payment Card Industry (PCI), and System and Organization Control (SOC) reports.

AWS Security Services – Practice Questions

  1. A company needs to manage encryption keys with FIPS 140-3 Level 3 compliance and wants AWS to handle the infrastructure. Which service should they use?
    • A. AWS CloudHSM
    • B. AWS KMS ✓
    • C. AWS Secrets Manager
    • D. AWS Certificate Manager
  2. A financial institution needs to process payment card transactions in the cloud while meeting PCI compliance requirements. Which service should they use?
    • A. AWS CloudHSM
    • B. AWS KMS
    • C. AWS Payment Cryptography ✓
    • D. AWS Private CA
  3. A company wants to provide secure access to corporate applications without using VPN. Which service implements Zero Trust access?
    • A. AWS Client VPN
    • B. AWS Verified Access ✓
    • C. AWS Direct Connect
    • D. AWS PrivateLink
  4. A development team needs to externalize authorization logic from their application and use fine-grained permissions. Which service should they use?
    • A. AWS IAM
    • B. Amazon Cognito
    • C. Amazon Verified Permissions ✓
    • D. AWS IAM Identity Center
  5. A company needs to centralize security data from multiple AWS accounts and third-party sources for analysis. Which service should they use?
    • A. AWS Security Hub
    • B. Amazon Security Lake ✓
    • C. Amazon Detective
    • D. AWS CloudTrail
  6. Which AWS service can detect runtime threats in EKS containers including file access and process execution?
    • A. Amazon Inspector
    • B. AWS Security Hub
    • C. Amazon GuardDuty ✓
    • D. Amazon Detective
  7. A company wants to scan container images for vulnerabilities in their CI/CD pipeline before deployment. Which service supports this?
    • A. AWS Config
    • B. Amazon Inspector ✓
    • C. AWS Security Hub
    • D. Amazon GuardDuty
  8. Which service can protect login pages from credential stuffing attacks and account takeover attempts?
    • A. AWS Shield
    • B. AWS WAF Fraud Control ✓
    • C. Amazon GuardDuty
    • D. AWS Firewall Manager
  9. A company needs to replicate secrets across multiple regions for disaster recovery. Which service supports this?
    • A. AWS Systems Manager Parameter Store
    • B. AWS Secrets Manager ✓
    • C. AWS KMS
    • D. AWS Certificate Manager
  10. Which service was renamed from AWS Single Sign-On (SSO) in July 2022?
    • A. AWS IAM
    • B. Amazon Cognito
    • C. AWS IAM Identity Center ✓
    • D. AWS Directory Service

References

AWS Certification Exam Cheat Sheet

AWS Certification Exam Cheat Sheet

AWS Certification Exams cover a lot of topics and a wide range of services with minute details for features, patterns, anti patterns and their integration with other services. This blog post is just to have a quick summary of all the services and key points for a quick glance before you appear for the exam

AWS Global Infrastructure

AWS Region, AZs, Edge locations

  • Each region is a separate geographic area, completely independent, isolated from the other regions & helps achieve the greatest possible fault tolerance and stability
  • Communication between regions is across the public Internet
  • Each region has multiple Availability Zones
  • Each AZ is physically isolated, geographically separated from each other and designed as an independent failure zone
  • AZs are connected with low-latency private links (not public internet)
  • Edge locations are locations maintained by AWS through a worldwide network of data centers for the distribution of content to reduce latency.

AWS Local Zones

  • AWS Local Zones place select AWS services closer to end-users, which allows running highly-demanding applications that require single-digit millisecond latencies to the end-users such as media & entertainment content creation, real-time gaming, machine learning etc.
  • AWS Local Zones provide a high-bandwidth, secure connection between local workloads and those running in the AWS Region, allowing you to seamlessly connect to the full range of in-region services through the same APIs and tool sets.

AWS Wavelength

  • AWS infrastructure deployments embed AWS compute and storage services within the telecommunications providers’ datacenters and help seamlessly access the breadth of AWS services in the region.
  • AWS Wavelength brings services to the edge of the 5G network, without leaving the mobile provider’s network reducing the extra network hops, minimizing the latency to connect to an application from a mobile device.

AWS Outposts

  • AWS Outposts bring native AWS services, infrastructure, and operating models to virtually any data center, co-location space, or on-premises facility.
  • AWS Outposts is designed for connected environments and can be used to support workloads that need to remain on-premises due to low latency, compliance or local data processing needs.

Refer details @ AWS Global Infrastructure

AWS Services

AWS Organizations

  • AWS Organizations offers policy-based management for multiple AWS accounts
  • Organizations allows creation of groups of accounts and then apply policies to those groups
  • Organizations enables you to centrally manage policies across multiple accounts, without requiring custom scripts and manual processes.
  • Organizations helps simplify the billing for multiple accounts by enabling the setup of a single payment method for all the accounts in the organization through consolidated billing

Consolidate Billing

  • Paying account with multiple linked accounts
  • Paying account is independent and should be only used for billing purpose
  • Paying account cannot access resources of other accounts unless given exclusively access through Cross Account roles
  • All linked accounts are independent and soft limit of 20
  • One bill per AWS account
  • provides Volume pricing discount for usage across the accounts
  • allows unused Reserved Instances to be applied across the group
  • Free tier is not applicable across the accounts

Tags & Resource Groups

  • are metadata, specified as key/value pairs with the AWS resources
  • are for labelling purposes and helps managing, organizing resources
  • can be inherited when created resources created from Auto Scaling, Cloud Formation, Elastic Beanstalk etc
  • can be used for
    • Cost allocation to categorize and track the AWS costs
    • Conditional Access Control policy to define permission to allow or deny access on resources based on tags
  • Resource Group is a collection of resources that share one or more tags

IDS/IPS

  • Promiscuous mode is not allowed, as AWS and Hypervisor will not deliver any traffic to instances this is not specifically addressed to the instance
  • IDS/IPS strategies
    • Host Based Firewall – Forward Deployed IDS where the IDS itself is installed on the instances
    • Host Based Firewall – Traffic Replication where IDS agents installed on instances which send/duplicate the data to a centralized IDS system
    • In-Line Firewall – Inbound IDS/IPS Tier (like a WAF configuration) which identifies and drops suspect packets

DDOS Mitigation

  • Minimize the Attack surface
    • use ELB/CloudFront/Route 53 to distribute load
    • maintain resources in private subnets and use Bastion servers
  • Scale to absorb the attack
    • scaling helps buy time to analyze and respond to an attack
    • auto scaling with ELB to handle increase in load to help absorb attacks
    • CloudFront, Route 53 inherently scales as per the demand
  • Safeguard exposed resources
    • user Route 53 for aliases to hide source IPs and Private DNS
    • use CloudFront geo restriction and Origin Access Identity
    • use WAF as part of the infrastructure
  • Learn normal behavior (IDS/WAF)
    • analyze and benchmark to define rules on normal behavior
    • use CloudWatch
  • Create a plan for attacks

AWS Services Region, AZ, Subnet VPC limitations

  • Services like IAM (user, role, group, SSL certificate), Route 53, STS are Global and available across regions
  • All other AWS services are limited to Region or within Region and do not exclusively copy data across regions unless configured
  • AMI are limited to region and need to be copied over to other region
  • EBS volumes are limited to the Availability Zone, and can be migrated by creating snapshots and copying them to another region
  • Reserved instances are limited to Availability Zone and (can be migrated to other Availability Zone now) cannot be migrated to another region
  • RDS instances are limited to the region and can be recreated in a different region by either using snapshots or promoting a Read Replica
  • Placement groups are limited to the Availability Zone
    • Cluster Placement groups are limited to single Availability Zones
    • Spread Placement groups can span across multiple Availability Zones
  • S3 data is replicated within the region and can be move to another region using cross region replication
  • DynamoDB maintains data within the region can be replicated to another region using DynamoDB cross region replication (using DynamoDB streams) or Data Pipeline using EMR (old method)
  • Redshift Cluster span within an Availability Zone only, and can be created in other AZ using snapshots

Disaster Recovery Whitepaper

  • RTO is the time it takes after a disruption to restore a business process to its service level and RPO acceptable amount of data loss measured in time before the disaster occurs
  • Techniques (RTO & RPO reduces and the Cost goes up as we go down)
    • Backup & Restore – Data is backed up and restored, within nothing running
    • Pilot light – Only minimal critical service like RDS is running and rest of the services can be recreated and scaled during recovery
    • Warm Standby – Fully functional site with minimal configuration is available and can be scaled during recovery
    • Multi-Site – Fully functional site with identical configuration is available and processes the load
  • Services
    • Region and AZ to launch services across multiple facilities
    • EC2 instances with the ability to scale and launch across AZs
    • EBS with Snapshot to recreate volumes in different AZ or region
    • AMI to quickly launch preconfigured EC2 instances
    • ELB and Auto Scaling to scale and launch instances across AZs
    • VPC to create private, isolated section
    • Elastic IP address as static IP address
    • ENI with pre allocated Mac Address
    • Route 53 is highly available and scalable DNS service to distribute traffic across EC2 instances and ELB in different AZs and regions
    • Direct Connect for speed data transfer (takes time to setup and expensive then VPN)
    • S3 and Glacier (with RTO of 3-5 hours) provides durable storage
    • RDS snapshots and Multi AZ support and Read Replicas across regions
    • DynamoDB with cross region replication
    • Redshift snapshots to recreate the cluster
    • Storage Gateway to backup the data in AWS
    • Import/Export to move large amount of data to AWS (if internet speed is the bottleneck)
    • CloudFormation, Elastic Beanstalk and Opsworks as orchestration tools for automation and recreate the infrastructure

 

AWS Certification – Application Services – Cheat Sheet

SQS

  • extremely scalable queue service and potentially handles millions of messages
  • helps build fault tolerant, distributed loosely coupled applications
  • stores copies of the messages on multiple servers for redundancy and high availability
  • guarantees At-Least-Once Delivery, but does not guarantee Exact One Time Delivery which might result in duplicate messages (Not true anymore with the introduction of FIFO queues)
  • does not maintain or guarantee message order, and if needed sequencing information needs to be added to the message itself (Not true anymore with the introduction of FIFO queues)
  • supports multiple readers and writers interacting with the same queue as the same time
  • holds message for 4 days, by default, and can be changed from 1 min – 14 days after which the message is deleted
  • message needs to be explicitly deleted by the consumer once processed
  • allows send, receive and delete batching which helps club up to 10 messages in a single batch while charging price for a single message
  • handles visibility of the message to multiple consumers using Visibility Timeout, where the message once read by a consumer is not visible to the other consumers till the timeout occurs
  • can handle load and performance requirements by scaling the worker instances as the demand changes (Job Observer pattern)
  • message sample allowing short and long polling
    • returns immediately vs waits for fixed time for e.g. 20 secs
    • might not return all messages as it samples a subset of servers vs returns all available messages
    • repetitive vs helps save cost with long connection
  • supports delay queues to make messages available after a certain delay, can you used to differentiate from priority queues
  • supports dead letter queues, to redirect messages which failed to process after certain attempts instead of being processed repeatedly
  • Design Patterns
    • Job Observer Pattern can help coordinate number of EC2 instances with number of job requests (Queue Size) automatically thus Improving cost effectiveness and performance
    • Priority Queue Pattern can be used to setup different queues with different handling either by delayed queues or low scaling capacity for handling messages in lower priority queues

SNS

  • delivery or sending of messages to subscribing endpoints or clients
  • publisher-subscriber model
  • Producers and Consumers communicate asynchronously with subscribers by producing and sending a message to a topic
  • supports Email (plain or JSON), HTTP/HTTPS, SMS, SQS
  • supports Mobile Push Notifications to push notifications directly to mobile devices with services like Amazon Device Messaging (ADM), Apple Push Notification Service (APNS), Google Cloud Messaging (GCM) etc. supported
  • order is not guaranteed and No recall available
  • integrated with Lambda to invoke functions on notifications
  • for Email notifications, use SNS or SES directly, SQS does not work

SWF

  • orchestration service to coordinate work across distributed components
  • helps define tasks, stores, assigns tasks to workers, define logic, tracks and monitors the task and maintains workflow state in a durable fashion
  • helps define tasks which can be executed on AWS cloud or on-premises
  • helps coordinating tasks across the application which involves managing intertask dependencies, scheduling, and concurrency in accordance with the logical flow of the application
  • supports built-in retries, timeouts and logging
  • supports manual tasks
  • Characteristics
    • deliver exactly once
    • uses long polling, which reduces number of polls without results
    • Visibility of task state via API
    • Timers, signals, markers, child workflows
    • supports versioning
    • keeps workflow history for a user-specified time
  • AWS SWF vs AWS SQS
    • task-oriented vs message-oriented
    • track of all tasks and events vs needs custom handling

SES

  • highly scalable and cost-effective email service
  • uses content filtering technologies to scan outgoing emails to check standards and email content for spam and malware
  • supports full fledged emails to be sent as compared to SNS where only the message is sent in Email
  • ideal for sending bulk emails at scale
  • guarantees first hop
  • eliminates the need to support custom software or applications to do heavy lifting of email transport

AWS Networking & Content Delivery Services Cheat Sheet

AWS Networking & Content Delivery Services

AWS Networking & Content Delivery Services Cheat Sheet

AWS Networking & Content Delivery Services

Virtual Private Cloud – VPC

  • helps define a logically isolated dedicated virtual network within the AWS
  • provides control of IP addressing using CIDR block from a minimum of /28 to a maximum of /16 block size
  • supports IPv4 and IPv6 addressing
  • cannot be extended once created
  • can be extended by associating secondary IPv4 CIDR blocks to VPC
  • Components
    • Internet gateway (IGW) provides access to the Internet
    • Virtual gateway (VGW) provides access to the on-premises data center through VPN and Direct Connect connections
    • VPC can have only one IGW and VGW
    • Route tables determine network traffic routing from the subnet
    • Ability to create a subnet with VPC CIDR block
    • A Network Address Translation (NAT) server provides outbound Internet access for EC2 instances in private subnets
    • Elastic IP addresses are static, persistent public IP addresses
    • Instances launched in the VPC will have a Private IP address and can have a Public or an Elastic IP address associated with it
    • Security Groups and NACLs help define security
    • Flow logs – Capture information about the IP traffic going to and from network interfaces in your VPC
  • Tenancy option for instances
    • shared, by default, allows instances to be launched on shared tenancy
    • dedicated allows instances to be launched on a dedicated hardware
  • Route Tables
    • defines rules, termed as routes, which determine where network traffic from the subnet would be routed
    • Each VPC has a Main Route table and can have multiple custom route tables created
    • Every route table contains a local route that enables communication within a VPC which cannot be modified or deleted
    • Route priority is decided by matching the most specific route in the route table that matches the traffic
  • Subnets
    • map to AZs and do not span across AZs
    • have a CIDR range that is a portion of the whole VPC.
    • CIDR ranges cannot overlap between subnets within the VPC.
    • AWS reserves 5 IP addresses in each subnet – first 4 and last one
    • Each subnet is associated with a route table which define its behavior
      • Public subnets – inbound/outbound Internet connectivity via IGW
      • Private subnets – outbound Internet connectivity via an NAT or VGW
      • Protected subnets – no outbound connectivity and used for regulated workloads
  • Elastic Network Interface (ENI)
    • a default ENI, eth0, is attached to an instance which cannot be detached with one or more secondary detachable ENIs (eth1-ethn)
    • has primary private, one or more secondary private, public, Elastic IP address, security groups, MAC address and source/destination check flag attributes associated
    • AN ENI in one subnet can be attached to an instance in the same or another subnet, in the same AZ and the same VPC
    • Security group membership of an ENI can be changed
    • with pre-allocated Mac Address can be used for applications with special licensing requirements
  • Security Groups vs NACLs – Network Access Control Lists
    • Stateful vs Stateless
    • At instance level vs At subnet level
    • Only allows Allow rule vs Allows both Allow and Deny rules
    • Evaluated as a Whole vs Evaluated in defined Order
  • Elastic IP
    • is a static IP address designed for dynamic cloud computing.
    • is associated with an AWS account, and not a particular instance
    • can be remapped from one instance to another instance
    • is charged for non-usage, if not linked for any instance or instance associated is in a stopped state
  • NAT
    • allows internet access to instances in the private subnets.
    • performs the function of both address translation and port address translation (PAT)
    • needs source/destination check flag to be disabled as it is not the actual destination of the traffic for NAT Instance.
    • NAT gateway is an AWS managed NAT service that provides better availability, higher bandwidth, and requires less administrative effort
    • are not supported for IPv6 traffic
    • NAT Gateway supports private NAT with fixed private IPs.
    • Regional NAT Gateway (announced Nov 2025) automatically expands across Availability Zones based on workload footprint, providing simplified setup, enhanced security, and automatic high availability without manual multi-AZ configuration.
  • Egress-Only Internet Gateways
    • outbound communication over IPv6 from instances in the VPC to the Internet, and prevents the Internet from initiating an IPv6 connection with your instances
    • supports IPv6 traffic only
  • Shared VPCs
    • allows multiple AWS accounts to create their application resources, such as EC2 instances, RDS databases, Redshift clusters, and AWS Lambda functions, into shared, centrally-managed VPCs
  • VPC Encryption Controls (announced Nov 2025)
    • allows enforcing encryption in transit for network traffic within the VPC
    • provides centralized encryption policy enforcement and monitoring capabilities
    • supports monitor and enforce modes to audit and enforce encryption compliance
    • transitioned to paid feature starting March 2026

VPC Peering

  • allows routing of traffic between the peer VPCs using private IP addresses with no IGW or VGW required.
  • No single point of failure and bandwidth bottlenecks
  • supports inter-region VPC peering
  • Limitations
    • IP space or CIDR blocks cannot overlap
    • cannot be transitive
    • supports a one-to-one relationship between two VPCs and has to be explicitly peered.
    • does not support edge-to-edge routing.
    • supports only one connection between any two VPCs
  • Private DNS values cannot be resolved
  • Security groups from peered VPC can now be referred to, however, the VPC should be in the same region.

VPC Endpoints

  • enables private connectivity from VPC to supported AWS services and VPC endpoint services powered by PrivateLink
  • does not require a public IP address, access over the Internet, NAT device, a VPN connection, or Direct Connect
  • traffic between VPC & AWS service does not leave the Amazon network
  • are virtual devices.
  • are horizontally scaled, redundant, and highly available VPC components that allow communication between instances in the VPC and services without imposing availability risks or bandwidth constraints on the network traffic.
  • Gateway Endpoints
    • is a gateway that is a target for a specified route in the route table, used for traffic destined to a supported AWS service.
    • only S3 and DynamoDB are currently supported
  • Interface Endpoints OR Private Links
    • is an elastic network interface with a private IP address that serves as an entry point for traffic destined to a supported service
    • supports services include AWS services, services hosted by other AWS customers and partners in their own VPCs (referred to as endpoint services), and supported AWS Marketplace partner services.
    • Private Links
      • provide fine-grained access control
      • provides a point-to-point integration.
      • supports overlapping CIDR blocks.
      • supports transitive routing
    • Access to VPC Resources over PrivateLink (announced Dec 2024) – allows sharing any VPC resource using AWS RAM and accessing them privately using VPC endpoints, without requiring the resource to sit behind a NLB.

CloudFront

  • provides low latency and high data transfer speeds for the distribution of static, dynamic web, or streaming content to web users.
  • delivers the content through a worldwide network of data centers called Edge Locations or Point of Presence (PoPs)
  • keeps persistent connections with the origin servers so that the files can be fetched from the origin servers as quickly as possible.
  • dramatically reduces the number of network hops that users’ requests must pass through
  • supports multiple origin server options, like AWS hosted service for e.g. S3, EC2, ELB, or an on-premise server, which stores the original, definitive version of the objects
  • single distribution can have multiple origins and Path pattern in a cache behavior determines which requests are routed to the origin
  • Web distribution supports static, dynamic web content, on-demand using progressive download & HLS, and live streaming video content
  • RTMP distributions were deprecated and removed on December 31, 2020. Use Web distributions with HTTP-based streaming protocols (HLS, DASH) instead.
  • supports HTTPS using either
    • dedicated IP address, which is expensive as a dedicated IP address is assigned to each CloudFront edge location
    • Server Name Indication (SNI), which is free but supported by modern browsers only with the domain name available in the request header
  • For E2E HTTPS connection,
    • Viewers -> CloudFront needs either a certificate issued by CA or ACM
    • CloudFront -> Origin needs a certificate issued by ACM for ELB and by CA for other origins
  • Security
    • Origin Access Control (OAC) is the recommended method to restrict content from S3 origin to be accessible from CloudFront only. OAC supports SSE-KMS, all HTTP methods, and all AWS Regions.
      • Origin Access Identity (OAI) is the legacy method. OAI creation was deprecated in 2024 and new distributions (as of March 2026) can only use OAC. Existing OAI configurations continue to work but migration to OAC is recommended.
    • supports Geo restriction (Geo-Blocking) to whitelist or blacklist countries that can access the content
    • Signed URLs
      • to restrict access to individual files, for e.g., an installation download for your application.
      • users using a client, for e.g. a custom HTTP client, that doesn’t support cookies
    • Signed Cookies
      • provide access to multiple restricted files, for e.g., video part files in HLS format or all of the files in the subscribers’ area of a website.
      • don’t want to change the current URLs
    • integrates with AWS WAF, a web application firewall that helps protect web applications from attacks by allowing rules configured based on IP addresses, HTTP headers, and custom URI strings
  • supports GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE to get object & object headers, add, update, and delete objects
    • only caches responses to GET and HEAD requests and, optionally, OPTIONS requests
    • does not cache responses to PUT, POST, PATCH, DELETE request methods and these requests are proxied back to the origin
  • object removal from the cache
    • would be removed upon expiry (TTL) from the cache, by default 24 hrs
    • can be invalidated explicitly, but has a cost associated, however, might continue to see the old version until it expires from those caches
    • objects can be invalidated only for Web distribution
    • use versioning or change object name, to serve a different version
    • Tag-based cache invalidation (announced May 2026) – allows tagging cached objects via origin response headers or S3 metadata and invalidating them by tag directly through the CloudFront API.
  • supports adding or modifying custom headers before the request is sent to origin which can be used to
    • validate if a user is accessing the content from CDN
    • identifying CDN from which the request was forwarded, in case of multiple CloudFront distributions
    • for viewers not supporting CORS to return the Access-Control-Allow-Origin header for every request
  • supports Partial GET requests using range header to download objects in smaller units improving the efficiency of partial downloads and recovery from partially failed transfers
  • supports compression to compress and serve compressed files when viewer requests include Accept-Encoding: gzip in the request header
  • supports different price classes to include all regions, or only the least expensive regions and other regions without the most expensive regions
  • supports access logs which contain detailed information about every user request for both web distribution
  • Edge Compute
    • CloudFront Functions – lightweight JavaScript functions for simple request/response transformations (URL rewrites, header manipulation, redirects) executed at viewer request/response events with sub-millisecond latency
    • Lambda@Edge – more powerful compute for complex processing at origin request/response and viewer request/response events
    • CloudFront KeyValueStore (launched 2023) – a globally distributed, low-latency data store that CloudFront Functions can read at runtime for dynamic routing, A/B testing, feature flags, and geo-routing without redeploying function code
  • CloudFront Flat-Rate Pricing Plans – combine CDN, AWS WAF, DDoS protection, bot management, Route 53 DNS, CloudWatch Logs ingestion, serverless edge compute, and S3 storage credits into a single monthly price

AWS VPN

  • AWS Site-to-Site VPN provides secure IPSec connections from on-premise computers or services to AWS over the Internet
  • is cheap, and quick to set up however it depends on the Internet speed
  • delivers high availability by using two tunnels across multiple Availability Zones within the AWS global network
  • VPN requires a Virtual Gateway – VGW and Customer Gateway – CGW for communication
  • VPN connection is terminated on VGW on AWS
  • Only one VGW can be attached to a VPC at a time
  • VGW supports both static and dynamic routing using Border Gateway Protocol (BGP)
  • VGW supports AWS-256 and SHA-2 for data encryption and integrity
  • AWS Client VPN is a managed client-based VPN service that enables secure access to AWS resources and resources in the on-premises network.
  • AWS VPN does not allow accessing the Internet through IGW or NAT Gateway, peered VPC resources, or VPC Gateway Endpoints from on-premises.
  • AWS VPN allows access accessing the Internet through NAT Instance and VPC Interface Endpoints from on-premises.

Direct Connect

  • is a network service that uses a private dedicated network connection to connect to AWS services.
  • helps reduce costs (long term), increases bandwidth, and provides a more consistent network experience than internet-based connections.
  • supports Dedicated and Hosted connections
    • Dedicated connection is made through a 1 Gbps, 10 Gbps, or 100 Gbps Ethernet port dedicated to a single customer.
    • Hosted connections are sourced from an AWS Direct Connect Partner that has a network link between themselves and AWS.
  • provides Virtual Interfaces
    • Private VIF to access instances within a VPC via VGW
    • Public VIF to access non VPC services
    • Transit VIF to access one or more Amazon VPC Transit Gateways associated with Direct Connect gateways, enabling connectivity to multiple VPCs through a single VIF
  • requires time to setup probably months, and should not be considered as an option if the turnaround time is less
  • does not provide redundancy, use either second direct connection or IPSec VPN connection
  • Virtual Private Gateway is on the AWS side and Customer Gateway is on the Customer side
  • route propagation is enabled on VGW and not on CGW
  • A link aggregation group (LAG) is a logical interface that uses the link aggregation control protocol (LACP) to aggregate multiple dedicated connections at a single AWS Direct Connect endpoint and treat them as a single, managed connection
  • VIF Rate Limiters (announced June 2026) on dedicated connections help prevent network congestion caused by unexpected traffic spikes on a VIF that could consume all available bandwidth impacting other VIFs on the same connection.
  • Direct Connect vs VPN IPSec
    • Expensive to Setup and Takes time vs Cheap & Immediate
    • Dedicated private connections vs Internet
    • Reduced data transfer rate vs Internet data transfer cost
    • Consistent performance vs Internet inherent variability
    • Do not provide Redundancy vs Provides Redundancy

Route 53

  • provides highly available and scalable DNS, Domain Registration Service, and health-checking web services
  • Reliable and cost-effective way to route end users to Internet applications
  • Supports multi-region and backup architectures for High availability. ELB is limited to region and does not support multi-region HA architecture.
  • supports private Intranet facing DNS service
  • internal resource record sets only work for requests originating from within the VPC and currently cannot extend to on-premise
  • Global propagation of any changes made to the DN records within ~ 1min
  • supports Alias resource record set is a Route 53 extension to DNS.
    • It’s similar to a CNAME resource record set, but supports both for root domain – zone apex e.g. example.com, and for subdomains for e.g. www.example.com.
    • supports ELB load balancers, CloudFront distributions, Elastic Beanstalk environments, API Gateways, VPC interface endpoints, and S3 buckets that are configured as websites.
  • CNAME resource record sets can be created only for subdomains and cannot be mapped to the zone apex record
  • supports Private DNS to provide an authoritative DNS within the VPCs without exposing the DNS records (including the name of the resource and its IP address(es) to the Internet.
  • Split-view (Split-horizon) DNS enables mapping the same domain publicly and privately. Requests are routed as per the origin.
  • Routing policy
    • Simple routing – simple round-robin policy
    • Weighted routing – assign weights to resource records sets to specify the proportion for e.g. 80%:20%
    • Latency based routing – helps improve global applications as requests are sent to the server from the location with minimal latency, is based on the latency and cannot guarantee users from the same geography will be served from the same location for any compliance reasons
    • Geolocation routing – Specify geographic locations by continent, country, the state limited to the US, is based on IP accuracy
    • Geoproximity routing policy – Use to route traffic based on the location of the resources and, optionally, shift traffic from resources in one location to resources in another.
    • Multivalue answer routing policy – Use to respond to DNS queries with up to eight healthy records selected at random.
    • Failover routing – failover to a backup site if the primary site fails and becomes unreachable
    • IP-based routing – route traffic based on the IP address of the client making the DNS query
  • Weighted, Latency and Geolocation can be used for Active-Active while Failover routing can be used for Active-Passive multi-region architecture
  • Traffic Flow is an easy-to-use and cost-effective global traffic management service. Traffic Flow supports versioning and helps create policies that route traffic based on the constraints they care most about, including latency, endpoint health, load, geoproximity, and geography.
  • Route 53 Resolver is a regional DNS service that helps with hybrid DNS
    • Inbound Endpoints are used to resolve DNS queries from an on-premises network to AWS
    • Outbound Endpoints are used to resolve DNS queries from AWS to an on-premises network
    • Resolver endpoints now support DNS delegation for private hosted zones (June 2025)
  • Route 53 Profiles – enables sharing DNS configurations (private hosted zone associations, Resolver rules, and Resolver DNS Firewall rule group associations) across VPCs and accounts using AWS RAM
  • Accelerated Recovery (announced Nov 2025) – provides a 60-minute recovery time objective (RTO) for regaining the ability to make DNS changes to public hosted zones during regional disruptions in US East (N. Virginia)
  • PrivateLink Support (announced Nov 2025) – allows making changes to DNS infrastructure (hosted zones, records, health checks) without using the public internet

AWS Global Accelerator

  • is a networking service that helps you improve the availability and performance of the applications to global users.
  • utilizes the Amazon global backbone network, improving the performance of the applications by lowering first-byte latency, and jitter, and increasing throughput as compared to the public internet.
  • provides two static IP addresses serviced by independent network zones that provide a fixed entry point to the applications and eliminate the complexity of managing specific IP addresses for different AWS Regions and AZs.
  • always routes user traffic to the optimal endpoint based on performance, reacting instantly to changes in application health, the user’s location, and configured policies
  • improves performance for a wide range of applications over TCP or UDP by proxying packets at the edge to applications running in one or more AWS Regions.
  • is a good fit for non-HTTP use cases, such as gaming (UDP), IoT (MQTT), or Voice over IP, as well as for HTTP use cases that specifically require static IP addresses or deterministic, fast regional failover.
  • integrates with AWS Shield for DDoS protection
  • uses a global network of 130+ Points of Presence in 95+ cities across 53+ countries
  • supports dual-stack Network Load Balancers as endpoints
  • supports endpoints in 33 AWS Regions (as of 2025)
  • integrates with AWS Load Balancer Controller for Kubernetes (announced 2025)

Transit Gateway – TGW

  • is a highly available and scalable service to consolidate the AWS VPC routing configuration for a region with a hub-and-spoke architecture.
  • acts as a Regional virtual router and is a network transit hub that can be used to interconnect VPCs and on-premises networks.
  • traffic always stays on the global AWS backbone, data is automatically encrypted, and never traverses the public internet, thereby reducing threat vectors, such as common exploits and DDoS attacks.
  • is a Regional resource and can connect VPCs within the same AWS Region.
  • TGWs across the same or different regions can peer with each other.
  • provides simpler VPC-to-VPC communication management over VPC Peering with a large number of VPCs.
  • scales elastically based on the volume of network traffic.
  • supports security group referencing (announced Sept 2024) – allows creating inbound security rules that reference security groups defined in other VPCs attached to the same Transit Gateway within the same Region.
  • supports per-AZ metrics delivered to CloudWatch and Path MTU Discovery (PMTUD) for both IPv4 and IPv6 (announced Nov 2024).
  • supports Transit Gateway Flow Logs for monitoring and logging network traffic between transit gateways.
  • supports Flexible Cost Allocation (announced Nov 2025) – provides versatile cost allocation options through a central metering policy beyond the default sender-pay model.

Amazon VPC Lattice

  • is a fully managed application networking service that connects, monitors, and secures communications between services and resources across VPCs and accounts.
  • simplifies service-to-service connectivity without requiring VPC peering, Transit Gateway, or PrivateLink NLBs.
  • automatically manages network connectivity and application-layer routing between services across different VPCs and AWS accounts.
  • supports connectivity to TCP resources, such as databases, domain names, and IP addresses across VPCs and accounts.
  • integrates with AWS IAM for service-to-service authentication and authorization using Auth policies.
  • removes the NLB requirement that PrivateLink imposes on providers and supports cross-VPC/cross-account connectivity without CIDR coordination.
  • terminates TLS at the data plane so callers do not need to manage certificates.
  • provides built-in observability with access logs, connection logs, and traffic metrics.
  • Key concepts:
    • Service Network – a logical boundary for a collection of services that can communicate with each other
    • Service – represents an application unit that is independently deployable
    • Target Groups – collection of resources (instances, IPs, Lambda, ALB) for routing
    • Resource Configurations – define TCP resources (databases, IPs, domain names) accessible through VPC Lattice
  • Use cases:
    • Microservices connectivity across multiple VPCs/accounts
    • Secure service-to-service communication with zero trust
    • Alternative to VPC Peering and Transit Gateway for application-layer connectivity
    • Replacement for AWS App Mesh (which reached EOL on September 30, 2026)

Amazon VPC IP Address Manager (IPAM)

  • is a VPC feature that allows you to plan, track, and monitor IP addresses for AWS workloads.
  • organizes IP addresses by routing and security requirements while automating allocation to VPCs, replacing manual spreadsheet-based tracking.
  • tracks AWS accounts and VPCs, eliminating IP bookkeeping overhead.
  • supports management at both VPC and subnet CIDR levels.
  • integrates with AWS Organizations for cross-account IP address management.
  • supports provisioning Amazon-provided contiguous IPv4 blocks into publicly scoped regional pools for use with EIPs, NLBs, and NAT Gateways.
  • Public IP Insights – free feature that simplifies monitoring, analysis, and auditing of public IPv4 addresses.
  • IPAM Policies – define public IPv4 allocation strategies and automate prefix lists.
  • integrates with ALB for predictable IP address blocks for internet-facing ALBs (March 2025).
  • IPAM Advanced Tier – includes Infoblox integration (Nov 2025) for managing AWS IP addresses through existing Infoblox workflows.

AWS Network Firewall

  • is a managed, stateful network firewall and intrusion detection and prevention service for all Amazon VPCs.
  • scales automatically with network traffic, requiring no infrastructure management.
  • provides Layer 7 firewall capabilities with deep packet inspection.
  • supports flexible rules engine for fine-grained control of VPC network traffic.
  • provides active threat defense using AWS managed rules to block evasive C2 channels, malicious URLs, and other threat vectors.
  • supports Suricata-compatible IPS rules for known bad signatures and traffic patterns.
  • includes Network Firewall Proxy for granular security controls to inspect and filter VPC outbound connections, preventing data exfiltration and malware intrusion.
  • integrates with AWS Firewall Manager for centralized policy management across accounts.
  • can be combined with VPC Lattice for comprehensive security (VPC Lattice for HTTP/S with identity-based controls, Network Firewall for other traffic types).

AWS Cloud WAN

  • is a managed WAN service that provides a central dashboard to connect and manage branch offices, data centers, VPN connections, SD-WAN, VPCs, and Transit Gateways.
  • uses network policies to create a global network spanning multiple locations and networks, removing the need for different technologies.
  • provides a single console and set of APIs to manage networks across AWS Regions.
  • supports direct Direct Connect gateway attachments without requiring an intermediate Transit Gateway (announced Nov 2024).
  • supports Routing Policy for advanced traffic control (announced Nov 2025) – enables controlled routing environments, minimizing route reachability blast radius.
  • supports Service Insertion for inspection and security appliance integration.
  • supports PMTUD for both IPv4 and IPv6 (announced Nov 2024).
  • supports AWS PrivateLink and IPv6 for management endpoint connectivity (announced March 2025).
  • available in AWS GovCloud (US) Regions.

AWS Verified Access

  • provides secure access to corporate applications and resources without requiring a VPN.
  • implements zero trust principles by evaluating each access request based on user identity and device security posture rather than network location.
  • uses the Cedar policy language for defining fine-grained access policies.
  • supports secure access to resources over non-HTTP(S) protocols (announced Feb 2025) – enables VPN-less access to TCP-based resources like SSH, RDP, and databases.
  • continuously monitors active connections and terminates connections when security requirements aren’t met.
  • integrates with third-party identity providers and device management solutions.
  • can be used with PrivateLink-backed services to provide authorized internet-based access while maintaining security boundaries.

AWS Management Tools Cheat Sheet

AWS Organizations

  • AWS Organizations is an account management service that enables consolidating multiple AWS accounts into an organization that can be created and centrally managed.
  • AWS Organizations enables you to
    • Automate AWS account creation and management, and provision resources with AWS CloudFormation Stacksets
    • Maintain a secure environment with policies and management of AWS security services
    • Govern access to AWS services, resources, and regions
    • Centrally manage policies across multiple AWS accounts
    • Audit your environment for compliance
    • View and manage costs with consolidated billing
    • Configure AWS services across multiple accounts

CloudFormation

  • gives developers and systems administrators an easy way to create and manage a collection of related AWS resources
  • Resources can be updated, deleted, and modified in an orderly, controlled and predictable fashion, in effect applying version control to the AWS infrastructure as code done for software code
  • CloudFormation Template is an architectural diagram, in JSON format, and Stack is the end result of that diagram, which is actually provisioned
  • template can be used to set up the resources consistently and repeatedly over and over across multiple regions and consists of
    • List of AWS resources and their configuration values
    • An optional template file format version number
    • An optional list of template parameters (input values supplied at stack creation time)
    • An optional list of output values like public IP address using the Fn::GetAtt function
    • An optional list of data tables used to lookup static configuration values for e.g., AMI names per AZ
  • supports Chef & Puppet Integration to deploy and configure right down the application layer
  • supports Bootstrap scripts to install packages, files, and services on the EC2 instances by simply describing them in the CF template
  • automatic rollback on error feature is enabled, by default, which will cause all the AWS resources that CF created successfully for a stack up to the point where an error occurred to be deleted
  • provides a WaitCondition resource to block the creation of other resources until a completion signal is received from an external source
  • allows DeletionPolicy attribute to be defined for resources in the template
    • retain to preserve resources like S3 even after stack deletion
    • snapshot to backup resources like RDS after stack deletion
  • DependsOn attribute to specify that the creation of a specific resource follows another
  • Service role is an IAM role that allows AWS CloudFormation to make calls to resources in a stack on the user’s behalf
  • Nested stacks can separate out reusable, common components and create dedicated templates to mix and match different templates but use nested stacks to create a single, unified stack
  • Change Sets presents a summary or preview of the proposed changes that CloudFormation will make when a stack is updated
  • Drift detection enables you to detect whether a stack’s actual configuration differs, or has drifted, from its expected configuration.
  • Termination protection helps prevent a stack from being accidentally deleted.
  • Stack policy can prevent stack resources from being unintentionally updated or deleted during a stack update.
  • StackSets extends the functionality of stacks by enabling you to create, update, or delete stacks across multiple accounts and Regions with a single operation.

Elastic BeanStalk

  • makes it easier for developers to quickly deploy and manage applications in the AWS cloud.
  • automatically handles the deployment details of capacity provisioning, load balancing, auto-scaling and application health monitoring
  • CloudFormation supports ElasticBeanstalk
  • provisions resources to support
    • a web application that handles HTTP(S) requests or
    • a web application that handles background-processing (worker) tasks
  • supports Out Of the Box
    • Apache Tomcat for Java applications
    • Apache HTTP Server for PHP applications
    • Apache HTTP server for Python applications
    • Nginx or Apache HTTP Server for Node.js applications
    • Passenger for Ruby applications
    • MicroSoft IIS 7.5 for .Net applications
    • Single and Multi Container Docker
  • supports custom AMI to be used
  • is designed to support multiple running environments such as one for Dev, QA, Pre-Prod and Production.
  • supports versioning and stores and tracks application versions over time allowing easy rollback to prior version
  • can provision RDS DB instance and connectivity information is exposed to the application by environment variables, but is NOT recommended for production setup as the RDS is tied up with the Elastic Beanstalk lifecycle and if deleted, the RDS instance would be deleted as well

OpsWorks

  • is a configuration management service that helps to configure and operate applications in a cloud enterprise by using Chef
  • helps deploy and monitor applications in stacks with multiple layers
  • supports preconfigured layers for Applications, Databases, Load Balancers, Caching
  • OpsWorks Stacks features is a set of lifecycle events – Setup, Configure, Deploy, Undeploy, and Shutdown – which automatically runs specified set of recipes at the appropriate time on each instance
  • Layers depend on Chef recipes to handle tasks such as installing packages on instances, deploying apps, running scripts, and so on
  • OpsWorks Stacks runs the recipes for each layer, even if the instance belongs to multiple layers
  • supports Auto Healing and Auto Scaling to monitor instance health, and provision new instances

CloudWatch

  • allows monitoring of AWS resources and applications in real time, collect and track pre configured or custom metrics and configure alarms to send notification or make resource changes based on defined rules
  • does not aggregate data across regions
  • stores the log data indefinitely, and the retention can be changed for each log group at any time
  • alarm history is stored for only 14 days
  • can be used an alternative to S3 to store logs with the ability to configure Alarms and generate metrics, however logs cannot be made public
  • Alarms exist only in the created region and the Alarm actions must reside in the same region as well

CloudTrail

  • records access to API calls for the AWS account made from AWS management console, SDKs, CLI and higher level AWS service
  • support many AWS services and tracks who did, from where, what & when
  • can be enabled per-region basis, a region can include global services (like IAM, STS etc), is applicable to all the supported services within that region
  • log files from different regions can be sent to the same S3 bucket
  • can be integrated with SNS to notify logs availability, CloudWatch logs log group for notifications when specific API events occur
  • call history enables security analysis, resource change tracking, trouble shooting and compliance auditing

AWS Identity & Security Services Cheat Sheet

AWS Identity & Security Services Cheat Sheet

AWS Identity and Security Services

📌 Last Updated: June 2026 — Includes AWS Security Hub reimagined (re:Invent 2025), AWS Security Agent (GA March 2026), mandatory MFA enforcement for all root users, GuardDuty Extended Threat Detection, and IAM Identity Center multi-Region replication.

AWS Identity Services Cheat Sheet

AWS Security Services Cheat Sheet

AWS Identity & Security Services Overview

AWS Security, Identity, and Compliance services provide a comprehensive set of tools to help protect data, accounts, and workloads. These services are organized into the following categories:

Identity and Access Management

  • AWS Identity and Access Management (IAM) – Securely manage access to AWS services and resources using users, groups, roles, and policies
  • AWS IAM Identity Center (formerly AWS SSO) – Centrally manage SSO access to multiple AWS accounts and business applications
    • Now supports multi-Region replication (Feb 2026) for high availability
    • Supports IPv6 dual-stack endpoints
  • Amazon Cognito – Customer identity and access management (CIAM) for web and mobile apps
    • Now supports passwordless authentication with passkeys (FIDO2/WebAuthn), email OTP, and SMS OTP (Nov 2024)
    • New feature tiers: Essentials and Plus (Nov 2024)
    • Managed Login for pre-built authentication UIs
  • Amazon Verified Permissions – Scalable, fine-grained authorization using Cedar policy language for custom applications
  • AWS Resource Access Manager (RAM) – Securely share AWS resources across accounts and within AWS Organizations
  • AWS Directory Service – Managed Microsoft Active Directory in the AWS Cloud

Detection and Response

  • Amazon GuardDuty – Intelligent threat detection that continuously monitors for malicious activity
    • Extended Threat Detection (re:Invent 2024) – AI/ML-powered attack sequence identification across multiple data sources
    • Now covers EC2, ECS, EKS, S3, and IAM attack sequences
    • Custom entity lists for domain-based threat intelligence (Sept 2025)
  • Amazon Detective – Analyze, investigate, and identify root cause of security findings using ML and graph theory
  • Amazon Inspector – Automated vulnerability management for EC2 instances and container images in ECR
  • AWS Security Hub – Cloud security posture management (CSPM) and unified security operations
    • Reimagined at re:Invent 2025 – Unifies GuardDuty, Inspector, and other services into a single experience
    • Near real-time analytics and risk prioritization (GA Dec 2025)
    • Extended Plan (GA Feb 2026) – Full-stack enterprise security with 21 curated partner solutions across 9 categories
    • Expanding to multicloud environments
  • AWS Security Agent (GA March 2026) – AI-powered frontier agent for proactive application security
    • Automated security reviews tailored to organizational requirements
    • On-demand context-aware penetration testing
    • Full repository code scanning (Preview May 2026)
    • Operates like a human penetration tester – identifies, exploits, and validates vulnerabilities

Data Protection

Network and Application Protection

  • AWS WAF – Web application firewall to protect against common web exploits and bots
  • AWS Shield – Managed DDoS protection (Standard and Advanced tiers)
  • AWS Network Firewall – Managed network firewall for VPC with stateful inspection and IPS
  • AWS Firewall Manager – Centrally configure and manage firewall rules across accounts in AWS Organizations

Security Data Management and Compliance

  • Amazon Security Lake – Centralize security data from AWS, SaaS, on-premises using OCSF standard
    • Achieved FedRAMP High and Moderate authorization (April 2025)
  • AWS Audit Manager – Continuously audit AWS usage for risk and compliance assessment
  • AWS Artifact – On-demand access to AWS security and compliance reports

Key Updates (2024-2026)

  • MFA Enforcement (2024-2025) – AWS now mandates MFA for all root users across all account types. Prevents over 99% of password-related attacks.
  • AWS Security Hub Reimagined (re:Invent 2025) – Completely redesigned to unify security services into a single experience with near real-time analytics and AI-driven risk prioritization.
  • AWS Security Agent (GA March 2026) – First AI-powered frontier agent for autonomous application security testing and code scanning.
  • GuardDuty Extended Threat Detection (re:Invent 2024) – AI/ML attack sequence identification now covers EC2, ECS, EKS workloads.
  • IAM Identity Center Multi-Region (Feb 2026) – Replicate identity center configuration across multiple AWS Regions for high availability.
  • Amazon Cognito Passwordless (Nov 2024) – Native passkey support with FIDO2/WebAuthn, email OTP, and SMS OTP authentication.
  • Centralized Root Access Management (Nov 2024) – Centrally manage root credentials and perform privileged tasks across AWS Organizations member accounts.
  • Agentic AI Security Framework (2025) – New Agentic AI Security Scoping Matrix for securing autonomous AI systems.

AWS Certification Relevance

  • Solutions Architect (Associate/Professional) – IAM, VPC security, encryption, Security Hub, GuardDuty
  • Security Specialty – All services in depth, including Security Lake, Detective, Macie, Inspector
  • SysOps Administrator – Security Hub, Config, GuardDuty, IAM best practices
  • Developer Associate – Cognito, IAM roles, KMS, Secrets Manager
  • DevOps Professional – Security automation, Inspector, Security Hub integrations