AWS Certification Exam Cheat Sheet

AWS Certification Exam Cheat Sheet

AWS Certification Exams cover a lot of topics and a wide range of services with minute details for features, patterns, anti patterns and their integration with other services. This blog post is just to have a quick summary of all the services and key points for a quick glance before you appear for the exam

AWS Global Infrastructure

AWS Region, AZs, Edge locations

  • Each region is a separate geographic area, completely independent, isolated from the other regions & helps achieve the greatest possible fault tolerance and stability
  • Communication between regions is across the public Internet
  • Each region has multiple Availability Zones
  • Each AZ is physically isolated, geographically separated from each other and designed as an independent failure zone
  • AZs are connected with low-latency private links (not public internet)
  • Edge locations are locations maintained by AWS through a worldwide network of data centers for the distribution of content to reduce latency.

AWS Local Zones

  • AWS Local Zones place select AWS services closer to end-users, which allows running highly-demanding applications that require single-digit millisecond latencies to the end-users such as media & entertainment content creation, real-time gaming, machine learning etc.
  • AWS Local Zones provide a high-bandwidth, secure connection between local workloads and those running in the AWS Region, allowing you to seamlessly connect to the full range of in-region services through the same APIs and tool sets.

AWS Wavelength

  • AWS infrastructure deployments embed AWS compute and storage services within the telecommunications providers’ datacenters and help seamlessly access the breadth of AWS services in the region.
  • AWS Wavelength brings services to the edge of the 5G network, without leaving the mobile provider’s network reducing the extra network hops, minimizing the latency to connect to an application from a mobile device.

AWS Outposts

  • AWS Outposts bring native AWS services, infrastructure, and operating models to virtually any data center, co-location space, or on-premises facility.
  • AWS Outposts is designed for connected environments and can be used to support workloads that need to remain on-premises due to low latency, compliance or local data processing needs.

Refer details @ AWS Global Infrastructure

AWS Services

AWS Organizations

  • AWS Organizations offers policy-based management for multiple AWS accounts
  • Organizations allows creation of groups of accounts and then apply policies to those groups
  • Organizations enables you to centrally manage policies across multiple accounts, without requiring custom scripts and manual processes.
  • Organizations helps simplify the billing for multiple accounts by enabling the setup of a single payment method for all the accounts in the organization through consolidated billing

Consolidate Billing

  • Paying account with multiple linked accounts
  • Paying account is independent and should be only used for billing purpose
  • Paying account cannot access resources of other accounts unless given exclusively access through Cross Account roles
  • All linked accounts are independent and soft limit of 20
  • One bill per AWS account
  • provides Volume pricing discount for usage across the accounts
  • allows unused Reserved Instances to be applied across the group
  • Free tier is not applicable across the accounts

Tags & Resource Groups

  • are metadata, specified as key/value pairs with the AWS resources
  • are for labelling purposes and helps managing, organizing resources
  • can be inherited when created resources created from Auto Scaling, Cloud Formation, Elastic Beanstalk etc
  • can be used for
    • Cost allocation to categorize and track the AWS costs
    • Conditional Access Control policy to define permission to allow or deny access on resources based on tags
  • Resource Group is a collection of resources that share one or more tags

IDS/IPS

  • Promiscuous mode is not allowed, as AWS and Hypervisor will not deliver any traffic to instances this is not specifically addressed to the instance
  • IDS/IPS strategies
    • Host Based Firewall – Forward Deployed IDS where the IDS itself is installed on the instances
    • Host Based Firewall – Traffic Replication where IDS agents installed on instances which send/duplicate the data to a centralized IDS system
    • In-Line Firewall – Inbound IDS/IPS Tier (like a WAF configuration) which identifies and drops suspect packets

DDOS Mitigation

  • Minimize the Attack surface
    • use ELB/CloudFront/Route 53 to distribute load
    • maintain resources in private subnets and use Bastion servers
  • Scale to absorb the attack
    • scaling helps buy time to analyze and respond to an attack
    • auto scaling with ELB to handle increase in load to help absorb attacks
    • CloudFront, Route 53 inherently scales as per the demand
  • Safeguard exposed resources
    • user Route 53 for aliases to hide source IPs and Private DNS
    • use CloudFront geo restriction and Origin Access Identity
    • use WAF as part of the infrastructure
  • Learn normal behavior (IDS/WAF)
    • analyze and benchmark to define rules on normal behavior
    • use CloudWatch
  • Create a plan for attacks

AWS Services Region, AZ, Subnet VPC limitations

  • Services like IAM (user, role, group, SSL certificate), Route 53, STS are Global and available across regions
  • All other AWS services are limited to Region or within Region and do not exclusively copy data across regions unless configured
  • AMI are limited to region and need to be copied over to other region
  • EBS volumes are limited to the Availability Zone, and can be migrated by creating snapshots and copying them to another region
  • Reserved instances are limited to Availability Zone and (can be migrated to other Availability Zone now) cannot be migrated to another region
  • RDS instances are limited to the region and can be recreated in a different region by either using snapshots or promoting a Read Replica
  • Placement groups are limited to the Availability Zone
    • Cluster Placement groups are limited to single Availability Zones
    • Spread Placement groups can span across multiple Availability Zones
  • S3 data is replicated within the region and can be move to another region using cross region replication
  • DynamoDB maintains data within the region can be replicated to another region using DynamoDB cross region replication (using DynamoDB streams) or Data Pipeline using EMR (old method)
  • Redshift Cluster span within an Availability Zone only, and can be created in other AZ using snapshots

Disaster Recovery Whitepaper

  • RTO is the time it takes after a disruption to restore a business process to its service level and RPO acceptable amount of data loss measured in time before the disaster occurs
  • Techniques (RTO & RPO reduces and the Cost goes up as we go down)
    • Backup & Restore – Data is backed up and restored, within nothing running
    • Pilot light – Only minimal critical service like RDS is running and rest of the services can be recreated and scaled during recovery
    • Warm Standby – Fully functional site with minimal configuration is available and can be scaled during recovery
    • Multi-Site – Fully functional site with identical configuration is available and processes the load
  • Services
    • Region and AZ to launch services across multiple facilities
    • EC2 instances with the ability to scale and launch across AZs
    • EBS with Snapshot to recreate volumes in different AZ or region
    • AMI to quickly launch preconfigured EC2 instances
    • ELB and Auto Scaling to scale and launch instances across AZs
    • VPC to create private, isolated section
    • Elastic IP address as static IP address
    • ENI with pre allocated Mac Address
    • Route 53 is highly available and scalable DNS service to distribute traffic across EC2 instances and ELB in different AZs and regions
    • Direct Connect for speed data transfer (takes time to setup and expensive then VPN)
    • S3 and Glacier (with RTO of 3-5 hours) provides durable storage
    • RDS snapshots and Multi AZ support and Read Replicas across regions
    • DynamoDB with cross region replication
    • Redshift snapshots to recreate the cluster
    • Storage Gateway to backup the data in AWS
    • Import/Export to move large amount of data to AWS (if internet speed is the bottleneck)
    • CloudFormation, Elastic Beanstalk and Opsworks as orchestration tools for automation and recreate the infrastructure

 

AWS Certification – Application Services – Cheat Sheet

SQS

  • extremely scalable queue service and potentially handles millions of messages
  • helps build fault tolerant, distributed loosely coupled applications
  • stores copies of the messages on multiple servers for redundancy and high availability
  • guarantees At-Least-Once Delivery, but does not guarantee Exact One Time Delivery which might result in duplicate messages (Not true anymore with the introduction of FIFO queues)
  • does not maintain or guarantee message order, and if needed sequencing information needs to be added to the message itself (Not true anymore with the introduction of FIFO queues)
  • supports multiple readers and writers interacting with the same queue as the same time
  • holds message for 4 days, by default, and can be changed from 1 min – 14 days after which the message is deleted
  • message needs to be explicitly deleted by the consumer once processed
  • allows send, receive and delete batching which helps club up to 10 messages in a single batch while charging price for a single message
  • handles visibility of the message to multiple consumers using Visibility Timeout, where the message once read by a consumer is not visible to the other consumers till the timeout occurs
  • can handle load and performance requirements by scaling the worker instances as the demand changes (Job Observer pattern)
  • message sample allowing short and long polling
    • returns immediately vs waits for fixed time for e.g. 20 secs
    • might not return all messages as it samples a subset of servers vs returns all available messages
    • repetitive vs helps save cost with long connection
  • supports delay queues to make messages available after a certain delay, can you used to differentiate from priority queues
  • supports dead letter queues, to redirect messages which failed to process after certain attempts instead of being processed repeatedly
  • Design Patterns
    • Job Observer Pattern can help coordinate number of EC2 instances with number of job requests (Queue Size) automatically thus Improving cost effectiveness and performance
    • Priority Queue Pattern can be used to setup different queues with different handling either by delayed queues or low scaling capacity for handling messages in lower priority queues

SNS

  • delivery or sending of messages to subscribing endpoints or clients
  • publisher-subscriber model
  • Producers and Consumers communicate asynchronously with subscribers by producing and sending a message to a topic
  • supports Email (plain or JSON), HTTP/HTTPS, SMS, SQS
  • supports Mobile Push Notifications to push notifications directly to mobile devices with services like Amazon Device Messaging (ADM), Apple Push Notification Service (APNS), Google Cloud Messaging (GCM) etc. supported
  • order is not guaranteed and No recall available
  • integrated with Lambda to invoke functions on notifications
  • for Email notifications, use SNS or SES directly, SQS does not work

SWF

  • orchestration service to coordinate work across distributed components
  • helps define tasks, stores, assigns tasks to workers, define logic, tracks and monitors the task and maintains workflow state in a durable fashion
  • helps define tasks which can be executed on AWS cloud or on-premises
  • helps coordinating tasks across the application which involves managing intertask dependencies, scheduling, and concurrency in accordance with the logical flow of the application
  • supports built-in retries, timeouts and logging
  • supports manual tasks
  • Characteristics
    • deliver exactly once
    • uses long polling, which reduces number of polls without results
    • Visibility of task state via API
    • Timers, signals, markers, child workflows
    • supports versioning
    • keeps workflow history for a user-specified time
  • AWS SWF vs AWS SQS
    • task-oriented vs message-oriented
    • track of all tasks and events vs needs custom handling

SES

  • highly scalable and cost-effective email service
  • uses content filtering technologies to scan outgoing emails to check standards and email content for spam and malware
  • supports full fledged emails to be sent as compared to SNS where only the message is sent in Email
  • ideal for sending bulk emails at scale
  • guarantees first hop
  • eliminates the need to support custom software or applications to do heavy lifting of email transport

AWS Networking & Content Delivery Services Cheat Sheet

AWS Networking & Content Delivery Services

AWS Networking & Content Delivery Services Cheat Sheet

AWS Networking & Content Delivery Services

Virtual Private Cloud – VPC

  • helps define a logically isolated dedicated virtual network within the AWS
  • provides control of IP addressing using CIDR block from a minimum of /28 to a maximum of /16 block size
  • supports IPv4 and IPv6 addressing
  • cannot be extended once created
  • can be extended by associating secondary IPv4 CIDR blocks to VPC
  • Components
    • Internet gateway (IGW) provides access to the Internet
    • Virtual gateway (VGW) provides access to the on-premises data center through VPN and Direct Connect connections
    • VPC can have only one IGW and VGW
    • Route tables determine network traffic routing from the subnet
    • Ability to create a subnet with VPC CIDR block
    • A Network Address Translation (NAT) server provides outbound Internet access for EC2 instances in private subnets
    • Elastic IP addresses are static, persistent public IP addresses
    • Instances launched in the VPC will have a Private IP address and can have a Public or an Elastic IP address associated with it
    • Security Groups and NACLs help define security
    • Flow logs – Capture information about the IP traffic going to and from network interfaces in your VPC
  • Tenancy option for instances
    • shared, by default, allows instances to be launched on shared tenancy
    • dedicated allows instances to be launched on a dedicated hardware
  • Route Tables
    • defines rules, termed as routes, which determine where network traffic from the subnet would be routed
    • Each VPC has a Main Route table and can have multiple custom route tables created
    • Every route table contains a local route that enables communication within a VPC which cannot be modified or deleted
    • Route priority is decided by matching the most specific route in the route table that matches the traffic
  • Subnets
    • map to AZs and do not span across AZs
    • have a CIDR range that is a portion of the whole VPC.
    • CIDR ranges cannot overlap between subnets within the VPC.
    • AWS reserves 5 IP addresses in each subnet – first 4 and last one
    • Each subnet is associated with a route table which define its behavior
      • Public subnets – inbound/outbound Internet connectivity via IGW
      • Private subnets – outbound Internet connectivity via an NAT or VGW
      • Protected subnets – no outbound connectivity and used for regulated workloads
  • Elastic Network Interface (ENI)
    • a default ENI, eth0, is attached to an instance which cannot be detached with one or more secondary detachable ENIs (eth1-ethn)
    • has primary private, one or more secondary private, public, Elastic IP address, security groups, MAC address and source/destination check flag attributes associated
    • AN ENI in one subnet can be attached to an instance in the same or another subnet, in the same AZ and the same VPC
    • Security group membership of an ENI can be changed
    • with pre-allocated Mac Address can be used for applications with special licensing requirements
  • Security Groups vs NACLs – Network Access Control Lists
    • Stateful vs Stateless
    • At instance level vs At subnet level
    • Only allows Allow rule vs Allows both Allow and Deny rules
    • Evaluated as a Whole vs Evaluated in defined Order
  • Elastic IP
    • is a static IP address designed for dynamic cloud computing.
    • is associated with an AWS account, and not a particular instance
    • can be remapped from one instance to another instance
    • is charged for non-usage, if not linked for any instance or instance associated is in a stopped state
  • NAT
    • allows internet access to instances in the private subnets.
    • performs the function of both address translation and port address translation (PAT)
    • needs source/destination check flag to be disabled as it is not the actual destination of the traffic for NAT Instance.
    • NAT gateway is an AWS managed NAT service that provides better availability, higher bandwidth, and requires less administrative effort
    • are not supported for IPv6 traffic
    • NAT Gateway supports private NAT with fixed private IPs.
    • Regional NAT Gateway (announced Nov 2025) automatically expands across Availability Zones based on workload footprint, providing simplified setup, enhanced security, and automatic high availability without manual multi-AZ configuration.
  • Egress-Only Internet Gateways
    • outbound communication over IPv6 from instances in the VPC to the Internet, and prevents the Internet from initiating an IPv6 connection with your instances
    • supports IPv6 traffic only
  • Shared VPCs
    • allows multiple AWS accounts to create their application resources, such as EC2 instances, RDS databases, Redshift clusters, and AWS Lambda functions, into shared, centrally-managed VPCs
  • VPC Encryption Controls (announced Nov 2025)
    • allows enforcing encryption in transit for network traffic within the VPC
    • provides centralized encryption policy enforcement and monitoring capabilities
    • supports monitor and enforce modes to audit and enforce encryption compliance
    • transitioned to paid feature starting March 2026

VPC Peering

  • allows routing of traffic between the peer VPCs using private IP addresses with no IGW or VGW required.
  • No single point of failure and bandwidth bottlenecks
  • supports inter-region VPC peering
  • Limitations
    • IP space or CIDR blocks cannot overlap
    • cannot be transitive
    • supports a one-to-one relationship between two VPCs and has to be explicitly peered.
    • does not support edge-to-edge routing.
    • supports only one connection between any two VPCs
  • Private DNS values cannot be resolved
  • Security groups from peered VPC can now be referred to, however, the VPC should be in the same region.

VPC Endpoints

  • enables private connectivity from VPC to supported AWS services and VPC endpoint services powered by PrivateLink
  • does not require a public IP address, access over the Internet, NAT device, a VPN connection, or Direct Connect
  • traffic between VPC & AWS service does not leave the Amazon network
  • are virtual devices.
  • are horizontally scaled, redundant, and highly available VPC components that allow communication between instances in the VPC and services without imposing availability risks or bandwidth constraints on the network traffic.
  • Gateway Endpoints
    • is a gateway that is a target for a specified route in the route table, used for traffic destined to a supported AWS service.
    • only S3 and DynamoDB are currently supported
  • Interface Endpoints OR Private Links
    • is an elastic network interface with a private IP address that serves as an entry point for traffic destined to a supported service
    • supports services include AWS services, services hosted by other AWS customers and partners in their own VPCs (referred to as endpoint services), and supported AWS Marketplace partner services.
    • Private Links
      • provide fine-grained access control
      • provides a point-to-point integration.
      • supports overlapping CIDR blocks.
      • supports transitive routing
    • Access to VPC Resources over PrivateLink (announced Dec 2024) – allows sharing any VPC resource using AWS RAM and accessing them privately using VPC endpoints, without requiring the resource to sit behind a NLB.

CloudFront

  • provides low latency and high data transfer speeds for the distribution of static, dynamic web, or streaming content to web users.
  • delivers the content through a worldwide network of data centers called Edge Locations or Point of Presence (PoPs)
  • keeps persistent connections with the origin servers so that the files can be fetched from the origin servers as quickly as possible.
  • dramatically reduces the number of network hops that users’ requests must pass through
  • supports multiple origin server options, like AWS hosted service for e.g. S3, EC2, ELB, or an on-premise server, which stores the original, definitive version of the objects
  • single distribution can have multiple origins and Path pattern in a cache behavior determines which requests are routed to the origin
  • Web distribution supports static, dynamic web content, on-demand using progressive download & HLS, and live streaming video content
  • RTMP distributions were deprecated and removed on December 31, 2020. Use Web distributions with HTTP-based streaming protocols (HLS, DASH) instead.
  • supports HTTPS using either
    • dedicated IP address, which is expensive as a dedicated IP address is assigned to each CloudFront edge location
    • Server Name Indication (SNI), which is free but supported by modern browsers only with the domain name available in the request header
  • For E2E HTTPS connection,
    • Viewers -> CloudFront needs either a certificate issued by CA or ACM
    • CloudFront -> Origin needs a certificate issued by ACM for ELB and by CA for other origins
  • Security
    • Origin Access Control (OAC) is the recommended method to restrict content from S3 origin to be accessible from CloudFront only. OAC supports SSE-KMS, all HTTP methods, and all AWS Regions.
      • Origin Access Identity (OAI) is the legacy method. OAI creation was deprecated in 2024 and new distributions (as of March 2026) can only use OAC. Existing OAI configurations continue to work but migration to OAC is recommended.
    • supports Geo restriction (Geo-Blocking) to whitelist or blacklist countries that can access the content
    • Signed URLs
      • to restrict access to individual files, for e.g., an installation download for your application.
      • users using a client, for e.g. a custom HTTP client, that doesn’t support cookies
    • Signed Cookies
      • provide access to multiple restricted files, for e.g., video part files in HLS format or all of the files in the subscribers’ area of a website.
      • don’t want to change the current URLs
    • integrates with AWS WAF, a web application firewall that helps protect web applications from attacks by allowing rules configured based on IP addresses, HTTP headers, and custom URI strings
  • supports GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE to get object & object headers, add, update, and delete objects
    • only caches responses to GET and HEAD requests and, optionally, OPTIONS requests
    • does not cache responses to PUT, POST, PATCH, DELETE request methods and these requests are proxied back to the origin
  • object removal from the cache
    • would be removed upon expiry (TTL) from the cache, by default 24 hrs
    • can be invalidated explicitly, but has a cost associated, however, might continue to see the old version until it expires from those caches
    • objects can be invalidated only for Web distribution
    • use versioning or change object name, to serve a different version
    • Tag-based cache invalidation (announced May 2026) – allows tagging cached objects via origin response headers or S3 metadata and invalidating them by tag directly through the CloudFront API.
  • supports adding or modifying custom headers before the request is sent to origin which can be used to
    • validate if a user is accessing the content from CDN
    • identifying CDN from which the request was forwarded, in case of multiple CloudFront distributions
    • for viewers not supporting CORS to return the Access-Control-Allow-Origin header for every request
  • supports Partial GET requests using range header to download objects in smaller units improving the efficiency of partial downloads and recovery from partially failed transfers
  • supports compression to compress and serve compressed files when viewer requests include Accept-Encoding: gzip in the request header
  • supports different price classes to include all regions, or only the least expensive regions and other regions without the most expensive regions
  • supports access logs which contain detailed information about every user request for both web distribution
  • Edge Compute
    • CloudFront Functions – lightweight JavaScript functions for simple request/response transformations (URL rewrites, header manipulation, redirects) executed at viewer request/response events with sub-millisecond latency
    • Lambda@Edge – more powerful compute for complex processing at origin request/response and viewer request/response events
    • CloudFront KeyValueStore (launched 2023) – a globally distributed, low-latency data store that CloudFront Functions can read at runtime for dynamic routing, A/B testing, feature flags, and geo-routing without redeploying function code
  • CloudFront Flat-Rate Pricing Plans – combine CDN, AWS WAF, DDoS protection, bot management, Route 53 DNS, CloudWatch Logs ingestion, serverless edge compute, and S3 storage credits into a single monthly price

AWS VPN

  • AWS Site-to-Site VPN provides secure IPSec connections from on-premise computers or services to AWS over the Internet
  • is cheap, and quick to set up however it depends on the Internet speed
  • delivers high availability by using two tunnels across multiple Availability Zones within the AWS global network
  • VPN requires a Virtual Gateway – VGW and Customer Gateway – CGW for communication
  • VPN connection is terminated on VGW on AWS
  • Only one VGW can be attached to a VPC at a time
  • VGW supports both static and dynamic routing using Border Gateway Protocol (BGP)
  • VGW supports AWS-256 and SHA-2 for data encryption and integrity
  • AWS Client VPN is a managed client-based VPN service that enables secure access to AWS resources and resources in the on-premises network.
  • AWS VPN does not allow accessing the Internet through IGW or NAT Gateway, peered VPC resources, or VPC Gateway Endpoints from on-premises.
  • AWS VPN allows access accessing the Internet through NAT Instance and VPC Interface Endpoints from on-premises.

Direct Connect

  • is a network service that uses a private dedicated network connection to connect to AWS services.
  • helps reduce costs (long term), increases bandwidth, and provides a more consistent network experience than internet-based connections.
  • supports Dedicated and Hosted connections
    • Dedicated connection is made through a 1 Gbps, 10 Gbps, or 100 Gbps Ethernet port dedicated to a single customer.
    • Hosted connections are sourced from an AWS Direct Connect Partner that has a network link between themselves and AWS.
  • provides Virtual Interfaces
    • Private VIF to access instances within a VPC via VGW
    • Public VIF to access non VPC services
    • Transit VIF to access one or more Amazon VPC Transit Gateways associated with Direct Connect gateways, enabling connectivity to multiple VPCs through a single VIF
  • requires time to setup probably months, and should not be considered as an option if the turnaround time is less
  • does not provide redundancy, use either second direct connection or IPSec VPN connection
  • Virtual Private Gateway is on the AWS side and Customer Gateway is on the Customer side
  • route propagation is enabled on VGW and not on CGW
  • A link aggregation group (LAG) is a logical interface that uses the link aggregation control protocol (LACP) to aggregate multiple dedicated connections at a single AWS Direct Connect endpoint and treat them as a single, managed connection
  • VIF Rate Limiters (announced June 2026) on dedicated connections help prevent network congestion caused by unexpected traffic spikes on a VIF that could consume all available bandwidth impacting other VIFs on the same connection.
  • Direct Connect vs VPN IPSec
    • Expensive to Setup and Takes time vs Cheap & Immediate
    • Dedicated private connections vs Internet
    • Reduced data transfer rate vs Internet data transfer cost
    • Consistent performance vs Internet inherent variability
    • Do not provide Redundancy vs Provides Redundancy

Route 53

  • provides highly available and scalable DNS, Domain Registration Service, and health-checking web services
  • Reliable and cost-effective way to route end users to Internet applications
  • Supports multi-region and backup architectures for High availability. ELB is limited to region and does not support multi-region HA architecture.
  • supports private Intranet facing DNS service
  • internal resource record sets only work for requests originating from within the VPC and currently cannot extend to on-premise
  • Global propagation of any changes made to the DN records within ~ 1min
  • supports Alias resource record set is a Route 53 extension to DNS.
    • It’s similar to a CNAME resource record set, but supports both for root domain – zone apex e.g. example.com, and for subdomains for e.g. www.example.com.
    • supports ELB load balancers, CloudFront distributions, Elastic Beanstalk environments, API Gateways, VPC interface endpoints, and S3 buckets that are configured as websites.
  • CNAME resource record sets can be created only for subdomains and cannot be mapped to the zone apex record
  • supports Private DNS to provide an authoritative DNS within the VPCs without exposing the DNS records (including the name of the resource and its IP address(es) to the Internet.
  • Split-view (Split-horizon) DNS enables mapping the same domain publicly and privately. Requests are routed as per the origin.
  • Routing policy
    • Simple routing – simple round-robin policy
    • Weighted routing – assign weights to resource records sets to specify the proportion for e.g. 80%:20%
    • Latency based routing – helps improve global applications as requests are sent to the server from the location with minimal latency, is based on the latency and cannot guarantee users from the same geography will be served from the same location for any compliance reasons
    • Geolocation routing – Specify geographic locations by continent, country, the state limited to the US, is based on IP accuracy
    • Geoproximity routing policy – Use to route traffic based on the location of the resources and, optionally, shift traffic from resources in one location to resources in another.
    • Multivalue answer routing policy – Use to respond to DNS queries with up to eight healthy records selected at random.
    • Failover routing – failover to a backup site if the primary site fails and becomes unreachable
    • IP-based routing – route traffic based on the IP address of the client making the DNS query
  • Weighted, Latency and Geolocation can be used for Active-Active while Failover routing can be used for Active-Passive multi-region architecture
  • Traffic Flow is an easy-to-use and cost-effective global traffic management service. Traffic Flow supports versioning and helps create policies that route traffic based on the constraints they care most about, including latency, endpoint health, load, geoproximity, and geography.
  • Route 53 Resolver is a regional DNS service that helps with hybrid DNS
    • Inbound Endpoints are used to resolve DNS queries from an on-premises network to AWS
    • Outbound Endpoints are used to resolve DNS queries from AWS to an on-premises network
    • Resolver endpoints now support DNS delegation for private hosted zones (June 2025)
  • Route 53 Profiles – enables sharing DNS configurations (private hosted zone associations, Resolver rules, and Resolver DNS Firewall rule group associations) across VPCs and accounts using AWS RAM
  • Accelerated Recovery (announced Nov 2025) – provides a 60-minute recovery time objective (RTO) for regaining the ability to make DNS changes to public hosted zones during regional disruptions in US East (N. Virginia)
  • PrivateLink Support (announced Nov 2025) – allows making changes to DNS infrastructure (hosted zones, records, health checks) without using the public internet

AWS Global Accelerator

  • is a networking service that helps you improve the availability and performance of the applications to global users.
  • utilizes the Amazon global backbone network, improving the performance of the applications by lowering first-byte latency, and jitter, and increasing throughput as compared to the public internet.
  • provides two static IP addresses serviced by independent network zones that provide a fixed entry point to the applications and eliminate the complexity of managing specific IP addresses for different AWS Regions and AZs.
  • always routes user traffic to the optimal endpoint based on performance, reacting instantly to changes in application health, the user’s location, and configured policies
  • improves performance for a wide range of applications over TCP or UDP by proxying packets at the edge to applications running in one or more AWS Regions.
  • is a good fit for non-HTTP use cases, such as gaming (UDP), IoT (MQTT), or Voice over IP, as well as for HTTP use cases that specifically require static IP addresses or deterministic, fast regional failover.
  • integrates with AWS Shield for DDoS protection
  • uses a global network of 130+ Points of Presence in 95+ cities across 53+ countries
  • supports dual-stack Network Load Balancers as endpoints
  • supports endpoints in 33 AWS Regions (as of 2025)
  • integrates with AWS Load Balancer Controller for Kubernetes (announced 2025)

Transit Gateway – TGW

  • is a highly available and scalable service to consolidate the AWS VPC routing configuration for a region with a hub-and-spoke architecture.
  • acts as a Regional virtual router and is a network transit hub that can be used to interconnect VPCs and on-premises networks.
  • traffic always stays on the global AWS backbone, data is automatically encrypted, and never traverses the public internet, thereby reducing threat vectors, such as common exploits and DDoS attacks.
  • is a Regional resource and can connect VPCs within the same AWS Region.
  • TGWs across the same or different regions can peer with each other.
  • provides simpler VPC-to-VPC communication management over VPC Peering with a large number of VPCs.
  • scales elastically based on the volume of network traffic.
  • supports security group referencing (announced Sept 2024) – allows creating inbound security rules that reference security groups defined in other VPCs attached to the same Transit Gateway within the same Region.
  • supports per-AZ metrics delivered to CloudWatch and Path MTU Discovery (PMTUD) for both IPv4 and IPv6 (announced Nov 2024).
  • supports Transit Gateway Flow Logs for monitoring and logging network traffic between transit gateways.
  • supports Flexible Cost Allocation (announced Nov 2025) – provides versatile cost allocation options through a central metering policy beyond the default sender-pay model.

Amazon VPC Lattice

  • is a fully managed application networking service that connects, monitors, and secures communications between services and resources across VPCs and accounts.
  • simplifies service-to-service connectivity without requiring VPC peering, Transit Gateway, or PrivateLink NLBs.
  • automatically manages network connectivity and application-layer routing between services across different VPCs and AWS accounts.
  • supports connectivity to TCP resources, such as databases, domain names, and IP addresses across VPCs and accounts.
  • integrates with AWS IAM for service-to-service authentication and authorization using Auth policies.
  • removes the NLB requirement that PrivateLink imposes on providers and supports cross-VPC/cross-account connectivity without CIDR coordination.
  • terminates TLS at the data plane so callers do not need to manage certificates.
  • provides built-in observability with access logs, connection logs, and traffic metrics.
  • Key concepts:
    • Service Network – a logical boundary for a collection of services that can communicate with each other
    • Service – represents an application unit that is independently deployable
    • Target Groups – collection of resources (instances, IPs, Lambda, ALB) for routing
    • Resource Configurations – define TCP resources (databases, IPs, domain names) accessible through VPC Lattice
  • Use cases:
    • Microservices connectivity across multiple VPCs/accounts
    • Secure service-to-service communication with zero trust
    • Alternative to VPC Peering and Transit Gateway for application-layer connectivity
    • Replacement for AWS App Mesh (which reached EOL on September 30, 2026)

Amazon VPC IP Address Manager (IPAM)

  • is a VPC feature that allows you to plan, track, and monitor IP addresses for AWS workloads.
  • organizes IP addresses by routing and security requirements while automating allocation to VPCs, replacing manual spreadsheet-based tracking.
  • tracks AWS accounts and VPCs, eliminating IP bookkeeping overhead.
  • supports management at both VPC and subnet CIDR levels.
  • integrates with AWS Organizations for cross-account IP address management.
  • supports provisioning Amazon-provided contiguous IPv4 blocks into publicly scoped regional pools for use with EIPs, NLBs, and NAT Gateways.
  • Public IP Insights – free feature that simplifies monitoring, analysis, and auditing of public IPv4 addresses.
  • IPAM Policies – define public IPv4 allocation strategies and automate prefix lists.
  • integrates with ALB for predictable IP address blocks for internet-facing ALBs (March 2025).
  • IPAM Advanced Tier – includes Infoblox integration (Nov 2025) for managing AWS IP addresses through existing Infoblox workflows.

AWS Network Firewall

  • is a managed, stateful network firewall and intrusion detection and prevention service for all Amazon VPCs.
  • scales automatically with network traffic, requiring no infrastructure management.
  • provides Layer 7 firewall capabilities with deep packet inspection.
  • supports flexible rules engine for fine-grained control of VPC network traffic.
  • provides active threat defense using AWS managed rules to block evasive C2 channels, malicious URLs, and other threat vectors.
  • supports Suricata-compatible IPS rules for known bad signatures and traffic patterns.
  • includes Network Firewall Proxy for granular security controls to inspect and filter VPC outbound connections, preventing data exfiltration and malware intrusion.
  • integrates with AWS Firewall Manager for centralized policy management across accounts.
  • can be combined with VPC Lattice for comprehensive security (VPC Lattice for HTTP/S with identity-based controls, Network Firewall for other traffic types).

AWS Cloud WAN

  • is a managed WAN service that provides a central dashboard to connect and manage branch offices, data centers, VPN connections, SD-WAN, VPCs, and Transit Gateways.
  • uses network policies to create a global network spanning multiple locations and networks, removing the need for different technologies.
  • provides a single console and set of APIs to manage networks across AWS Regions.
  • supports direct Direct Connect gateway attachments without requiring an intermediate Transit Gateway (announced Nov 2024).
  • supports Routing Policy for advanced traffic control (announced Nov 2025) – enables controlled routing environments, minimizing route reachability blast radius.
  • supports Service Insertion for inspection and security appliance integration.
  • supports PMTUD for both IPv4 and IPv6 (announced Nov 2024).
  • supports AWS PrivateLink and IPv6 for management endpoint connectivity (announced March 2025).
  • available in AWS GovCloud (US) Regions.

AWS Verified Access

  • provides secure access to corporate applications and resources without requiring a VPN.
  • implements zero trust principles by evaluating each access request based on user identity and device security posture rather than network location.
  • uses the Cedar policy language for defining fine-grained access policies.
  • supports secure access to resources over non-HTTP(S) protocols (announced Feb 2025) – enables VPN-less access to TCP-based resources like SSH, RDP, and databases.
  • continuously monitors active connections and terminates connections when security requirements aren’t met.
  • integrates with third-party identity providers and device management solutions.
  • can be used with PrivateLink-backed services to provide authorized internet-based access while maintaining security boundaries.

AWS Management Tools Cheat Sheet

AWS Organizations

  • AWS Organizations is an account management service that enables consolidating multiple AWS accounts into an organization that can be created and centrally managed.
  • AWS Organizations enables you to
    • Automate AWS account creation and management, and provision resources with AWS CloudFormation Stacksets
    • Maintain a secure environment with policies and management of AWS security services
    • Govern access to AWS services, resources, and regions
    • Centrally manage policies across multiple AWS accounts
    • Audit your environment for compliance
    • View and manage costs with consolidated billing
    • Configure AWS services across multiple accounts

CloudFormation

  • gives developers and systems administrators an easy way to create and manage a collection of related AWS resources
  • Resources can be updated, deleted, and modified in an orderly, controlled and predictable fashion, in effect applying version control to the AWS infrastructure as code done for software code
  • CloudFormation Template is an architectural diagram, in JSON format, and Stack is the end result of that diagram, which is actually provisioned
  • template can be used to set up the resources consistently and repeatedly over and over across multiple regions and consists of
    • List of AWS resources and their configuration values
    • An optional template file format version number
    • An optional list of template parameters (input values supplied at stack creation time)
    • An optional list of output values like public IP address using the Fn::GetAtt function
    • An optional list of data tables used to lookup static configuration values for e.g., AMI names per AZ
  • supports Chef & Puppet Integration to deploy and configure right down the application layer
  • supports Bootstrap scripts to install packages, files, and services on the EC2 instances by simply describing them in the CF template
  • automatic rollback on error feature is enabled, by default, which will cause all the AWS resources that CF created successfully for a stack up to the point where an error occurred to be deleted
  • provides a WaitCondition resource to block the creation of other resources until a completion signal is received from an external source
  • allows DeletionPolicy attribute to be defined for resources in the template
    • retain to preserve resources like S3 even after stack deletion
    • snapshot to backup resources like RDS after stack deletion
  • DependsOn attribute to specify that the creation of a specific resource follows another
  • Service role is an IAM role that allows AWS CloudFormation to make calls to resources in a stack on the user’s behalf
  • Nested stacks can separate out reusable, common components and create dedicated templates to mix and match different templates but use nested stacks to create a single, unified stack
  • Change Sets presents a summary or preview of the proposed changes that CloudFormation will make when a stack is updated
  • Drift detection enables you to detect whether a stack’s actual configuration differs, or has drifted, from its expected configuration.
  • Termination protection helps prevent a stack from being accidentally deleted.
  • Stack policy can prevent stack resources from being unintentionally updated or deleted during a stack update.
  • StackSets extends the functionality of stacks by enabling you to create, update, or delete stacks across multiple accounts and Regions with a single operation.

Elastic BeanStalk

  • makes it easier for developers to quickly deploy and manage applications in the AWS cloud.
  • automatically handles the deployment details of capacity provisioning, load balancing, auto-scaling and application health monitoring
  • CloudFormation supports ElasticBeanstalk
  • provisions resources to support
    • a web application that handles HTTP(S) requests or
    • a web application that handles background-processing (worker) tasks
  • supports Out Of the Box
    • Apache Tomcat for Java applications
    • Apache HTTP Server for PHP applications
    • Apache HTTP server for Python applications
    • Nginx or Apache HTTP Server for Node.js applications
    • Passenger for Ruby applications
    • MicroSoft IIS 7.5 for .Net applications
    • Single and Multi Container Docker
  • supports custom AMI to be used
  • is designed to support multiple running environments such as one for Dev, QA, Pre-Prod and Production.
  • supports versioning and stores and tracks application versions over time allowing easy rollback to prior version
  • can provision RDS DB instance and connectivity information is exposed to the application by environment variables, but is NOT recommended for production setup as the RDS is tied up with the Elastic Beanstalk lifecycle and if deleted, the RDS instance would be deleted as well

OpsWorks

  • is a configuration management service that helps to configure and operate applications in a cloud enterprise by using Chef
  • helps deploy and monitor applications in stacks with multiple layers
  • supports preconfigured layers for Applications, Databases, Load Balancers, Caching
  • OpsWorks Stacks features is a set of lifecycle events – Setup, Configure, Deploy, Undeploy, and Shutdown – which automatically runs specified set of recipes at the appropriate time on each instance
  • Layers depend on Chef recipes to handle tasks such as installing packages on instances, deploying apps, running scripts, and so on
  • OpsWorks Stacks runs the recipes for each layer, even if the instance belongs to multiple layers
  • supports Auto Healing and Auto Scaling to monitor instance health, and provision new instances

CloudWatch

  • allows monitoring of AWS resources and applications in real time, collect and track pre configured or custom metrics and configure alarms to send notification or make resource changes based on defined rules
  • does not aggregate data across regions
  • stores the log data indefinitely, and the retention can be changed for each log group at any time
  • alarm history is stored for only 14 days
  • can be used an alternative to S3 to store logs with the ability to configure Alarms and generate metrics, however logs cannot be made public
  • Alarms exist only in the created region and the Alarm actions must reside in the same region as well

CloudTrail

  • records access to API calls for the AWS account made from AWS management console, SDKs, CLI and higher level AWS service
  • support many AWS services and tracks who did, from where, what & when
  • can be enabled per-region basis, a region can include global services (like IAM, STS etc), is applicable to all the supported services within that region
  • log files from different regions can be sent to the same S3 bucket
  • can be integrated with SNS to notify logs availability, CloudWatch logs log group for notifications when specific API events occur
  • call history enables security analysis, resource change tracking, trouble shooting and compliance auditing

AWS Identity & Security Services Cheat Sheet

AWS Identity & Security Services Cheat Sheet

AWS Identity and Security Services

📌 Last Updated: June 2026 — Includes AWS Security Hub reimagined (re:Invent 2025), AWS Security Agent (GA March 2026), mandatory MFA enforcement for all root users, GuardDuty Extended Threat Detection, and IAM Identity Center multi-Region replication.

AWS Identity Services Cheat Sheet

AWS Security Services Cheat Sheet

AWS Identity & Security Services Overview

AWS Security, Identity, and Compliance services provide a comprehensive set of tools to help protect data, accounts, and workloads. These services are organized into the following categories:

Identity and Access Management

  • AWS Identity and Access Management (IAM) – Securely manage access to AWS services and resources using users, groups, roles, and policies
  • AWS IAM Identity Center (formerly AWS SSO) – Centrally manage SSO access to multiple AWS accounts and business applications
    • Now supports multi-Region replication (Feb 2026) for high availability
    • Supports IPv6 dual-stack endpoints
  • Amazon Cognito – Customer identity and access management (CIAM) for web and mobile apps
    • Now supports passwordless authentication with passkeys (FIDO2/WebAuthn), email OTP, and SMS OTP (Nov 2024)
    • New feature tiers: Essentials and Plus (Nov 2024)
    • Managed Login for pre-built authentication UIs
  • Amazon Verified Permissions – Scalable, fine-grained authorization using Cedar policy language for custom applications
  • AWS Resource Access Manager (RAM) – Securely share AWS resources across accounts and within AWS Organizations
  • AWS Directory Service – Managed Microsoft Active Directory in the AWS Cloud

Detection and Response

  • Amazon GuardDuty – Intelligent threat detection that continuously monitors for malicious activity
    • Extended Threat Detection (re:Invent 2024) – AI/ML-powered attack sequence identification across multiple data sources
    • Now covers EC2, ECS, EKS, S3, and IAM attack sequences
    • Custom entity lists for domain-based threat intelligence (Sept 2025)
  • Amazon Detective – Analyze, investigate, and identify root cause of security findings using ML and graph theory
  • Amazon Inspector – Automated vulnerability management for EC2 instances and container images in ECR
  • AWS Security Hub – Cloud security posture management (CSPM) and unified security operations
    • Reimagined at re:Invent 2025 – Unifies GuardDuty, Inspector, and other services into a single experience
    • Near real-time analytics and risk prioritization (GA Dec 2025)
    • Extended Plan (GA Feb 2026) – Full-stack enterprise security with 21 curated partner solutions across 9 categories
    • Expanding to multicloud environments
  • AWS Security Agent (GA March 2026) – AI-powered frontier agent for proactive application security
    • Automated security reviews tailored to organizational requirements
    • On-demand context-aware penetration testing
    • Full repository code scanning (Preview May 2026)
    • Operates like a human penetration tester – identifies, exploits, and validates vulnerabilities

Data Protection

Network and Application Protection

  • AWS WAF – Web application firewall to protect against common web exploits and bots
  • AWS Shield – Managed DDoS protection (Standard and Advanced tiers)
  • AWS Network Firewall – Managed network firewall for VPC with stateful inspection and IPS
  • AWS Firewall Manager – Centrally configure and manage firewall rules across accounts in AWS Organizations

Security Data Management and Compliance

  • Amazon Security Lake – Centralize security data from AWS, SaaS, on-premises using OCSF standard
    • Achieved FedRAMP High and Moderate authorization (April 2025)
  • AWS Audit Manager – Continuously audit AWS usage for risk and compliance assessment
  • AWS Artifact – On-demand access to AWS security and compliance reports

Key Updates (2024-2026)

  • MFA Enforcement (2024-2025) – AWS now mandates MFA for all root users across all account types. Prevents over 99% of password-related attacks.
  • AWS Security Hub Reimagined (re:Invent 2025) – Completely redesigned to unify security services into a single experience with near real-time analytics and AI-driven risk prioritization.
  • AWS Security Agent (GA March 2026) – First AI-powered frontier agent for autonomous application security testing and code scanning.
  • GuardDuty Extended Threat Detection (re:Invent 2024) – AI/ML attack sequence identification now covers EC2, ECS, EKS workloads.
  • IAM Identity Center Multi-Region (Feb 2026) – Replicate identity center configuration across multiple AWS Regions for high availability.
  • Amazon Cognito Passwordless (Nov 2024) – Native passkey support with FIDO2/WebAuthn, email OTP, and SMS OTP authentication.
  • Centralized Root Access Management (Nov 2024) – Centrally manage root credentials and perform privileged tasks across AWS Organizations member accounts.
  • Agentic AI Security Framework (2025) – New Agentic AI Security Scoping Matrix for securing autonomous AI systems.

AWS Certification Relevance

  • Solutions Architect (Associate/Professional) – IAM, VPC security, encryption, Security Hub, GuardDuty
  • Security Specialty – All services in depth, including Security Lake, Detective, Macie, Inspector
  • SysOps Administrator – Security Hub, Config, GuardDuty, IAM best practices
  • Developer Associate – Cognito, IAM roles, KMS, Secrets Manager
  • DevOps Professional – Security automation, Inspector, Security Hub integrations

AWS Elastic Transcoder – Certification

AWS Elastic Transcoder

  • Amazon Elastic Transcoder is a highly scalable, easy-to-use and cost-effective way for developers and businesses to convert (or “transcode”) video files from their source format into versions that will play back on multiple devices like smartphones, tablets and PCs.
  • Elastic Transcoder is for any customer with media assets stored in S3 for e.g. developers creating apps or websites that publish user-generated content, enterprises and educational establishments converting training and communication videos, and content owners and broadcasters needing to convert media assets into web-friendly formats.
  • Elastic Transcoder features
    • can be used to convert files from different media formats into H.264/AAC/MP4 files at different resolutions, bitrates, and frame rates, and set up transcoding pipelines to transcode files in parallel.
    • can be configured to overlay up to four graphics, known as watermarks, over a video during transcoding
    • can be configured to transcode captions, or subtitles, from one format to another and supports embedded and sidebar caption types
    • provides clip stitching ability to stitch together parts, or clips, from multiple input files to create a single output
    • can be configured to create Thumbnails
  • Elastic Transcoder is integrated with CloudTrail, an AWS service that captures information about every request that is sent to the Elastic Transcoder API by your AWS account, including your IAM users

Elastic Transcoder Components

  • Presets
    • are templates that contain most of the settings for transcoding media files from one format to another.
    • Elastic Transcoder includes some default presets for common formats and ability to create customized presets
  • Jobs
    • do the work of transcoding and converts a file into up to 30 formats.
    • takes the input file to be transcoded, names of the transcoded files and several other settings as input
    • For each transcoded format a preset needs to be specified
  • Pipelines
    • are queues that manage the transcoding jobs.
    • Elastic Transcoder starts processing the jobs and transcoding into format (for multiple formats) in the order they are added.
    • can be paused to temporarily stop processing jobs
  • Notifications
    • help keep you apprised of the status of a job, i.e. started, completed, encounters warning or error
    • eliminate the need for polling to determine when a job has finished and can be configured during pipeline creation

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Your website is serving on-demand training videos to your workforce. Videos are uploaded monthly in high resolution MP4 format. Your workforce is distributed globally often on the move and using company-provided tablets that require the HTTP Live Streaming (HLS) protocol to watch a video. Your company has no video transcoding expertise and it required you might need to pay for a consultant. How do you implement the most cost-efficient architecture without compromising high availability and quality of video delivery?
    1. Elastic Transcoder to transcode original high-resolution MP4 videos to HLS. S3 to host videos with lifecycle Management to archive original flies to Glacier after a few days. CloudFront to serve HLS transcoded videos from S3
    2. A video transcoding pipeline running on EC2 using SQS to distribute tasks and Auto Scaling to adjust the number or nodes depending on the length of the queue S3 to host videos with Lifecycle Management to archive all files to Glacier after a few days CloudFront to serve HLS transcoding videos from Glacier
    3. Elastic Transcoder to transcode original high-resolution MP4 videos to HLS EBS volumes to host videos and EBS snapshots to incrementally backup original rues after a few days. CloudFront to serve HLS transcoded videos from EC2.
    4. A video transcoding pipeline running on EC2 using SQS to distribute tasks and Auto Scaling to adjust the number of nodes depending on the length of the queue. EBS volumes to host videos and EBS snapshots to incrementally backup original files after a few days. CloudFront to serve HLS transcoded videos from EC2

References

AWS CloudSearch – Certification

AWS CloudSearch

  • CloudSearch is a fully-managed, full-featured search service in the AWS Cloud that makes it easy to set up, manage, and scale a search solution
  • CloudSearch
    • automatically provisions the required resources
    • deploys a highly tuned search index
    • easy configuration and can be up & running in less than one hour
    • search and ability to upload searchable data
    • automatically scales for data and traffic
    • self-healing clusters, and
    • high availability with Multi-AZ
  • CloudSearch uses Apache Solr as the underlying text search engine and
    • can be used to index and search both structured and unstructured data.
    • content can come from multiple sources and can include database fields along with files in a variety of formats, web pages, and so on.
    • supports indexing features like algorithmic stemming, dictionary stemming, stopword dictionary
    • can support customizable result ranking i.e. relevancy
    • supports search features for text search, different query types (range, boolean etc), sorting, facets for filtering, grouping etc
    • supports enhanced features for auto suggestions, highlighting, spatial search, fuzzy search etc
  • CloudSearch supports Multi-AZ option and it deploys additional instances in a second AZ in the same region.
  • CloudSearch can offer significantly lower total cost of ownership compared to operating and managing your own search environment

CloudSearch Search Domains, Data & Indexing

CloudSearch Architecture

  • Search domain is a data container and a set of services that make the data searchable
    • Document service that allows data uploading to domain for indexing
    • Search service that enables search requests against the indexed data
    • Configuration service for controlling the domains behavior (include relevance ranking)
  • Search domain can’t be automatically migrated from one region to another. New domain in the target region needs to be created, configured and data uploaded, and then the original domain deleted
  • Indexed data to be made searchable
    • can be submitted through a REST based web service url
    • has to be in JSON or XML format
    • is represented as a document with a unique document ID and multiple fields either to be search on to needed to be just retrieved
  • CloudSearch generates a search index from the document data according to the index fields configured for the domain
  • Data updates can be submitted by to add, update and delete documents
  • Data can be uploaded using secure and encrypted SSL HTTPS connection

CloudSearch Auto Scaling

CloudSearch Scaling

  • Search domains scale in two dimensions: data and traffic
  • A search instance is a single search engine in the cloud that indexes documents and responds to search requests with a finite amount of RAM and CPU resources for indexing data and processing requests.
  • Search domain can have one or more search partitions, portion of the data which fits on a single search instance, and the number of search partitions can change as the documents are indexed
  • CloudSearch can determine the size and number of search instances required to deliver low latency, high throughput search performance
  • When a search domain is created , a single instance is deployed
  • CloudSearch automatically scales the domain by adding instances as the volume of data or traffic increases
  • Scaling for data
    • CloudSearch handles scaling for data by
      • Vertical scaling by increasing the size of the instance, when the amount of data exceeds a single search instance
      • Horizontal scaling using search partitions, when the amount of data exceeds the capacity of the largest search instance type
    • Number of search instances required to hold the index partitions is sometimes referred to as the domain’s width.
    • CloudSearch reduces the number of partitions and size of search instances if the amount of data reduces
  • Scaling for traffic
    • CloudSearch handles Scaling for traffic by
      • Vertical scaling by increasing the size of the instance, when the amount of traffic exceeds a single search instance
      • Horizontal scaling by deploying a duplicate search instance to provide additional processing power i.e. the complete number of partitions are duplicated
    • CloudSearch reduces the number of partitions and size of search instances if the traffic reduces
    • Number of duplicate search instances is sometimes referred to as the domain’s depth.

CloudSearch Search Features

  • CloudSearch provides features to index and search both structured data and plain text as well as unstructured data like pdf, word documents
  • CloudSearch provides near real-time indexing for document updates
  • Indexing features include
    • tokenization,
    • stopwords,
    • stemming and
    • synonyms
  • Search features include
    • faceted search, free text search, Boolean search expressions,
    • customizable relevance ranking, query time rank expressions,
    • grouping
    • field weighting, searching and sorting
    • Other features like
      • Autocomplete suggestions
      • Highlighting
      • Geospatial search
      • New data types: date, double, 64 bit signed int, LatLon
      • Dynamic fields
      • Index field statistics
      • Sloppy phrase search
      • Term boosting
      • Enhanced range searching for all field types
      • Search filters that don’t affect relevance
      • Support for multiple query parsers: simple, structured, lucene, dismax
      • Query parser configuration options

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A newspaper organization has an on-premises application which allows the public to search its back catalogue and retrieve individual newspaper pages via a website written in Java. They have scanned the old newspapers into JPEGs (approx. 17TB) and used Optical Character Recognition (OCR) to populate a commercial search product. The hosting platform and software is now end of life and the organization wants to migrate its archive to AWS and produce a cost efficient architecture and still be designed for availability and durability. Which is the most appropriate?
    1. Use S3 with reduced redundancy to store and serve the scanned files, install the commercial search application on EC2 Instances and configure with auto-scaling and an Elastic Load Balancer. (Reusing Commercial search application which is nearing end of life not a good option for cost)
    2. Model the environment using CloudFormation. Use an EC2 instance running Apache webserver and an open source search application, stripe multiple standard EBS volumes together to store the JPEGs and search index. (storing JPEGs on EBS volumes not cost effective also answer does not address Open source solution availability)
    3. Use S3 with standard redundancy to store and serve the scanned files, use CloudSearch for query processing, and use Elastic Beanstalk to host the website across multiple availability zones. (Cost effective S3 storage, CloudSearch for Search and Highly available and durable web application)
    4. Use a single-AZ RDS MySQL instance to store the search index and the JPEG images use an EC2 instance to serve the website and translate user queries into SQL. (MySQL not an ideal solution to sore index and JPEG images for cost and performance)
    5. Use a CloudFront download distribution to serve the JPEGs to the end users and Install the current commercial search product, along with a Java Container for the website on EC2 instances and use Route53 with DNS round-robin. (Web Application not scalable, whats the source for JPEGs files through CloudFront)

References

AWS High Availability & Fault Tolerance Architecture – Certification

AWS High Availability & Fault Tolerance Architecture

📅 Content Update – June 2025

This post has been updated to reflect modern AWS HA/FT services and best practices including AWS Resilience Hub, Application Recovery Controller (ARC), Fault Injection Service (FIS), Multi-AZ DB Clusters, DynamoDB Global Tables with Multi-Region Strong Consistency (MRSC), and current ELB types (ALB/NLB/GWLB).

  • Amazon Web Services provides services and infrastructure to build reliable, fault-tolerant, and highly available systems in the cloud.
  • Fault-tolerance defines the ability for a system to remain in operation even if some of the components used to build the system fail.
  • Most of the higher-level services, such as S3, DynamoDB, SQS, and ELB, have been built with fault tolerance and high availability in mind.
  • Services that provide basic infrastructure, such as EC2 and EBS, provide specific features, such as availability zones, elastic IP addresses, and snapshots, that a fault-tolerant and highly available system must take advantage of and use correctly.

AWS High Availability and Fault Tolerance

NOTE: Topic mainly for Professional Exam Only

Regions & Availability Zones

  • Amazon Web Services are available in geographic Regions and with multiple Availability Zones (AZs) within a region, which provide easy access to redundant deployment locations.
  • AZs are distinct geographical locations that are engineered to be insulated from failures in other AZs.
  • Regions and AZs help achieve greater fault tolerance by distributing the application geographically and help build multi-site solutions.
  • AZs provide inexpensive, low latency network connectivity to other Availability Zones in the same Region. All traffic between AZs is encrypted.
  • By placing EC2 instances in multiple AZs, an application can be protected from failure at a single data center.
  • It is important to run independent application stacks in more than one AZ, either in the same region or in another region, so that if one zone fails, the application in the other zone can continue to run.
  • AWS recommends deploying production workloads across at least 3 AZs for optimal fault isolation and static stability.

Amazon Machine Image – AMIs

  • EC2 is a web service within Amazon Web Services that provides computing resources.
  • Amazon Machine Image (AMI) provides a Template that can be used to define the service instances.
  • Template basically contains a software configuration (i.e., OS, application server, and applications) and is applied to an instance type.
  • AMI can either contain all the softwares, applications and the code bundled or can be configured to have a bootstrap script (user data) to install the same on startup.
  • A single AMI can be used to create server resources of different instance types and start creating new instances or replacing failed instances.
  • EC2 Image Builder can automate the creation, testing, and distribution of AMIs across regions, enabling faster recovery through pre-built golden images.

Auto Scaling

  • Auto Scaling helps to automatically scale EC2 capacity up or down based on defined rules.
  • Auto Scaling also enables addition of more instances in response to an increasing load; and when those instances are no longer needed, they will be automatically terminated.
  • Auto Scaling enables terminating server instances at will, knowing that replacement instances will be automatically launched.
  • Auto Scaling can work across multiple AZs within an AWS Region.
  • Predictive Scaling uses machine learning to proactively scale out ASGs ahead of anticipated demand spikes, improving availability and reducing the need for over-provisioning.
  • Target Tracking Scaling policies provide a simplified way to configure dynamic scaling based on a specific metric target (e.g., average CPU utilization at 50%).
  • Auto Scaling groups support warm pools to pre-initialize instances for faster scaling, reducing cold-start times during demand surges.
  • Amazon Application Recovery Controller (ARC) supports zonal autoshift with EC2 Auto Scaling, automatically shifting traffic away from impaired AZs.

Elastic Load Balancing – ELB

  • Elastic Load Balancing is an effective way to increase the availability of a system and distributes incoming traffic to applications across several EC2 instances.
  • ELB supports health checks on hosts, distribution of traffic to EC2 instances across multiple availability zones, and dynamic addition and removal of EC2 hosts from the load-balancing rotation.
  • Elastic Load Balancing detects unhealthy instances within its pool and automatically reroutes traffic to healthy instances, until the unhealthy instances have been restored seamlessly using Auto Scaling.
  • Auto Scaling and Elastic Load Balancing are an ideal combination – while ELB gives a single DNS name for addressing, Auto Scaling ensures there is always the right number of healthy EC2 instances to accept requests.
  • ELB can be used to balance across instances in multiple AZs of a region.

ELB Types

  • Application Load Balancer (ALB) – Layer 7 (HTTP/HTTPS); supports path-based routing, host-based routing, mutual TLS authentication (2023), one-click AWS WAF integration, URL and host header rewrites (2025), Automatic Target Weights, and LCU Capacity Reservation for handling sharp traffic spikes.
  • Network Load Balancer (NLB) – Layer 4 (TCP/UDP/TLS); ultra-low latency, static IPs per AZ, weighted target groups for blue/green deployments, and subnet removal/addition capability (2025).
  • Gateway Load Balancer (GWLB) – Layer 3 gateway + Layer 4 load balancer; used to deploy, scale, and manage third-party virtual network appliances (firewalls, IDS/IPS).
  • Classic Load Balancer (CLB) – Previous generation; deprecated for new workloads. AWS recommends migrating to ALB or NLB. CLBs in EC2-Classic were retired in August 2022.

Elastic IPs – EIPs

  • Elastic IP addresses are public static IP addresses that can be mapped programmatically between instances within a region.
  • EIPs are associated with the AWS account and not with a specific instance or lifetime of an instance.
  • Elastic IP addresses can be used for instances and services that require consistent endpoints, such as master databases, central file servers, and EC2-hosted load balancers.
  • Elastic IP addresses can be used to work around host or availability zone failures by quickly remapping the address to another running instance or a replacement instance that was just started.

Reserved Instances & Savings Plans

  • Reserved Instances help reserve and guarantee computing capacity is available at a lower cost always.
  • Savings Plans provide a more flexible pricing model with up to 72% savings in exchange for committing to a consistent amount of compute usage (measured in $/hour) over a 1 or 3-year term.
  • On-Demand Capacity Reservations (ODCRs) ensure EC2 capacity is available in a specific AZ when needed for HA without requiring a term commitment.

Elastic Block Store – EBS

  • Elastic Block Store (EBS) offers persistent off-instance storage volumes that persist independently from the life of an instance and are about an order of magnitude more durable than on-instance storage.
  • EBS volumes store data redundantly and are automatically replicated within a single availability zone.
  • EBS helps in failover scenarios where if an EC2 instance fails and needs to be replaced, the EBS volume can be attached to the new EC2 instance.
  • Valuable data should never be stored only on instance (ephemeral) storage without proper backups, replication, or the ability to re-create the data.
  • EBS Multi-Attach (for io1/io2 volumes) allows a single volume to be attached to up to 16 Nitro-based instances within the same AZ for shared storage HA scenarios.

EBS Snapshots

  • EBS volumes are highly reliable, but to further mitigate the possibility of a failure and increase durability, point-in-time Snapshots can be created to store data on volumes in S3, which is then replicated to multiple AZs.
  • Snapshots can be used to create new EBS volumes, which are an exact replica of the original volume at the time the snapshot was taken.
  • Snapshots provide an effective way to deal with disk failures or other host-level issues, as well as with problems affecting an AZ.
  • Snapshots are incremental and back up only changes since the previous snapshot, so it is advisable to hold on to recent snapshots.
  • Snapshots are tied to the region, while EBS volumes are tied to a single AZ.
  • EBS Snapshots Archive provides up to 75% lower storage costs for snapshots stored 90+ days and rarely accessed.
  • Fast Snapshot Restore (FSR) eliminates the need for initializing volumes from snapshots, enabling full-performance volumes immediately upon creation for faster failover.

Relational Database Service – RDS

  • RDS makes it easy to run relational databases in the cloud.
  • RDS Multi-AZ instance deployments provision a synchronous standby replica in a different AZ, providing high availability and automatic failover protection.
  • In case of a failover scenario, the standby is promoted to be the primary seamlessly and will handle the database operations.
  • RDS Multi-AZ DB Cluster deployments (for MySQL and PostgreSQL) provide a primary instance and two readable standby instances across 3 AZs. This offers improved write latency, faster failover (typically under 35 seconds), and the standby instances can serve read traffic.
  • Automated backups, enabled by default, provide point-in-time recovery for the database instance.
  • RDS will back up your database and transaction logs and store both for a user-specified retention period.
  • In addition to the automated backups, manual RDS backups can also be performed which are retained until explicitly deleted.
  • Backups help recover from higher-level faults such as unintentional data modification, either by operator error or by bugs in the application.
  • RDS Read Replicas provide read-only replicas of the database and the ability to scale out beyond the capacity of a single database deployment for read-heavy database workloads.
  • RDS Read Replicas is a scalability and not a High Availability solution. However, cross-region Read Replicas can be manually promoted for disaster recovery.
  • Amazon RDS now supports ENA Express for Multi-AZ replication (2026), using Scalable Reliable Datagram (SRD) to improve replication performance by distributing traffic across multiple network paths.

Simple Storage Service – S3

  • S3 provides highly durable (99.999999999% / 11 9s), fault-tolerant and redundant object store.
  • S3 stores objects redundantly on multiple devices across multiple facilities in an S3 Region.
  • S3 is a great storage solution for somewhat static or slow-changing objects, such as images, videos, and other static media.
  • S3 also supports edge caching and streaming of these assets by interacting with the Amazon CloudFront service.
  • S3 Cross-Region Replication (CRR) automatically replicates objects to a bucket in another region, enabling disaster recovery and low-latency access for globally distributed users.
  • S3 Express One Zone delivers up to 10x faster performance with single-digit millisecond latency for frequently accessed data, but note it stores data in a single AZ (not suitable as the sole copy for fault tolerance).

Simple Queue Service – SQS

  • Simple Queue Service (SQS) is a highly reliable distributed messaging system that can serve as the backbone of a fault-tolerant application.
  • SQS is engineered to provide “at least once” delivery of all messages in standard queues. FIFO queues provide exactly-once processing and strict message ordering.
  • Messages sent to a queue are retained for up to 4 days (by default, can be extended up to 14 days) or until they are read and deleted by the application.
  • Messages can be polled by multiple workers and processed, while SQS takes care that a request is processed by only one worker at a time using a configurable time interval called visibility timeout.
  • If the number of messages in a queue starts to grow or if the average time to process a message becomes too high, workers can be scaled upwards by simply adding additional EC2 instances.
  • Dead-letter queues (DLQs) capture messages that cannot be processed successfully. DLQ redrive allows moving messages back to source queues for reprocessing.
  • FIFO queues support up to 70,000 messages per second with high throughput mode and up to 120K in-flight messages (increased from 20K in November 2024).

Route 53

  • Amazon Route 53 is a highly available and scalable DNS web service.
  • Queries for the domain are automatically routed to the nearest DNS server and thus are answered with the best possible performance.
  • Route 53 resolves requests for your domain name (for example, www.example.com) to your Elastic Load Balancer, as well as your zone apex record (example.com).
  • Route 53 supports multiple routing policies for HA: Failover (active-passive), Latency-based, Weighted, Geolocation, Geoproximity (expanded to public/private hosted zones in 2024), and Multivalue Answer.
  • Route 53 health checks can monitor endpoint health and trigger DNS failover automatically.
  • Route 53 Accelerated Recovery (2026) ensures customers can continue making DNS changes even during regional AWS outages, providing greater predictability for mission-critical applications.

CloudFront

  • CloudFront can be used to deliver website content, including dynamic, static and streaming content using a global network of edge locations.
  • Requests for your content are automatically routed to the nearest edge location, so content is delivered with the best possible performance.
  • CloudFront is optimized to work with other Amazon Web Services, like S3 and EC2.
  • CloudFront also works seamlessly with any non-AWS origin server, which stores the original, definitive versions of your files.
  • CloudFront Functions run lightweight JavaScript at edge locations for request/response customization. Lambda@Edge provides full compute capabilities at Regional Edge Caches.
  • VPC Origins allow CloudFront to fetch content directly from private resources within a VPC without exposing them to the public internet.
  • Origin Shield acts as an additional caching layer to reduce the load on origins and improve cache hit ratios for multi-region architectures.

DynamoDB Global Tables

  • DynamoDB Global Tables provide a fully managed, multi-Region, multi-active database solution for globally distributed applications.
  • Global Tables automatically replicate data across your choice of AWS Regions. Every replica table in every Region can accept both reads and writes.
  • Changes made to an item in one Region are typically replicated to all other replica Regions within a second.
  • Multi-Region Strong Consistency (MRSC), generally available since June 2025, provides zero RPO (Recovery Point Objective) by enabling strongly consistent reads from any Region. This is the highest level of application resilience for DynamoDB.
  • Global Tables now support cross-account replication (2026), enabling multi-account multi-region architectures.
  • Global Tables replace the previous cross-region replication approach (DynamoDB Streams-based) with a fully managed, zero-administration solution.

AWS Resilience Hub

  • AWS Resilience Hub is a central location to define, track, and manage the resilience of applications.
  • It enables you to define resilience goals (RTO/RPO), assess your resilience posture against those goals, and implement recommendations based on the AWS Well-Architected Framework.
  • Resilience Hub performs automated resilience assessments and identifies gaps in your architecture, such as missing Multi-AZ deployments or lack of backup strategies.
  • Integrates with AWS Fault Injection Service (FIS) to run chaos experiments directly from the Resilience Hub console.
  • The next generation of Resilience Hub (GA May 2026) uses generative AI to provide a structured resilience journey for SRE and development teams.

AWS Fault Injection Service (FIS)

  • AWS FIS is a managed chaos engineering service that enables you to perform controlled fault injection experiments on your AWS workloads.
  • FIS helps simulate real-world failures (AZ disruptions, instance failures, network degradation, API throttling) to validate fault tolerance of your architecture.
  • Supports actions targeting EC2, ECS, EKS, RDS, Lambda functions (native integration since October 2024), and more.
  • Amazon.com ran 733 AWS FIS experiments to prepare for Prime Day 2024.
  • Experiments can be generated using natural language through Amazon Bedrock integration (2025).

Amazon Application Recovery Controller (ARC)

  • ARC helps manage and coordinate recovery for applications across AWS Regions and Availability Zones.
  • Zonal Shift allows you to quickly shift traffic for a resource (ALB, NLB, EKS, Auto Scaling group) away from an impaired AZ to healthy AZs.
  • Zonal Autoshift enables AWS to automatically shift traffic away from an AZ when internal telemetry detects a potential impairment — without manual intervention.
  • Routing Controls provide manual override capabilities for cross-region failover of applications.
  • Zonal shift and zonal autoshift are available at no additional cost.
  • Supported resources include ALB, NLB, EC2 Auto Scaling groups, EKS clusters, and Karpenter (2026).

AWS Certification Exam Practice Questions

  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated.
  • Open to further feedback, discussion and correction.
  1. You are moving an existing traditional system to AWS, and during the migration discover that there is a master server which is a single point of failure. Having examined the implementation of the master server you realize there is not enough time during migration to re-engineer it to be highly available, though you do discover that it stores its state in a local MySQL database. In order to minimize down-time you select RDS to replace the local database and configure master to use it, what steps would best allow you to create a self-healing architecture[PROFESSIONAL]
    1. Migrate the local database into multi-AZ RDS database. Place master node into a multi-AZ auto-scaling group with a minimum of one and maximum of one with health checks.
    2. Replicate the local database into a RDS read replica. Place master node into a Cross-Zone ELB with a minimum of one and maximum of one with health checks. (Read Replica does not provide HA and write capability and ELB does not have feature for Min and Max 1 and Cross Zone allows just the equal distribution of load across instances)
    3. Migrate the local database into multi-AZ RDS database. Place master node into a Cross-Zone ELB with a minimum of one and maximum of one with health checks. (ELB does not have feature for Min and Max 1 and Cross Zone allows just the equal distribution of load across instances)
    4. Replicate the local database into a RDS read replica. Place master node into a multi-AZ auto-scaling group with a minimum of one and maximum of one with health checks. (Read Replica does not provide HA and write capability)
  2. You are designing Internet connectivity for your VPC. The Web servers must be available on the Internet. The application must have a highly available architecture. Which alternatives should you consider? (Choose 2 answers)
    1. Configure a NAT instance in your VPC. Create a default route via the NAT instance and associate it with all subnets. Configure a DNS A record that points to the NAT instance public IP address (NAT is for internet connectivity for instances in private subnet)
    2. Configure a CloudFront distribution and configure the origin to point to the private IP addresses of your Web servers. Configure a Route53 CNAME record to your CloudFront distribution.
    3. Place all your web servers behind ELB. Configure a Route53 CNAME to point to the ELB DNS name.
    4. Assign EIPs to all web servers. Configure a Route53 record set with all EIPs. With health checks and DNS failover.
  3. When deploying a highly available 2-tier web application on AWS, which combination of AWS services meets the requirements? 1. AWS Direct Connect 2. Amazon Route 53 3. AWS Storage Gateway 4. Elastic Load Balancing 4. Amazon EC2 5. Auto scaling 6. Amazon VPC 7. AWS Cloud Trail [PROFESSIONAL]
    1. 2,4,5 and 6
    2. 3,4,5 and 8
    3. 1 through 8
    4. 1,3,5 and 7
    5. 1,2,5 and 6
  4. Company A has hired you to assist with the migration of an interactive website that allows registered users to rate local restaurants. Updates to the ratings are displayed on the home page, and ratings are updated in real time. Although the website is not very popular today, the company anticipates that It will grow rapidly over the next few weeks. They want the site to be highly available. The current architecture consists of a single Windows Server 2008 R2 web server and a MySQL database running on Linux. Both reside inside an on-premises hypervisor. What would be the most efficient way to transfer the application to AWS, ensuring performance and high-availability? [PROFESSIONAL]
    1. Export web files to an Amazon S3 bucket in us-west-1. Run the website directly out of Amazon S3. Launch a multi-AZ MySQL Amazon RDS instance in us-west-1a. Import the data into Amazon RDS from the latest MySQL backup. Use Route 53 and create an alias record pointing to the elastic load balancer. (Its an Interactive website, although it can be implemented using Javascript SDK, its a migration and the application would need changes. Also no use of ELB if hosted on S3)
    2. Launch two Windows Server 2008 R2 instances in us-west-1b and two in us-west-1a. Copy the web files from on premises web server to each Amazon EC2 web server, using Amazon S3 as the repository. Launch a multi-AZ MySQL Amazon RDS instance in us-west-2a. Import the data into Amazon RDS from the latest MySQL backup. Create an elastic load balancer to front your web servers. Use Route 53 and create an alias record pointing to the elastic load balancer. (Although RDS instance is in a different region which will impact performance, this is the only option that works.)
    3. Use AWS VM Import/Export to create an Amazon Elastic Compute Cloud (EC2) Amazon Machine Image (AMI) of the web server. Configure Auto Scaling to launch two web servers in us-west-1a and two in us-west-1b. Launch a Multi-AZ MySQL Amazon Relational Database Service (RDS) instance in us-west-1b. Import the data into Amazon RDS from the latest MySQL backup. Use Amazon Route 53 to create a hosted zone and point an A record to the elastic load balancer. (does not create a load balancer)
    4. Use AWS VM Import/Export to create an Amazon EC2 AMI of the web server. Configure auto-scaling to launch two web servers in us-west-1a and two in us-west-1b. Launch a multi-AZ MySQL Amazon RDS instance in us-west-1a. Import the data into Amazon RDS from the latest MySQL backup. Create an elastic load balancer to front your web servers. Use Amazon Route 53 and create an A record pointing to the elastic load balancer. (Need to create an aliased record without which the Route 53 pointing to ELB would not work)
  5. Your company runs a customer facing event registration site. This site is built with a 3-tier architecture with web and application tier servers and a MySQL database. The application requires 6 web tier servers and 6 application tier servers for normal operation, but can run on a minimum of 65% server capacity and a single MySQL database. When deploying this application in a region with three availability zones (AZs) which architecture provides high availability? [PROFESSIONAL]
    1. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer), and an application tier deployed across 2 AZs with 3 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB. and one RDS (Relational Database Service) instance deployed with read replicas in the other AZ.
    2. A web tier deployed across 3 AZs with 2 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 3 AZs with 2 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB and one RDS (Relational Database Service) Instance deployed with read replicas in the two other AZs.
    3. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 2 AZs with 3 EC2 instances m each AZ inside an Auto Scaling Group behind an ELB and a Multi-AZ RDS (Relational Database Service) deployment.
    4. A web tier deployed across 3 AZs with 2 EC2 (Elastic Compute Cloud) instances in each AZ Inside an Auto Scaling Group behind an ELB (elastic load balancer). And an application tier deployed across 3 AZs with 2 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB. And a Multi-AZ RDS (Relational Database services) deployment.
  6. For a 3-tier, customer facing, inclement weather site utilizing a MySQL database running in a Region which has two AZs which architecture provides fault tolerance within the region for the application that minimally requires 6 web tier servers and 6 application tier servers running in the web and application tiers and one MySQL database? [PROFESSIONAL]
    1. A web tier deployed across 2 AZs with 6 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer), and an application tier deployed across 2 AZs with 6 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB. and a Multi-AZ RDS (Relational Database Service) deployment. (As it needs Fault Tolerance with minimal 6 servers always available)
    2. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each A2 inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 2 AZs with 3 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB and a Multi-AZ RDS (Relational Database Service) deployment.
    3. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 2 AZs with 6 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB and one RDS (Relational Database Service) Instance deployed with read replicas in the other AZs.
    4. A web tier deployed across 1 AZs with 6 EC2 (Elastic Compute Cloud) instances in each AZ Inside an Auto Scaling Group behind an ELB (elastic load balancer). And an application tier deployed in the same AZs with 6 EC2 instances inside an Auto scaling group behind an ELB and a Multi-AZ RDS (Relational Database services) deployment, with 6 stopped web tier EC2 instances and 6 stopped application tier EC2 instances all in the other AZ ready to be started if any of the running instances in the first AZ fails.
  7. You are designing a system which needs, at minimum, 8 m4.large instances operating to service traffic. When designing a system for high availability in the us-east-1 region, which has 6 Availability Zones, you company needs to be able to handle death of a full availability zone. How should you distribute the servers, to save as much cost as possible, assuming all of the EC2 nodes are properly linked to an ELB? Your VPC account can utilize us-east-1’s AZ’s a through f, inclusive.
    1. 3 servers in each of AZ’s a through d, inclusive.
    2. 8 servers in each of AZ’s a and b.
    3. 2 servers in each of AZ’s a through e, inclusive. (You need to design for N+1 redundancy on Availability Zones. ZONE_COUNT = (REQUIRED_INSTANCES / INSTANCE_COUNT_PER_ZONE) + 1. To minimize cost, spread the instances across as many possible zones as you can. By using a though e, you are allocating 5 zones. Using 2 instances, you have 10 total instances. If a single zone fails, you have 4 zones left, with 2 instances each, for a total of 8 instances. By spreading out as much as possible, you have increased cost by only 25% and significantly de-risked an availability zone failure. Refer link)
    4. 4 servers in each of AZ’s a through c, inclusive.
  8. You need your API backed by DynamoDB to stay online during a total regional AWS failure. You can tolerate a couple minutes of lag or slowness during a large failure event, but the system should recover with normal operation after those few minutes. What is a good approach? [PROFESSIONAL]
    1. Set up DynamoDB Global Tables in a multi-active configuration across two regions. Create an Auto Scaling Group behind an ELB in each of the two regions. Add a Route53 Latency DNS Record with DNS Failover, using the ELBs in the two regions as the resource records. (Use DynamoDB Global Tables (multi-active replication) with two ELBs and ASGs with Route53 Failover and Latency DNS. Note: DynamoDB Global Tables now also support Multi-Region Strong Consistency (MRSC) for zero RPO since June 2025.)
    2. Set up a DynamoDB Multi-Region table. Create an Auto Scaling Group behind an ELB in each of the two regions DynamoDB is running in. Add a Route53 Latency DNS Record with DNS Failover, using the ELBs in the two regions as the resource records. (This is now essentially correct with DynamoDB Global Tables being the multi-region solution. However at the time of the question, this option was considered incorrect.)
    3. Set up a DynamoDB Multi-Region table. Create a cross-region ELB pointing to a cross-region Auto Scaling Group, and direct a Route53 Latency DNS Record with DNS Failover to the cross-region ELB. (No such thing as Cross Region ELB or cross-region ASG)
    4. Set up DynamoDB cross-region replication in a master-standby configuration, with a single standby in another region. Create a cross-region ELB pointing to a cross-region Auto Scaling Group, and direct a Route53 Latency DNS Record with DNS Failover to the cross-region ELB. (No such thing as cross-region ELB or cross-region ASG)
  9. You are putting together a WordPress site for a local charity and you are using a combination of Route53, Elastic Load Balancers, EC2 & RDS. You launch your EC2 instance, download WordPress and setup the configuration files connection string so that it can communicate to RDS. When you browse to your URL however, nothing happens. Which of the following could NOT be the cause of this.
    1. You have forgotten to open port 80/443 on your security group in which the EC2 instance is placed.
    2. Your elastic load balancer has a health check, which is checking a webpage that does not exist; therefore your EC2 instance is not in service.
    3. You have not configured an ALIAS for your A record to point to your elastic load balancer
    4. You have locked port 22 down to your specific IP address therefore users cannot access your site using HTTP/HTTPS
  10. A development team that is currently doing a nightly six-hour build which is lengthening over time on-premises with a large and mostly under utilized server would like to transition to a continuous integration model of development on AWS with multiple builds triggered within the same day. However, they are concerned about cost, security and how to integrate with existing on-premises applications such as their LDAP and email servers, which cannot move off-premises. The development environment needs a source code repository; a project management system with a MySQL database resources for performing the builds and a storage location for QA to pick up builds from. What AWS services combination would you recommend to meet the development team’s requirements? [PROFESSIONAL]
    1. A Bastion host Amazon EC2 instance running a VPN server for access from on-premises, Amazon EC2 for the source code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIP for the source code repository and project management system, Amazon SQS for a build queue, An Amazon Auto Scaling group of Amazon EC2 instances for performing builds and Amazon Simple Email Service for sending the build output. (Bastion is not for VPN connectivity also SES should not be used)
    2. An AWS Storage Gateway for connecting on-premises software applications with cloud-based storage securely, Amazon EC2 for the resource code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIPs for the source code repository and project management system, Amazon Simple Notification Service for a notification initiated build, An Auto Scaling group of Amazon EC2 instances for performing builds and Amazon S3 for the build output. (Storage Gateway does not provide secure connectivity, still needs VPN. SNS alone cannot handle builds)
    3. An AWS Storage Gateway for connecting on-premises software applications with cloud-based storage securely, Amazon EC2 for the resource code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIPs for the source code repository and project management system, Amazon SQS for a build queue, An Amazon Elastic Map Reduce (EMR) cluster of Amazon EC2 instances for performing builds and Amazon CloudFront for the build output. (Storage Gateway does not provide secure connectivity, still needs VPN. EMR is not ideal for performing builds as it needs normal EC2 instances)
    4. A VPC with a VPN Gateway back to their on-premises servers, Amazon EC2 for the source-code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIPs for the source code repository and project management system, SQS for a build queue, An Auto Scaling group of EC2 instances for performing builds and S3 for the build output. (VPN gateway is required for secure connectivity. SQS for build queue and EC2 for builds)
  11. Which of the following AWS services and features are essential for building a modern, highly available fault-tolerant architecture? (Choose 3) [NEW – 2025]
    1. Amazon Application Recovery Controller (ARC) with zonal autoshift
    2. AWS CloudTrail
    3. AWS Fault Injection Service (FIS) for resilience testing
    4. RDS Multi-AZ DB Cluster with readable standbys
    5. Amazon Inspector
  12. A company needs its DynamoDB-backed application to survive a complete regional failure with zero data loss (zero RPO). Which approach best achieves this requirement? [NEW – 2025]
    1. Use DynamoDB Streams to replicate data to another region manually.
    2. Use DynamoDB point-in-time recovery (PITR) with cross-region backups.
    3. Use DynamoDB Global Tables with Multi-Region Strong Consistency (MRSC). (MRSC, GA since June 2025, enables zero RPO with strongly consistent reads from any region.)
    4. Use DynamoDB On-Demand backup and restore to a secondary region.
  13. An application runs behind an Application Load Balancer across 3 AZs. During an AZ impairment detected by AWS, what feature can automatically redirect traffic away from the affected AZ without manual intervention? [NEW – 2025]
    1. Route 53 health check failover
    2. ALB Cross-Zone load balancing
    3. Amazon Application Recovery Controller (ARC) zonal autoshift (ARC zonal autoshift automatically shifts traffic away from an impaired AZ when AWS internal telemetry detects issues, without requiring manual intervention.)
    4. Auto Scaling AZ rebalancing

References

AWS Intrusion Detection & Prevention System IDS/IPS

AWS Intrusion Detection & Prevention System IDS/IPS

  • An Intrusion Prevention System IPS
    • is an appliance that monitors and analyzes network traffic to detect malicious patterns and potentially harmful packets and prevent vulnerability exploits
    • Most IPS offer firewall, unified threat management and routing capabilities
  • An Intrusion Detection System IDS is
    • an appliance or capability that continuously monitors the environment
    • sends alerts when it detects malicious activity, policy violations or network & system attack from someone attempting to break into or compromise the system
    • produces reports for analysis.

Approaches for AWS IDS/IPS

Network Tap or SPAN

  • Traditional approach involves using a network Test Access Point (TAP) or Switch Port Analyzer (SPAN) to access & monitor all network traffic.
  • Connection between the AWS Internet Gateway (IGW) and the Elastic Load Balancer would be an ideal place to capture all network traffic.
  • However, there is no place to plug this in between IGW and ELB as there are no SPAN ports, network taps, or a concept of Layer 2 bridging

Packet Sniffing

  • It is not possible for a virtual instance running in promiscuous mode to receive or sniff traffic that is intended for a different virtual instance.
  • While interfaces can be placed into promiscuous mode, the hypervisor will not deliver any traffic to an instance that is not addressed to it.
  • Even two virtual instances that are owned by the same customer located on the same physical host cannot listen to each other’s traffic
  • So, promiscuous mode is not allowed

Host Based Firewall – Forward Deployed IDS

  • Deploy a network-based IDS on every instance you deploy IDS workload scales with your infrastructure
  • Host-based security software works well with highly distributed and scalable application architectures because network packet inspection is distributed across the entire software fleet
  • However, CPU-intensive process is deployed onto every single machine.

Host Based Firewall – Traffic Replication

  • An Agent is deployed on every instance to capture & replicate traffic for centralized analysis
  • Actual workload of network traffic analysis is not performed on the instance but on a separate server
  • Traffic capture and replication is still CPU-intensive (particularly on Windows machines.)
  • It significantly increases the internal network traffic in the environment as every inbound packet is duplicated in the transfer from the instance that captures the traffic to the instance that analyzes the traffic

AWS IDS IPS Solution 1

In-Line Firewall – Inbound IDS Tier

  • Add another tier to the application architecture where a load balancer sends all inbound traffic to a tier of instances that performs the network analysis for e.g. Third Party Solution Fortinet FortiGate
  • IDS workload is now isolated to a horizontally scalable tier in the architecture You have to maintain and manage another mission-critical elastic tier in the architecture

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A web company is looking to implement an intrusion detection and prevention system into their deployed VPC. This platform should have the ability to scale to thousands of instances running inside of the VPC. How should they architect their solution to achieve these goals?
    1. Configure an instance with monitoring software and the elastic network interface (ENI) set to promiscuous mode packet sniffing to see an traffic across the VPC. (virtual instance running in promiscuous mode to receive or“sniff” traffic)
    2. Create a second VPC and route all traffic from the primary application VPC through the second VPC where the scalable virtualized IDS/IPS platform resides.
    3. Configure servers running in the VPC using the host-based ‘route’ commands to send all traffic through the platform to a scalable virtualized IDS/IPS (host based routing is not allowed)
    4. Configure each host with an agent that collects all network traffic and sends that traffic to the IDS/IPS platform for inspection.
  2. You are designing an intrusion detection prevention (IDS/IPS) solution for a customer web application in a single VPC. You are considering the options for implementing IDS/IPS protection for traffic coming from the Internet. Which of the following options would you consider? (Choose 2 answers)
    1. Implement IDS/IPS agents on each Instance running In VPC
    2. Configure an instance in each subnet to switch its network interface card to promiscuous mode and analyze network traffic. (virtual instance running in promiscuous mode to receive or“sniff” traffic)
    3. Implement Elastic Load Balancing with SSL listeners In front of the web applications (ELB with SSL does not serve as IDS/IPS)
    4. Implement a reverse proxy layer in front of web servers and configure IDS/IPS agents on each reverse proxy server

References

AWS Risk and Compliance – Whitepaper – Certification

AWS Risk and Compliance Whitepaper Overview

  • AWS Risk and Compliance Whitepaper is intended to provide information to assist AWS customers with integrating AWS into their existing control framework supporting their IT environment.
  • AWS does communicate its security and control environment relevant to customers. AWS does this by doing the following:
    • Obtaining industry certifications and independent third-party attestations described in this document
    • Publishing information about the AWS security and control practices in whitepapers and web site content
    • Providing certificates, reports, and other documentation directly to AWS customers under NDA (as required)

Shared Responsibility model

  • AWS’ part in the shared responsibility includes
    • providing its services on a highly secure and controlled platform and providing a wide array of security features customers can use
    • relieves the customer’s operational burden as AWS operates, manages and controls the components from the host operating system and virtualization layer down to the physical security of the facilities in which the service operates
  • Customers’ responsibility includes
    • configuring their IT environments in a secure and controlled manner for their purposes
    • responsibility and management of the guest operating system (including updates and security patches), other associated application software as well as the configuration of the AWS provided security group firewall
    • stringent compliance requirements by leveraging technology such as host based firewalls, host based intrusion detection/prevention, encryption and key management
    • relieve customer burden of operating controls by managing those controls associated with the physical infrastructure deployed in the AWS environment

Risk and Compliance Governance

  • AWS provides a wide range of information regarding its IT control environment to customers through white papers, reports, certifications, and other third-party attestations
  • AWS customers are required to continue to maintain adequate governance over the entire IT control environment regardless of how IT is deployed.
  • Leading practices include
    • an understanding of required compliance objectives and requirements (from relevant sources),
    • establishment of a control environment that meets those objectives and requirements,
    • an understanding of the validation required based on the organization’s risk tolerance,
    • and verification of the operating effectiveness of their control environment.
  • Strong customer compliance and governance might include the following basic approach:
    • Review information available from AWS together with other information to understand as much of the entire IT environment as possible, and then document all compliance requirements.
    • Design and implement control objectives to meet the enterprise compliance requirements.
    • Identify and document controls owned by outside parties.
    • Verify that all control objectives are met and all key controls are designed and operating effectively.
  • Approaching compliance governance in this manner helps companies gain a better understanding of their control environment and will help clearly delineate the verification activities to be performed.

AWS Certifications, Programs, Reports, and Third-Party Attestations

  • AWS engages with external certifying bodies and independent auditors to provide customers with considerable information regarding the policies, processes, and controls established and operated by AWS.
  • AWS provides third-party attestations, certifications, Service Organization Controls (SOC) reports and other relevant compliance reports directly to our customers under NDA.

Key Risk and Compliance Questions

  • Shared Responsibility
    • AWS controls the physical components of that technology.
    • Customer owns and controls everything else, including control over connection points and transmissions
  • Auditing IT
    • Auditing for most layers and controls above the physical controls remains the responsibility of the customer
    • AWS ISO 27001 and other certifications are available for auditors review
    • AWS-defined logical and physical controls is documented in the SOC 1 Type II report and available for review by audit and compliance teams
  • Data location
    • AWS customers control which physical region their data and their servers will be located
    • AWS replicates the data only within the region
    • AWS will not move customers’ content from the selected Regions without notifying the customer, unless required to comply with the law or requests of governmental entities
  • Data center tours
    • As AWS host multiple customers, AWS does not allow data center tours by customers, as this exposes a wide range of customers to physical access of a third party.
    • An independent and competent auditor validates the presence and operation of controls as part of our SOC 1 Type II report.
    • This third-party validation provides customers with the independent perspective of the effectiveness of controls in place.
    • AWS customers that have signed a non-disclosure agreement with AWS may request a copy of the SOC 1 Type II report.
  • Third-party access
    • AWS strictly controls access to data centers, even for internal employees.
    • Third parties are not provided access to AWS data centers except when explicitly approved by the appropriate AWS data center manager per the AWS access policy
  • Multi-tenancy
    • AWS environment is a virtualized, multi-tenant environment.
    • AWS has implemented security management processes, PCI controls, and other security controls designed to isolate each customer from other customers.
    • AWS systems are designed to prevent customers from accessing physical hosts or instances not assigned to them by filtering through the virtualization software.
  • Hypervisor vulnerabilities
    • Amazon EC2 utilizes a highly customized version of Xen hypervisor.
    • Hypervisor is regularly assessed for new and existing vulnerabilities and attack vectors by internal and external penetration teams, and is well suited for maintaining strong isolation between guest virtual machines
  • Vulnerability management
    • AWS is responsible for patching systems supporting the delivery of service to customers, such as the hypervisor and networking services
  • Encryption
    • AWS allows customers to use their own encryption mechanisms for nearly all the services, including S3, EBS, SimpleDB, and EC2.
    • IPSec tunnels to VPC are also encrypted
  • Data isolation
    • All data stored by AWS on behalf of customers has strong tenant isolation security and control capabilities
  • Composite services
    • AWS does not leverage any third-party cloud providers to deliver AWS services to customers.
  • Distributed Denial Of Service (DDoS) attacks
    • AWS network provides significant protection against traditional network security issues and the customer can implement further protection
  • Data portability
    • AWS allows customers to move data as needed on and off AWS storage
  • Service & Customer provider business continuity
    • AWS does operate a business continuity program
    • AWS data centers incorporate physical protection against environmental risks.
    • AWS’ physical protection against environmental risks has been validated by an independent auditor and has been certified
    • AWS provides customers with the capability to implement a robust continuity plan with multi region/AZ deployment architectures, backups, data redundancy replication
  • Capability to scale
    • AWS cloud is distributed, highly secure and resilient, giving customers massive scale potential.
    • Customers may scale up or down, paying for only what they use
  • Service availability
    • AWS does commit to high levels of availability in its service level agreements (SLA) for e.g. S3 99.9%
  • Application Security
    • AWS system development lifecycle incorporates industry best practices which include formal design reviews by the AWS Security Team, source code analysis, threat modeling and completion of a risk assessment
    • AWS does not generally outsource development of software.
  • Threat and Vulnerability Management
    • AWS Security regularly engages independent security firms to perform external vulnerability threat assessments
    • AWS Security regularly scans all Internet facing service endpoint IP addresses for vulnerabilities, but do not include customer instances
    • AWS Security notifies the appropriate parties to remediate any identified vulnerabilities.
    • Customers can request permission to conduct scans and Penetration tests of their cloud infrastructure as long as they are limited to the customer’s instances and do not violate the AWS Acceptable Use Policy. Advance approval for these types of scans is required
  • Data Security

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. When preparing for a compliance assessment of your system built inside of AWS. What are three best practices for you to prepare for an audit? Choose 3 answers
    1. Gather evidence of your IT operational controls (Customer still needs to gather all the IT operation controls inline with their environment)
    2. Request and obtain applicable third-party audited AWS compliance reports and certifications (Customers can request the reports and certifications produced by our third-party auditors or can request more information about AWS Compliance)
    3. Request and obtain a compliance and security tour of an AWS data center for a pre-assessment security review (AWS does not allow data center tour)
    4. Request and obtain approval from AWS to perform relevant network scans and in-depth penetration tests of your system’s Instances and endpoints (AWS requires prior approval to be taken to perform penetration tests)
    5. Schedule meetings with AWS’s third-party auditors to provide evidence of AWS compliance that maps to your control objectives (Customers can request the reports and certifications produced by our third-party auditors or can request more information about AWS Compliance)
  2. In the shared security model, AWS is responsible for which of the following security best practices (check all that apply) :
    1. Penetration testing
    2. Operating system account security management
    3. Threat modeling
    4. User group access management
    5. Static code analysis
  3. You are running a web-application on AWS consisting of the following components an Elastic Load Balancer (ELB) an Auto-Scaling Group of EC2 instances running Linux/PHP/Apache, and Relational DataBase Service (RDS) MySQL. Which security measures fall into AWS’s responsibility?
    1. Protect the EC2 instances against unsolicited access by enforcing the principle of least-privilege access (Customer owned)
    2. Protect against IP spoofing or packet sniffing
    3. Assure all communication between EC2 instances and ELB is encrypted (Customer owned)
    4. Install latest security patches on ELB, RDS and EC2 instances (Customer owned)
  4. Which of the following statements is true about achieving PCI certification on the AWS platform? (Choose 2)
    1. Your organization owns the compliance initiatives related to anything placed on the AWS infrastructure
    2. Amazon EC2 instances must run on a single-tenancy environment (dedicated instance)
    3. AWS manages card-holder environments
    4. AWS Compliance provides assurance related to the underlying infrastructure

References