I recently cleared the AWS Certified Solution Architect Professional Exam with 93% after almost 2 months of preparation
Topic Level Scoring: 1.0 High Availability and Business Continuity: 100% 2.0 Costing: 75% 3.0 Deployment Management: 100% 4.0 Network Design: 85% 5.0 Data Storage: 90% 6.0 Security: 92% 7.0 Scalability & Elasticity: 100% 8.0 Cloud Migration & Hybrid Architecture: 85%
AWS Solution Architect – Professional exam is quite an exhaustive exam with 77 questions in 180 minutes and covers a lot of AWS services and the combinations how they work and integrate together. However, the questions are bit old and has not kept pace with the fast changing AWS enhancements
If looking for Associate Preparation Guide, please refer
AWS Certification Exams cover a lot of topics and a wide range of services with minute details for features, patterns, anti patterns and their integration with other services. This blog post is just to have a quick summary of all the services and key points for a quick glance before you appear for the exam
AWS Global Infrastructure
AWS Region, AZs, Edge locations
Each region is a separate geographic area, completely independent, isolated from the other regions & helps achieve the greatest possible fault tolerance and stability
Communication between regions is across the public Internet
Each region has multiple Availability Zones
Each AZ is physically isolated, geographically separated from each other and designed as an independent failure zone
AZs are connected with low-latency private links (not public internet)
Edge locations are locations maintained by AWS through a worldwide network of data centers for the distribution of content to reduce latency.
AWS Local Zones
AWS Local Zones place select AWS services closer to end-users, which allows running highly-demanding applications that require single-digit millisecond latencies to the end-users such as media & entertainment content creation, real-time gaming, machine learning etc.
AWS Local Zones provide a high-bandwidth, secure connection between local workloads and those running in the AWS Region, allowing you to seamlessly connect to the full range of in-region services through the same APIs and tool sets.
AWS infrastructure deployments embed AWS compute and storage services within the telecommunications providers’ datacenters and help seamlessly access the breadth of AWS services in the region.
AWS Wavelength brings services to the edge of the 5G network, without leaving the mobile provider’s network reducing the extra network hops, minimizing the latency to connect to an application from a mobile device.
AWS Outposts bring native AWS services, infrastructure, and operating models to virtually any data center, co-location space, or on-premises facility.
AWS Outposts is designed for connected environments and can be used to support workloads that need to remain on-premises due to low latency, compliance or local data processing needs.
data transfer solution for delivering real time streaming data to destinations such as S3, Redshift, Elasticsearch service, and Splunk.
is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration
is Near Real Time (min. 60 secs) as it buffers incoming streaming data to a certain size or for a certain period of time before delivering it
supports batching, compression, and encryption of the data before loading it, minimizing the amount of storage used at the destination and increasing security
supports data compression, minimizing the amount of storage used at the destination. It currently supports GZIP, ZIP, and SNAPPY compression formats. Only GZIP is supported if the data is further loaded to Redshift.
supports out of box data transformation as well as custom transformationusing Lambda function to transform incoming source data and deliver the transformed data to destinations
uses at least once semantics for data delivery.
supports multiple producers as datasource, which include Kinesis data stream, KPL, Kinesis Agent, or the Kinesis Data Firehose API using the AWS SDK, CloudWatch Logs, CloudWatch Events, or AWS IoT
does NOT support consumers like Spark and KCL
supports interface VPC endpoint to keep traffic between the VPC and Kinesis Data Firehose from leaving the Amazon network.
Kinesis Data Streams vs Kinesis Data Firehose
Kinesis Data Analytics
helps analyze streaming data, gain actionable insights, and respond to the business and customer needs in real time.
reduces the complexity of building, managing, and integrating streaming applications with other AWS service
is made up of all of the columns listed in the sort key definition, in the order they are listed and is more efficient when query predicates use a prefix, or query’s filter applies conditions, such as filters and joins, which is a subset of the sort key columns in order.
Interleaved sort key
gives equal weight to each column in the sort key, so query predicates can use any subset of the columns that make up the sort key, in any order.
Not ideal for monotonically increasing attributes
Column encodings CANNOT be changed once created.
supports query queues for Workload Management, in order to manage concurrency and resource planning. It is a best practice to have separate queues for long running resource-intensive queries and fast queries that don’t require big amounts of memory and CPU
is a very fast, easy-to-use, cloud-powered business analytics service that makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from their data, anytime, on any device.
delivers fast and responsive query performance by using a robust in-memory engine (SPICE).
“SPICE” stands for a Super-fast, Parallel, In-memory Calculation Engine
can also be configured to keep the data in SPICE up-to-date as the data in the underlying sources change.
automatically replicates data for high availability and enables QuickSight to scale to support users to perform simultaneous fast interactive analysis across a wide variety of AWS data sources.
Excel files and flat files like CSV, TSV, CLF, ELF
on-premises databases like PostgreSQL, SQL Server and MySQL
SaaS applications like Salesforce
and AWS data sources such as Redshift, RDS, Aurora, Athena, and S3
supports various functions to format and transform the data.
supports assorted visualizations that facilitate different analytical approaches:
Comparison and distribution – Bar charts (several assorted variants)
Elasticsearch Service is a managed service that makes it easy to deploy, operate, and scale Elasticsearch clusters in the AWS Cloud.
real-time, distributed search and analytics engine
ability to provision all the resources for Elasticsearch cluster and launches the cluster
easy to use cluster scaling options. Scaling Elasticsearch Service domain by adding or modifying instances, and storage volumes is an online operation that does not require any downtime.
provides self-healing clusters, which automatically detects and replaces failed Elasticsearch nodes, reducing the overhead associated with self-managed infrastructures
domain snapshots to back up and restore ES domains and replicate domains across AZs
enhanced security with IAM, Network, Domain access policies, and fine-grained access control
storage volumes for the data using EBS volumes
ability to span cluster nodes across multiple AZs in the same region, known as zone awareness, for high availability and redundancy. Elasticsearch Service automatically distributes the primary and replica shards across instances in different AZs.
dedicated master nodes to improve cluster stability
data visualization using the Kibana tool
integration with CloudWatch for monitoring ES domain metrics
integration with CloudTrail for auditing configuration API calls to ES domains
integration with S3, Kinesis, and DynamoDB for loading streaming data
ability to handle structured and Unstructured data
supports encryption at rest through KMS, node-to-node encryption over TLS, and the ability to require clients to communicate of HTTPS
stores copies of the messages on multiple servers for redundancy and high availability
guarantees At-Least-Once Delivery, but does not guarantee Exact One Time Delivery which might result in duplicate messages (Not true anymore with the introduction of FIFO queues)
does not maintain or guarantee message order, and if needed sequencing information needs to be added to the message itself (Not true anymore with the introduction of FIFO queues)
supports multiple readers and writers interacting with the same queue as the same time
holds message for 4 days, by default, and can be changed from 1 min – 14 days after which the message is deleted
message needs to be explicitly deleted by the consumer once processed
allows send, receive and delete batching which helps club up to 10 messages in a single batch while charging price for a single message
handles visibility of the message to multiple consumers using Visibility Timeout, where the message once read by a consumer is not visible to the other consumers till the timeout occurs
can handle load and performance requirements by scaling the worker instances as the demand changes (Job Observer pattern)
message sample allowing short and long polling
returns immediately vs waits for fixed time for e.g. 20 secs
might not return all messages as it samples a subset of servers vs returns all available messages
repetitive vs helps save cost with long connection
supports delay queues to make messages available after a certain delay, can you used to differentiate from priority queues
supports dead letter queues, to redirect messages which failed to process after certain attempts instead of being processed repeatedly
Job Observer Pattern can help coordinate number of EC2 instances with number of job requests (Queue Size) automatically thus Improving cost effectiveness and performance
Priority Queue Pattern can be used to setup different queues with different handling either by delayed queues or low scaling capacity for handling messages in lower priority queues
delivery or sending of messages to subscribing endpoints or clients
Producers and Consumers communicate asynchronously with subscribers by producing and sending a message to a topic
supports Email (plain or JSON), HTTP/HTTPS, SMS, SQS
supports Mobile Push Notifications to push notifications directly to mobile devices with services like Amazon Device Messaging (ADM), Apple Push Notification Service (APNS), Google Cloud Messaging (GCM) etc. supported
order is not guaranteed and No recall available
integrated with Lambda to invoke functions on notifications
for Email notifications, use SNS or SES directly, SQS does not work
orchestration service to coordinate work across distributed components
helps define tasks, stores, assigns tasks to workers, define logic, tracks and monitors the task and maintains workflow state in a durable fashion
helps define tasks which can be executed on AWS cloud or on-premises
helps coordinating tasks across the application which involves managing intertask dependencies, scheduling, and concurrency in accordance with the logical flow of the application
supports built-in retries, timeouts and logging
supports manual tasks
deliver exactly once
uses long polling, which reduces number of polls without results
Visibility of task state via API
Timers, signals, markers, child workflows
keeps workflow history for a user-specified time
AWS SWF vs AWS SQS
task-oriented vs message-oriented
track of all tasks and events vs needs custom handling
highly scalable and cost-effective email service
uses content filtering technologies to scan outgoing emails to check standards and email content for spam and malware
supports full fledged emails to be sent as compared to SNS where only the message is sent in Email
ideal for sending bulk emails at scale
guarantees first hop
eliminates the need to support custom software or applications to do heavy lifting of email transport
allows a single Aurora database to span multiple AWS regions.
provides Physical replication, which uses dedicated infrastructure that leaves the databases entirely available to serve the application
supports 1 Primary Region (read / write)
replicates across up to 5 secondary (read-only) regions, replication lag is less than 1 second
supports up to 16 Read Replicas per secondary region
recommended for low-latency global reads and disaster recovery with an RTO of < 1 minute
failover is not automated and if the primary region becomes unavailable, a secondary region can be manually removed from an Aurora Global Database and promote it to take full reads and writes. Application needs to be updated to point to the newly promoted region.
supports parallel or distributed query using Aurora Parallel Query, which refers to the ability to push down and distribute the computational load of a single query across thousands of CPUs in Aurora’s storage layer.
helps define a logically isolated dedicated virtual network within the AWS
provides control of IP addressing using CIDR block from a minimum of /28 to maximum of /16 block size
supports IPv4 and IPv6 addressing
cannot be extended once created
can be extended by associating secondary IPv4 CIDR blocks to VPC
Internet gateway (IGW) provides access to the Internet
Virtual gateway (VGW) provides access to on-premises data center through VPN and Direct Connect connections
VPC can have only one IGW and VGW
Route tables determine where network traffic from subnet is directed
Ability to create subnet with VPC CIDR block
A Network Address Translation (NAT) server provides outbound Internet access for EC2 instances in private subnets
Elastic IP addresses are static, persistent public IP addresses
Instances launched in the VPC will have a Private IP address and can have a Public or a Elastic IP address associated with it
Security Groups and NACLs help define security
Flow logs – Capture information about the IP traffic going to and from network interfaces in your VPC
Tenancy option for instances
shared, by default, allows instances to be launched on shared tenancy
dedicated allows instances to be launched on a dedicated hardware
defines rules, termed as routes, which determine where network traffic from the subnet would be routed
Each VPC has a Main Route table, and can have multiple custom route tables created
Every route table contains a local route that enables communication within a VPC which cannot be modified or deleted
Route priority is decided by matching the most specific route in the route table that matches the traffic
map to AZs and do not span across AZs
have a CIDR range that is a portion of the whole VPC.
CIDR ranges cannot overlap between subnets within the VPC.
AWS reserves 5 IP addresses in each subnet – first 4 and last one
Each subnet is associated with a route table which define its behavior
Public subnets – inbound/outbound Internet connectivity via IGW
Private subnets – outbound Internet connectivity via an NAT or VGW
Protected subnets – no outbound connectivity and used for regulated workloads
Elastic Network Interface (ENI)
a default ENI, eth0, is attached to an instance which cannot be detached with one or more secondary detachable ENIs (eth1-ethn)
has primary private, one or more secondary private, public, Elastic IP address, security groups, MAC address and source/destination check flag attributes associated
AN ENI in one subnet can be attached to an instance in the same or another subnet, in the same AZ and the same VPC
Security group membership of an ENI can be changed
with pre allocated Mac Address can be used for applications with special licensing requirements
Security Groups vs Network Access Control Lists
Stateful vs Stateless
At instance level vs At subnet level
Only allows Allow rule vs Allows both Allow and Deny rules
Evaluated as a Whole vs Evaluated in defined Order
is a static IP address designed for dynamic cloud computing.
is associated with AWS account, and not a particular instance
can be remapped from one instance to an other instance
is charged for non usage, if not linked for any instance or instance associated is in stopped state
allows internet access to instances in private subnet
performs the function of both address translation and port address translation (PAT)
needs source/destination check flag to be disabled as it is not actual destination of the traffic
NAT gateway is a AWS managed NAT service that provides better availability, higher bandwidth, and requires less administrative effort
are not supported for IPv6 traffic
Egress-Only Internet Gateways
outbound communication over IPv6 from instances in the VPC to the Internet, and prevents the Internet from initiating an IPv6 connection with your instances
supports IPv6 traffic only
allows multiple AWS accounts to create their application resources, such as EC2 instances, RDS databases, Redshift clusters, and AWS Lambda functions, into shared, centrally-managed VPCs
allows routing of traffic between the peer VPCs using private IP addresses and no IGW or VGW required
No single point of failure and bandwidth bottlenecks
cannot span across regions
supports inter-region VPC peering
IP space or CIDR blocks cannot overlap
cannot be transitive, one-to-one relationship between two VPC
Only one between any two VPCs and have to be explicitly peered
Private DNS values cannot be resolved
Security groups from peered VPC cannot be referred for ingress and egress rules in security group, use CIDR block instead
Security groups from peered VPC can now be referred, however the VPC should be in the same region.
enables you to privately connect VPC to supported AWS services and VPC endpoint services powered by PrivateLink
does not require a public IP address, access over the Internet, NAT device, a VPN connection or Direct Connect
traffic between VPC & AWS service does not leave the Amazon network
are virtual devices.
are horizontally scaled, redundant, and highly available VPC components that allow communication between instances in your VPC and services without imposing availability risks or bandwidth constraints on your network traffic.
is a gateway that is a target for a specified route in the route table, used for traffic destined to a supported AWS service.
only S3 and DynamoDB are currently supported
is an elastic network interface with a private IP address that serves as an entry point for traffic destined to a supported service
services supported API Gateway AWS CloudFormation, CloudWatch, CloudWatch Events, CloudWatch Logs AWS CodeBuild AWS CodeCommit AWS Config, EC2 API Elastic Load Balancing API, Elastic Container Registry, Elastic Container Service AWS Key Management Service, Kinesis Data Streams, SageMaker and, SageMaker Runtime, SageMaker Notebook Instance AWS Secrets Manager AWS Security Token Service AWS Service Catalog, SNS, SQS AWS Systems Manager
provides low latency and high data transfer speeds for distribution of static, dynamic web, or streaming content to web users.
delivers the content through a worldwide network of data centers called Edge Locations or Point of Presence (PoPs)
keeps persistent connections with the origin servers so that the files can be fetched from the origin servers as quickly as possible.
dramatically reduces the number of network hops that users’ requests must pass through
supports multiple origin server options, like AWS hosted service for e.g. S3, EC2, ELB, or an on-premise server, which stores the original, definitive version of the objects
single distribution can have multiple origins and Path pattern in a cache behavior determines which requests are routed to the origin
Web distribution supports static, dynamic web content, on-demand using progressive download & HLS, and live streaming video content
supports HTTPS using either
dedicated IP address, which is expensive as a dedicated IP address is assigned to each CloudFront edge location
Server Name Indication (SNI), which is free but supported by modern browsers only with the domain name available in the request header
For E2E HTTPS connection,
Viewers -> CloudFront needs either a self-signed certificate or a certificate issued by CA or ACM
CloudFront -> Origin needs certificate issued by ACM for ELB and by CA for other origins
Origin Access Identity (OAI) can be used to restrict the content from S3 origin to be accessible from CloudFront only
supports Geo restriction (Geo-Blocking) to whitelist or blacklist countries that can access the content
to restrict access to individual files, for e.g., an installation download for your application.
users using a client, for e.g. a custom HTTP client, that doesn’t support cookies
provide access to multiple restricted files, for e.g., video part files in HLS format or all of the files in the subscribers’ area of a website.
don’t want to change the current URLs
integrates with AWS WAF, a web application firewall that helps protect web applications from attacks by allowing rules configured based on IP addresses, HTTP headers, and custom URI strings
supports GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE to get object & object headers, add, update, and delete objects
only caches responses to GET and HEAD requests and, optionally, OPTIONS requests
does not cache responses to PUT, POST, PATCH, DELETE request methods and these requests are proxied back to the origin
object removal from the cache
would be removed upon expiry (TTL) from the cache, by default 24 hrs
can be invalidated explicitly, but has a cost associated, however, might continue to see the old version until it expires from those caches
objects can be invalidated only for Web distribution
use versioning or change object name, to serve a different version
supports adding or modifying custom headers before the request is sent to origin which can be used to
validate if a user is accessing the content from CDN
identifying CDN from which the request was forwarded, in case of multiple CloudFront distributions
for viewers not supporting CORS to return the Access-Control-Allow-Origin header for every request
supports Partial GET requests using range header to download objects in smaller units improving the efficiency of partial downloads and recovery from partially failed transfers
supports compression to compress and serve compressed files when viewer requests include Accept-Encoding: gzip in the request header
supports different price classes to include all regions, or only the least expensive regions and other regions without the most expensive regions
supports access logs which contain detailed information about every user request for both web and RTMP distribution
Direct Connect & VPN
provide secure IPSec connections from on-premise computers or services to AWS over the Internet
is quick to setup, is cheap however it depends on the Internet speed
is a network service that provides an alternative to using Internet to utilize AWS services by using private dedicated network connection
provides Virtual Interfaces
Private VIF to access instances within an VPC via VGW
Public VIF to access non VPC services
requires time to setup probably months, and should not be considered as an option if turnaround time is less
does not provide redundancy, use either second direct connection or IPSec VPN connection
Virtual Private Gateway is on the AWS side and Customer Gateway is on the Customer side
route propagation is enabled on VGW and not on CGW
Direct Connect vs VPN IPSec
Expensive to Setup and Takes time vs Cheap & Immediate
Dedicated private connections vs Internet
Reduced data transfer rate vs Internet data transfer cost
Consistent performance vs Internet inherent variability
Do not provide Redundancy vs Provides Redundancy
Highly available and scalable DNS & Domain Registration Service
Reliable and cost-effective way to route end users to Internet applications
Supports multi-region and backup architectures for High availability. ELB , limited to region, does not support multi region HA architecture
supports private Intranet facing DNS service
internal resource record sets only work for requests originating from within the VPC and currently cannot extend to on-premise
Global propagation of any changes made to the DN records within ~ 1min
Route 53 to create an alias resource record set that points to ELB, S3, CloudFront. An alias resource record set is a Route 53 extension to DNS. It’s similar to a CNAME resource record set, but supports both for root domain – zone apex e.g. example.com, and for subdomains for e.g. www.example.com.
CNAME resource record sets can be created only for subdomains and cannot be mapped to the zone apex record
Route 53 Split-view (Split-horizon) DNS enables you to access an internal version of your website using the same domain name that is used publicly
Simple routing – simple round robin policy
Weighted round robin – assign weights to resource records sets to specify the proportion for e.g. 80%:20%
Latency based routing – helps improve global applications as request are sent to server from the location with minimal latency, is based on the latency and cannot guarantee users from the same geographic will be served from the same location for any compliance reasons
Geolocation routing – Specify geographic locations by continent, country, state limited to US, is based on IP accuracy
Failover routing – failover to a backup site if the primary site fails and becomes unreachable
Weighted, Latency and Geolocation can be used for Active-Active while Failover routing can be used for Active-Passive multi region architecture
gives developers and systems administrators an easy way to create and manage a collection of related AWS resources
Resources can be updated, deleted, and modified in an orderly, controlled and predictable fashion, in effect applying version control to the AWS infrastructure as code done for software code
CloudFormation Template is an architectural diagram, in JSON format, and Stack is the end result of that diagram, which is actually provisioned
template can be used to set up the resources consistently and repeatedly over and over across multiple regions and consists of
List of AWS resources and their configuration values
An optional template file format version number
An optional list of template parameters (input values supplied at stack creation time)
An optional list of output values like public IP address using the Fn::GetAtt function
An optional list of data tables used to lookup static configuration values for e.g., AMI names per AZ
supports Chef & Puppet Integration to deploy and configure right down the application layer
supports Bootstrap scripts to install packages, files, and services on the EC2 instances by simply describing them in the CF template
automatic rollback on error feature is enabled, by default, which will cause all the AWS resources that CF created successfully for a stack up to the point where an error occurred to be deleted
provides a WaitCondition resource to block the creation of other resources until a completion signal is received from an external source
allows DeletionPolicy attribute to be defined for resources in the template
retain to preserve resources like S3 even after stack deletion
snapshot to backup resources like RDS after stack deletion
DependsOn attribute to specify that the creation of a specific resource follows another
Service role is an IAM role that allows AWS CloudFormation to make calls to resources in a stack on the user’s behalf
Nested stacks can separate out reusable, common components and create dedicated templates to mix and match different templates but use nested stacks to create a single, unified stack
Change Sets presents a summary or preview of the proposed changes that CloudFormation will make when a stack is updated
Drift detection enables you to detect whether a stack’s actual configuration differs, or has drifted, from its expected configuration.
Termination protection helps prevent a stack from being accidentally deleted.
Stack policy can prevent stack resources from being unintentionally updated or deleted during a stack update.
StackSets extends the functionality of stacks by enabling you to create, update, or delete stacks across multiple accounts and Regions with a single operation.
makes it easier for developers to quickly deploy and manage applications in the AWS cloud.
automatically handles the deployment details of capacity provisioning, load balancing, auto-scaling and application health monitoring
CloudFormation supports ElasticBeanstalk
provisions resources to support
a web application that handles HTTP(S) requests or
a web application that handles background-processing (worker) tasks
supports Out Of the Box
Apache Tomcat for Java applications
Apache HTTP Server for PHP applications
Apache HTTP server for Python applications
Nginx or Apache HTTP Server for Node.js applications
Passenger for Ruby applications
MicroSoft IIS 7.5 for .Net applications
Single and Multi Container Docker
supports custom AMI to be used
is designed to support multiple running environments such as one for Dev, QA, Pre-Prod and Production.
supports versioning and stores and tracks application versions over time allowing easy rollback to prior version
can provision RDS DB instance and connectivity information is exposed to the application by environment variables, but is NOT recommended for production setup as the RDS is tied up with the Elastic Beanstalk lifecycle and if deleted, the RDS instance would be deleted as well
is a configuration management service that helps to configure and operate applications in a cloud enterprise by using Chef
helps deploy and monitor applications in stacks with multiple layers
supports preconfigured layers for Applications, Databases, Load Balancers, Caching
OpsWorks Stacks features is a set of lifecycle events – Setup, Configure, Deploy, Undeploy, and Shutdown – which automatically runs specified set of recipes at the appropriate time on each instance
Layers depend on Chef recipes to handle tasks such as installing packages on instances, deploying apps, running scripts, and so on
OpsWorks Stacks runs the recipes for each layer, even if the instance belongs to multiple layers
supports Auto Healing and Auto Scaling to monitor instance health, and provision new instances
allows monitoring of AWS resources and applications in real time, collect and track pre configured or custom metrics and configure alarms to send notification or make resource changes based on defined rules
does not aggregate data across regions
stores the log data indefinitely, and the retention can be changed for each log group at any time
alarm history is stored for only 14 days
can be used an alternative to S3 to store logs with the ability to configure Alarms and generate metrics, however logs cannot be made public
Alarms exist only in the created region and the Alarm actions must reside in the same region as well
records access to API calls for the AWS account made from AWS management console, SDKs, CLI and higher level AWS service
support many AWS services and tracks who did, from where, what & when
can be enabled per-region basis, a region can include global services (like IAM, STS etc), is applicable to all the supported services within that region
log files from different regions can be sent to the same S3 bucket
can be integrated with SNS to notify logs availability, CloudWatch logs log group for notifications when specific API events occur
call history enables security analysis, resource change tracking, trouble shooting and compliance auditing
securely control access to AWS services and resources
helps create and manage user identities and grant permissions for those users to access AWS resources
helps create groups for multiple users with similar permissions
not appropriate for application authentication
is Global and does not need to be migrated to a different region
helps define Policies,
in JSON format
all permissions are implicitly denied by default
most restrictive policy wins
helps grants and delegate access to users and services without the need of creating permanent credentials
IAM users or AWS services can assume a role to obtain temporary security credentials that can be used to make AWS API calls
needs Trust policy to define who and Permission policy to define what the user or service can access
used with Security Token Service (STS), a lightweight web service that provides temporary, limited privilege credentials for IAM users or for authenticated federated users
IAM role scenarios
Service access for e.g. EC2 to access S3 or DynamoDB
Cross Account access for users
with user within the same account
with user within an AWS account owned the same owner
with user from a Third Party AWS account with External ID for enhanced security
Identity Providers & Federation
Web Identity Federation, where the user can be authenticated using external authentication Identity providers like Amazon, Google or any OpenId IdP using AssumeRoleWithWebIdentity
Identity Provider using SAML 2.0, where the user can be authenticated using on premises Active Directory, Open Ldap or any SAML 2.0 compliant IdP using AssumeRoleWithSAML
For other Identity Providers, use Identity Broker to authenticate and provide temporary Credentials using AssumeRole (recommended) or GetFederationToken
IAM Best Practices
Do not use Root account for anything other than billing
Create Individual IAM users
Use groups to assign permissions to IAM users
Grant least privilege
Use IAM roles for applications on EC2
Delegate using roles instead of sharing credentials
Rotate credentials regularly
Use Policy conditions for increased granularity
Use CloudTrail to keep a history of activity
Enforce a strong IAM password policy for IAM users
Remove all unused users and credentials
provides secure cryptographic key storage to customers by making hardware security modules (HSMs) available in the AWS cloud
single tenant, dedicated physical device to securely generate, store, and manage cryptographic keys used for data encryption
are inside the VPC (not EC2-classic) & isolated from the rest of the network
can use VPC peering to connect to CloudHSM from multiple VPCs
integrated with Amazon Redshift and Amazon RDS for Oracle
EBS volume encryption, S3 object encryption and key management can be done with CloudHSM but requires custom application scripting
is NOT fault tolerant and would need to build a cluster as if one fails all the keys are lost
expensive, prefer AWS Key Management Service (KMS) if cost is a criteria
AWS Directory Services
gives applications in AWS access to Active Directory services
different from SAML + AD, where the access is granted to AWS services through Temporary Credentials
least expensive but does not support Microsoft AD advance features
provides a Samba 4 Microsoft Active Directory compatible standalone directory service on AWS
No single point of Authentication or Authorization, as a separate copy is maintained
trust relationships cannot be setup between Simple AD and other Active Directory domains
Don’t use it, if the requirement is to leverage access and control through centralized authentication service
acts just as an hosted proxy service for instances in AWS to connect to on-premises Active Directory
enables consistent enforcement of existing security policies, such as password expiration, password history, and account lockouts, whether users are accessing resources on-premises or in the AWS cloud
needs VPN connectivity (or Direct Connect)
integrates with existing RADIUS-based MFA solutions to enabled multi-factor authentication
does not cache data which might lead to latency
Read-only Domain Controllers (RODCs)
works out as a Read-only Active Directory
holds a copy of the Active Directory Domain Service (AD DS) database and respond to authentication requests
they cannot be written to and are typically deployed in locations where physical security cannot be guaranteed
helps maintain a single point to authentication & authorization controls, however needs to be synced
Writable Domain Controllers
are expensive to setup
operate in a multi-master model; changes can be made on any writable server in the forest, and those changes are replicated to servers throughout the entire forest
is a web application firewall that helps monitor the HTTP/HTTPS traffic and allows controlling access to the content.
helps protect web applications from attacks by allowing rules configuration that allow, block, or monitor (count) web requests based on defined conditions. These conditions include IP addresses, HTTP headers, HTTP body, URI strings, SQL injection and cross-site scripting.
helps define Web ACLs, which is a combination of Rules that is a combinations of Conditions and Action to block or allow
integrated with CloudFront, Application Load Balancer (ALB), API Gateway services commonly used to deliver content and applications
supports custom origins outside of AWS, when integrated with CloudFront
Third Party WAF
act as filters that apply a set of rules to web traffic to cover exploits like XSS and SQL injection and also help build resiliency against DDoS by mitigating HTTP GET or POST floods
WAF provides a lot of features like OWASP Top 10, HTTP rate limiting, Whitelist or blacklist, inspect and identify requests with abnormal patterns, CAPTCHA etc
a WAF sandwich pattern can be implemented where an autoscaled WAF sits between the Internet and Internal Load Balancer
is a managed service that provides protection against Distributed Denial of Service (DDoS) attacks for applications running on AWS
provides protection for all AWS customers against common and most frequently occurring infrastructure (layer 3 and 4) attacks like SYN/UDP floods, reflection attacks, and others to support high availability of applications on AWS.
provides AWS Shield Advanced with additional protections against more sophisticated and larger attacks for applications running on EC2, ELB, CloudFront, AWS Global Accelerator, and Route 53
offers threat detection that enables continuous monitoring and protect the AWS accounts and workloads.
analyzes continuous streams of meta-data generated from AWS account and network activity found in AWS CloudTrail Events, VPC Flow Logs, and DNS Logs.
integrated threat intelligence such as known malicious IP addresses, anomaly detection, and machine learning to identify threats more accurately.
operates completely independently from the resources so there is no risk of performance or availability impacts to the workloads
is an automated security assessment service that helps test the network accessibility of EC2 instances and the security state of the applications running on the instances.
helps automate security vulnerability assessments throughout the development and deployment pipeline or against static production systems
is a self-service audit artifact retrieval portal that provides customers with on-demand access to AWS’ compliance documentation and agreements
can use AWS Artifact Reports to download AWS security and compliance documents, such as AWS ISO certifications, Payment Card Industry (PCI), and System and Organization Control (SOC) reports.
Managed load balancing service and scales automatically
distributes incoming application traffic across multiple EC2 instances
is distributed system that is fault tolerant and actively monitored by AWS scales it as per the demand
are engineered to not be a single point of failure
need to Pre-Warm ELB if the demand is expected to shoot especially during load testing.AWS documentation does not mention it now.
supports routing traffic to instances in multiple AZs in the same region
performs Health Checks to route traffic only to the healthy instances
support Listeners with HTTP, HTTPS, SSL, TCP protocols
has an associated IPv4 and dual stack DNS name
can offload the work of encryption and decryption (SSL termination) so that the EC2 instances can focus on their main work
supports Cross Zone load balancing to help route traffic evenly across all EC2 instances regardless of the AZs they reside in
to help identify the IP address of a client
supports Proxy Protocol header for TCP/SSL connections
supports X-Forward headers for HTTP/HTTPS connections
supports Stick Sessions (session affinity) to bind a user’s session to a specific application instance,
it is not fault tolerant, if an instance is lost the information is lost
requires HTTP/HTTPS listener and does not work with TCP
requires SSL termination on ELB as it users the headers
supports Connection draining to help complete the in-flight requests in case an instance is deregistered
For High Availability, it is recommended to attach one subnet per AZ for at least two AZs, even if the instances are in a single subnet.
supports Static/Elastic IP (NLB only)
IPv4 & IPv6 support however VPC does not support IPv6. VPC now supports IPV6.
HTTPS listener does not support Client Side Certificate
For SSL termination at backend instances or support for Client Side Certificate use TCP for connections from the client to the ELB, use the SSL protocol for connections from the ELB to the back-end application, and deploy certificates on the back-end instances handling requests
supports a single SSL certificate, so for multiple SSL certificate multiple ELBs need to be created
Uses Server Name Indication to supports multiple SSL certificates
supports HTTP/2, which is enabled natively. Clients that support HTTP/2 can connect over TLS
supports WebSockets and Secure WebSockets natively
supports Request tracing, by default.
request tracing can be used to track HTTP requests from clients to targets or other services.
Load balancer upon receiving a request from a client, adds or updates the X-Amzn-Trace-Id header before sending the request to the target
supports containerized applications. Using Dynamic port mapping, ECS can select an unused port when scheduling a task and register the task with a target group using this port.
supports Sticky Sessions (Session Affinity) using load balancer generated cookies, to route requests from the same client to the same target
supports SSL termination, to decrypt the request on ALB before sending it to the underlying targets.
supports layer 7 specific features like X-Forwarded-For headers to help determine the actual client IP, port and protocol
automatically scales its request handling capacity in response to incoming application traffic.
supports hybrid load balancing, to route traffic to instances in VPC and an on-premises location
provides High Availability, by allowing more than one AZ to be specified
integrates with ACM to provision and bind a SSL/TLS certificate to the load balancer thereby making the entire SSL offload process very easy
supports multiple certificates for the same domain to a secure listener
supports IPv6 addressing, for an Internet facing load balancer
supports Cross-zone load balancing, and cannot be disabled.
supports Security Groups to control the traffic allowed to and from the load balancer.
provides Access Logs, to record all requests sent the load balancer, and store the logs in S3 for later analysis in compressed format
provides Delete Protection, to prevent the ALB from accidental deletion
supports Connection Idle Timeout – ALB maintains two connections for each request one with the Client (front end) and one with the target instance (back end). If no data has been sent or received by the time that the idle timeout period elapses, ALB closes the front-end connection
integrates with CloudWatch to provide metrics such as request counts, error counts, error types, and request latency
integrates with AWS WAF, a web application firewall that helps protect web applications from attacks by allowing rules configuration based on IP addresses, HTTP headers, and custom URI strings
integrates with CloudTrail to receive a history of ALB API calls made on the AWS account
handles volatile workloads and scale to millions of requests per second, without the need of pre-warming
offers extremely low latencies for latency-sensitive applications.
provides static IP/Elastic IP addresses for the load balancer
allows registering targets by IP address, including targets outside the VPC (on-premises) for the load balancer.
supports containerized applications. Using Dynamic port mapping, ECS can select an unused port when scheduling a task and register the task with a target group using this port.
monitors the health of its registered targets and routes the traffic only to healthy targets
enable cross-zone loading balancing only after creating the NLB
preserves client side source IP allowing the back-end to see client IP address. Target groups can be created with target type as instance ID or IP address. If targets registered by instance ID, the source IP addresses of the clients are preserved and provided to the applications. If register targets registered by IP address, the source IP addresses are the private IP addresses of the load balancer nodes.
supports both network and application target health checks.
supports long-lived TCP connections ideal for WebSocket type of applications
supports Zonal Isolation, which is designed for application architectures in a single zone and can be enabled in a single AZ to support architectures that require zonal isolation
Auto Scaling & ELB can be used for High Availability and Redundancy by spanning Auto Scaling groups across multiple AZs within a region and then setting up ELB to distribute incoming traffic across those AZs
With Auto Scaling, use ELB health check with the instances to ensure that traffic is routed only to the healthy instances