Table of Contents hide

AWS Compute Services Cheat Sheet

Elastic Cloud Compute – EC2

Elastic Load Balancer

Auto Scaling

AWS Auto Scaling & ELB

Lambda

AWS Compute – Additional Services

AWS Compute Services Cheat Sheet

AWS Compute Services

Elastic Cloud Compute – EC2

provides scalable computing capacity
Features
- Virtual computing environments, known as EC2 instances
- Preconfigured templates for EC2 instances, known as Amazon Machine Images (AMIs), that package the bits needed for the server (including the operating system and additional software)
- Various configurations of CPU, memory, storage, and networking capacity for your instances, known as Instance types
- Secure login information for your instances using key pairs (public-private keys where private is kept by user)
- Storage volumes for temporary data that’s deleted when you stop or terminate your instance, known as Instance store volumes
- Persistent storage volumes for data using Elastic Block Store (EBS)
- Multiple physical locations for your resources, such as instances and EBS volumes, known as Regions and Availability Zones
- A firewall to specify the protocols, ports, and source IP ranges that can reach your instances using Security Groups
- Static IP addresses, known as Elastic IP addresses
- Metadata, known as tags, can be created and assigned to EC2 resources
- Virtual networks that are logically isolated from the rest of the AWS cloud, and can optionally connect to on-premises network, known as Virtual private clouds (VPCs)

Amazon Machine Image – AMI

- template from which EC2 instances can be launched quickly
- does NOT span across regions, and needs to be copied
- can be shared with other specific AWS accounts or made public

Instance Types

T for applications needing general usage
- T2 instances are Burstable Performance Instances that provide a baseline level of CPU performance with the ability to burst above the baseline.
- T2 instances accumulate CPU Credits when they are idle, and consume CPU Credits when they are active.
- T2 Unlimited Instances can sustain high CPU performance for as long as a workload needs it at an additional cost.
- T4g instances are powered by AWS Graviton2 processors and provide the next generation low cost burstable general purpose instance type.

R for applications needing more RAM or Memory
- R8g instances powered by Graviton4 processors deliver up to 30% better performance over Graviton3-based instances for memory-intensive workloads.
C for applications needing more Compute
- C8g instances powered by Graviton4 and C8i instances powered by Intel Xeon 6 processors represent the latest generation (2024-2025).
M for applications needing more Medium or Moderate performance on both Memory and CPU
- M8g instances powered by Graviton4 and M8i instances powered by Intel Xeon 6 processors are the latest generation (2024-2025).

I for applications needing more IOPS
G for applications needing more GPU
P for applications needing GPU-accelerated computing for ML/AI
- P5 and P5e instances for high-performance ML training and inference workloads.
Graviton-based instances (suffix “g”, e.g., C8g, M8g, R8g) are powered by AWS-designed Arm processors and provide the best price performance for most workloads.

Instance Purchasing Option

On-Demand Instances
- pay for instances and compute capacity that you use by the hour or second
- no long-term commitments or up-front payments
Reserved Instances
- provides lower hourly running costs by providing a billing discount (up to 72%)
- capacity reservation is applied to instances
- suited if consistent, heavy, predictable usage
- provides benefits with Consolidate Billing
- can be modified to switch Availability Zones or the instance size within the same instance type, given the instance size footprint (Normalization factor) remains the same
- pay for the entire term regardless of the usage
- is not a physical instance that is launched, but rather a billing discount applied to the use of On-Demand Instances
- available in Standard and Convertible options

Savings Plans
- flexible pricing model offering savings up to 72% on compute usage in exchange for a commitment to a consistent amount of usage (measured in $/hour) for a 1 or 3 year term
- Compute Savings Plans apply to EC2, Fargate, and Lambda usage regardless of instance family, size, AZ, region, OS, or tenancy
- EC2 Instance Savings Plans apply to a specific instance family within a region
- recommended over Reserved Instances for new workloads due to greater flexibility
~~Scheduled Reserved Instances~~
- ⚠️ Scheduled Reserved Instances are no longer available for purchase. AWS does not have capacity available and has no plans to make it available in the future. Use On-Demand Capacity Reservations instead.
On-Demand Capacity Reservations
- reserve compute capacity for instances in a specific AZ for any duration
- can be created to start immediately or scheduled for a future date
- ensures access to EC2 capacity when needed, independent of billing discounts
- can be combined with Savings Plans or Reserved Instances for cost savings

Capacity Blocks for ML
- reserve GPU instances (P4d, P5, P5e, Trn1) for ML workloads up to 8 weeks in advance
- durations of up to 6 months in cluster sizes of 1 to 64 instances
- supports instant start times and extensions
- instances are placed in EC2 UltraClusters for low-latency networking

Spot Instances
- cost-effective choice (up to 90% discount) but does NOT guarantee availability
- applications flexible in the timing when they can run and also able to handle interruption by storing the state externally
- provides a two-minute warning if the instance is to be terminated to save any unsaved work
- ~~Spot blocks can also be launched with a required duration, which are not interrupted due to changes in the Spot price~~ Spot Blocks (Defined Duration) are no longer available for new customers.
- Spot Fleet is a collection, or fleet, of Spot Instances, and optionally On-Demand Instances, which attempts to launch the number of Spot and On-Demand Instances to meet the specified target capacity

Dedicated Instances
- is a tenancy option that enables instances to run in VPC on hardware that’s isolated, dedicated to a single customer
Dedicated Host
- is a physical server with EC2 instance capacity fully dedicated to your use
- helps meet compliance requirements and reduce costs by allowing use of existing server-bound software licenses
Light, Medium, and Heavy Utilization Reserved Instances are no longer available for purchase and were part of the Previous Generation AWS EC2 purchasing model

Enhanced Networking

results in higher bandwidth, higher packet per second (PPS) performance, lower latency, consistency, scalability, and lower jitter
supported using Single Root – I/O Virtualization (SR-IOV) only on supported instance types
is supported only with a VPC and HVM virtualization type

available by default on Amazon AMI but can be installed on other AMIs as well
no additional charge for using enhanced networking
Note: EC2-Classic has been fully retired (August 2023). All instances now run in VPC only.

Placement Group

Cluster Placement Group
- provide low latency, High-Performance Computing via 10Gbps network
- is a logical grouping on instances within a Single AZ
- don’t span availability zones, can span multiple subnets but subnets must be in the same AZ
- can span across peered VPCs for the same Availability Zones
- An existing instance can be moved to a placement group, or moved from one placement group to another, or removed from a placement group, given it is in the stopped state.
- for capacity errors, stop and start the instances in the placement group
- use homogenous instance types which support enhanced networking and launch all the instances at once

Spread Placement Groups
- is a group of instances that are each placed on distinct underlying hardware i.e. each instance on a distinct rack across AZ
- recommended for applications that have a small number of critical instances that should be kept separate from each other.
- reduces the risk of simultaneous failures that might occur when instances share the same underlying hardware.
Partition Placement Groups
- is a group of instances spread across partitions i.e. group of instances spread across racks across AZs
- reduces the likelihood of correlated hardware failures for the application.
- can be used to spread deployment of large distributed and replicated workloads, such as HDFS, HBase, and Cassandra, across distinct hardware

EC2 Monitoring

CloudWatch provides monitoring for EC2 instances
Status monitoring helps quickly determine whether EC2 has detected any problems that might prevent instances from running applications.

Status monitoring includes
- System Status checks – indicate issues with the underlying hardware
- Instance Status checks – indicate issues with the underlying instance.

Elastic Load Balancer

Managed load balancing service and scales automatically
distributes incoming application traffic across multiple EC2 instances
is distributed system that is fault tolerant and actively monitored by AWS scales it as per the demand
are engineered to not be a single point of failure
supports Load Balancer Capacity Unit (LCU) Reservation to proactively set a minimum capacity for ALB and NLB, complementing auto-scaling for planned traffic events (launched Nov 2024)

supports routing traffic to instances in multiple AZs in the same region
performs Health Checks to route traffic only to the healthy instances
support Listeners with HTTP, HTTPS, SSL, TCP protocols
has an associated IPv4 and dual stack DNS name
can offload the work of encryption and decryption (SSL termination) so that the EC2 instances can focus on their main work

supports Cross Zone load balancing to help route traffic evenly across all EC2 instances regardless of the AZs they reside in
to help identify the IP address of a client
- supports Proxy Protocol header for TCP/SSL connections
- supports X-Forward headers for HTTP/HTTPS connections
supports Stick Sessions (session affinity) to bind a user’s session to a specific application instance,
- it is not fault tolerant, if an instance is lost the information is lost
- requires HTTP/HTTPS listener and does not work with TCP
- requires SSL termination on ELB as it users the headers
supports Connection draining to help complete the in-flight requests in case an instance is deregistered

For High Availability, it is recommended to attach one subnet per AZ for at least two AZs, even if the instances are in a single subnet.
supports Static/Elastic IP (NLB only)
IPv4 & IPv6 support. VPC supports IPv6.
HTTPS listener does not support Client Side Certificate
For SSL termination at backend instances or support for Client Side Certificate use TCP for connections from the client to the ELB, use the SSL protocol for connections from the ELB to the back-end application, and deploy certificates on the back-end instances handling requests
Uses Server Name Indication to supports multiple SSL certificates
Supports four types: Application Load Balancer (ALB), Network Load Balancer (NLB), Gateway Load Balancer (GWLB), and Classic Load Balancer (CLB – previous generation)

Application Load Balancer

supports HTTP and HTTPS (Secure HTTP) protocols
supports HTTP/2, which is enabled natively. Clients that support HTTP/2 can connect over TLS
supports WebSockets and Secure WebSockets natively
supports Request tracing, by default.
- request tracing can be used to track HTTP requests from clients to targets or other services.
- Load balancer upon receiving a request from a client, adds or updates the X-Amzn-Trace-Id header before sending the request to the target
supports containerized applications. Using Dynamic port mapping, ECS can select an unused port when scheduling a task and register the task with a target group using this port.
supports Sticky Sessions (Session Affinity) using load balancer generated cookies, to route requests from the same client to the same target

supports SSL termination, to decrypt the request on ALB before sending it to the underlying targets.
supports layer 7 specific features like X-Forwarded-For headers to help determine the actual client IP, port and protocol
automatically scales its request handling capacity in response to incoming application traffic.

supports hybrid load balancing, to route traffic to instances in VPC and an on-premises location
provides High Availability, by allowing more than one AZ to be specified
integrates with ACM to provision and bind a SSL/TLS certificate to the load balancer thereby making the entire SSL offload process very easy

supports multiple certificates for the same domain to a secure listener
supports IPv6 addressing, for an Internet facing load balancer
supports dual-stack without public IPv4, enabling clients to connect using only IPv6 addresses without needing public IPv4 addresses (launched May 2024)
supports Cross-zone load balancing, and cannot be disabled.
supports Security Groups to control the traffic allowed to and from the load balancer.

provides Access Logs, to record all requests sent the load balancer, and store the logs in S3 for later analysis in compressed format
provides Delete Protection, to prevent the ALB from accidental deletion
supports Connection Idle Timeout – ALB maintains two connections for each request one with the Client (front end) and one with the target instance (back end). If no data has been sent or received by the time that the idle timeout period elapses, ALB closes the front-end connection

integrates with CloudWatch to provide metrics such as request counts, error counts, error types, and request latency
integrates with AWS WAF, a web application firewall that helps protect web applications from attacks by allowing rules configuration based on IP addresses, HTTP headers, and custom URI strings
integrates with CloudTrail to receive a history of ALB API calls made on the AWS account

back-end server authentication is NOT supported
does not provide Static, Elastic IP addresses

Network Load Balancer

handles volatile workloads and scale to millions of requests per second, without the need of pre-warming
offers extremely low latencies for latency-sensitive applications.
provides static IP/Elastic IP addresses for the load balancer
allows registering targets by IP address, including targets outside the VPC (on-premises) for the load balancer.
supports containerized applications. Using Dynamic port mapping, ECS can select an unused port when scheduling a task and register the task with a target group using this port.

monitors the health of its registered targets and routes the traffic only to healthy targets
enable cross-zone loading balancing only after creating the NLB
preserves client side source IP allowing the back-end to see client IP address. Target groups can be created with target type as instance ID or IP address. If targets registered by instance ID, the source IP addresses of the clients are preserved and provided to the applications. If register targets registered by IP address, the source IP addresses are the private IP addresses of the load balancer nodes.
supports both network and application target health checks.
supports long-lived TCP connections ideal for WebSocket type of applications

supports Zonal Isolation, which is designed for application architectures in a single zone and can be enabled in a single AZ to support architectures that require zonal isolation
supports sticky sessions using source IP affinity at the target group level to route traffic from the same client to the same target
supports removing Availability Zones after creation, enabling subnet reconfiguration without recreating the NLB (launched Feb 2025)

supports weighted target groups for blue/green and canary deployments without multiple load balancers (launched 2025)
supports QUIC protocol in passthrough mode, enabling low-latency forwarding of QUIC traffic while preserving session stickiness through QUIC Connection ID (launched Nov 2025)
supports UDP over IPv6 for dualstack load balancers (launched Nov 2024)

Gateway Load Balancer

enables deployment, scaling, and management of third-party virtual appliances such as firewalls, intrusion detection/prevention systems, and deep packet inspection systems
provides one gateway for distributing traffic across multiple virtual appliances while scaling them up or down based on demand
operates at Layer 3 (Network layer) and listens for all IP packets across all ports

uses the GENEVE protocol on port 6081 to encapsulate traffic
supports flow stickiness using 2-tuple, 3-tuple, or 5-tuple hash
configurable TCP idle timeout from 60 to 6000 seconds
decreases potential points of failure in network and increases availability
use cases include centralized network security inspection, traffic mirroring, and compliance monitoring

Auto Scaling

ensures correct number of EC2 instances are always running to handle the load by scaling up or down automatically as demand changes
cannot span multiple regions.
attempts to distribute instances evenly between the AZs that are enabled for the Auto Scaling group
performs checks either using EC2 status checks or can use ELB health checks to determine the health of an instance and terminates the instance if unhealthy, to launch a new instance

can be scaled using manual scaling, scheduled scaling, dynamic scaling (target tracking, step, simple) or predictive scaling
Predictive Scaling uses machine learning to predict future traffic based on historical patterns and proactively launches instances ahead of demand, ideal for applications with recurring traffic spikes
Target Tracking scaling now features highly responsive scaling policies that adapt to unique application usage patterns and support high-resolution CloudWatch metrics (enhanced Nov 2024)

cooldown period helps ensure instances are not launched or terminated before the previous scaling activity takes effect to allow the newly launched instances to start handling traffic and reduce load

AWS Auto Scaling & ELB

Auto Scaling & ELB can be used for High Availability and Redundancy by spanning Auto Scaling groups across multiple AZs within a region and then setting up ELB to distribute incoming traffic across those AZs
With Auto Scaling, use ELB health check with the instances to ensure that traffic is routed only to the healthy instances

Lambda

offers Serverless computing that allows applications and services to be built and run without thinking about servers.
helps run code without provisioning or managing servers, where you pay only for the compute time when the code is running.
is priced on a pay-per-use basis and there are no charges when the code is not running.

performs all the operational and administrative activities on your behalf, including capacity provisioning, monitoring fleet health, applying security patches to the underlying compute resources, deploying code, running a web service front end, and monitoring and logging the code.
does not provide access to the underlying compute infrastructure.
handles scalability and availability as it
- provides easy scaling and high availability to the code without additional effort on your part.
- is designed to process events within milliseconds.
- is designed to run many instances of the functions in parallel.
- is designed to use replication and redundancy to provide high availability for both the service and the functions it operates.
- has no maintenance windows or scheduled downtimes for either.
- has a default safety throttle for the number of concurrent executions per account per region.
- has a higher latency immediately after a function is created, or updated, or if it has not been used recently.
- for any function updates, there is a brief window of time, less than a minute, when requests would be served by both versions
Security
- stores code in S3 and encrypts it at rest and performs additional integrity checks while the code is in use.
- each function runs in its own isolated environment, with its own resources and file system view
- supports Code Signing using AWS Signer, which offers trust and integrity controls that enable you to verify that only unaltered code from approved developers is deployed in the functions.
Functions must complete execution within 900 seconds (15 minutes). The default timeout is 3 seconds. The timeout can be set to any value between 1 and 900 seconds.
Supports up to 10,240 MB (10 GB) of memory per function.
AWS Step Functions can help coordinate a series of Lambda functions in a specific order. Multiple functions can be invoked sequentially, passing the output of one to the other, and/or in parallel, while the state is being maintained by Step Functions.
AWS X-Ray helps to trace functions, which provides insights such as service overhead, function init time, and function execution time.
Lambda Provisioned Concurrency provides greater control over the performance of serverless applications.
Lambda@Edge allows you to run code across AWS locations globally without provisioning or managing servers, responding to end-users at the lowest network latency.
Lambda Extensions allow integration of Lambda with other third-party tools for monitoring, observability, security, and governance.
Compute Savings Plan can help save money for Lambda executions.
CodePipeline and CodeDeploy can be used to automate the serverless application release process.
RDS Proxy provides a highly available database proxy that manages thousands of concurrent connections to relational databases.
Supports Elastic File Store, to provide a shared, external, persistent, scalable volume using a fully managed elastic NFS file system without the need for provisioning or capacity management.
Supports Function URLs, a built-in HTTPS endpoint that can be invoked using the browser, curl, and any HTTP client.
Lambda SnapStart reduces cold start latency from several seconds to sub-second for Java, Python, and .NET functions by taking a snapshot of the initialized execution environment (GA for Python & .NET in Nov 2024).
Lambda Durable Functions enable building resilient multi-step applications and AI workflows that can execute for up to one year, automatically checkpoint progress, suspend execution during long-running tasks, and recover from failures without custom state management code (launched Dec 2025).
Lambda Managed Instances enables running Lambda functions on EC2 instances (including Graviton4, GPU, network-optimized) while maintaining Lambda’s operational simplicity, with access to EC2 commitment-based pricing (Savings Plans, Reserved Instances) for up to 72% cost savings (launched Nov 2025).
Supports runtimes including Node.js 24, Python 3.12+, Java 25, .NET 8, and more.

AWS Compute – Additional Services

Amazon ECS (Elastic Container Service) – fully managed container orchestration service
- Supports Fargate (serverless) and EC2 launch types
- ECS Express Mode (launched Nov 2025) – streamlines deployment of containerized workloads by automatically setting up load balancing, auto scaling, networking, and monitoring with simplified APIs
- ECS Managed Instances (launched Sep 2025) – fully managed compute option for broader EC2 instance access without infrastructure overhead
- Supports predictive scaling (launched Nov 2024)
Amazon EKS (Elastic Kubernetes Service) – managed Kubernetes service for running containers at scale
AWS Fargate – serverless compute engine for containers that works with both ECS and EKS, removing the need to manage underlying infrastructure
AWS Batch – fully managed batch computing service for running batch jobs at any scale
~~AWS App Runner~~ – ⚠️ No longer accepting new customers as of April 30, 2026. Existing services continue to operate. AWS recommends migrating to Amazon ECS Express Mode. No new features planned.

Route 53 Alias vs CNAME

22 thoughts on “AWS Compute Services Cheat Sheet – EC2, Lambda, ECS”

Sanjay Naikwadi says:

January 11, 2018 at 2:07 am

“VPC does not support IPv6” – VPC support IPV6
1. jayendrapatil says:
  
  January 11, 2018 at 11:30 pm
  
  Thanks Sanjay, updated the same.
Julian says:

June 2, 2018 at 6:11 pm

Thank you for the blog, it’s really helpful!!

I think there’s been an update on placement groups:

There are now Spread Placement Groups which can span availability zones. (As well as Cluster Placement Groups which are single AZ)
1. jayendrapatil says:
  
  June 4, 2018 at 7:20 pm
  
  Thanks Julian, the update was long pending. Have updated the post now.
Rajeev Kumar Sinha says:

August 16, 2018 at 8:20 pm

Nicely written blog, so much helpful info gathered at one place, Kudos for this hard work.
Vihan Agarwal says:

August 22, 2018 at 11:32 pm

“However, if you have attached one or more load balancers or target groups to the Auto Scaling group and a load balancer reports that an instance is unhealthy, it does not consider the instance unhealthy and therefore it does not replace it.”

I would add this to the AutoScaling Section.

Reference: https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-add-elb-healthcheck.html
1. jayendrapatil says:
  
  November 26, 2018 at 12:39 pm
  
  Thanks Vihan, will check and update the same.
khizer says:

August 31, 2018 at 1:48 am

Please update the ec2 pricing:

Per Second Billing
With per-second billing, you pay for only what you use. It takes cost of unused minutes and seconds in an hour off of the bill, so you can focus on improving your applications instead of maximizing usage to the hour. Especially, if you manage instances running for irregular periods of time, such as dev/testing, data processing, analytics, batch processing and gaming applications, can benefit.

EC2 usage are billed on one second increments, with a minimum of 60 seconds. Similarly, provisioned storage for EBS volumes will be billed per-second increments, with a 60 second minimum. Per-second billing is available for instances launched in:

On-Demand, Reserved and Spot forms
All regions and Availability Zones
Amazon Linux and Ubuntu
1. jayendrapatil says:
  
  October 1, 2018 at 6:42 pm
  
  Thanks khizer, the pricing model needs update. Will cover that soon.
tapun says:

October 4, 2018 at 8:40 pm

Thank you so much for your great work!

“existing instances cannot be moved into an existing placement group”

It seems it is now possible now.

You can move an existing instance to a placement group, move an instance from one placement group to another, or remove an instance from a placement group. Before you begin, the instance must be in the stopped state.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html#change-instance-placement-group
1. jayendrapatil says:
  
  October 15, 2018 at 6:53 pm
  
  Thanks Tapun, yup as per the latest enchancement it is now possible.
Jacques says:

February 18, 2019 at 2:15 pm

“supports a single SSL certificate, so for multiple SSL certificate multiple ELBs need to be created” — This is no longer true.

https://aws.amazon.com/cn/blogs/aws/new-application-load-balancer-sni/

Thanks for your post.
Rasesh Patel says:

October 6, 2019 at 10:51 am

Hello Jayendra,

I see you have listed Cheat sheet point for ELB(Load Balancer ) but you title it them with EBS(Block Storage). I just found little bit confusing to start with, so just wanted to let you know.

Thanks.
-Rasesh
1. jayendrapatil says:
  
  October 7, 2019 at 6:55 pm
  
  Thanks Rasesh, corrected the same.
Chintan P Mangukiya says:

November 9, 2019 at 6:23 pm

HI Jayendra,

I find these bits confusing “supports a single SSL certificate, so for multiple SSL certificates multiple ELBs need to be created”.

However, I think it now allows multiple certificates along with a default certificate and does SNI based search to find a suitable certificate. Is it the right context?

https://docs.aws.amazon.com/en_pv/elasticloadbalancing/latest/application/create-https-listener.html#sni-certificate-list

Thanks,
Chintan
1. jayendrapatil says:
  
  June 9, 2020 at 7:41 pm
  
  It was for CLB – Classic Load balancer. ALB and NLB supports SNI and multiple SSL certificates.
Bobby says:

December 2, 2019 at 12:22 pm

Hi Jayendra

Apart from the pricing class like on demand, reserved and spot Instances I come across some other pricing models like Scheduled reserved instances, on demand capacity instances. It is right?

I hope Pricing models are updated with the latest .
1. jayendrapatil says:
  
  June 9, 2020 at 7:40 pm
  
  Hi Bobby, they are updated.
mgtroyas says:

December 15, 2019 at 9:07 pm

Fantastic content. Just a suggestion:

[ELB] supports a single SSL certificate, so for multiple SSL certificate multiple ELBs need to be created

Now SNI certificates are supported.
1. jayendrapatil says:
  
  June 9, 2020 at 7:40 pm
  
  Thats right, sni helps multiple certs and its supported by ALB and NLB.
Laxman says:

June 27, 2020 at 3:50 pm

Hi Jayendra,

–> supports containerized applications. Using Dynamic port mapping, ECS can select an unused port when scheduling a task and register the task with a target group using this port. <–

Is it NLB or ALB. can you pls check once.

Thanks,
Laxman.
1. jayendrapatil says:
  
  July 3, 2020 at 1:02 pm
  
  Thanks Laxman, it ALB. Corrected the same.