AWS Auto Scaling Policies

March 10, 2023 ~ Last updated on : March 13, 2023 ~ jayendrapatil

AWS Auto Scaling Policies

EC2 Auto Scaling Policies provide several ways for scaling the Auto Scaling group.

Maintain a Steady Count of Instances

Auto Scaling ensures a steady minimum (or desired if specified) count of Instances will always be running.
If an instance is found unhealthy, Auto Scaling will terminate the Instance and launch a new one.
ASG determines the health state of each instance by periodically checking the results of EC2 instance status checks.
ASG can be associated with an Elastic load balancer enabled to use the Elastic Load Balancing health check, Auto Scaling determines the health status of the instances by checking the results of both EC2 instance status and Elastic Load Balancing instance health.
Auto Scaling marks an instance unhealthy and launches a replacement if
- the instance is in a state other than running,
- the system status is impaired, or
- Elastic Load Balancing reports the instance state as OutOfService.
After an instance has been marked unhealthy as a result of an EC2 or ELB health check, it is almost immediately scheduled for replacement. It never automatically recovers its health.
For an unhealthy instance, the instance’s health check can be changed back to healthy manually but you will encounter an error if the instance is already terminating.
Because the interval between marking an instance unhealthy and its actual termination is so small, attempting to set an instance’s health status back to healthy is probably useful only for a suspended group.
When the instance is terminated, any associated Elastic IP addresses are disassociated and are not automatically associated with the new instance.
Elastic IP addresses must be associated with the new instance manually.
Similarly, when the instance is terminated, its attached EBS volumes are detached and must be attached to the new instance manually.

Manual Scaling

Manual scaling can be performed by
- Changing the desired capacity limit of the ASG
- Attaching/Detaching instances to the ASG
Attaching/Detaching an EC2 instance can be done only if
- Instance is in the running state.
- AMI used to launch the instance must still exist.
- Instance is not a member of another ASG.
- Instance is in the same Availability Zone as the ASG.
- If the ASG is associated with a load balancer, the instance and the load balancer must both be in the same VPC.
Auto Scaling increases the desired capacity of the group by the number of instances being attached. But if the number of instances being attached plus the desired capacity exceeds the maximum size, the request fails.
When Detaching instances, an option to decrement the desired capacity for the ASG by the number of instances being detached is provided. If chosen not to decrement the capacity, Auto Scaling launches new instances to replace the ones that you detached.
If an instance is detached from an ASG that is also registered with a load balancer, the instance is deregistered from the load balancer. If connection draining is enabled for the load balancer, Auto Scaling waits for the in-flight requests to complete.

Scheduled Scaling

Scaling based on a schedule allows you to scale the application in response to predictable load changes for e.g. last day of the month, the last day of a financial year.
Scheduled scaling requires the configuration of Scheduled actions, which tells Auto Scaling to perform a scaling action at a certain time in the future, with the start time at which the scaling action should take effect, and the new minimum, maximum, and desired size of group should have.
Auto Scaling guarantees the order of execution for scheduled actions within the same group, but not for scheduled actions across groups.
Multiple Scheduled Actions can be specified but should have unique time values and they cannot have overlapping times scheduled which will lead to their rejection.
Cooldown periods are not supported.

Dynamic Scaling

Allows automatic scaling in response to the changing demand for e.g. scale-out in case CPU utilization of the instance goes above 70% and scale in when the CPU utilization goes below 30%
ASG uses a combination of alarms & policies to determine when the conditions for scaling are met.
- An alarm is an object that watches over a single metric over a specified time period. When the value of the metric breaches the defined threshold, for the number of specified time periods the alarm performs one or more actions (such as sending messages to Auto Scaling).
- A policy is a set of instructions that tells Auto Scaling how to respond to alarm messages.
Dynamic scaling process works as below
1. CloudWatch monitors the specified metrics for all the instances in the Auto Scaling Group.
2. Changes are reflected in the metrics as the demand grows or shrinks
3. When the change in the metrics breaches the threshold of the CloudWatch alarm, the CloudWatch alarm performs an action. Depending on the breach, the action is a message sent to either the scale-in policy or the scale-out policy
4. After the Auto Scaling policy receives the message, Auto Scaling performs the scaling activity for the ASG.
5. This process continues until you delete either the scaling policies or the ASG.
When a scaling policy is executed, if the capacity calculation produces a number outside of the minimum and maximum size range of the group, EC2 Auto Scaling ensures that the new capacity never goes outside of the minimum and maximum size limits.
When the desired capacity reaches the maximum size limit, scaling out stops. If demand drops and capacity decreases, Auto Scaling can scale out again.

Dynamic Scaling Policy Types

Target tracking scaling

Increase or decrease the current capacity of the group based on a target value for a specific metric.

Auto Scaling Target Tracking Scaling

Step scaling

Increase or decrease the current capacity of the group based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.

Simple scaling

Increase or decrease the current capacity of the group based on a single scaling adjustment.

Multiple Policies

ASG can have more than one scaling policy attached at any given time.
Each ASG would have at least two policies: one to scale the architecture out and another to scale the architecture in.
If an ASG has multiple policies, there is always a chance that both policies can instruct the Auto Scaling to Scale Out or Scale In at the same time.
When these situations occur, Auto Scaling chooses the policy that has the greatest impact i.e. provides the largest capacity for both scale out and scale in on the ASG for e.g. if two policies are triggered at the same time and Policy 1 instructs to scale out the instance by 1 while Policy 2 instructs to scale out the instances by 2, Auto Scaling will use the Policy 2 and scale out the instances by 2 as it has a greater impact.

Predictive Scaling

Predictive scaling can be used to increase the number of EC2 instances in the ASG in advance of daily and weekly patterns in traffic flows.
Predictive scaling is well suited for situations where you have:
- Cyclical traffic, such as high use of resources during regular business hours and low use of resources during evenings and weekends
- Recurring on-and-off workload patterns, such as batch processing, testing, or periodic data analysis
- Applications that take a long time to initialize, causing a noticeable latency impact on application performance during scale-out events
Predictive scaling provides proactive scaling that can help scale faster by launching capacity in advance of forecasted load, compared to using only dynamic scaling, which is reactive in nature.
Predictive scaling uses machine learning to predict capacity requirements based on historical data from CloudWatch. The machine learning algorithm consumes the available historical data and calculates the capacity that best fits the historical load pattern, and then continuously learns based on new data to make future forecasts more accurate.
Predictive scaling supports forecast only mode so that you can evaluate the forecast before you allow predictive scaling to actively scale capacity
When you are ready to start scaling with predictive scaling, switch the policy from forecast only mode to forecast and scale mode.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

A user has created a web application with Auto Scaling. The user is regularly monitoring the application and he observed that the traffic is highest on Thursday and Friday between 8 AM to 6 PM. What is the best solution to handle scaling in this case?
1. Add a new instance manually by 8 AM Thursday and terminate the same by 6 PM Friday
2. Schedule Auto Scaling to scale up by 8 AM Thursday and scale down after 6 PM on Friday
3. Schedule a policy which may scale up every day at 8 AM and scales down by 6 PM
4. Configure a batch process to add a instance by 8 AM and remove it by Friday 6 PM
A customer has a website which shows all the deals available across the market. The site experiences a load of 5 large EC2 instances generally. However, a week before Thanksgiving vacation they encounter a load of almost 20 large instances. The load during that period varies over the day based on the office timings. Which of the below mentioned solutions is cost effective as well as help the website achieve better performance?
1. Keep only 10 instances running and manually launch 10 instances every day during office hours.
2. Setup to run 10 instances during the pre-vacation period and only scale up during the office time by launching 10 more instances using the AutoScaling schedule.
3. During the pre-vacation period setup a scenario where the organization has 15 instances running and 5 instances to scale up and down using Auto Scaling based on the network I/O policy.
4. During the pre-vacation period setup 20 instances to run continuously.
A user has setup Auto Scaling with ELB on the EC2 instances. The user wants to configure that whenever the CPU utilization is below 10%, Auto Scaling should remove one instance. How can the user configure this?
1. The user can get an email using SNS when the CPU utilization is less than 10%. The user can use the desired capacity of Auto Scaling to remove the instance
2. Use CloudWatch to monitor the data and Auto Scaling to remove the instances using scheduled actions
3. Configure CloudWatch to send a notification to Auto Scaling Launch configuration when the CPU utilization is less than 10% and configure the Auto Scaling policy to remove the instance
4. Configure CloudWatch to send a notification to the Auto Scaling group when the CPU Utilization is less than 10% and configure the Auto Scaling policy to remove the instance

References

Auto_Scaling_Options

AWS Auto Scaling Lifecycle

May 16, 2018 ~ Last updated on : September 14, 2021 ~ jayendrapatil

Auto Scaling Lifecycle

Instances launched through the Auto Scaling group have a different lifecycle than that of other EC2 instances
Auto Scaling lifecycle starts when the Auto Scaling group launches an instance and puts it into service.
Auto Scaling lifecycle ends when the instance is terminated either by the user, or the Auto Scaling group takes it out of service and terminates it
AWS charges for the instances as soon as they are launched, including the time it is not in InService

Auto Scaling Lifecycle Transition

Auto Scaling Group Lifecycle

Auto Scaling Lifecycle Hooks

Auto Scaling Lifecycle hooks enable performing custom actions by pausing instances as an Auto Scaling group launches or terminates them
Each Auto Scaling group can have multiple lifecycle hooks. However, there is a limit on the number of hooks per Auto Scaling group
Auto Scaling scale out event flow
- Instances start in the Pending state
- If an autoscaling:EC2_INSTANCE_LAUNCHING lifecycle hook is added, the state is moved to Pending:Wait
- After the lifecycle action is completed, instances enter to Pending:Proceed
- When the instances are fully configured, they are attached to the Auto Scaling group and moved to the InService state
Auto Scaling scale in event flow
- Instances are detached from the Auto Scaling group and enter the Terminating state.
- If an autoscaling:EC2_INSTANCE_TERMINATING lifecycle hook is added, the state is moved to Terminating:Wait
- After the lifecycle action is completed, the instances enter the Terminating:Proceed state.
- When the instances are fully terminated, they enter the Terminated state.
During the scale out and scale in events, instances are put into a wait state (Pending:Wait or Terminating:Wait) and are paused until either a continue action happens or the timeout period ends.
By default, the instance remains in a wait state for one hour, which can be extended by restarting the timeout period by recording a heartbeat.
If the task finishes before the timeout period ends, the lifecycle action can be marked completed and it continues the launch or termination process.
After the wait period, the Auto Scaling group continues the launch or terminate process (Pending:Proceed or Terminating:Proceed)
- CloudWatch Events target to invoke a Lambda function when a lifecycle action occurs. Event contains information about the instance that is launching or terminating and a token that can be used to control the lifecycle action.
- Notification target (CloudWatch events, SNS, SQS) for the lifecycle hook which receives the message from EC2 Auto Scaling.The message contains information about the instance that is launching or terminating and a token that can be used to control the lifecycle action
- Create a script that runs on the instance as the instance starts. The script can control the lifecycle action using the ID of the instance on which it runs.Custom action can be implemented using

Auto Scaling Lifecycle Hooks Considerations

Keeping Instances in a Wait State
- Instances remain in a wait state for a finite period of time.
- Default is 1 hour (3600 seconds) with the max being 48 hours or 100 times the heartbeat timeout, whichever is smaller.
- Time can be adjusted using
  - complete-lifecycle-action (CompleteLifecycleAction) command to continue to the next state if finishes before the timeout period end
  - put-lifecycle-hook command, the –heartbeat-timeout parameter to set the heartbeat timeout for the lifecycle hook during its creation
  - Restart the timeout period by recording a heartbeat, using the record-lifecycle-action-heartbeat (RecordLifecycleActionHeartbeat) command
Cooldowns and Custom Actions
- Cooldown period helps ensure that the Auto Scaling group does not launch or terminate more instances than needed
- Cooldown period starts when the instance enters the InService state. Any suspended scaling actions resume after cooldown period expires
Health Check Grace Period
- Health check grace period does not start until the lifecycle hook completes and the instance enters the InService state
Lifecycle Action Result
- Result of the lifecycle hook is either ABANDON or CONTINUE
- If the instance is launching,
  - CONTINUE indicates a successful action, and the instance can be put into service.
  - ABANDON indicates the custom actions were unsuccessful, and that the instance can be terminated.
- If the instance is terminating,
  - ABANDON and CONTINUE allow the instance to terminate.
  - However, ABANDON stops any remaining actions from other lifecycle hooks, while CONTINUE allows them to complete
Spot Instances
- Lifecycle hooks can be used with Spot Instances. However, a lifecycle hook does not prevent an instance from terminating due to a change in the Spot Price, which can happen at any time

Enter and Exit Standby

Instance in an InService state can be moved toStandby state.
Standby state enables you to remove the instance from service, troubleshoot or make changes to it, and then put it back into service.
Instances in a Standby state continue to be managed by the Auto Scaling group. However, they are not an active part of the application until they are put back into service.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

Your application is running on EC2 in an Auto Scaling group. Bootstrapping is taking 20 minutes to complete. You find out that instances are shown as InService although the bootstrapping has not completed. How can you make sure that new instances are not added until the bootstrapping has finished. Choose the correct answer:
1. Create a CloudWatch alarm with an SNS topic to send alarms to your DevOps engineer.
2. Create a lifecycle hook to keep the instance in pending:wait state until the bootstrapping has finished and then put the instance in pending:proceed state.
3. Increase the number of instances in your Auto Scaling group.
4. Create a lifecycle hook to keep the instance in standby state until the bootstrapping has finished and then put the instance in pending:proceed state.
When a scale out event occurs, the Auto Scaling group launches the required number of EC2 instances using its assigned launch configuration. What instance state do these instances start in? Choose the correct answer:
1. pending:wait
2. InService
3. Pending
4. Terminating
With AWS Auto Scaling, once we apply a hook and the action is complete or the default wait state timeout runs out, the state changes to what, depending on which hook we have applied and what the instance is doing? Select two. Choose the 2 correct answers:
1. pending:proceed
2. pending:wait
3. terminating:wait
4. terminating:proceed
For AWS Auto Scaling, what is the first transition state an existing instance enters after leaving steady state in Standby mode?
1. Detaching
2. Terminating:Wait
3. Pending (You can put any instance that is in an InService state into a Standby state. This enables you to remove the instance from service, troubleshoot or make changes to it, and then put it back into service. Instances in a Standby state continue to be managed by the Auto Scaling group. However, they are not an active part of your application until you put them back into service. Refer link)
4. EnteringStandby
For AWS Auto Scaling, what is the first transition state an instance enters after leaving steady state when scaling in due to health check failure or decreased load?
1. Terminating (When Auto Scaling responds to a scale in event, it terminates one or more instances. These instances are detached from the Auto Scaling group and enter the Terminating state. Refer link)
2. Detaching
3. Terminating:Wait
4. EnteringStandby

References

AutoScalingGroupLifecycle

AWS Compute Services Cheat Sheet

February 9, 2017 ~ Last updated on : February 6, 2023 ~ jayendrapatil ~ 22 Comments

AWS Compute Services Cheat Sheet

AWS Compute Services

Elastic Cloud Compute – EC2

provides scalable computing capacity
Features
- Virtual computing environments, known as EC2 instances
- Preconfigured templates for EC2 instances, known as Amazon Machine Images (AMIs), that package the bits needed for the server (including the operating system and additional software)
- Various configurations of CPU, memory, storage, and networking capacity for your instances, known as Instance types
- Secure login information for your instances using key pairs (public-private keys where private is kept by user)
- Storage volumes for temporary data that’s deleted when you stop or terminate your instance, known as Instance store volumes
- Persistent storage volumes for data using Elastic Block Store (EBS)
- Multiple physical locations for your resources, such as instances and EBS volumes, known as Regions and Availability Zones
- A firewall to specify the protocols, ports, and source IP ranges that can reach your instances using Security Groups
- Static IP addresses, known as Elastic IP addresses
- Metadata, known as tags, can be created and assigned to EC2 resources
- Virtual networks that are logically isolated from the rest of the AWS cloud, and can optionally connect to on-premises network, known as Virtual private clouds (VPCs)

Amazon Machine Image – AMI

- template from which EC2 instances can be launched quickly
- does NOT span across regions, and needs to be copied
- can be shared with other specific AWS accounts or made public

Instance Types

T for applications needing general usage
- T2 instances are Burstable Performance Instances that provide a baseline level of CPU performance with the ability to burst above the baseline.
- T2 instances accumulate CPU Credits when they are idle, and consume CPU Credits when they are active.
- T2 Unlimited Instances can sustain high CPU performance for as long as a workload needs it at an additional cost.
R for applications needing more RAM or Memory
C for applications needing more Compute
M for applications needing more Medium or Moderate performance on both Memory and CPU
I for applications needing more IOPS
G for applications needing more GPU

Instance Purchasing Option

On-Demand Instances
- pay for instances and compute capacity that you use by the hour
- no long-term commitments or up-front payments
Reserved Instances
- provides lower hourly running costs by providing a billing discount
- capacity reservation is applied to instances
- suited if consistent, heavy, predictable usage
- provides benefits with Consolidate Billing
- can be modified to switch Availability Zones or the instance size within the same instance type, given the instance size footprint (Normalization factor) remains the same
- pay for the entire term regardless of the usage
- is not a physical instance that is launched, but rather a billing discount applied to the use of On-Demand Instances
Scheduled Reserved Instances
- enable capacity reservations purchase that recurs on a daily, weekly, or monthly basis, with a specified start time and duration, for a one-year term.
- Charges are incurred for the time that the instances are scheduled, even if they are not used
- good choice for workloads that do not run continuously, but do run on a regular schedule
Spot Instances
- cost-effective choice but does NOT guarantee availability
- applications flexible in the timing when they can run and also able to handle interruption by storing the state externally
- provides a two-minute warning if the instance is to be terminated to save any unsaved work
- Spot blocks can also be launched with a required duration, which are not interrupted due to changes in the Spot price
- Spot Fleet is a collection, or fleet, of Spot Instances, and optionally On-Demand Instances, which attempts to launch the number of Spot and On-Demand Instances to meet the specified target capacity
Dedicated Instances
- is a tenancy option that enables instances to run in VPC on hardware that’s isolated, dedicated to a single customer
Dedicated Host
- is a physical server with EC2 instance capacity fully dedicated to your use
Light, Medium, and Heavy Utilization Reserved Instances are no longer available for purchase and were part of the Previous Generation AWS EC2 purchasing model

Enhanced Networking

results in higher bandwidth, higher packet per second (PPS) performance, lower latency, consistency, scalability, and lower jitter
supported using Single Root – I/O Virtualization (SR-IOV) only on supported instance types
is supported only with a VPC (not EC2 Classic), HVM virtualization type and available by default on Amazon AMI but can be installed on other AMIs as well

Placement Group

Cluster Placement Group
- provide low latency, High-Performance Computing via 10Gbps network
- is a logical grouping on instances within a Single AZ
- don’t span availability zones, can span multiple subnets but subnets must be in the same AZ
- can span across peered VPCs for the same Availability Zones
- ~~existing instances can’t be moved into an existing placement group~~
- An existing instance can be moved to a placement group, or moved from one placement group to another, or removed from a placement group, given it is in the stopped state.
- for capacity errors, stop and start the instances in the placement group
- use homogenous instance types which support enhanced networking and launch all the instances at once
Spread Placement Groups
- is a group of instances that are each placed on distinct underlying hardware i.e. each instance on a distinct rack across AZ
- recommended for applications that have a small number of critical instances that should be kept separate from each other.
- reduces the risk of simultaneous failures that might occur when instances share the same underlying hardware.
Partition Placement Groups
- is a group of instances spread across partitions i.e. group of instances spread across racks across AZs
- reduces the likelihood of correlated hardware failures for the application.
- can be used to spread deployment of large distributed and replicated workloads, such as HDFS, HBase, and Cassandra, across distinct hardware

EC2 Monitoring

CloudWatch provides monitoring for EC2 instances
Status monitoring helps quickly determine whether EC2 has detected any problems that might prevent instances from running applications.
Status monitoring includes
- System Status checks – indicate issues with the underlying hardware
- Instance Status checks – indicate issues with the underlying instance.

Elastic Load Balancer

Managed load balancing service and scales automatically
distributes incoming application traffic across multiple EC2 instances
is distributed system that is fault tolerant and actively monitored by AWS scales it as per the demand
are engineered to not be a single point of failure
~~need to Pre-Warm ELB if the demand is expected to shoot especially during load testing.~~ AWS documentation does not mention it now.
supports routing traffic to instances in multiple AZs in the same region
performs Health Checks to route traffic only to the healthy instances
support Listeners with HTTP, HTTPS, SSL, TCP protocols
has an associated IPv4 and dual stack DNS name
can offload the work of encryption and decryption (SSL termination) so that the EC2 instances can focus on their main work
supports Cross Zone load balancing to help route traffic evenly across all EC2 instances regardless of the AZs they reside in
to help identify the IP address of a client
- supports Proxy Protocol header for TCP/SSL connections
- supports X-Forward headers for HTTP/HTTPS connections
supports Stick Sessions (session affinity) to bind a user’s session to a specific application instance,
- it is not fault tolerant, if an instance is lost the information is lost
- requires HTTP/HTTPS listener and does not work with TCP
- requires SSL termination on ELB as it users the headers
supports Connection draining to help complete the in-flight requests in case an instance is deregistered
For High Availability, it is recommended to attach one subnet per AZ for at least two AZs, even if the instances are in a single subnet.
supports Static/Elastic IP (NLB only)
IPv4 & IPv6 support ~~however VPC does not support IPv6.~~ VPC now supports IPV6.
HTTPS listener does not support Client Side Certificate
For SSL termination at backend instances or support for Client Side Certificate use TCP for connections from the client to the ELB, use the SSL protocol for connections from the ELB to the back-end application, and deploy certificates on the back-end instances handling requests
~~supports a single SSL certificate, so for multiple SSL certificate multiple ELBs need to be created~~
Uses Server Name Indication to supports multiple SSL certificates

Application Load Balancer

supports HTTP and HTTPS (Secure HTTP) protocols
supports HTTP/2, which is enabled natively. Clients that support HTTP/2 can connect over TLS
supports WebSockets and Secure WebSockets natively
supports Request tracing, by default.
- request tracing can be used to track HTTP requests from clients to targets or other services.
- Load balancer upon receiving a request from a client, adds or updates the X-Amzn-Trace-Id header before sending the request to the target
supports containerized applications. Using Dynamic port mapping, ECS can select an unused port when scheduling a task and register the task with a target group using this port.
supports Sticky Sessions (Session Affinity) using load balancer generated cookies, to route requests from the same client to the same target
supports SSL termination, to decrypt the request on ALB before sending it to the underlying targets.
supports layer 7 specific features like X-Forwarded-For headers to help determine the actual client IP, port and protocol
automatically scales its request handling capacity in response to incoming application traffic.
supports hybrid load balancing, to route traffic to instances in VPC and an on-premises location
provides High Availability, by allowing more than one AZ to be specified
integrates with ACM to provision and bind a SSL/TLS certificate to the load balancer thereby making the entire SSL offload process very easy
supports multiple certificates for the same domain to a secure listener
supports IPv6 addressing, for an Internet facing load balancer
supports Cross-zone load balancing, and cannot be disabled.
supports Security Groups to control the traffic allowed to and from the load balancer.
provides Access Logs, to record all requests sent the load balancer, and store the logs in S3 for later analysis in compressed format
provides Delete Protection, to prevent the ALB from accidental deletion
supports Connection Idle Timeout – ALB maintains two connections for each request one with the Client (front end) and one with the target instance (back end). If no data has been sent or received by the time that the idle timeout period elapses, ALB closes the front-end connection
integrates with CloudWatch to provide metrics such as request counts, error counts, error types, and request latency
integrates with AWS WAF, a web application firewall that helps protect web applications from attacks by allowing rules configuration based on IP addresses, HTTP headers, and custom URI strings
integrates with CloudTrail to receive a history of ALB API calls made on the AWS account
back-end server authentication is NOT supported
does not provide Static, Elastic IP addresses

Network Load Balancer

handles volatile workloads and scale to millions of requests per second, without the need of pre-warming
offers extremely low latencies for latency-sensitive applications.
provides static IP/Elastic IP addresses for the load balancer
allows registering targets by IP address, including targets outside the VPC (on-premises) for the load balancer.
supports containerized applications. Using Dynamic port mapping, ECS can select an unused port when scheduling a task and register the task with a target group using this port.
monitors the health of its registered targets and routes the traffic only to healthy targets
enable cross-zone loading balancing only after creating the NLB
preserves client side source IP allowing the back-end to see client IP address. Target groups can be created with target type as instance ID or IP address. If targets registered by instance ID, the source IP addresses of the clients are preserved and provided to the applications. If register targets registered by IP address, the source IP addresses are the private IP addresses of the load balancer nodes.
supports both network and application target health checks.
supports long-lived TCP connections ideal for WebSocket type of applications
supports Zonal Isolation, which is designed for application architectures in a single zone and can be enabled in a single AZ to support architectures that require zonal isolation
does not support stick sessions

Auto Scaling

ensures correct number of EC2 instances are always running to handle the load by scaling up or down automatically as demand changes
cannot span multiple regions.
attempts to distribute instances evenly between the AZs that are enabled for the Auto Scaling group
performs checks either using EC2 status checks or can use ELB health checks to determine the health of an instance and terminates the instance if unhealthy, to launch a new instance
can be scaled using manual scaling, scheduled scaling or demand based scaling
cooldown period helps ensure instances are not launched or terminated before the previous scaling activity takes effect to allow the newly launched instances to start handling traffic and reduce load

AWS Auto Scaling & ELB

Auto Scaling & ELB can be used for High Availability and Redundancy by spanning Auto Scaling groups across multiple AZs within a region and then setting up ELB to distribute incoming traffic across those AZs
With Auto Scaling, use ELB health check with the instances to ensure that traffic is routed only to the healthy instances

Lambda

offers Serverless computing that allows applications and services to be built and run without thinking about servers.
helps run code without provisioning or managing servers, where you pay only for the compute time when the code is running.
is priced on a pay-per-use basis and there are no charges when the code is not running.
performs all the operational and administrative activities on your behalf, including capacity provisioning, monitoring fleet health, applying security patches to the underlying compute resources, deploying code, running a web service front end, and monitoring and logging the code.
does not provide access to the underlying compute infrastructure.
handles scalability and availability as it
- provides easy scaling and high availability to the code without additional effort on your part.
- is designed to process events within milliseconds.
- is designed to run many instances of the functions in parallel.
- is designed to use replication and redundancy to provide high availability for both the service and the functions it operates.
- has no maintenance windows or scheduled downtimes for either.
- has a default safety throttle for the number of concurrent executions per account per region.
- has a higher latency immediately after a function is created, or updated, or if it has not been used recently.
- for any function updates, there is a brief window of time, less than a minute, when requests would be served by both versions
Security
- stores code in S3 and encrypts it at rest and performs additional integrity checks while the code is in use.
- each function runs in its own isolated environment, with its own resources and file system view
- supports Code Signing using AWS Signer, which offers trust and integrity controls that enable you to verify that only unaltered code from approved developers is deployed in the functions.
Functions must complete execution within 900 seconds. The default timeout is 3 seconds. The timeout can be set the timeout to any value between 1 and 900 seconds.
AWS Step Functions can help coordinate a series of Lambda functions in a specific order. Multiple functions can be invoked sequentially, passing the output of one to the other, and/or in parallel, while the state is being maintained by Step Functions.
AWS X-Ray helps to trace functions, which provides insights such as service overhead, function init time, and function execution time.
Lambda Provisioned Concurrency provides greater control over the performance of serverless applications.
Lambda@Edge allows you to run code across AWS locations globally without provisioning or managing servers, responding to end-users at the lowest network latency.
Lambda Extensions allow integration of Lambda with other third-party tools for monitoring, observability, security, and governance.
Compute Savings Plan can help save money for Lambda executions.
CodePipeline and CodeDeploy can be used to automate the serverless application release process.
RDS Proxy provides a highly available database proxy that manages thousands of concurrent connections to relational databases.
Supports Elastic File Store, to provide a shared, external, persistent, scalable volume using a fully managed elastic NFS file system without the need for provisioning or capacity management.
Supports Function URLs, a built-in HTTPS endpoint that can be invoked using the browser, curl, and any HTTP client.

AWS Auto Scaling & ELB

June 9, 2016 ~ Last updated on : July 5, 2022 ~ jayendrapatil ~ 30 Comments

Auto Scaling & ELB

Auto Scaling & ELB
- makes it easy to route traffic across a dynamically changing fleet of EC2 instances
- acts as a single point of contact for all incoming traffic to the instances in an Auto Scaling group.
Auto Scaling dynamically adds and removes EC2 instances, while Elastic Load Balancing manages incoming requests by optimally routing traffic so that no one instance is overwhelmed
Auto Scaling helps to automatically increase the number of EC2 instances when the user demand goes up, and decrease the number of EC2 instances when demand goes down
ELB service helps to distribute the incoming web traffic (called the load) automatically among all the running EC2 instances
ELB uses load balancers to monitor traffic and handle requests that come through the Internet.
Using ELB & Auto Scaling
- makes it easy to route traffic across a dynamically changing fleet of EC2 instances
- load balancer acts as a single point of contact for all incoming traffic to the instances in an Auto Scaling group.

Attaching/Detaching ELB with Auto Scaling Group

Auto Scaling integrates with Elastic Load Balancing and enables attaching one or more load balancers to an existing Auto Scaling group.
ELB registers the EC2 instance using its IP address and routes requests to the primary IP address of the primary interface (eth0) of the instance.
After the ELB is attached, it automatically registers the instances in the group and distributes incoming traffic across the instances
When ELB is detached, it enters the Removing state while deregistering the instances in the group.
If connection draining is enabled, ELB waits for in-flight requests to complete before deregistering the instances.
Instances remain running after they are deregistered from the ELB
Auto Scaling adds instances to the ELB as they are launched, but this can be suspended. Instances launched during the suspension period are not added to the load balancer, after the resumption, and must be registered manually.

High Availability & Redundancy

Auto Scaling can span across multiple AZs, within the same region.
When one AZ becomes unhealthy or unavailable, Auto Scaling launches new instances in an unaffected AZ.
When the unhealthy AZ recovers, Auto Scaling redistributes the traffic across all the healthy AZ.
Elastic Load balancer can be set up to distribute incoming requests across EC2 instances in a single AZ or multiple AZs within a region.
Using Auto Scaling & ELB by spanning Auto Scaling groups across multiple AZs within a region and then setting up ELB to distribute incoming traffic across those AZs helps take advantage of the safety and reliability of geographic redundancy.
Incoming traffic is load balanced equally across all the AZs enabled for ELB.

Health Checks

Auto Scaling group determines the health state of each instance by periodically checking the results of EC2 instance status checks.
Auto Scaling marks the instance as unhealthy and replaces the instance if the instance fails the EC2 instance status check.
ELB also performs health checks on the EC2 instances that are registered with it for e.g. the application is available by pinging a health check page
ELB health check with the instances should be used to ensure that traffic is routed only to the healthy instances.
Auto Scaling, by default, does not replace the instance, if the ELB health check fails.
After a load balancer is registered with an Auto Scaling group, it can be configured to use the results of the ELB health check in addition to the EC2 instance status checks to determine the health of the EC2 instances in the Auto Scaling group.

Monitoring

Elastic Load Balancing sends data about the load balancers and EC2 instances to CloudWatch. CloudWatch collects data about the performance of your resources and presents it as metrics.
After registering one or more load balancers with the Auto Scaling group, the Auto Scaling group can be configured to use ELB metrics (such as request latency or request count) to scale the application automatically.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

A company is building a two-tier web application to serve dynamic transaction-based content. The data tier is leveraging an Online Transactional Processing (OLTP) database. What services should you leverage to enable an elastic and scalable web tier?
1. Elastic Load Balancing, Amazon EC2, and Auto Scaling
2. Elastic Load Balancing, Amazon RDS with Multi-AZ, and Amazon S3
3. Amazon RDS with Multi-AZ and Auto Scaling
4. Amazon EC2, Amazon DynamoDB, and Amazon S3
You have been given a scope to deploy some AWS infrastructure for a large organization. The requirements are that you will have a lot of EC2 instances but may need to add more when the average utilization of your Amazon EC2 fleet is high and conversely remove them when CPU utilization is low. Which AWS services would be best to use to accomplish this?
1. Amazon CloudFront, Amazon CloudWatch and Elastic Load Balancing
2. Auto Scaling, Amazon CloudWatch and AWS CloudTrail
3. Auto Scaling, Amazon CloudWatch and Elastic Load Balancing
4. Auto Scaling, Amazon CloudWatch and AWS Elastic Beanstalk
A user has configured ELB with Auto Scaling. The user suspended the Auto Scaling AddToLoadBalancer, which adds instances to the load balancer. process for a while. What will happen to the instances launched during the suspension period?
1. The instances will not be registered with ELB and the user has to manually register when the process is resumed
2. The instances will be registered with ELB only once the process has resumed
3. Auto Scaling will not launch the instance during this period due to process suspension
4. It is not possible to suspend only the AddToLoadBalancer process
You have an Auto Scaling group associated with an Elastic Load Balancer (ELB). You have noticed that instances launched via the Auto Scaling group are being marked unhealthy due to an ELB health check, but these unhealthy instances are not being terminated. What do you need to do to ensure trial instances marked unhealthy by the ELB will be terminated and replaced?
1. Change the thresholds set on the Auto Scaling group health check
2. Add an Elastic Load Balancing health check to your Auto Scaling group
3. Increase the value for the Health check interval set on the Elastic Load Balancer
4. Change the health check set on the Elastic Load Balancer to use TCP rather than HTTP checks
You are responsible for a web application that consists of an Elastic Load Balancing (ELB) load balancer in front of an Auto Scaling group of Amazon Elastic Compute Cloud (EC2) instances. For a recent deployment of a new version of the application, a new Amazon Machine Image (AMI) was created, and the Auto Scaling group was updated with a new launch configuration that refers to this new AMI. During the deployment, you received complaints from users that the website was responding with errors. All instances passed the ELB health checks. What should you do in order to avoid errors for future deployments? (Choose 2 answer) [PROFESSIONAL]
1. Add an Elastic Load Balancing health check to the Auto Scaling group. Set a short period for the health checks to operate as soon as possible in order to prevent premature registration of the instance to the load balancer.
2. Enable EC2 instance CloudWatch alerts to change the launch configuration’s AMI to the previous one. Gradually terminate instances that are using the new AMI.
3. Set the Elastic Load Balancing health check configuration to target a part of the application that fully tests application health and returns an error if the tests fail.
4. Create a new launch configuration that refers to the new AMI, and associate it with the group. Double the size of the group, wait for the new instances to become healthy, and reduce back to the original size. If new instances do not become healthy, associate the previous launch configuration.
5. Increase the Elastic Load Balancing Unhealthy Threshold to a higher value to prevent an unhealthy instance from going into service behind the load balancer.
What is the order of most-to-least rapidly-scaling (fastest to scale first)? A) EC2 + ELB + Auto Scaling B) Lambda C) RDS
1. B, A, C (Lambda is designed to scale instantly. EC2 + ELB + Auto Scaling require single-digit minutes to scale out. RDS will take at least 15 minutes, and will apply OS patches or any other updates when applied.)
2. C, B, A
3. C, A, B
4. A, C, B
A user has hosted an application on EC2 instances. The EC2 instances are configured with ELB and Auto Scaling. The application server session time out is 2 hours. The user wants to configure connection draining to ensure that all in-flight requests are supported by ELB even though the instance is being deregistered. What time out period should the user specify for connection draining?
1. 5 minutes
2. 1 hour (max allowed is 3600 secs that is close to 2 hours to keep the in flight requests alive)
3. 30 minutes
4. 2 hours

References

AWS Auto Scaling with ELB

AWS Auto Scaling

March 19, 2016 ~ Last updated on : August 11, 2022 ~ jayendrapatil ~ 59 Comments

Auto Scaling Overview

Auto Scaling provides the ability to ensure a correct number of EC2 instances are always running to handle the load of the application
Auto Scaling helps
- to achieve better fault tolerance, better availability and cost management.
- helps specify scaling policies that can be used to launch and terminate EC2 instances to handle any increase or decrease in demand.
Auto Scaling attempts to distribute instances evenly between the AZs that are enabled for the Auto Scaling group.
Auto Scaling does this by attempting to launch new instances in the AZ with the fewest instances. If the attempt fails, it attempts to launch the instances in another AZ until it succeeds.

Auto Scaling Components

AWS Auto Scaling

Auto Scaling Groups – ASG

Auto Scaling groups are the core of Auto Scaling and contain a collection of EC2 instances that share similar characteristics and are treated as a logical grouping for the purposes of automatic scaling and management.
ASG requires
- Launch configuration OR Launch Template
  - determine the EC2 template to use for launching the instance
- Minimum & Maximum capacity
  - determine the number of instances when an autoscaling policy is applied.
  - Number of instances cannot grow beyond these boundaries
- Desired capacity
  - to determine the number of instances the ASG must maintain at all times. If missing, it equals the minimum size.
  - Desired capacity is different from minimum capacity.
  - An Auto Scaling group’s desired capacity is the default number of instances that should be running. A group’s minimum capacity is the fewest number of instances the group can have running
- Availability Zones or Subnets in which the instances will be launched.
- Metrics & Health Checks
  - metrics to determine when it should launch or terminate instances and health checks to determine if the instance is healthy or not
ASG starts by launching a desired capacity of instances and maintains this number by performing periodic health checks.
If an instance becomes unhealthy, the ASG terminates and launches a new instance.
ASG can also use scaling policies to increase or decrease the number of instances automatically to meet changing demands
An ASG can contain EC2 instances in one or more AZs within the same region.
ASGs cannot span multiple regions.
ASG can launch On-Demand Instances, Spot Instances, or both when configured to use a launch template.
To merge separate single-zone ASGs into a single ASG spanning multiple AZs, rezone one of the single-zone groups into a multi-zone group, and then delete the other groups. This process works for groups with or without a load balancer, as long as the new multi-zone group is in one of the same AZs as the original single-zone groups.
ASG can be associated with a single launch configuration or template
As the Launch Configuration can’t be modified once created, the only way to update the Launch Configuration for an ASG is to create a new one and associate it with the ASG.
When the launch configuration for the ASG is changed, any new instances launched, use the new configuration parameters, but the existing instances are not affected.
ASG can be deleted from CLI, if it has no running instances else need to set the minimum and desired capacity to 0. This is handled automatically when deleting an ASG from the AWS management console.

Launch Configuration

Launch configuration is an instance configuration template that an ASG uses to launch EC2 instances.
Launch configuration is similar to EC2 configuration and involves the selection of the Amazon Machine Image (AMI), block devices, key pair, instance type, security groups, user data, EC2 instance monitoring, instance profile, kernel, ramdisk, the instance tenancy, whether the instance has a public IP address, and is EBS-optimized.
Launch configuration can be associated with multiple ASGs
Launch configuration can’t be modified after creation and needs to be created new if any modification is required.
Basic or detailed monitoring for the instances in the ASG can be enabled when a launch configuration is created.
By default, basic monitoring is enabled when you create the launch configuration using the AWS Management Console, and detailed monitoring is enabled when you create the launch configuration using the AWS CLI or an API
AWS recommends using Launch Template instead.

Launch Template

A Launch Template is similar to a launch configuration, with additional features, and is recommended by AWS.
Launch Template allows multiple versions of a template to be defined.
With versioning, a subset of the full set of parameters can be created and then reused to create other templates or template versions for e.g, a default template that defines common configuration parameters can be created and allow the other parameters to be specified as part of another version of the same template.
Launch Template allows the selection of both Spot and On-Demand Instances or multiple instance types.
Launch templates support EC2 Dedicated Hosts. Dedicated Hosts are physical servers with EC2 instance capacity that are dedicated to your use.
Launch templates provide the following features
- Support for multiple instance types and purchase options in a single ASG.
- Launching Spot Instances with the capacity-optimized allocation strategy.
- Support for launching instances into existing Capacity Reservations through an ASG.
- Support for unlimited mode for burstable performance instances.
- Support for Dedicated Hosts.
- Combining CPU architectures such as Intel, AMD, and ARM (Graviton2)
- Improved governance through IAM controls and versioning.
- Automating instance deployment with Instance Refresh.

Auto Scaling Launch Configuration vs Launch Template

Auto Scaling Launch Template vs Launch Configuration

Auto Scaling Policies

Refer blog post @ Auto Scaling Policies

Auto Scaling Cooldown Period

Auto Scaling Cooldown period is a configurable setting for the ASG that helps to ensure that Auto Scaling doesn’t launch or terminate additional instances before the previous scaling activity takes effect and allows the newly launched instances to start handling traffic and reduce load
When ASG dynamically scales using a simple scaling policy and launches an instance, Auto Scaling suspends the scaling activities for the cooldown period (default 300 seconds) to complete before resuming scaling activities
Example Use Case
- You configure a scale out alarm to increase the capacity, if the CPU utilization increases more than 80%
- A CPU spike occurs and causes the alarm to be triggered, Auto Scaling launches a new instance
- However, it would take time for the newly launched instance to be configured, instantiated, and started, let’s say 5 mins
- Without a cooldown period, if another CPU spike occurs Auto Scaling would launch a new instance again and this would continue for 5 mins till the previously launched instance is up and running and started handling traffic
- With a cooldown period, Auto Scaling would suspend the activity for the specified time period enabling the newly launched instance to start handling traffic and reduce the load.
- After the cooldown period, Auto Scaling resumes acting on the alarms
When manually scaling the ASG, the default is not to wait for the cooldown period but can be overridden to honour the cooldown period.
Note that if an instance becomes unhealthy, Auto Scaling does not wait for the cooldown period to complete before replacing the unhealthy instance.
Cooldown periods are automatically applied to dynamic scaling activities for simple scaling policies and are not supported for step scaling policies.

Auto Scaling Termination Policy

Termination policy helps Auto Scaling decide which instances it should terminate first when Auto Scaling automatically scales in.
Auto Scaling specifies a default termination policy and also provides the ability to create a customized one.

Default Termination Policy

Default termination policy helps ensure that the network architecture spans AZs evenly and instances are selected for termination as follows:-

Selection of Availability Zone
- selects the AZ, in multiple AZs environments, with the most instances and at least one instance that is not protected from scale in.
- selects the AZ with instances that use the oldest launch configuration, if there is more than one AZ with the same number of instances
Selection of an Instance within the Availability Zone
- terminates the unprotected instance using the oldest launch configuration if one exists.
- terminates unprotected instances closest to the next billing hour, If multiple instances with the oldest launch configuration. This helps in maximizing the use of the EC2 instances that have an hourly charge while minimizing the number of hours billed for EC2 usage.
- terminates instances at random, if more than one unprotected instance is closest to the next billing hour.

Customized Termination Policy

Auto Scaling first assesses the AZs for any imbalance. If an AZ has more instances than the other AZs that are used by the group, then it applies the specified termination policy on the instances from the imbalanced AZ
If the Availability Zones used by the group are balanced, then Auto Scaling applies the specified termination policy.
Following Customized Termination, policies are supported
1. OldestInstance – terminates the oldest instance in the group and can be useful to upgrade to new instance types
2. NewestInstance – terminates the newest instance in the group and can be useful when testing a new launch configuration
3. OldestLaunchConfiguration – terminates instances that have the oldest launch configuration
4. OldestLaunchTemplate – terminates instances that have the oldest launch template
5. ClosestToNextInstanceHour – terminates instances that are closest to the next billing hour and helps to maximize the use of your instances and manage costs.
6. Default – terminates as per the default termination policy

Instance Refresh

Instance refresh can be used to update the instances in the ASG instead of manually replacing instances a few at a time.
An instance refresh can be helpful when you have a new AMI or a new user data script.
Instance refresh also helps configure the minimum healthy percentage, instance warmup, and checkpoints.
To use an instance refresh
- Create a new launch template that specifies the new AMI or user data script.
- Start an instance refresh to begin updating the instances in the group immediately.
- EC2 Auto Scaling starts performing a rolling replacement of the instances.

Instance Protection

Instance protection controls whether Auto Scaling can terminate a particular instance or not.
Instance protection can be enabled on an ASG or an individual instance as well, at any time
Instances launched within an ASG with Instance protection enabled would inherit the property.
Instance protection starts as soon as the instance is InService and if the Instance is detached, it loses its Instance protection
If all instances in an ASG are protected from termination during scale in and a scale-in event occurs, it can’t terminate any instance and will decrement the desired capacity.
Instance protection does not protect for the below cases
- Manual termination through the EC2 console, the terminate-instances command, or the TerminateInstances API.
- If it fails health checks and must be replaced
- Spot instances in an ASG from interruption

Standby State

Auto Scaling allows putting the InService instances in the Standby state during which the instance is still a part of the ASG but does not serve any requests. This can be used to either troubleshoot an instance or update an instance and return the instance back to service.

An instance can be put into Standby state and it will continue to remain in the Standby state unless exited.
Auto Scaling, by default, decrements the desired capacity for the group and prevents it from launching a new instance. If no decrement is selected, it would launch a new instance
When the instance is in the standby state, the instance can be updated or used for troubleshooting.
If a load balancer is associated with Auto Scaling, the instance is automatically deregistered when the instance is in Standby state and registered again when the instance exits the Standby state

Suspension

Auto Scaling processes can be suspended and then resumed. This can be very useful to investigate a configuration problem or debug an issue with the application, without triggering the Auto Scaling process.
Auto Scaling also performs Administrative Suspension where it would suspend processes for ASGs if the ASG has been trying to launch instances for over 24 hours but has not succeeded in launching any instances.
Auto Scaling processes include
- Launch – Adds a new EC2 instance to the group, increasing its capacity.
- Terminate – Removes an EC2 instance from the group, decreasing its capacity.
- HealthCheck – Checks the health of the instances.
- ReplaceUnhealthy – Terminates instances that are marked as unhealthy and subsequently creates new instances to replace them.
- AlarmNotification – Accepts notifications from CloudWatch alarms that are associated with the group. If suspended, Auto Scaling does not automatically execute policies that would be triggered by an alarm
- ScheduledActions – Performs scheduled actions that you create.
- AddToLoadBalancer – Adds instances to the load balancer when they are launched.
- InstanceRefresh – Terminates and replaces instances using the instance refresh feature.
- AZRebalance – Balances the number of EC2 instances in the group across the Availability Zones in the region.
  - If an AZ either is removed from the ASG or becomes unhealthy or unavailable, Auto Scaling launches new instances in an unaffected AZ before terminating the unhealthy or unavailable instances
  - When the unhealthy AZ returns to a healthy state, Auto Scaling automatically redistributes the instances evenly across the Availability Zones for the group.
  - Note that if you suspend AZRebalance and a scale out or scale in event occurs, Auto Scaling still tries to balance the Availability Zones for e.g. during scale out, it launches the instance in the Availability Zone with the fewest instances.
  - If you suspend Launch, AZRebalance neither launches new instances nor terminates existing instances. This is because AZRebalance terminates instances only after launching the replacement instances.
  - If you suspend Terminate, the ASG can grow up to 10% larger than its maximum size, because Auto Scaling allows this temporarily during rebalancing activities. If it cannot terminate instances, your ASG could remain above its maximum size until the Terminate process is resumed

Auto Scaling Lifecycle

Refer to blog post @ Auto Scaling Lifecycle

Autoscaling & ELB

Refer to blog post @ Autoscaling & ELB

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

A user is trying to setup a scheduled scaling activity using Auto Scaling. The user wants to setup the recurring schedule. Which of the below mentioned parameters is not required in this case?
1. Maximum size
2. Auto Scaling group name
3. End time
4. Recurrence value
A user has configured Auto Scaling with 3 instances. The user had created a new AMI after updating one of the instances. If the user wants to terminate two specific instances to ensure that Auto Scaling launches an instances with the new launch configuration, which command should he run?
1. as-delete-instance-in-auto-scaling-group <Instance ID> –no-decrement-desired-capacity
2. as-terminate-instance-in-auto-scaling-group <Instance ID> –update-desired-capacity
3. as-terminate-instance-in-auto-scaling-group <Instance ID> –decrement-desired-capacity
4. as-terminate-instance-in-auto-scaling-group <Instance ID> –no-decrement-desired-capacity
A user is planning to scale up an application by 8 AM and scale down by 7 PM daily using Auto Scaling. What should the user do in this case?
1. Setup the scaling policy to scale up and down based on the CloudWatch alarms
2. User should increase the desired capacity at 8 AM and decrease it by 7 PM manually
3. User should setup a batch process which launches the EC2 instance at a specific time
4. Setup scheduled actions to scale up or down at a specific time
An organization has setup Auto Scaling with ELB. Due to some manual error, one of the instances got rebooted. Thus, it failed the Auto Scaling health check. Auto Scaling has marked it for replacement. How can the system admin ensure that the instance does not get terminated?
1. Update the Auto Scaling group to ignore the instance reboot event
2. It is not possible to change the status once it is marked for replacement
3. Manually add that instance to the Auto Scaling group after reboot to avoid replacement
4. Change the health of the instance to healthy using the Auto Scaling commands
A user has configured Auto Scaling with the minimum capacity as 2 and the desired capacity as 2. The user is trying to terminate one of the existing instance with the command: as-terminate-instance-in-auto-scaling-group<Instance ID> –decrement-desired-capacity. What will Auto Scaling do in this scenario?
1. Terminates the instance and does not launch a new instance
2. Terminates the instance and updates the desired capacity to 1
3. Terminates the instance and updates the desired capacity & minimum size to 1
4. Throws an error
An organization has configured Auto Scaling for hosting their application. The system admin wants to understand the Auto Scaling health check process. If the instance is unhealthy, Auto Scaling launches an instance and terminates the unhealthy instance. What is the order execution?
1. Auto Scaling launches a new instance first and then terminates the unhealthy instance
2. Auto Scaling performs the launch and terminate processes in a random order
3. Auto Scaling launches and terminates the instances simultaneously
4. Auto Scaling terminates the instance first and then launches a new instance
A user has configured ELB with Auto Scaling. The user suspended the Auto Scaling terminate process only for a while. What will happen to the availability zone rebalancing process (AZRebalance) during this period?
1. Auto Scaling will not launch or terminate any instances
2. Auto Scaling will allow the instances to grow more than the maximum size
3. Auto Scaling will keep launching instances till the maximum instance size
4. It is not possible to suspend the terminate process while keeping the launch active
An organization has configured Auto Scaling with ELB. There is a memory issue in the application which is causing CPU utilization to go above 90%. The higher CPU usage triggers an event for Auto Scaling as per the scaling policy. If the user wants to find the root cause inside the application without triggering a scaling activity, how can he achieve this?
1. Stop the scaling process until research is completed
2. It is not possible to find the root cause from that instance without triggering scaling
3. Delete Auto Scaling until research is completed
4. Suspend the scaling process until research is completed
A user has configured ELB with Auto Scaling. The user suspended the Auto Scaling Alarm Notification (which notifies Auto Scaling for CloudWatch alarms) process for a while. What will Auto Scaling do during this period?
1. AWS will not receive the alarms from CloudWatch
2. AWS will receive the alarms but will not execute the Auto Scaling policy
3. Auto Scaling will execute the policy but it will not launch the instances until the process is resumed
4. It is not possible to suspend the AlarmNotification process
An organization has configured two single availability zones. The Auto Scaling groups are configured in separate zones. The user wants to merge the groups such that one group spans across multiple zones. How can the user configure this?
1. Run the command as-join-auto-scaling-group to join the two groups
2. Run the command as-update-auto-scaling-group to configure one group to span across zones and delete the other group
3. Run the command as-copy-auto-scaling-group to join the two groups
4. Run the command as-merge-auto-scaling-group to merge the groups
An organization has configured Auto Scaling with ELB. One of the instance health check returns the status as Impaired to Auto Scaling. What will Auto Scaling do in this scenario?
1. Perform a health check until cool down before declaring that the instance has failed
2. Terminate the instance and launch a new instance
3. Notify the user using SNS for the failed state
4. Notify ELB to stop sending traffic to the impaired instance
A user has setup an Auto Scaling group. The group has failed to launch a single instance for more than 24 hours. What will happen to Auto Scaling in this condition
1. Auto Scaling will keep trying to launch the instance for 72 hours
2. Auto Scaling will suspend the scaling process
3. Auto Scaling will start an instance in a separate region
4. The Auto Scaling group will be terminated automatically
A user is planning to setup infrastructure on AWS for the Christmas sales. The user is planning to use Auto Scaling based on the schedule for proactive scaling. What advise would you give to the user?
1. It is good to schedule now because if the user forgets later on it will not scale up
2. The scaling should be setup only one week before Christmas
3. Wait till end of November before scheduling the activity
4. It is not advisable to use scheduled based scaling
A user is trying to setup a recurring Auto Scaling process. The user has setup one process to scale up every day at 8 am and scale down at 7 PM. The user is trying to setup another recurring process which scales up on the 1st of every month at 8 AM and scales down the same day at 7 PM. What will Auto Scaling do in this scenario
1. Auto Scaling will execute both processes but will add just one instance on the 1st
2. Auto Scaling will add two instances on the 1st of the month
3. Auto Scaling will schedule both the processes but execute only one process randomly
4. Auto Scaling will throw an error since there is a conflict in the schedule of two separate Auto Scaling Processes
A sys admin is trying to understand the Auto Scaling activities. Which of the below mentioned processes is not performed by Auto Scaling?
1. Reboot Instance
2. Schedule Actions
3. Replace Unhealthy
4. Availability Zone Re-Balancing
You have started a new job and are reviewing your company’s infrastructure on AWS. You notice one web application where they have an Elastic Load Balancer in front of web instances in an Auto Scaling Group. When you check the metrics for the ELB in CloudWatch you see four healthy instances in Availability Zone (AZ) A and zero in AZ B. There are zero unhealthy instances. What do you need to fix to balance the instances across AZs?
1. Set the ELB to only be attached to another AZ
2. Make sure Auto Scaling is configured to launch in both AZs
3. Make sure your AMI is available in both AZs
4. Make sure the maximum size of the Auto Scaling Group is greater than 4
You have been asked to leverage Amazon VPC EC2 and SQS to implement an application that submits and receives millions of messages per second to a message queue. You want to ensure your application has sufficient bandwidth between your EC2 instances and SQS. Which option will provide the most scalable solution for communicating between the application and SQS?
1. Ensure the application instances are properly configured with an Elastic Load Balancer
2. Ensure the application instances are launched in private subnets with the EBS-optimized option enabled
3. Ensure the application instances are launched in public subnets with the associate-public-IP-address=trueoption enabled
4. Launch application instances in private subnets with an Auto Scaling group and Auto Scaling triggers configured to watch the SQS queue size
You have decided to change the Instance type for instances running in your application tier that are using Auto Scaling. In which area below would you change the instance type definition?
1. Auto Scaling launch configuration
2. Auto Scaling group
3. Auto Scaling policy
4. Auto Scaling tags
A user is trying to delete an Auto Scaling group from CLI. Which of the below mentioned steps are to be performed by the user?
1. Terminate the instances with the ec2-terminate-instance command
2. Terminate the Auto Scaling instances with the as-terminate-instance command
3. Set the minimum size and desired capacity to 0
4. There is no need to change the capacity. Run the as-delete-group command and it will reset all values to 0
A user has created a web application with Auto Scaling. The user is regularly monitoring the application and he observed that the traffic is highest on Thursday and Friday between 8 AM to 6 PM. What is the best solution to handle scaling in this case?
1. Add a new instance manually by 8 AM Thursday and terminate the same by 6 PM Friday
2. Schedule Auto Scaling to scale up by 8 AM Thursday and scale down after 6 PM on Friday
3. Schedule a policy which may scale up every day at 8 AM and scales down by 6 PM
4. Configure a batch process to add a instance by 8 AM and remove it by Friday 6 PM
A user has configured the Auto Scaling group with the minimum capacity as 3 and the maximum capacity as 5. When the user configures the AS group, how many instances will Auto Scaling launch?
1. 3
2. 0
3. 5
4. 2
A sys admin is maintaining an application on AWS. The application is installed on EC2 and user has configured ELB and Auto Scaling. Considering future load increase, the user is planning to launch new servers proactively so that they get registered with ELB. How can the user add these instances with Auto Scaling?
1. Increase the desired capacity of the Auto Scaling group
2. Increase the maximum limit of the Auto Scaling group
3. Launch an instance manually and register it with ELB on the fly
4. Decrease the minimum limit of the Auto Scaling group
In reviewing the auto scaling events for your application you notice that your application is scaling up and down multiple times in the same hour. What design choice could you make to optimize for the cost while preserving elasticity? Choose 2 answers.
1. Modify the Amazon CloudWatch alarm period that triggers your auto scaling scale down policy.
2. Modify the Auto scaling group termination policy to terminate the oldest instance first.
3. Modify the Auto scaling policy to use scheduled scaling actions.
4. Modify the Auto scaling group cool down timers.
5. Modify the Auto scaling group termination policy to terminate newest instance first.
You have a business critical two tier web app currently deployed in two availability zones in a single region, using Elastic Load Balancing and Auto Scaling. The app depends on synchronous replication (very low latency connectivity) at the database layer. The application needs to remain fully available even if one application Availability Zone goes off-line, and Auto scaling cannot launch new instances in the remaining Availability Zones. How can the current architecture be enhanced to ensure this? [PROFESSIONAL]
1. Deploy in two regions using Weighted Round Robin (WRR), with Auto Scaling minimums set for 100% peak load per region.
2. Deploy in three AZs, with Auto Scaling minimum set to handle 50% peak load per zone.
3. Deploy in three AZs, with Auto Scaling minimum set to handle 33% peak load per zone. (Loss of one AZ will handle only 66% if the autoscaling also fails)
4. Deploy in two regions using Weighted Round Robin (WRR), with Auto Scaling minimums set for 50% peak load per region.
A user has created a launch configuration for Auto Scaling where CloudWatch detailed monitoring is disabled. The user wants to now enable detailed monitoring. How can the user achieve this?
1. Update the Launch config with CLI to set InstanceMonitoringDisabled = false
2. The user should change the Auto Scaling group from the AWS console to enable detailed monitoring
3. Update the Launch config with CLI to set InstanceMonitoring.Enabled = true
4. Create a new Launch Config with detail monitoring enabled and update the Auto Scaling group
A user has created an Auto Scaling group with default configurations from CLI. The user wants to setup the CloudWatch alarm on the EC2 instances, which are launched by the Auto Scaling group. The user has setup an alarm to monitor the CPU utilization every minute. Which of the below mentioned statements is true?
1. It will fetch the data at every minute but the four data points [corresponding to 4 minutes] will not have value since the EC2 basic monitoring metrics are collected every five minutes
2. It will fetch the data at every minute as detailed monitoring on EC2 will be enabled by the default launch configuration of Auto Scaling
3. The alarm creation will fail since the user has not enabled detailed monitoring on the EC2 instances
4. The user has to first enable detailed monitoring on the EC2 instances to support alarm monitoring at every minute
A customer has a website which shows all the deals available across the market. The site experiences a load of 5 large EC2 instances generally. However, a week before Thanksgiving vacation they encounter a load of almost 20 large instances. The load during that period varies over the day based on the office timings. Which of the below mentioned solutions is cost effective as well as help the website achieve better performance?
1. Keep only 10 instances running and manually launch 10 instances every day during office hours.
2. Setup to run 10 instances during the pre-vacation period and only scale up during the office time by launching 10 more instances using the AutoScaling schedule.
3. During the pre-vacation period setup a scenario where the organization has 15 instances running and 5 instances to scale up and down using Auto Scaling based on the network I/O policy.
4. During the pre-vacation period setup 20 instances to run continuously.
When Auto Scaling is launching a new instance based on condition, which of the below mentioned policies will it follow?
1. Based on the criteria defined with cross zone Load balancing
2. Launch an instance which has the highest load distribution
3. Launch an instance in the AZ with the fewest instances
4. Launch an instance in the AZ which has the highest instances
The user has created multiple AutoScaling groups. The user is trying to create a new AS group but it fails. How can the user know that he has reached the AS group limit specified by AutoScaling in that region?
1. Run the command: as-describe-account-limits
2. Run the command: as-describe-group-limits
3. Run the command: as-max-account-limits
4. Run the command: as-list-account-limits
A user is trying to save some cost on the AWS services. Which of the below mentioned options will not help him save cost?
1. Delete the unutilized EBS volumes once the instance is terminated
2. Delete the Auto Scaling launch configuration after the instances are terminated (Auto Scaling Launch config does not cost anything)
3. Release the elastic IP if not required once the instance is terminated
4. Delete the AWS ELB after the instances are terminated
To scale up the AWS resources using manual Auto Scaling, which of the below mentioned parameters should the user change?
1. Maximum capacity
2. Desired capacity
3. Preferred capacity
4. Current capacity
For AWS Auto Scaling, what is the first transition state an existing instance enters after leaving steady state in Standby mode?
1. Detaching
2. Terminating:Wait
3. Pending (You can put any instance that is in an InService state into a Standby state. This enables you to remove the instance from service, troubleshoot or make changes to it, and then put it back into service. Instances in a Standby state continue to be managed by the Auto Scaling group. However, they are not an active part of your application until you put them back into service. Refer link)
4. EnteringStandby
For AWS Auto Scaling, what is the first transition state an instance enters after leaving steady state when scaling in due to health check failure or decreased load?
1. Terminating (When Auto Scaling responds to a scale in event, it terminates one or more instances. These instances are detached from the Auto Scaling group and enter the Terminating state. Refer link)
2. Detaching
3. Terminating:Wait
4. EnteringStandby
A user has setup Auto Scaling with ELB on the EC2 instances. The user wants to configure that whenever the CPU utilization is below 10%, Auto Scaling should remove one instance. How can the user configure this?
1. The user can get an email using SNS when the CPU utilization is less than 10%. The user can use the desired capacity of Auto Scaling to remove the instance
2. Use CloudWatch to monitor the data and Auto Scaling to remove the instances using scheduled actions
3. Configure CloudWatch to send a notification to Auto Scaling Launch configuration when the CPU utilization is less than 10% and configure the Auto Scaling policy to remove the instance
4. Configure CloudWatch to send a notification to the Auto Scaling group when the CPU Utilization is less than 10% and configure the Auto Scaling policy to remove the instance
A user has enabled detailed CloudWatch metric monitoring on an Auto Scaling group. Which of the below mentioned metrics will help the user identify the total number of instances in an Auto Scaling group including pending, terminating and running instances?
1. GroupTotalInstances (Refer link)
2. GroupSumInstances
3. It is not possible to get a count of all the three metrics together. The user has to find the individual number of running, terminating and pending instances and sum it
4. GroupInstancesCount
Your startup wants to implement an order fulfillment process for selling a personalized gadget that needs an average of 3-4 days to produce with some orders taking up to 6 months you expect 10 orders per day on your first day. 1000 orders per day after 6 months and 10,000 orders after 12 months. Orders coming in are checked for consistency then dispatched to your manufacturing plant for production quality control packaging shipment and payment processing. If the product does not meet the quality standards at any stage of the process employees may force the process to repeat a step. Customers are notified via email about order status and any critical issues with their orders such as payment failure. Your case architecture includes AWS Elastic Beanstalk for your website with an RDS MySQL instance for customer data and orders. How can you implement the order fulfillment process while making sure that the emails are delivered reliably? [PROFESSIONAL]
1. Add a business process management application to your Elastic Beanstalk app servers and re-use the ROS database for tracking order status use one of the Elastic Beanstalk instances to send emails to customers.
2. Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group with min/max=1 Use the decider instance to send emails to customers.
3. Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group with min/max=1 use SES to send emails to customers.
4. Use an SQS queue to manage all process tasks Use an Auto Scaling group of EC2 Instances that poll the tasks and execute them. Use SES to send emails to customers.

References

AWS_Auto_Scaling_Developer_Guide

AWS Autoscaling Troubleshooting

March 6, 2016 ~ Last updated on : June 11, 2020 ~ jayendrapatil ~ 2 Comments

Exam Question Scenario

EC2 instances fail to launch with Autoscaling configuration

Description

Autoscaling configuration requires the following :-
Autoscaling launch configuration which allows you to select an
- AMI
- Instance type
- IAM role (optional)
- Security group
- Key pair file
Autoscaling group configuration allows you to select AZ to be used to launch the EC2 instances with the selected launch configuration

Troubleshooting key points :-

AMI id does not exist or is still pending and cannot be used to launch instances
Security group provided in the launch configuration does not exist
Key pair associated with the EC2 instance does not exist
Autoscaling group not found or is incorrectly configured
AZ configured with the Autoscaling group is no longer supported cause it might not be available
Invalid EBS block device mappings
Instance type is not supported in the AZ
Capacity limits reached either cause of the restriction on the number of instance type that can be launched in a region or cause AWS is not able to provision the specified instance type in the AZ (for e.g. no more spot instances or On-demand instances availability)

References

More details @ AWS Autoscaling Developer Guide