AWS Blue Green Deployment Whitepaper

March 21, 2017 ~ Last updated on : April 12, 2023 ~ jayendrapatil ~ 7 Comments

AWS Blue Green Deployment

Blue/green deployments provide near zero-downtime release and rollback capabilities.

Blue/green deployment works by shifting traffic between two identical environments that are running different versions of the application
- Blue environment represents the current application version serving production traffic.
- In parallel, the green environment is staged running a different version of your application.
- After the green environment is ready and tested, production traffic is redirected from blue to green.
- If any problems are identified, you can roll back by reverting traffic back to the blue environment.

NOTE: Advanced Topic required for DevOps Professional Exam Only

AWS Services

Route 53

Route 53 is a highly available and scalable authoritative DNS service that route user requests
Route 53 with its DNS service allows administrators to direct traffic by simply updating DNS records in the hosted zone
TTL can be adjusted for resource records to be shorter which allow record changes to propagate faster to clients

Elastic Load Balancing

Elastic Load Balancing distributes incoming application traffic across EC2 instances
Elastic Load Balancing scales in response to incoming requests, performs health checking against Amazon EC2 resources, and naturally integrates with other AWS tools, such as Auto Scaling.
ELB also helps perform health checks of EC2 instances to route traffic only to the healthy instances

Auto Scaling

Auto Scaling allows different versions of launch configuration, which define templates used to launch EC2 instances, to be attached to an Auto Scaling group to enable blue/green deployment.
Auto Scaling’s termination policies and Standby state enable blue/green deployment
- Termination policies in Auto Scaling groups to determine which EC2 instances to remove during a scaling action.
- Auto Scaling also allows instances to be placed in Standby state, instead of termination, which helps with quick rollback when required
Auto Scaling with Elastic Load Balancing can be used to balance and scale the traffic

Elastic Beanstalk

Elastic Beanstalk makes it easy to run multiple versions of the application and provides capabilities to swap the environment URLs, facilitating blue/green deployment.

Elastic Beanstalk supports Auto Scaling and Elastic Load Balancing, both of which enable blue/green deployment

OpsWorks

OpsWorks has the concept of stacks, which are logical groupings of AWS resources with a common purpose & should be logically managed together
Stacks are made of one or more layers with each layer represents a set of EC2 instances that serve a particular purpose, such as serving applications or hosting a database server.

OpsWorks simplifies cloning entire stacks when preparing for blue/green environments.

CloudFormation

CloudFormation helps describe the AWS resources through JSON formatted templates and provides automation capabilities for provisioning blue/green environments and facilitating updates to switch traffic, whether through Route 53 DNS, Elastic Load Balancing, etc
CloudFormation provides infrastructure as code strategy, where infrastructure is provisioned and managed using code and software development techniques, such as version control and continuous integration, in a manner similar to how application code is treated

CloudWatch

CloudWatch monitoring can provide early detection of application health in blue/green deployments

Deployment Techniques

DNS Routing using Route 53

Route 53 DNS service can help switch traffic from the blue environment to the green and vice versa, if rollback is necessary
Route 53 can help either switch the traffic completely or through a weighted distribution

Weighted distribution
- helps distribute percentage of traffic to go to the green environment and gradually update the weights until the green environment carries the full production traffic
- provides the ability to perform canary analysis where a small percentage of production traffic is introduced to a new environment
- helps manage cost by using auto scaling for instances to scale based on the actual demand
Route 53 can handle Public or Elastic IP address, Elastic Load Balancer, Elastic Beanstalk environment web tiers etc.

Auto Scaling Group Swap Behind Elastic Load Balancer

AWS Blue Green Deployment - Auto Scaling Group

Elastic Load Balancing with Auto Scaling to manage EC2 resources as per the demand can be used for Blue Green deployments

Multiple Auto Scaling groups can be attached to the Elastic Load Balancer
Green ASG can be attached to an existing ELB while Blue ASG is already attached to the ELB to serve traffic
ELB would start routing requests to the Green Group as for HTTP/S listener it uses a least outstanding requests routing algorithm

Green group capacity can be increased to process more traffic while the Blue group capacity can be reduced either by terminating the instances or by putting the instances in a standby mode
Standby is a good option because if roll back to the blue environment needed, blue server instances can be put back in service and they’re ready to go
If no issues with the Green group, the blue group can be decommissioned by adjusting the group size to zero

Update Auto Scaling Group Launch Configurations

AWS Blue Green Deployment - Auto Scaling Launch

Auto Scaling groups have their own launch configurations which define template for EC2 instances to be launched
Auto Scaling group can have only one launch configuration at a time, and it can’t be modified. If needs modification, a new launch configuration can be created and attached to the existing Auto Scaling Group
After a new launch configuration is in place, any new instances that are launched use the new launch configuration parameters, but existing instances are not affected.

When Auto Scaling removes instances (referred to as scaling in) from the group, the default termination policy is to remove instances with the oldest launch configuration
To deploy the new version of the application in the green environment, update the Auto Scaling group with the new launch configuration, and then scale the Auto Scaling group to twice its original size.
Then, shrink the Auto Scaling group back to the original size

To perform a rollback, update the Auto Scaling group with the old launch configuration. Then, do the preceding steps in reverse

Elastic Beanstalk Application Environment Swap

AWS Blue Green Deployment - Elastic Beanstalk

Elastic Beanstalk multiple environment and environment url swap feature helps enable Blue Green deployment
Elastic Beanstalk can be used to host the blue environment exposed via URL to access the environment

Elastic Beanstalk provides several deployment policies, ranging from policies that perform an in-place update on existing instances, to immutable deployment using a set of new instances.
Elastic Beanstalk performs an in-place update when the application versions are updated, however application may become unavailable to users for a short period of time.
To avoid the downtime, a new version can be deployed to a separate Green environment with its own URL, launched with the existing environment’s configuration

Elastic Beanstalk’s Swap Environment URLs feature can be used to promote the green environment to serve production traffic
Elastic Beanstalk performs a DNS switch, which typically takes a few minutes
To perform a rollback, invoke Swap Environment URL again.

Clone a Stack in AWS OpsWorks and Update DNS

OpsWorks can be used to create
- Blue environment stack with the current version of the application and serving production traffic
- Green environment stack with the newer version of the application and is not receiving any traffic

To promote to the green environment/stack into production, update DNS records to point to the green environment/stack’s load balancer

Labs

Qwiklabs free labs
- Blue/Green Deployment Pattern with AWS Elastic Beanstalk

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

What is server immutability?
1. Not updating a server after creation. (During the new release, a new set of EC2 instances are rolled out by terminating older instances and are disposable. EC2 instance usage is considered temporary or ephemeral in nature for the period of deployment until the current release is active)
2. The ability to change server counts.
3. Updating a server after creation.
4. The inability to change server counts.

You need to deploy a new application version to production. Because the deployment is high-risk, you need to roll the new version out to users over a number of hours, to make sure everything is working correctly. You need to be able to control the proportion of users seeing the new version of the application down to the percentage point. You use ELB and EC2 with Auto Scaling Groups and custom AMIs with your code pre-installed assigned to Launch Configurations. There are no database-level changes during your deployment. You have been told you cannot spend too much money, so you must not increase the number of EC2 instances much at all during the deployment, but you also need to be able to switch back to the original version of code quickly if something goes wrong. What is the best way to meet these requirements?
1. Create a second ELB, Auto Scaling Launch Configuration, and Auto Scaling Group using the Launch Configuration. Create AMIs with all code pre-installed. Assign the new AMI to the second Auto Scaling Launch Configuration. Use Route53 Weighted Round Robin Records to adjust the proportion of traffic hitting the two ELBs. (Use Weighted Round Robin DNS Records and reverse proxies allow such fine-grained tuning of traffic splits. Blue-Green option does not meet the requirement that we mitigate costs and keep overall EC2 fleet size consistent, so we must select the 2 ELB and ASG option with WRR DNS tuning)
2. Use the Blue-Green deployment method to enable the fastest possible rollback if needed. Create a full second stack of instances and cut the DNS over to the new stack of instances, and change the DNS back if a rollback is needed. (Full second stack is expensive)
3. Create AMIs with all code pre-installed. Assign the new AMI to the Auto Scaling Launch Configuration, to replace the old one. Gradually terminate instances running the old code (launched with the old Launch Configuration) and allow the new AMIs to boot to adjust the traffic balance to the new code. On rollback, reverse the process by doing the same thing, but changing the AMI on the Launch Config back to the original code. (Cannot modify the existing launch config)
4. Migrate to use AWS Elastic Beanstalk. Use the established and well-tested Rolling Deployment setting AWS provides on the new Application Environment, publishing a zip bundle of the new code and adjusting the wait period to spread the deployment over time. Re-deploy the old code bundle to rollback if needed.
When thinking of AWS Elastic Beanstalk, the ‘Swap Environment URLs’ feature most directly aids in what?
1. Immutable Rolling Deployments
2. Mutable Rolling Deployments
3. Canary Deployments
4. Blue-Green Deployments (Complete switch from one environment to other)
You were just hired as a DevOps Engineer for a startup. Your startup uses AWS for 100% of their infrastructure. They currently have no automation at all for deployment, and they have had many failures while trying to deploy to production. The company has told you deployment process risk mitigation is the most important thing now, and you have a lot of budget for tools and AWS resources. Their stack: 2-tier API Data stored in DynamoDB or S3, depending on type, Compute layer is EC2 in Auto Scaling Groups, They use Route53 for DNS pointing to an ELB, An ELB balances load across the EC2 instances. The scaling group properly varies between 4 and 12 EC2 servers. Which of the following approaches, given this company’s stack and their priorities, best meets the company’s needs?
1. Model the stack in AWS Elastic Beanstalk as a single Application with multiple Environments. Use Elastic Beanstalk’s Rolling Deploy option to progressively roll out application code changes when promoting across environments. (Does not support DynamoDB also need Blue Green deployment for zero downtime deployment as cost is not a constraint)
2. Model the stack in 3 CloudFormation templates: Data layer, compute layer, and networking layer. Write stack deployment and integration testing automation following Blue-Green methodologies.
3. Model the stack in AWS OpsWorks as a single Stack, with 1 compute layer and its associated ELB. Use Chef and App Deployments to automate Rolling Deployment. (Does not support DynamoDB also need Blue Green deployment for zero downtime deployment as cost is not a constraint)
4. Model the stack in 1 CloudFormation template, to ensure consistency and dependency graph resolution. Write deployment and integration testing automation following Rolling Deployment methodologies. (Need Blue Green deployment for zero downtime deployment as cost is not a constraint)

You are building out a layer in a software stack on AWS that needs to be able to scale out to react to increased demand as fast as possible. You are running the code on EC2 instances in an Auto Scaling Group behind an ELB. Which application code deployment method should you use?
1. SSH into new instances those come online, and deploy new code onto the system by pulling it from an S3 bucket, which is populated by code that you refresh from source control on new pushes. (is slow and manual)
2. Bake an AMI when deploying new versions of code, and use that AMI for the Auto Scaling Launch Configuration. (Pre baked AMIs can help to get started quickly)
3. Create a Dockerfile when preparing to deploy a new version to production and publish it to S3. Use UserData in the Auto Scaling Launch configuration to pull down the Dockerfile from S3 and run it when new instances launch. (is slow)
4. Create a new Auto Scaling Launch Configuration with UserData scripts configured to pull the latest code at all times. (is slow)
You company runs a complex customer relations management system that consists of around 10 different software components all backed by the same Amazon Relational Database (RDS) database. You adopted AWS OpsWorks to simplify management and deployment of that application and created an AWS OpsWorks stack with layers for each of the individual components. An internal security policy requires that all instances should run on the latest Amazon Linux AMI and that instances must be replaced within one month after the latest Amazon Linux AMI has been released. AMI replacements should be done without incurring application downtime or capacity problems. You decide to write a script to be run as soon as a new Amazon Linux AMI is released. Which solutions support the security policy and meet your requirements? Choose 2 answers
1. Assign a custom recipe to each layer, which replaces the underlying AMI. Use AWS OpsWorks life-cycle events to incrementally execute this custom recipe and update the instances with the new AMI.
2. Create a new stack and layers with identical configuration, add instances with the latest Amazon Linux AMI specified as a custom AMI to the new layer, switch DNS to the new stack, and tear down the old stack. (Blue-Green Deployment)
3. Identify all Amazon Elastic Compute Cloud (EC2) instances of your AWS OpsWorks stack, stop each instance, replace the AMI ID property with the ID of the latest Amazon Linux AMI ID, and restart the instance. To avoid downtime, make sure not more than one instance is stopped at the same time.
4. Specify the latest Amazon Linux AMI as a custom AMI at the stack level, terminate instances of the stack and let AWS OpsWorks launch new instances with the new AMI.
5. Add new instances with the latest Amazon Linux AMI specified as a custom AMI to all AWS OpsWorks layers of your stack, and terminate the old ones.
Your company runs an event management SaaS application that uses Amazon EC2, Auto Scaling, Elastic Load Balancing, and Amazon RDS. Your software is installed on instances at first boot, using a tool such as Puppet or Chef, which you also use to deploy small software updates multiple times per week. After a major overhaul of your software, you roll out version 2.0 new, much larger version of the software of your running instances. Some of the instances are terminated during the update process. What actions could you take to prevent instances from being terminated in the future? (Choose two)
1. Use the zero downtime feature of Elastic Beanstalk to deploy new software releases to your existing instances. (No such feature, you can perform environment url swap)
2. Use AWS CodeDeploy. Create an application and a deployment targeting the Auto Scaling group. Use CodeDeploy to deploy and update the application in the future. (Refer link)
3. Run “aws autoscaling suspend-processes” before updating your application. (Refer link)
4. Use the AWS Console to enable termination protection for the current instances. (Termination protection does not work with Auto Scaling)
5. Run “aws autoscaling detach-load-balancers” before updating your application. (Does not prevent Auto Scaling to terminate the instances)

References

AWS Blue/Green Deployment Whitepaper

AWS Elastic Transcoder – Certification

January 14, 2017 ~ Last updated on : January 23, 2017 ~ jayendrapatil ~ 4 Comments

AWS Elastic Transcoder

Amazon Elastic Transcoder is a highly scalable, easy-to-use and cost-effective way for developers and businesses to convert (or “transcode”) video files from their source format into versions that will play back on multiple devices like smartphones, tablets and PCs.

Elastic Transcoder is for any customer with media assets stored in S3 for e.g. developers creating apps or websites that publish user-generated content, enterprises and educational establishments converting training and communication videos, and content owners and broadcasters needing to convert media assets into web-friendly formats.
Elastic Transcoder features
- can be used to convert files from different media formats into H.264/AAC/MP4 files at different resolutions, bitrates, and frame rates, and set up transcoding pipelines to transcode files in parallel.
- can be configured to overlay up to four graphics, known as watermarks, over a video during transcoding
- can be configured to transcode captions, or subtitles, from one format to another and supports embedded and sidebar caption types
- provides clip stitching ability to stitch together parts, or clips, from multiple input files to create a single output
- can be configured to create Thumbnails
Elastic Transcoder is integrated with CloudTrail, an AWS service that captures information about every request that is sent to the Elastic Transcoder API by your AWS account, including your IAM users

Elastic Transcoder Components

Presets
- are templates that contain most of the settings for transcoding media files from one format to another.
- Elastic Transcoder includes some default presets for common formats and ability to create customized presets

Jobs
- do the work of transcoding and converts a file into up to 30 formats.
- takes the input file to be transcoded, names of the transcoded files and several other settings as input
- For each transcoded format a preset needs to be specified
Pipelines
- are queues that manage the transcoding jobs.
- Elastic Transcoder starts processing the jobs and transcoding into format (for multiple formats) in the order they are added.
- can be paused to temporarily stop processing jobs
Notifications
- help keep you apprised of the status of a job, i.e. started, completed, encounters warning or error
- eliminate the need for polling to determine when a job has finished and can be configured during pipeline creation

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

Your website is serving on-demand training videos to your workforce. Videos are uploaded monthly in high resolution MP4 format. Your workforce is distributed globally often on the move and using company-provided tablets that require the HTTP Live Streaming (HLS) protocol to watch a video. Your company has no video transcoding expertise and it required you might need to pay for a consultant. How do you implement the most cost-efficient architecture without compromising high availability and quality of video delivery?
1. Elastic Transcoder to transcode original high-resolution MP4 videos to HLS. S3 to host videos with lifecycle Management to archive original flies to Glacier after a few days. CloudFront to serve HLS transcoded videos from S3
2. A video transcoding pipeline running on EC2 using SQS to distribute tasks and Auto Scaling to adjust the number or nodes depending on the length of the queue S3 to host videos with Lifecycle Management to archive all files to Glacier after a few days CloudFront to serve HLS transcoding videos from Glacier
3. Elastic Transcoder to transcode original high-resolution MP4 videos to HLS EBS volumes to host videos and EBS snapshots to incrementally backup original rues after a few days. CloudFront to serve HLS transcoded videos from EC2.
4. A video transcoding pipeline running on EC2 using SQS to distribute tasks and Auto Scaling to adjust the number of nodes depending on the length of the queue. EBS volumes to host videos and EBS snapshots to incrementally backup original files after a few days. CloudFront to serve HLS transcoded videos from EC2

References

Elastic_Transcoder_Developer_Guide

AWS CloudSearch – Certification

January 13, 2017 ~ Last updated on : January 15, 2017 ~ jayendrapatil

AWS CloudSearch

CloudSearch is a fully-managed, full-featured search service in the AWS Cloud that makes it easy to set up, manage, and scale a search solution

CloudSearch
- automatically provisions the required resources
- deploys a highly tuned search index
- easy configuration and can be up & running in less than one hour
- search and ability to upload searchable data
- automatically scales for data and traffic
- self-healing clusters, and
- high availability with Multi-AZ

CloudSearch uses Apache Solr as the underlying text search engine and
- can be used to index and search both structured and unstructured data.
- content can come from multiple sources and can include database fields along with files in a variety of formats, web pages, and so on.
- supports indexing features like algorithmic stemming, dictionary stemming, stopword dictionary
- can support customizable result ranking i.e. relevancy
- supports search features for text search, different query types (range, boolean etc), sorting, facets for filtering, grouping etc
- supports enhanced features for auto suggestions, highlighting, spatial search, fuzzy search etc
CloudSearch supports Multi-AZ option and it deploys additional instances in a second AZ in the same region.
CloudSearch can offer significantly lower total cost of ownership compared to operating and managing your own search environment

CloudSearch Search Domains, Data & Indexing

CloudSearch Architecture

Search domain is a data container and a set of services that make the data searchable
- Document service that allows data uploading to domain for indexing
- Search service that enables search requests against the indexed data
- Configuration service for controlling the domains behavior (include relevance ranking)
Search domain can’t be automatically migrated from one region to another. New domain in the target region needs to be created, configured and data uploaded, and then the original domain deleted

Indexed data to be made searchable
- can be submitted through a REST based web service url
- has to be in JSON or XML format
- is represented as a document with a unique document ID and multiple fields either to be search on to needed to be just retrieved
CloudSearch generates a search index from the document data according to the index fields configured for the domain
Data updates can be submitted by to add, update and delete documents

Data can be uploaded using secure and encrypted SSL HTTPS connection

CloudSearch Auto Scaling

CloudSearch Scaling

Search domains scale in two dimensions: data and traffic

A search instance is a single search engine in the cloud that indexes documents and responds to search requests with a finite amount of RAM and CPU resources for indexing data and processing requests.
Search domain can have one or more search partitions, portion of the data which fits on a single search instance, and the number of search partitions can change as the documents are indexed
CloudSearch can determine the size and number of search instances required to deliver low latency, high throughput search performance

When a search domain is created , a single instance is deployed
CloudSearch automatically scales the domain by adding instances as the volume of data or traffic increases
Scaling for data
- CloudSearch handles scaling for data by
  - Vertical scaling by increasing the size of the instance, when the amount of data exceeds a single search instance
  - Horizontal scaling using search partitions, when the amount of data exceeds the capacity of the largest search instance type
- Number of search instances required to hold the index partitions is sometimes referred to as the domain’s width.
- CloudSearch reduces the number of partitions and size of search instances if the amount of data reduces
Scaling for traffic
- CloudSearch handles Scaling for traffic by
  - Vertical scaling by increasing the size of the instance, when the amount of traffic exceeds a single search instance
  - Horizontal scaling by deploying a duplicate search instance to provide additional processing power i.e. the complete number of partitions are duplicated
- CloudSearch reduces the number of partitions and size of search instances if the traffic reduces
- Number of duplicate search instances is sometimes referred to as the domain’s depth.

CloudSearch Search Features

CloudSearch provides features to index and search both structured data and plain text as well as unstructured data like pdf, word documents

CloudSearch provides near real-time indexing for document updates
Indexing features include
- tokenization,
- stopwords,
- stemming and
- synonyms

Search features include
- faceted search, free text search, Boolean search expressions,
- customizable relevance ranking, query time rank expressions,
- grouping
- field weighting, searching and sorting
- Other features like
  - Autocomplete suggestions
  - Highlighting
  - Geospatial search
  - New data types: date, double, 64 bit signed int, LatLon
  - Dynamic fields
  - Index field statistics
  - Sloppy phrase search
  - Term boosting
  - Enhanced range searching for all field types
  - Search filters that don’t affect relevance
  - Support for multiple query parsers: simple, structured, lucene, dismax
  - Query parser configuration options

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

A newspaper organization has an on-premises application which allows the public to search its back catalogue and retrieve individual newspaper pages via a website written in Java. They have scanned the old newspapers into JPEGs (approx. 17TB) and used Optical Character Recognition (OCR) to populate a commercial search product. The hosting platform and software is now end of life and the organization wants to migrate its archive to AWS and produce a cost efficient architecture and still be designed for availability and durability. Which is the most appropriate?
1. Use S3 with reduced redundancy to store and serve the scanned files, install the commercial search application on EC2 Instances and configure with auto-scaling and an Elastic Load Balancer. (Reusing Commercial search application which is nearing end of life not a good option for cost)
2. Model the environment using CloudFormation. Use an EC2 instance running Apache webserver and an open source search application, stripe multiple standard EBS volumes together to store the JPEGs and search index. (storing JPEGs on EBS volumes not cost effective also answer does not address Open source solution availability)
3. Use S3 with standard redundancy to store and serve the scanned files, use CloudSearch for query processing, and use Elastic Beanstalk to host the website across multiple availability zones. (Cost effective S3 storage, CloudSearch for Search and Highly available and durable web application)
4. Use a single-AZ RDS MySQL instance to store the search index and the JPEG images use an EC2 instance to serve the website and translate user queries into SQL. (MySQL not an ideal solution to sore index and JPEG images for cost and performance)
5. Use a CloudFront download distribution to serve the JPEGs to the end users and Install the current commercial search product, along with a Java Container for the website on EC2 instances and use Route53 with DNS round-robin. (Web Application not scalable, whats the source for JPEGs files through CloudFront)

References

AWS_CloudSearch_Developer_Guide

AWS Elastic Beanstalk vs OpsWorks vs CloudFormation – Certification

January 10, 2017 ~ Last updated on : January 14, 2021 ~ jayendrapatil ~ 14 Comments

AWS Elastic Beanstalk vs OpsWorks vs CloudFormation

AWS offers multiple options for provisioning IT infrastructure and application deployment and management varying from convenience & easy of setup with low level granular control
Deployment and Management - Elastic Beanstalk vs OpsWorks vs CloudFormation

AWS Elastic Beanstalk

AWS Elastic Beanstalk is a higher level service which allows you to quickly deploy out with minimum management effort a web or worker based environments using EC2, Docker using ECS, Elastic Load Balancing, Auto Scaling, RDS, CloudWatch etc.
Elastic Beanstalk is the fastest and simplest way to get an application up and running on AWS and perfect for developers who want to deploy code and not worry about underlying infrastructure

Elastic Beanstalk provides an environment to easily deploy and run applications in the cloud. It is integrated with developer tools and provides a one-stop experience for application lifecycle management
Elastic Beanstalk requires minimal configuration points and will help deploy, monitor and handle the elasticity/scalability of the application
A user does’t need to do much more than write application code and configure and define some configuration on Elastic Beanstalk

AWS OpsWorks

AWS OpsWorks is an application management service that simplifies software configuration, application deployment, scaling, and monitoring
OpsWorks is recommended if you want to manage your infrastructure with a configuration management system such as Chef.
Opsworks enables writing custom chef recipes, utilizes self healing, and works with layers

Although, Opsworks is deployment management service that helps you deploy applications with Chef recipes, but it is not primally meant to manage the scaling of the application out of the box, and needs to be handled explicitly

AWS CloudFormation

AWS CloudFormation enables modeling, provisioning and version-controlling of a wide range of AWS resources ranging from a single EC2 instance to a complex multi-tier, multi-region application
CloudFormation is a low level service and provides granular control to provision and manage stacks of AWS resources based on templates

CloudFormation templates enables version control of the infrastructure and makes deployment of environments easy and repeatable
CloudFormation supports infrastructure needs of many different types of applications such as existing enterprise applications, legacy applications, applications built using a variety of AWS resources and container-based solutions (including those built using AWS Elastic Beanstalk).
CloudFormation is not just an application deployment tool but can provision any kind of AWS resource

CloudFormation is designed to complement both Elastic Beanstalk and OpsWorks
CloudFormation with Elastic Beanstalk
- CloudFormation supports Elastic Beanstalk application environments as one of the AWS resource types.
- This allows you, for example, to create and manage an AWS Elastic Beanstalk–hosted application along with an RDS database to store the application data. In addition to RDS instances, any other supported AWS resource can be added to the group as well.
CloudFormation with OpsWorks
- CloudFormation also supports OpsWorks and OpsWorks components (stacks, layers, instances, and applications) can be modeled inside CloudFormation templates, and provisioned as CloudFormation stacks.
- This enables you to document, version control, and share your OpsWorks configuration.
- Unified CloudFormation template or separate CloudFormation templates can be created to provision OpsWorks components and other related AWS resources such as VPC and Elastic Load Balancer

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

Your team is excited about the use of AWS because now they have access to programmable infrastructure. You have been asked to manage your AWS infrastructure in a manner similar to the way you might manage application code. You want to be able to deploy exact copies of different versions of your infrastructure, stage changes into different environments, revert back to previous versions, and identify what versions are running at any particular time (development test QA. production). Which approach addresses this requirement?
1. Use cost allocation reports and AWS Opsworks to deploy and manage your infrastructure.
2. Use AWS CloudWatch metrics and alerts along with resource tagging to deploy and manage your infrastructure.
3. Use AWS Elastic Beanstalk and a version control system like GIT to deploy and manage your infrastructure.
4. Use AWS CloudFormation and a version control system like GIT to deploy and manage your infrastructure.
An organization is planning to use AWS for their production roll out. The organization wants to implement automation for deployment such that it will automatically create a LAMP stack, download the latest PHP installable from S3 and setup the ELB. Which of the below mentioned AWS services meets the requirement for making an orderly deployment of the software?
1. AWS Elastic Beanstalk
2. AWS CloudFront
3. AWS CloudFormation
4. AWS DevOps
You are working with a customer who is using Chef configuration management in their data center. Which service is designed to let the customer leverage existing Chef recipes in AWS?
1. Amazon Simple Workflow Service
2. AWS Elastic Beanstalk
3. AWS CloudFormation
4. AWS OpsWorks

References

Overview_of_Deployment_Options_Whitepaper

AWS High Availability & Fault Tolerance Architecture – Certification

January 7, 2017 ~ Last updated on : February 15, 2022 ~ jayendrapatil ~ 40 Comments

AWS High Availability & Fault Tolerance Architecture

Amazon Web Services provides services and infrastructure to build reliable, fault-tolerant, and highly available systems in the cloud.

Fault-tolerance defines the ability for a system to remain in operation even if some of the components used to build the system fail.
Most of the higher-level services, such as S3, SimpleDB, SQS, and ELB, have been built with fault tolerance and high availability in mind.

Services that provide basic infrastructure, such as EC2 and EBS, provide specific features, such as availability zones, elastic IP addresses, and snapshots, that a fault-tolerant and highly available system must take advantage of and use correctly.

AWS High Availability and Fault Tolerance

NOTE: Topic mainly for Professional Exam Only

Regions & Availability Zones

Amazon Web Services are available in geographic Regions and with multiple Availability zones (AZs) within a region, which provide easy access to redundant deployment locations.

AZs are distinct geographical locations that are engineered to be insulated from failures in other AZs.
Regions and AZs help achieve greater fault tolerance by distributing the application geographically and help build multi-site solution.
AZs provide inexpensive, low latency network connectivity to other Availability Zones in the same Region

By placing EC2 instances in multiple AZs, an application can be protected from failure at a single data center
It is important to run independent application stacks in more than one AZ, either in the same region or in another region, so that if one zone fails, the application in the other zone can continue to run.

Amazon Machine Image – AMIs

EC2 is a web service within Amazon Web Services that provides computing resources.

Amazon Machine Image (AMI) provides a Template that can be used to define the service instances.
Template basically contains a software configuration (i.e., OS, application server, and applications) and is applied to an instance type
AMI can either contain all the softwares, applications and the code bundled or can be configured to have a bootstrap script to install the same on startup.

A single AMI can be used to create server resources of different instance types and start creating new instances or replacing failed instances

Auto Scaling

Auto Scaling helps to automatically scale EC2 capacity up or down based on defined rules.
Auto Scaling also enables addition of more instances in response to an increasing load; and when those instances are no longer needed, they will be automatically terminated.

Auto Scaling enables terminating server instances at will, knowing that replacement instances will be automatically launched.
Auto Scaling can work across multiple AZs within an AWS Region

Elastic Load Balancing – ELB

Elastic Load balancing is an effective way to increase the availability of a system and distributes incoming traffic to application across several EC2 instances

With ELB, a DNS host name is created and any requests sent to this host name are delegated to a pool of EC2 instances
ELB supports health checks on hosts, distribution of traffic to EC2 instances across multiple availability zones, and dynamic addition and removal of EC2 hosts from the load-balancing rotation
Elastic Load Balancing detects unhealthy instances within its pool of EC2 instances and automatically reroutes traffic to healthy instances, until the unhealthy instances have been restored seamlessly using Auto Scaling.

Auto Scaling and Elastic Load Balancing are an ideal combination – while ELB gives a single DNS name for addressing, Auto Scaling ensures there is always the right number of healthy EC2 instances to accept requests.
ELB can be used to balance across instances in multiple AZs of a region.

Elastic IPs – EIPs

Elastic IP addresses are public static IP addresses that can be mapped programmatically between instances within a region.

EIPs associated with the AWS account and not with a specific instance or lifetime of an instance.
Elastic IP addresses can be used for instances and services that require consistent endpoints, such as, master databases, central file servers, and EC2-hosted load balancers
Elastic IP addresses can be used to work around host or availability zone failures by quickly remapping the address to another running instance or a replacement instance that was just started.

Reserved Instance

Reserved instances help reserve and guarantee computing capacity is available at a lower cost always.

Elastic Block Store – EBS

Elastic Block Store (EBS) offers persistent off-instance storage volumes that persists independently from the life of an instance and are about an order of magnitude more durable than on-instance storage.
EBS volumes store data redundantly and are automatically replicated within a single availability zone.

EBS helps in failover scenarios where if an EC2 instance fails and needs to be replaced, the EBS volume can be attached to the new EC2 instance
Valuable data should never be stored only on instance (ephemeral) storage without proper backups, replication, or the ability to re-create the data.

EBS Snapshots

EBS volumes are highly reliable, but to further mitigate the possibility of a failure and increase durability, point-in-time Snapshots can be created to store data on volumes in S3, which is then replicated to multiple AZs.

Snapshots can be used to create new EBS volumes, which are an exact replica of the original volume at the time the snapshot was taken
Snapshots provide an effective way to deal with disk failures or other host-level issues, as well as with problems affecting an AZ.
Snapshots are incremental and back up only changes since the previous snapshot, so it is advisable to hold on to recent snapshots

Snapshots are tied to the region, while EBS volumes are tied to a single AZ

Relational Database Service – RDS

- RDS makes it easy to run relational databases in the cloud
- RDS Multi-AZ deployments, where a synchronous standby replica of the database is provisioned in a different AZ, which helps increase the database availability and protect the database against unplanned outages
- In case of a failover scenario, the standby is promoted to be the primary seamlessly and will handle the database operations.
- Automated backups, enabled by default, of the database provides point-in-time recovery for the database instance.
- RDS will back up your database and transaction logs and store both for a user-specified retention period.
- In addition to the automated backups, manual RDS backups can also be performed which are retained until explicitly deleted.
- Backups help recover from higher-level faults such as unintentional data modification, either by operator error or by bugs in the application.

- RDS Read Replicas provide read-only replicas of the database an provides the ability to scale out beyond the capacity of a single database deployment for read-heavy database workloads

RDS Read Replicas is a scalability and not a High Availability solution

Simple Storage Service – S3

S3 provides highly durable, fault-tolerant and redundant object store
S3 stores objects redundantly on multiple devices across multiple facilities in an S3 Region
S3 is a great storage solution for somewhat static or slow-changing objects, such as images, videos, and other static media.

S3 also supports edge caching and streaming of these assets by interacting with the Amazon CloudFront service.

Simple Queue Service – SQS

Simple Queue Service (SQS) is a highly reliable distributed messaging system that can serve as the backbone of fault-tolerant application
SQS is engineered to provide “at least once” delivery of all messages

Messages are guaranteed for sent to a queue are retained for up to four days( by default, and can be extended upto 14 days) or until they are read and deleted by the application
Messages can be polled by multiple workers and processed, while SQS takes care that a request is processed by only one worker at a time using configurable time interval called visibility timeout
If the number of messages in a queue starts to grow or if the average time to process a message becomes too high, workers can be scaled upwards by simply adding additional EC2 instances.

Route 53

- Amazon Route 53 is a highly available and scalable DNS web service.

- Queries for the domain are automatically routed to the nearest DNS server and thus are answered with the best possible performance.

Route 53 resolves requests for your domain name (for example, www.example.com) to your Elastic Load Balancer, as well as your zone apex record (example.com).

CloudFront

- CloudFront can be used to deliver website, including dynamic, static and streaming content using a global network of edge locations.

- Requests for your content are automatically routed to the nearest edge location, so content is delivered with the best possible performance.

- CloudFront is optimized to work with other Amazon Web Services, like S3 and EC2

CloudFront also works seamlessly with any non-AWS origin server, which stores the original, definitive versions of your files.

AWS Certification Exam Practice Questions

AWS Certification Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

You are moving an existing traditional system to AWS, and during the migration discover that there is a master server which is a single point of failure. Having examined the implementation of the master server you realize there is not enough time during migration to re-engineer it to be highly available, though you do discover that it stores its state in a local MySQL database. In order to minimize down-time you select RDS to replace the local database and configure master to use it, what steps would best allow you to create a self-healing architecture[PROFESSIONAL]
1. Migrate the local database into multi-AWS RDS database. Place master node into a multi-AZ auto-scaling group with a minimum of one and maximum of one with health checks.
2. Replicate the local database into a RDS read replica. Place master node into a Cross-Zone ELB with a minimum of one and maximum of one with health checks. (Read Replica does not provide HA and write capability and ELB does not have feature for Min and Max 1 and Cross Zone allows just the equal distribution of load across instances)
3. Migrate the local database into multi-AWS RDS database. Place master node into a Cross-Zone ELB with a minimum of one and maximum of one with health checks. (ELB does not have feature for Min and Max 1 and Cross Zone allows just the equal distribution of load across instances)
4. Replicate the local database into a RDS read replica. Place master node into a multi-AZ auto-scaling group with a minimum of one and maximum of one with health checks. (Read Replica does not provide HA and write capability)

You are designing Internet connectivity for your VPC. The Web servers must be available on the Internet. The application must have a highly available architecture. Which alternatives should you consider? (Choose 2 answers)
1. Configure a NAT instance in your VPC. Create a default route via the NAT instance and associate it with all subnets. Configure a DNS A record that points to the NAT instance public IP address (NAT is for internet connectivity for instances in private subnet)
2. Configure a CloudFront distribution and configure the origin to point to the private IP addresses of your Web servers. Configure a Route53 CNAME record to your CloudFront distribution.
3. Place all your web servers behind ELB. Configure a Route53 CNAME to point to the ELB DNS name.
4. Assign EIPs to all web servers. Configure a Route53 record set with all EIPs. With health checks and DNS failover.
When deploying a highly available 2-tier web application on AWS, which combination of AWS services meets the requirements? 1. AWS Direct Connect 2. Amazon Route 53 3. AWS Storage Gateway 4. Elastic Load Balancing 4. Amazon EC2 5. Auto scaling 6. Amazon VPC 7. AWS Cloud Trail [PROFESSIONAL]
1. 2,4,5 and 6
2. 3,4,5 and 8
3. 1 through 8
4. 1,3,5 and 7
5. 1,2,5 and 6
Company A has hired you to assist with the migration of an interactive website that allows registered users to rate local restaurants. Updates to the ratings are displayed on the home page, and ratings are updated in real time. Although the website is not very popular today, the company anticipates that It will grow rapidly over the next few weeks. They want the site to be highly available. The current architecture consists of a single Windows Server 2008 R2 web server and a MySQL database running on Linux. Both reside inside an on -premises hypervisor. What would be the most efficient way to transfer the application to AWS, ensuring performance and high-availability? [PROFESSIONAL]
1. Export web files to an Amazon S3 bucket in us-west-1. Run the website directly out of Amazon S3. Launch a multi-AZ MySQL Amazon RDS instance in us-west-1a. Import the data into Amazon RDS from the latest MySQL backup. Use Route 53 and create an alias record pointing to the elastic load balancer. (Its an Interactive website, although it can be implemented using Javascript SDK, its a migration and the application would need changes. Also no use of ELB if hosted on S3)
2. Launch two Windows Server 2008 R2 instances in us-west-1b and two in us-west-1a. Copy the web files from on premises web server to each Amazon EC2 web server, using Amazon S3 as the repository. Launch a multi-AZ MySQL Amazon RDS instance in us-west-2a. Import the data into Amazon RDS from the latest MySQL backup. Create an elastic load balancer to front your web servers. Use Route 53 and create an alias record pointing to the elastic load balancer. (Although RDS instance is in a different region which will impact performance, this is the only option that works.)
3. Use AWS VM Import/Export to create an Amazon Elastic Compute Cloud (EC2) Amazon Machine Image (AMI) of the web server. Configure Auto Scaling to launch two web servers in us-west-1a and two in us-west-1b. Launch a Multi-AZ MySQL Amazon Relational Database Service (RDS) instance in us-west-1b. Import the data into Amazon RDS from the latest MySQL backup. Use Amazon Route 53 to create a hosted zone and point an A record to the elastic load balancer. (does not create a load balancer)
4. Use AWS VM Import/Export to create an Amazon EC2 AMI of the web server. Configure auto-scaling to launch two web servers in us-west-1a and two in us-west-1b. Launch a multi-AZ MySQL Amazon RDS instance in us-west-1a. Import the data into Amazon RDS from the latest MySQL backup. Create an elastic load balancer to front your web servers. Use Amazon Route 53 and create an A record pointing to the elastic load balancer. (Need to create a aliased record without which the Route 53 pointing to ELB would not work)
Your company runs a customer facing event registration site. This site is built with a 3-tier architecture with web and application tier servers and a MySQL database. The application requires 6 web tier servers and 6 application tier servers for normal operation, but can run on a minimum of 65% server capacity and a single MySQL database. When deploying this application in a region with three availability zones (AZs) which architecture provides high availability? [PROFESSIONAL]
1. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer), and an application tier deployed across 2 AZs with 3 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB. and one RDS (Relational Database Service) instance deployed with read replicas in the other AZ.
2. A web tier deployed across 3 AZs with 2 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 3 AZs with 2 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB and one RDS (Relational Database Service) Instance deployed with read replicas in the two other AZs.
3. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 2 AZs with 3 EC2 instances m each AZ inside an Auto Scaling Group behind an ELS and a Multi-AZ RDS (Relational Database Service) deployment.
4. A web tier deployed across 3 AZs with 2 EC2 (Elastic Compute Cloud) instances in each AZ Inside an Auto Scaling Group behind an ELB (elastic load balancer). And an application tier deployed across 3 AZs with 2 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB. And a Multi-AZ RDS (Relational Database services) deployment.

For a 3-tier, customer facing, inclement weather site utilizing a MySQL database running in a Region which has two AZs which architecture provides fault tolerance within the region for the application that minimally requires 6 web tier servers and 6 application tier servers running in the web and application tiers and one MySQL database? [PROFESSIONAL]
1. A web tier deployed across 2 AZs with 6 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer), and an application tier deployed across 2 AZs with 6 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB. and a Multi-AZ RDS (Relational Database Service) deployment. (As it needs Fault Tolerance with minimal 6 servers always available)
2. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each A2 inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 2 AZs with 3 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB and a Multi-AZ RDS (Relational Database Service) deployment.
3. A web tier deployed across 2 AZs with 3 EC2 (Elastic Compute Cloud) instances in each AZ inside an Auto Scaling Group behind an ELB (elastic load balancer) and an application tier deployed across 2 AZs with 6 EC2 instances in each AZ inside an Auto Scaling Group behind an ELB and one RDS (Relational Database Service) Instance deployed with read replicas in the other AZs.
4. A web tier deployed across 1 AZs with 6 EC2 (Elastic Compute Cloud) instances in each AZ Inside an Auto Scaling Group behind an ELB (elastic load balancer). And an application tier deployed in the same AZs with 6 EC2 instances inside an Auto scaling group behind an ELB and a Multi-AZ RDS (Relational Database services) deployment, with 6 stopped web tier EC2 instances and 6 stopped application tier EC2 instances all in the other AZ ready to be started if any of the running instances in the first AZ fails.
You are designing a system which needs, at minimum, 8 m4.large instances operating to service traffic. When designing a system for high availability in the us-east-1 region, which has 6 Availability Zones, you company needs to be able to handle death of a full availability zone. How should you distribute the servers, to save as much cost as possible, assuming all of the EC2 nodes are properly linked to an ELB? Your VPC account can utilize us-east-1’s AZ’s a through f, inclusive.
1. 3 servers in each of AZ’s a through d, inclusive.
2. 8 servers in each of AZ’s a and b.
3. 2 servers in each of AZ’s a through e, inclusive. (You need to design for N+1 redundancy on Availability Zones. ZONE_COUNT = (REQUIRED_INSTANCES / INSTANCE_COUNT_PER_ZONE) + 1. To minimize cost, spread the instances across as many possible zones as you can. By using a though e, you are allocating 5 zones. Using 2 instances, you have 10 total instances. If a single zone fails, you have 4 zones left, with 2 instances each, for a total of 8 instances. By spreading out as much as possible, you have increased cost by only 25% and significantly de-risked an availability zone failure. Refer link)
4. 4 servers in each of AZ’s a through c, inclusive.
You need your API backed by DynamoDB to stay online during a total regional AWS failure. You can tolerate a couple minutes of lag or slowness during a large failure event, but the system should recover with normal operation after those few minutes. What is a good approach? [PROFESSIONAL]
1. Set up DynamoDB cross-region replication in a master-standby configuration, with a single standby in another region. Create an Auto Scaling Group behind an ELB in each of the two regions DynamoDB is running in. Add a Route53 Latency DNS Record with DNS Failover, using the ELBs in the two regions as the resource records. (Use DynamoDB cross-regional replication version with two ELBs and ASGs with Route53 Failover and Latency DNS. Refer link)
2. Set up a DynamoDB Multi-Region table. Create an Auto Scaling Group behind an ELB in each of the two regions DynamoDB is running in. Add a Route53 Latency DNS Record with DNS Failover, using the ELBs in the two regions as the resource records. (No such thing as DynamoDB Multi-Region table before. However, global tables have been now introduced.)
3. Set up a DynamoDB Multi-Region table. Create a cross-region ELB pointing to a cross-region Auto Scaling Group, and direct a Route53 Latency DNS Record with DNS Failover to the cross-region ELB. (No such thing as Cross Region ELB or cross-region ASG)
4. Set up DynamoDB cross-region replication in a master-standby configuration, with a single standby in another region. Create a cross-region ELB pointing to a cross-region Auto Scaling Group, and direct a Route53 Latency DNS Record with DNS Failover to the cross-region ELB. (No such thing as DynamoDB cross-region table or cross-region ELB)

You are putting together a WordPress site for a local charity and you are using a combination of Route53, Elastic Load Balancers, EC2 & RDS. You launch your EC2 instance, download WordPress and setup the configuration files connection string so that it can communicate to RDS. When you browse to your URL however, nothing happens. Which of the following could NOT be the cause of this.
1. You have forgotten to open port 80/443 on your security group in which the EC2 instance is placed.
2. Your elastic load balancer has a health check, which is checking a webpage that does not exist; therefore your EC2 instance is not in service.
3. You have not configured an ALIAS for your A record to point to your elastic load balancer
4. You have locked port 22 down to your specific IP address therefore users cannot access your site using HTTP/HTTPS
A development team that is currently doing a nightly six-hour build which is lengthening over time on-premises with a large and mostly under utilized server would like to transition to a continuous integration model of development on AWS with multiple builds triggered within the same day. However, they are concerned about cost, security and how to integrate with existing on-premises applications such as their LDAP and email servers, which cannot move off-premises. The development environment needs a source code repository; a project management system with a MySQL database resources for performing the builds and a storage location for QA to pick up builds from. What AWS services combination would you recommend to meet the development team’s requirements? [PROFESSIONAL]
1. A Bastion host Amazon EC2 instance running a VPN server for access from on-premises, Amazon EC2 for the source code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIP for the source code repository and project management system, Amazon SQL for a build queue, An Amazon Auto Scaling group of Amazon EC2 instances for performing builds and Amazon Simple Email Service for sending the build output. (Bastion is not for VPN connectivity also SES should not be used)
2. An AWS Storage Gateway for connecting on-premises software applications with cloud-based storage securely, Amazon EC2 for the resource code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIPs for the source code repository and project management system, Amazon Simple Notification Service for a notification initiated build, An Auto Scaling group of Amazon EC2 instances for performing builds and Amazon S3 for the build output. (Storage Gateway does not provide secure connectivity, still needs VPN. SNS alone cannot handle builds)
3. An AWS Storage Gateway for connecting on-premises software applications with cloud-based storage securely, Amazon EC2 for the resource code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIPs for the source code repository and project management system, Amazon SQS for a build queue, An Amazon Elastic Map Reduce (EMR) cluster of Amazon EC2 instances for performing builds and Amazon CloudFront for the build output. (Storage Gateway does not provide secure connectivity, still needs VPN. EMR is not ideal for performing builds as it needs normal EC2 instances)
4. A VPC with a VPN Gateway back to their on-premises servers, Amazon EC2 for the source-code repository with attached Amazon EBS volumes, Amazon EC2 and Amazon RDS MySQL for the project management system, EIPs for the source code repository and project management system, SQS for a build queue, An Auto Scaling group of EC2 instances for performing builds and S3 for the build output. (VPN gateway is required for secure connectivity. SQS for build queue and EC2 for builds)

References

AWS_Fault_Tolerant_High_Availability_Whitepaper

AWS Intrusion Detection & Prevention System IDS/IPS

December 22, 2016 ~ Last updated on : April 4, 2023 ~ jayendrapatil ~ 13 Comments

AWS Intrusion Detection & Prevention System IDS/IPS

An Intrusion Prevention System IPS
- is an appliance that monitors and analyzes network traffic to detect malicious patterns and potentially harmful packets and prevent vulnerability exploits
- Most IPS offer firewall, unified threat management and routing capabilities

An Intrusion Detection System IDS is
- an appliance or capability that continuously monitors the environment
- sends alerts when it detects malicious activity, policy violations or network & system attack from someone attempting to break into or compromise the system
- produces reports for analysis.

Approaches for AWS IDS/IPS

Network Tap or SPAN

Traditional approach involves using a network Test Access Point (TAP) or Switch Port Analyzer (SPAN) to access & monitor all network traffic.
Connection between the AWS Internet Gateway (IGW) and the Elastic Load Balancer would be an ideal place to capture all network traffic.

However, there is no place to plug this in between IGW and ELB as there are no SPAN ports, network taps, or a concept of Layer 2 bridging

Packet Sniffing

It is not possible for a virtual instance running in promiscuous mode to receive or sniff traffic that is intended for a different virtual instance.
While interfaces can be placed into promiscuous mode, the hypervisor will not deliver any traffic to an instance that is not addressed to it.

Even two virtual instances that are owned by the same customer located on the same physical host cannot listen to each other’s traffic
So, promiscuous mode is not allowed

Host Based Firewall – Forward Deployed IDS

Deploy a network-based IDS on every instance you deploy IDS workload scales with your infrastructure

Host-based security software works well with highly distributed and scalable application architectures because network packet inspection is distributed across the entire software fleet
However, CPU-intensive process is deployed onto every single machine.

Host Based Firewall – Traffic Replication

An Agent is deployed on every instance to capture & replicate traffic for centralized analysis

Actual workload of network traffic analysis is not performed on the instance but on a separate server
Traffic capture and replication is still CPU-intensive (particularly on Windows machines.)
It significantly increases the internal network traffic in the environment as every inbound packet is duplicated in the transfer from the instance that captures the traffic to the instance that analyzes the traffic

In-Line Firewall – Inbound IDS Tier

Add another tier to the application architecture where a load balancer sends all inbound traffic to a tier of instances that performs the network analysis for e.g. Third Party Solution Fortinet FortiGate
IDS workload is now isolated to a horizontally scalable tier in the architecture You have to maintain and manage another mission-critical elastic tier in the architecture

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

A web company is looking to implement an intrusion detection and prevention system into their deployed VPC. This platform should have the ability to scale to thousands of instances running inside of the VPC. How should they architect their solution to achieve these goals?
1. Configure an instance with monitoring software and the elastic network interface (ENI) set to promiscuous mode packet sniffing to see an traffic across the VPC. (virtual instance running in promiscuous mode to receive or“sniff” traffic)
2. Create a second VPC and route all traffic from the primary application VPC through the second VPC where the scalable virtualized IDS/IPS platform resides.
3. Configure servers running in the VPC using the host-based ‘route’ commands to send all traffic through the platform to a scalable virtualized IDS/IPS (host based routing is not allowed)
4. Configure each host with an agent that collects all network traffic and sends that traffic to the IDS/IPS platform for inspection.
You are designing an intrusion detection prevention (IDS/IPS) solution for a customer web application in a single VPC. You are considering the options for implementing IDS/IPS protection for traffic coming from the Internet. Which of the following options would you consider? (Choose 2 answers)
1. Implement IDS/IPS agents on each Instance running In VPC
2. Configure an instance in each subnet to switch its network interface card to promiscuous mode and analyze network traffic. (virtual instance running in promiscuous mode to receive or“sniff” traffic)
3. Implement Elastic Load Balancing with SSL listeners In front of the web applications (ELB with SSL does not serve as IDS/IPS)
4. Implement a reverse proxy layer in front of web servers and configure IDS/IPS agents on each reverse proxy server

References

Network based intrusion detection in AWS

AWS Risk and Compliance – Whitepaper – Certification

November 7, 2016 ~ Last updated on : February 9, 2017 ~ jayendrapatil ~ 2 Comments

AWS Risk and Compliance Whitepaper Overview

AWS Risk and Compliance Whitepaper is intended to provide information to assist AWS customers with integrating AWS into their existing control framework supporting their IT environment.

AWS does communicate its security and control environment relevant to customers. AWS does this by doing the following:
- Obtaining industry certifications and independent third-party attestations described in this document
- Publishing information about the AWS security and control practices in whitepapers and web site content
- Providing certificates, reports, and other documentation directly to AWS customers under NDA (as required)

Shared Responsibility model

AWS’ part in the shared responsibility includes
- providing its services on a highly secure and controlled platform and providing a wide array of security features customers can use
- relieves the customer’s operational burden as AWS operates, manages and controls the components from the host operating system and virtualization layer down to the physical security of the facilities in which the service operates
Customers’ responsibility includes
- configuring their IT environments in a secure and controlled manner for their purposes
- responsibility and management of the guest operating system (including updates and security patches), other associated application software as well as the configuration of the AWS provided security group firewall
- stringent compliance requirements by leveraging technology such as host based firewalls, host based intrusion detection/prevention, encryption and key management
- relieve customer burden of operating controls by managing those controls associated with the physical infrastructure deployed in the AWS environment

Risk and Compliance Governance

AWS provides a wide range of information regarding its IT control environment to customers through white papers, reports, certifications, and other third-party attestations
AWS customers are required to continue to maintain adequate governance over the entire IT control environment regardless of how IT is deployed.

Leading practices include
- an understanding of required compliance objectives and requirements (from relevant sources),
- establishment of a control environment that meets those objectives and requirements,
- an understanding of the validation required based on the organization’s risk tolerance,
- and verification of the operating effectiveness of their control environment.
Strong customer compliance and governance might include the following basic approach:
- Review information available from AWS together with other information to understand as much of the entire IT environment as possible, and then document all compliance requirements.
- Design and implement control objectives to meet the enterprise compliance requirements.
- Identify and document controls owned by outside parties.
- Verify that all control objectives are met and all key controls are designed and operating effectively.
Approaching compliance governance in this manner helps companies gain a better understanding of their control environment and will help clearly delineate the verification activities to be performed.

AWS Certifications, Programs, Reports, and Third-Party Attestations

AWS engages with external certifying bodies and independent auditors to provide customers with considerable information regarding the policies, processes, and controls established and operated by AWS.

AWS provides third-party attestations, certifications, Service Organization Controls (SOC) reports and other relevant compliance reports directly to our customers under NDA.

Key Risk and Compliance Questions

Shared Responsibility
- AWS controls the physical components of that technology.
- Customer owns and controls everything else, including control over connection points and transmissions
Auditing IT
- Auditing for most layers and controls above the physical controls remains the responsibility of the customer
- AWS ISO 27001 and other certifications are available for auditors review
- AWS-defined logical and physical controls is documented in the SOC 1 Type II report and available for review by audit and compliance teams
Data location
- AWS customers control which physical region their data and their servers will be located
- AWS replicates the data only within the region
- AWS will not move customers’ content from the selected Regions without notifying the customer, unless required to comply with the law or requests of governmental entities

Data center tours
- As AWS host multiple customers, AWS does not allow data center tours by customers, as this exposes a wide range of customers to physical access of a third party.
- An independent and competent auditor validates the presence and operation of controls as part of our SOC 1 Type II report.
- This third-party validation provides customers with the independent perspective of the effectiveness of controls in place.
- AWS customers that have signed a non-disclosure agreement with AWS may request a copy of the SOC 1 Type II report.
Third-party access
- AWS strictly controls access to data centers, even for internal employees.
- Third parties are not provided access to AWS data centers except when explicitly approved by the appropriate AWS data center manager per the AWS access policy
Multi-tenancy
- AWS environment is a virtualized, multi-tenant environment.
- AWS has implemented security management processes, PCI controls, and other security controls designed to isolate each customer from other customers.
- AWS systems are designed to prevent customers from accessing physical hosts or instances not assigned to them by filtering through the virtualization software.

Hypervisor vulnerabilities
- Amazon EC2 utilizes a highly customized version of Xen hypervisor.
- Hypervisor is regularly assessed for new and existing vulnerabilities and attack vectors by internal and external penetration teams, and is well suited for maintaining strong isolation between guest virtual machines

Vulnerability management
- AWS is responsible for patching systems supporting the delivery of service to customers, such as the hypervisor and networking services
Encryption
- AWS allows customers to use their own encryption mechanisms for nearly all the services, including S3, EBS, SimpleDB, and EC2.
- IPSec tunnels to VPC are also encrypted
Data isolation
- All data stored by AWS on behalf of customers has strong tenant isolation security and control capabilities
Composite services
- AWS does not leverage any third-party cloud providers to deliver AWS services to customers.

Distributed Denial Of Service (DDoS) attacks
- AWS network provides significant protection against traditional network security issues and the customer can implement further protection
Data portability
- AWS allows customers to move data as needed on and off AWS storage
Service & Customer provider business continuity
- AWS does operate a business continuity program
- AWS data centers incorporate physical protection against environmental risks.
- AWS’ physical protection against environmental risks has been validated by an independent auditor and has been certified
- AWS provides customers with the capability to implement a robust continuity plan with multi region/AZ deployment architectures, backups, data redundancy replication

Capability to scale
- AWS cloud is distributed, highly secure and resilient, giving customers massive scale potential.
- Customers may scale up or down, paying for only what they use

Service availability
- AWS does commit to high levels of availability in its service level agreements (SLA) for e.g. S3 99.9%
Application Security
- AWS system development lifecycle incorporates industry best practices which include formal design reviews by the AWS Security Team, source code analysis, threat modeling and completion of a risk assessment
- AWS does not generally outsource development of software.
Threat and Vulnerability Management
- AWS Security regularly engages independent security firms to perform external vulnerability threat assessments
- AWS Security regularly scans all Internet facing service endpoint IP addresses for vulnerabilities, but do not include customer instances
- AWS Security notifies the appropriate parties to remediate any identified vulnerabilities.
- Customers can request permission to conduct scans and Penetration tests of their cloud infrastructure as long as they are limited to the customer’s instances and do not violate the AWS Acceptable Use Policy. Advance approval for these types of scans is required
Data Security

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

When preparing for a compliance assessment of your system built inside of AWS. What are three best practices for you to prepare for an audit? Choose 3 answers
1. Gather evidence of your IT operational controls (Customer still needs to gather all the IT operation controls inline with their environment)
2. Request and obtain applicable third-party audited AWS compliance reports and certifications (Customers can request the reports and certifications produced by our third-party auditors or can request more information about AWS Compliance)
3. Request and obtain a compliance and security tour of an AWS data center for a pre-assessment security review (AWS does not allow data center tour)
4. Request and obtain approval from AWS to perform relevant network scans and in-depth penetration tests of your system’s Instances and endpoints (AWS requires prior approval to be taken to perform penetration tests)
5. Schedule meetings with AWS’s third-party auditors to provide evidence of AWS compliance that maps to your control objectives (Customers can request the reports and certifications produced by our third-party auditors or can request more information about AWS Compliance)

In the shared security model, AWS is responsible for which of the following security best practices (check all that apply) :
1. Penetration testing
2. Operating system account security management
3. Threat modeling
4. User group access management
5. Static code analysis

You are running a web-application on AWS consisting of the following components an Elastic Load Balancer (ELB) an Auto-Scaling Group of EC2 instances running Linux/PHP/Apache, and Relational DataBase Service (RDS) MySQL. Which security measures fall into AWS’s responsibility?
1. Protect the EC2 instances against unsolicited access by enforcing the principle of least-privilege access (Customer owned)
2. Protect against IP spoofing or packet sniffing
3. Assure all communication between EC2 instances and ELB is encrypted (Customer owned)
4. Install latest security patches on ELB, RDS and EC2 instances (Customer owned)
Which of the following statements is true about achieving PCI certification on the AWS platform? (Choose 2)
1. Your organization owns the compliance initiatives related to anything placed on the AWS infrastructure
2. Amazon EC2 instances must run on a single-tenancy environment (dedicated instance)
3. AWS manages card-holder environments
4. AWS Compliance provides assurance related to the underlying infrastructure

References

AWS_Risk_and_Compliance_Whitepaper.pdf

AWS Import/Export – Certification

October 14, 2016 ~ Last updated on : December 5, 2017 ~ jayendrapatil ~ 11 Comments

AWS Import/Export Disk

AWS Import/Export accelerates moving large amounts of data into and out of AWS using portable storage devices for transport

AWS transfers the data directly onto and off of storage devices using Amazon’s high-speed internal network, bypassing the Internet, and can be much faster and more cost effective than upgrading connectivity.
AWS Import/Export can be implemented in two different ways
- AWS Import/Export Disk (Disk)
  - originally the only service offered by AWS for data transfer by mail
  - Disk supports transfers data directly onto and off of storage devices you own using the Amazon high-speed internal network
- AWS Snowball
  - is generally faster and cheaper to use than Disk for importing data into Amazon S3
AWS Import/Export supports
- importing data to several types of AWS storage, including EBS snapshots, S3 buckets, and Glacier vaults.
- exporting data out from S3 only
Data load typically begins the next business day after the storage device arrives at AWS and after the data export or import completes, the storage device is returned

Ideal Usage Patterns

AWS Import/Export is ideal for transferring large amounts of data in and out of the AWS cloud, especially in cases where transferring the data over the Internet would be too slow (a week or more) or too costly.
Common use cases include
- first time migration – initial data upload to AWS
- content distribution or regular data interchange to/from your customers or business associates,
- off-site backup – transfer to Amazon S3 or Amazon Glacier for off-site backup and archival storage, and
- disaster recovery – quick retrieval (export) of large backups from Amazon S3 or Amazon Glacier

AWS Import/Export Disk Jobs

AWS Import/Export jobs can be created in 2 steps
- Submit a Job request to AWS where each job corresponds to exactly one storage device
- Send your storage device to AWS, which after the data is uploaded or downloaded is returned back

AWS Import/Export jobs can be created
- using a command line tool, which requires no programming or
- programmatically using the AWS SDK for Java or the REST API to send requests to AWS or
- even through third party tools
AWS Import/Export Data Encrption
- supports data encryption methods
  - PIN-code encryption, Hardware-based device encryption that uses a physical PIN pad for access to the data.
  - TrueCrypt software encryption, Disk encryption using TrueCrypt, which is an open-source encryption application.
- Creating an import or export job with encryption requires providing the PIN code or password for the selected encryption method
- Although is is not mandatory for the data to be encrypted for import jobs, it is highly recommended
- All export jobs require data encryption can use either hardware encryption or software encryption or both methods.
AWS Import/Export supported Job Types
- Import to S3
- ~~Import to Glacier~~ (Import to Glacier is no longer supported by AWS. Refer Updates)
- Import to EBS
- Export to S3
AWS erases the device after every import job prior to return shipping.

Guidelines and Limitations

AWS Import/Export does not support Server-Side Encryption (SSE) when importing data.

Maximum file size of a single file or object to be imported is 5 TB. Files and objects larger than 5 TB won’t be imported.
Maximum device capacity is 16 TB for Amazon Simple Storage Service (Amazon S3) and Amazon EBS jobs.
Maximum device capacity is 4 TB for Amazon Glacier jobs.

AWS Import/Export exports only the latest version from an Amazon S3 bucket that has versioning turned on.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

You are working with a customer who has 10 TB of archival data that they want to migrate to Amazon Glacier. The customer has a 1-Mbps connection to the Internet. Which service or feature provides the fastest method of getting the data into Amazon Glacier?
1. Amazon Glacier multipart upload
2. AWS Storage Gateway
3. VM Import/Export
4. AWS Import/Export (Normal upload will take ~900 days as the internet max speed is capped)

References

AWS Import & Export Developer Guide

AWS Associate Certification Exams – Preparation – Sample Questions

October 2, 2016 ~ Last updated on : July 18, 2017 ~ jayendrapatil ~ 181 Comments

AWS Solution Architect & SysOps Associate Certification Exams Preparation & Sample Questions

AWS Certified Solution Architect – Associate learning path

I recently passed AWS Solution Architect – Associate (90%) & SysOps – Associate (81%) certification exams.

I would like to share my preparation leading to and experience for the exams

AWS Certification exams are pretty tough to crack as they cover a lot of topics from a wide range of services offered by them.
I cleared both the Solution Architect and SysOps Associate certifications in a time frame of 2 months.

I had 6 months of prior hands-on experience with AWS primarily on IAM, VPC, EC2, S3 & RDS which helped a lot
There are lot of resources online which can be helpful but are overwhelming as well as misguide you (I found lot of dumps which have sample exam questions but the answers are marked wrong)
AWS Associate certifications although can be cleared with complete theoretical knowledge, a bit of hands on really helps a lot.

Also, AWS services are update literally everyday with new features being added, issues resolved and so on, which the exam questions surely don’t keep a track off. Not sure how often the exam questions are updated.
So my suggestion is if you see a question which focuses on a scenario which added latest by AWS within a month, still don’t go with that answer and stick to the answer which was relevant before the update for e.g. encryption of Root volume usually made in the certification exam with options to use external tools and was enabled by AWS recently.

AWS Certification Exam Preparation

As I mentioned there are lot of resources and courses online for the Certification exam which can be overwhelming, this is what I did for my preparation to clear the exams

- Went through AWS Certification Preparation guide
- Went through the AWS Solution Architect & SysOps blue print thoroughly as it mentions the topics and the weightage in the exam
- Purchased the acloud guru course from udemy (got it for $10 on discount) for both the AWS Certified Solutions Architect – Associate 2017 and AWS Certified SysOps Administrator – Associate 2017 course, which greatly helped to have a clear picture of the the format, topics and relevant sections
- Signed up with AWS for the Free Tier account which provides a lot of the Services to be tried for free with certain limits which are more then enough to get things going. Be sure to decommission anything, if you using any thing beyond the free limits, preventing any surprises 🙂
- Also, used the QwikLabs for all the introductory courses which are free and allow you to try out the services multiple times (I think its max 5, as I got the warnings couple of times)
- Update: Qwiklabs seems to have reduced the free courses quite a lot and now provide targeted labs for AWS Certification exams which are charged
- Went through the few Whitepapers especially the
  - DDOS
  - Security Best Practices
- Read the FAQs atleast for the important topics, as they cover important points and are good for quick review
- Went through multiple sites to consolidate the Sample exam questions and worked on them to get the correct answers. I have tried to consolidate them further in this blog topic wise.
- Went through multiple discussion topics on the acloud guru course which are pretty interesting and provides further insights and some of them are actually certification exam questions
- I did not purchase the AWS Practice exams, as the questions are available all around. But if you want to check the format, it might be useful.
- Opinion : acloud guru course are good by itself but is not sufficient to pass the exam but might help to counter about 50-60% of exam questions
- Also, if you are well prepared the time for the certification exam is more then enough and I could answer all the questions within an hour and was able to run a review on all them once.
- Important Exam Time Tip: Only mark the questions which you doubt as Mark for Review and then go through them only. I did the mistake marking quite a few as Mark for Review, even though I was confident on the answers, and wasting time on them again.
- You can also check on
  - Braincert AWS Solution Architect – Associate Practice Exam
    - Set of extensive questions, with very nice, accurate & detailed explanation
  - Whizlabs AWS Solutions Architect Associate Exam and AWS SysOps Administrator Associate Exam exams which has practice exams

AWS Associate Certification Exam Important Topics

Both Solution Architect & Sysops concentrate on a variety of AWS services
Important topics with 70-80% coverage
- IAM
  - IAM Roles
  - IAM Best Practices
- VPC
- EC2
- S3
  - S3 General
  - S3 Storage classes, Permissions, Object lifecycle & Versioning
- Whitepapers
- CloudWatch Monitoring & Troubleshooting – Primarily for SysOps
Other topics with 20-30% coverage
Can expect questions from SWF, AWS Support, Cloud HSM (supported services), Trusted Advisor, Storage Gateway, Direct Connect, SNS, Consolidated Billing

AWS SWF – Simple Workflow Overview – Certification

September 29, 2016 ~ Last updated on : March 24, 2017 ~ jayendrapatil ~ 11 Comments

AWS SWF – Simple Workflow

AWS SWF makes it easy to build applications that coordinate work across distributed components

SWF makes it easier to develop asynchronous and distributed applications by providing a programming model and infrastructure for coordinating distributed components, tracking and maintaining their execution state in a reliable way
SWF does the following
- stores metadata about a workflow and its component parts.
- stores task for workers and queues them until a Worker needs them.
- assigns task to workers, which can run either on cloud or on-premises
- routes information between executions of a workflow and the associated Workers.
- tracks the progress of workers on Tasks, with configurable timeouts.
- maintains workflow state in a durable fashion

SWF helps coordinating tasks across the application which involves managing intertask dependencies, scheduling, and concurrency in accordance with the logical flow of the application.
SWF gives full control over implementing tasks and coordinating them without worrying about underlying complexities such as tracking their progress and maintaining their state.
SWF tracks and maintains the workflow state in a durable fashion, so that the application is resilient to failures in individual components, which can be implemented, deployed, scaled, and modified independently

SWF offers capabilities to support a variety of application requirements and is suitable for a range of use cases that require coordination of tasks, including media processing, web application back-ends, business process workflows, and analytics pipelines.

Simple Workflow Concepts

AWS SWF Components

Workflow
- Fundamental concept in SWF is the Workflow, which is the automation of a business process
- A workflow is a set of activities that carry out some objective, together with logic that coordinates the activities.
Workflow Execution
- A workflow execution is a running instance of a workflow
Workflow History
- SWF maintains the state and progress of each workflow execution in its Workflow History, which saves the application from having to store the state in a durable way.
- It enables applications to be stateless as all information about a workflow execution is stored in its workflow history.
- For each workflow execution, the history provides a record of which activities were scheduled, their current status, and their results. The workflow execution uses this information to determine next steps.
- History provides a detailed audit trail that can be used to monitor running workflow executions and verify completed workflow executions.
- Operations that do not change the state of the workflow for e.g. polling execution do not typically appear in the workflow history
- Markers can be used to record information in the workflow history of a workflow execution that is specific to the use case
Domain
- Each workflow runs in an AWS resource called a Domain, which controls the workflow’s scope
- An AWS account can have multiple domains, with each containing multiple workflows
- Workflows in different domains cannot interact with each other

Activities
- Designing an SWF workflow, Activities need to be precisely defined and then registered with SWF as an activity type with information such as name, version and timeout
Activity Task & Activity Worker
- An Activity Worker is a program that receives activity tasks, performs them, and provides results back. An activity worker can be a program or even a person who performs the task using an activity worker software
- Activity tasks—and the activity workers that perform them can
  - run synchronously or asynchronously, can be distributed across multiple computers, potentially in different geographic regions, or run on the same computer,
  - be written in different programming languages and run on different operating systems
  - be created that are long-running, or that may fail, time out require restarts or that may complete with varying throughput & latency
Decider
- A Decider implements a Workflow’s coordination logic.
- Decider schedules activity tasks, provides input data to the activity workers, processes events that arrive while the workflow is in progress, and ends (or closes) the workflow when the objective has been completed.
- Decider directs the workflow by receiving decision tasks from SWF and responding back to SWF with decisions. A decision represents an action or set of actions which are the next steps in the workflow which can either be to schedule an activity task, set timers to delay the execution of an activity task, to request cancellation of activity tasks already in progress, and to complete or close the workflow.

Workers and Deciders are both stateless, and can respond to increased traffic by simply adding additional Workers and Deciders as needed
Role of SWF service is to function as a reliable central hub through which data is exchanged between the decider, the activity workers, and other relevant entities such as the person administering the workflow.
Mechanism by which both the activity workers and the decider receive their tasks (activity tasks and decision tasks resp.) is by polling the SWF

SWF allows “long polling”, requests will be held open for up to 60 seconds if necessary, to reduce network traffic and unnecessary processing
SWF informs the decider of the state of the workflow by including with each decision task, a copy of the current workflow execution history. The workflow execution history is composed of events, where an event represents a significant change in the state of the workflow execution for e.g events would be the completion of a task, notification that a task has timed out, or the expiration of a timer that was set earlier in the workflow execution. The history is a complete, consistent, and authoritative record of the workflow’s progress

Workflow Implementation & Execution

Implement Activity workers with the processing steps in the Workflow.

Implement Decider with the coordination logic of the Workflow.
Register the Activities and workflow with SWF.
Start the Activity workers and Decider. Once started, the decider and activity workers should start polling Amazon SWF for tasks.

Start one or more executions of the Workflow. Each execution runs independently and can be provided with its own set of input data.
When an execution is started, SWF schedules the initial decision task. In response, the decider begins generating decisions which initiate activity tasks. Execution continues until your decider makes a decision to close the execution.
View and Track workflow executions

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

What does Amazon SWF stand for?
1. Simple Web Flow
2. Simple Work Flow
3. Simple Wireless Forms
4. Simple Web Form

Regarding Amazon SWF, the coordination logic in a workflow is contained in a software program called a ____.
1. Handler
2. Decider
3. Coordinator
4. Worker
For which of the following use cases are Simple Workflow Service (SWF) and Amazon EC2 an appropriate solution? Choose 2 answers
1. Using as an endpoint to collect thousands of data points per hour from a distributed fleet of sensors
2. Managing a multi-step and multi-decision checkout process of an e-commerce website
3. Orchestrating the execution of distributed and auditable business processes
4. Using as an SNS (Simple Notification Service) endpoint to trigger execution of video transcoding jobs
5. Using as a distributed session store for your web application
Amazon SWF is designed to help users…
1. … Design graphical user interface interactions
2. … Manage user identification and authorization
3. … Store Web content
4. … Coordinate synchronous and asynchronous tasks which are distributed and fault tolerant.
What does a “Domain” refer to in Amazon SWF?
1. A security group in which only tasks inside can communicate with each other
2. A special type of worker
3. A collection of related Workflows
4. The DNS record for the Amazon SWF service

Your company produces customer commissioned one-of-a-kind skiing helmets combining nigh fashion with custom technical enhancements Customers can show oft their Individuality on the ski slopes and have access to head-up-displays. GPS rear-view cams and any other technical innovation they wish to embed in the helmet. The current manufacturing process is data rich and complex including assessments to ensure that the custom electronics and materials used to assemble the helmets are to the highest standards Assessments are a mixture of human and automated assessments you need to add a new set of assessment to model the failure modes of the custom electronics using GPUs with CUD across a cluster of servers with low latency networking. What architecture would allow you to automate the existing process using a hybrid approach and ensure that the architecture can support the evolution of processes over time? [PROFESSIONAL]
1. Use AWS Data Pipeline to manage movement of data & meta-data and assessments. Use an auto-scaling group of G2 instances in a placement group. (Involves mixture of human assessments)
2. Use Amazon Simple Workflow (SWF) to manage assessments, movement of data & meta-data. Use an autoscaling group of G2 instances in a placement group. (Human and automated assessments with GPU and low latency networking)
3. Use Amazon Simple Workflow (SWF) to manage assessments movement of data & meta-data. Use an autoscaling group of C3 instances with SR-IOV (Single Root I/O Virtualization). (C3 and SR-IOV won’t provide GPU as well as Enhanced networking needs to be enabled)
4. Use AWS data Pipeline to manage movement of data & meta-data and assessments use auto-scaling group of C3 with SR-IOV (Single Root I/O virtualization). (Involves mixture of human assessments)
Your startup wants to implement an order fulfillment process for selling a personalized gadget that needs an average of 3-4 days to produce with some orders taking up to 6 months you expect 10 orders per day on your first day. 1000 orders per day after 6 months and 10,000 orders after 12 months. Orders coming in are checked for consistency men dispatched to your manufacturing plant for production quality control packaging shipment and payment processing. If the product does not meet the quality standards at any stage of the process employees may force the process to repeat a step Customers are notified via email about order status and any critical issues with their orders such as payment failure. Your case architecture includes AWS Elastic Beanstalk for your website with an RDS MySQL instance for customer data and orders. How can you implement the order fulfillment process while making sure that the emails are delivered reliably? [PROFESSIONAL]
1. Add a business process management application to your Elastic Beanstalk app servers and re-use the ROS database for tracking order status use one of the Elastic Beanstalk instances to send emails to customers. (Would use a SWF instead of BPM)
2. Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group with min/max=1. Use the decider instance to send emails to customers. (Decider sending emails might not be reliable)
3. Use SWF with an Auto Scaling group of activity workers and a decider instance in another Auto Scaling group with min/max=1. Use SES to send emails to customers.
4. Use an SQS queue to manage all process tasks. Use an Auto Scaling group of EC2 Instances that poll the tasks and execute them. Use SES to send emails to customers. (Does not provide an ability to repeat a step)
Select appropriate use cases for SWF with Amazon EC2? (Choose 2)
1. Video encoding using Amazon S3 and Amazon EC2. In this use case, large videos are uploaded to Amazon S3 in chunks. Application is built as a workflow where each video file is handled as one workflow execution.
2. Processing large product catalogs using Amazon Mechanical Turk. While validating data in large catalogs, the products in the catalog are processed in batches. Different batches can be processed concurrently.
3. Order processing system with Amazon EC2, SQS, and SimpleDB. Use SWF notifications to orchestrate an order processing system running on EC2, where notifications sent over HTTP can trigger real-time processing in related components such as an inventory system or a shipping service.
4. Using as an SQS (Simple Queue Service) endpoint to trigger execution of video transcoding jobs.

When you register an activity in Amazon SWF, you provide the following information, except:
1. a name
2. timeout values
3. a domain
4. version
Regarding Amazon SWF, at times you might want to record information in the workflow history of a workflow execution that is specific to your use case. ____ enable you to record information in the workflow execution history that you can use for any custom or scenario-specific purpose.
1. Markers
2. Tags
3. Hash keys
4. Events
Which of the following statements about SWF are true? Choose 3 answers.
1. SWF tasks are assigned once and never duplicated
2. SWF requires an S3 bucket for workflow storage
3. SWF workflow executions can last up to a year
4. SWF triggers SNS notifications on task assignment
5. SWF uses deciders and workers to complete tasks
6. SWF requires at least 1 EC2 instance per domain