AWS Automated Backups – Certification

AWS Automated Backups

  • AWS allows automated backups for
    • RDS
    • ElastiCache – Redis only
    • Redshift
  • AWS does not perform automated backups for EC2 EBS volumes and needs to be manually scripted
  • AWS stores the backups and snapshots in S3

RDS Backups

  • RDS supports automated backups as well as manual snapshots
  • Automated Backups
    • enable point-in-time recovery of the DB Instance
    • perform a full daily backup and captures transaction logs (as updates to your DB instance are made
    • are performed during the defined preferred backup window and is retained for user-specified period of time called the retention period (default 1 day with a max of 35 days)
    • When a point-in-time recovery is initiated, transaction logs are applied to the most appropriate daily backup in order to restore the DB instance to the specific requested time.
    • allows a point-in-time restore and an ability to specify any second during the retention period, up to the Latest Restorable Time
    • are deleted when the DB instance is deleted
  • Snapshots
    • are user-initiated and enable to back up the DB instance in a known state as frequently as needed, and then restored to that specific state at any time.
    • can be created with the AWS Management Console or by using the CreateDBSnapshot API call.
    • are not deleted when the DB instance is deleted
  • Automated backups and snapshots can result in a performance hit, if Multi-AZ is not enabled

ElastiCache Automated Backups

  • ElastiCache supports Automated backups for Redis cluster only
  • ElastiCache creates a backup of the cluster on a daily basis
  • Snapshot will degrade performance, so should be performed during least bust part of the day
  • Backups are performed during the Backup period and retained for backup retention limit defined, with a maximum of 35 days
  • ElastiCache also allows manual snapshots of the cluster

Redshift Automated Backups

  • Amazon Redshift enables automated backups, by default
  • Redshift replicates all the data within your data warehouse cluster when it is loaded and also continuously backs up the data to S3
  • Redshift retains backups for 1 day which can be extended to max 35 days
  • Redshift only backs up data that has changed and are incremental so most snapshots use up a small amount of storage
  • Redshift also allows manual snapshots of the data warehouse

EC2 EBS Backups

  • EBS does not provide automated backups
  • EBS snapshots can be created by using the AWS Management Console, the command line interface (CLI), or the APIs
  • Backups degrade performance
  • Stored on S3
  • EBS Snapshots are incremental and block-based, and they consume space only for changed data after the initial snapshot is created
  • Data can be restored from snapshots by created a volume from the snapshot
  • EBS snapshots are region specific and can be copied between AWS regions

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which two AWS services provide out-of-the-box user configurable automatic backup-as-a-service and backup rotation options? Choose 2 answers
    1. Amazon S3
    2. Amazon RDS
    3. Amazon EBS
    4. Amazon Redshift
  2. You have been asked to automate many routine systems administrator backup and recovery activities. Your current plan is to leverage AWS-managed solutions as much as possible and automate the rest with the AWS CLI and scripts. Which task would be best accomplished with a script?
    1. Creating daily EBS snapshots with a monthly rotation of snapshots
    2. Creating daily RDS snapshots with a monthly rotation of snapshots
    3. Automatically detect and stop unused or underutilized EC2 instances
    4. Automatically add Auto Scaled EC2 instances to an Amazon Elastic Load Balancer

AWS Billing and Cost Management – Certification

AWS Billing and Cost Management

  • AWS Billing and Cost Management is the service that you use to pay AWS bill, monitor your usage, and budget your costs

Analyzing Costs with Graphs

  • AWS provides Cost Explorer tool which allows filter graphs by API operations, Availability Zones, AWS service, custom cost allocation tags, EC2 instance type, purchase options, region, usage type, usage type groups, or, if Consolidated Billing used, by linked account.

Budgets

  • Budgets can be used to track AWS costs to see usage-to-date and current estimated charges from AWS
  • Budgets use the cost visualization provided by Cost Explorer to show the status of the budgets and to provide forecasts of your estimated costs.
  • Budgets can be used to create CloudWatch alarms that notify when you go over your budgeted amounts, or when the estimated costs exceed budgets
  • Notifications can be sent to an SNS topic and to email addresses associated with your budget notification

Cost Allocation Tags

  • Tags can be used to organize AWS resources, and cost allocation tags to track the AWS costs on a detailed level.
  • Upon cost allocation tags activation, AWS uses the cost allocation tags to organize the resource costs on the cost allocation report making it easier to categorize and track your AWS costs.
  • AWS provides two types of cost allocation tags,
    • an AWS-generated tag AWS defines, creates, and applies the AWS-generated tag for you,
    • and user-defined tags that you define, create,
  • Both types of tags must be activated separately before they can appear in Cost Explorer or on a cost allocation report

Alerts on Cost Limits

  • CloudWatch can be used to create billing alerts when the AWS costs exceed specified thresholds
  • When the usage exceeds threshold amounts, AWS sends an email notification

Consolidated Billing

Refer to My Blog Post about Consolidated Billing

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. An organization is using AWS since a few months. The finance team wants to visualize the pattern of AWS spending. Which of the below AWS tool will help for this requirement?
    • AWS Cost Manager
    • AWS Cost Explorer (Check Cost Explorer)
    • AWS CloudWatch
    • AWS Consolidated Billing (Will not help visualize)
  2. Your company wants to understand where cost is coming from in the company’s production AWS account. There are a number of applications and services running at any given time. Without expending too much initial development time, how best can you give the business a good understanding of which applications cost the most per month to operate?
    1. Create an automation script, which periodically creates AWS Support tickets requesting detailed intra-month information about your bill.
    2. Use custom CloudWatch Metrics in your system, and put a metric data point whenever cost is incurred.
    3. Use AWS Cost Allocation Tagging for all resources, which support it. Use the Cost Explorer to analyze costs throughout the month. (Refer link)
    4. Use the AWS Price API and constantly running resource inventory scripts to calculate total price based on multiplication of consumed resources over time.
  3. You need to know when you spend $1000 or more on AWS. What’s the easy way for you to see that notification?
    1. AWS CloudWatch Events tied to API calls, when certain thresholds are exceeded, publish to SNS.
    2. Scrape the billing page periodically and pump into Kinesis.
    3. AWS CloudWatch Metrics + Billing Alarm + Lambda event subscription. When a threshold is exceeded, email the manager.
    4. Scrape the billing page periodically and publish to SNS.
  4. A user is planning to use AWS services for his web application. If the user is trying to set up his own billing management system for AWS, how can he configure it?
    1. Set up programmatic billing access. Download and parse the bill as per the requirement
    2. It is not possible for the user to create his own billing management service with AWS
    3. Enable the AWS CloudWatch alarm which will provide APIs to download the alarm data
    4. Use AWS billing APIs to download the usage report of each service from the AWS billing console
  5. An organization is setting up programmatic billing access for their AWS account. Which of the below mentioned services is not required or enabled when the organization wants to use programmatic access?
    1. Programmatic access
    2. AWS bucket to hold the billing report
    3. AWS billing alerts
    4. Monthly Billing report
  6. A user has setup a billing alarm using CloudWatch for $200. The usage of AWS exceeded $200 after some days. The user wants to increase the limit from $200 to $400? What should the user do?
    1. Create a new alarm of $400 and link it with the first alarm
    2. It is not possible to modify the alarm once it has crossed the usage limit
    3. Update the alarm to set the limit at $400 instead of $200 (Refer link)
    4. Create a new alarm for the additional $200 amount
  7. A user is trying to configure the CloudWatch billing alarm. Which of the below mentioned steps should be performed by the user for the first time alarm creation in the AWS Account Management section?
    1. Enable Receiving Billing Reports
    2. Enable Receiving Billing Alerts
    3. Enable AWS billing utility
    4. Enable CloudWatch Billing Threshold

References

AWS_Billing_&_Cost_Management – User_Guide

AWS RDS Monitoring & Notification – Certification

AWS RDS Monitoring & Notification

  • RDS integrates with CloudWatch and provides metrics for monitoring
  • CloudWatch alarms can be created over a single metric that sends an SNS message when the alarm changes state
  • RDS also provides SNS notification whenever any RDS event occurs

CloudWatch RDS Monitoring

  • RDS DB instance can be monitored using CloudWatch, which collects and processes raw data from RDS into readable, near real-time metrics.
  • The statistics are recorded for a period of two weeks, so that you can access historical information and gain a better perspective on how the service is performing.
  • By default, RDS metric data is automatically sent to Amazon CloudWatch in 1-minute periods
  • CloudWatch RDS Metrics
    • BinLogDiskUsage – Amount of disk space occupied by binary logs on the master. Applies to MySQL read replicas.
    • CPUUtilization – Percentage of CPU utilization.
    • DatabaseConnections – Number of database connections in use.
    • DiskQueueDepth – The number of outstanding IOs (read/write requests) waiting to access the disk.
    • FreeableMemory – Amount of available random access memory.
    • FreeStorageSpace – Amount of available storage space.
    • ReplicaLag – Amount of time a Read Replica DB instance lags behind the source DB instance. Applies to MySQL, MariaDB, and PostgreSQL Read Replicas.
    • SwapUsage – Amount of swap space used on the DB instance.
    • ReadIOPS – Average number of disk I/O operations per second.
    • WriteIOPS – Average number of disk I/O operations per second.
    • ReadLatency – Average amount of time taken per disk I/O operation.
    • WriteLatency – Average amount of time taken per disk I/O operation.
    • ReadThroughput – Average number of bytes read from disk per second.
    • WriteThroughput – Average number of bytes written to disk per second.
    • NetworkReceiveThroughput – Incoming (Receive) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.
    • NetworkTransmitThroughput – Outgoing (Transmit) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.

RDS Event Notification

  • RDS uses the SNS to provide notification when an RDS event occurs
  • RDS groups the events into categories, which can be subscribed so that a notification is sent when an event in that category occurs.
  • Event category for a DB instance, DB cluster, DB snapshot, DB cluster snapshot, DB security group or for a DB parameter group can be subscribed
  • Event notifications are sent to the email addresses provided during subscription creation
  • Subscription can be easily turn off notification without deleting a subscription by setting the Enabled radio button to No in the RDS console or by setting the Enabled parameter to false using the CLI or RDS API.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You run a web application with the following components Elastic Load Balancer (ELB), 3 Web/Application servers, 1 MySQL RDS database with read replicas, and Amazon Simple Storage Service (Amazon S3) for static content. Average response time for users is increasing slowly. What three CloudWatch RDS metrics will allow you to identify if the database is the bottleneck? Choose 3 answers
    1. The number of outstanding IOs waiting to access the disk
    2. The amount of write latency
    3. The amount of disk space occupied by binary logs on the master.
    4. The amount of time a Read Replica DB Instance lags behind the source DB Instance
    5. The average number of disk I/O operations per second.
  2. Typically, you want your application to check whether a request generated an error before you spend any time processing results. The easiest way to find out if an error occurred is to look for an __________ node in the response from the Amazon RDS API.
    1. Incorrect
    2. Error
    3. FALSE
  3. In the Amazon CloudWatch, which metric should I be checking to ensure that your DB Instance has enough free storage space?
    1. FreeStorage
    2. FreeStorageSpace
    3. FreeStorageVolume
    4. FreeDBStorageSpace
  4. A user is receiving a notification from the RDS DB whenever there is a change in the DB security group. The user does not want to receive these notifications for only a month. Thus, he does not want to delete the notification. How can the user configure this?
    1. Change the Disable button for notification to “Yes” in the RDS console
    2. Set the send mail flag to false in the DB event notification console
    3. The only option is to delete the notification from the console
    4. Change the Enable button for notification to “No” in the RDS console
  5. A sys admin is planning to subscribe to the RDS event notifications. For which of the below mentioned source categories the subscription cannot be configured?
    1. DB security group
    2. DB snapshot
    3. DB options group
    4. DB parameter group
  6. A user is planning to setup notifications on the RDS DB for a snapshot. Which of the below mentioned event categories is not supported by RDS for this snapshot source type?
    1. Backup (Refer link)
    2. Creation
    3. Deletion
    4. Restoration
  7. A system admin is planning to setup event notifications on RDS. Which of the below mentioned services will help the admin setup notifications?
    1. AWS SES
    2. AWS Cloudtrail
    3. AWS CloudWatch
    4. AWS SNS
  8. A user has setup an RDS DB with Oracle. The user wants to get notifications when someone modifies the security group of that DB. How can the user configure that?
    1. It is not possible to get the notifications on a change in the security group
    2. Configure SNS to monitor security group changes
    3. Configure event notification on the DB security group
    4. Configure the CloudWatch alarm on the DB for a change in the security group
  9. It is advised that you watch the Amazon CloudWatch “_____” metric (available via the AWS Management Console or Amazon Cloud Watch APIs) carefully and recreate the Read Replica should it fall behind due to replication errors.
    1. Write Lag
    2. Read Replica
    3. Replica Lag
    4. Single Replica

AWS RDS Security – Certification

AWS RDS Security

  • AWS provides multiple features to provide RDS security
    • DB instance can be hosted in a VPC for the greatest possible network access control
    • IAM policies can be used to assign permissions that determine who is allowed to manage RDS resources
    • Security groups allow to control what IP addresses or EC2 instances can connect to the databases on a DB instance
    • Secure Socket Layer (SSL) connections with DB instances
    • RDS encryption to secure RDS instances and snapshots at rest.
    • Network encryption and transparent data encryption (TDE) with Oracle DB instances

RDS Authentication and Access Control

  • IAM can be used to control which RDS operations each individual user has permission to call

Encrypting RDS Resources

  • RDS encrypted instances use the industry standard AES-256 encryption algorithm to encrypt data on the server that hosts the RDS instance
  • RDS then handles authentication of access and decryption of this data with a minimal impact on performance, and with no need to modify your database client applications
  • Data at Rest Encryption
    • can be enabled on RDS instances to encrypt the underlying storage
    • encryption keys are managed by KMS
    • can be enabled only during instance creation
    • once enabled, the encryption keys cannot be changed
    • if the key is lost, the DB can only be restored from the backup
  • Once encryption is enabled for an RDS instance,
    • logs are encrypted
    • snapshots are encrypted
    • automated backups are encrypted
    • read replicas are encrypted
  • Cross region replicas and snapshots copy does not work since the key is only available in a single region
  • RDS DB Snapshot considerations
    • DB snapshot encrypted using an KMS encryption key can be copied
    • Copying an encrypted DB snapshot, results in an encrypted copy of the DB snapshot
    • When copying, DB snapshot can either be encrypted with the same KMS encryption key as the original DB snapshot, or a different KMS encryption key to encrypt the copy of the DB snapshot.
    • An unencrypted DB snapshot can be copied to an encrypted snapshot, a quick way to add encryption to a previously encrypted DB instance.
    • Encrypted snapshot can be restored only to an encrypted DB instance
    • If a KMS encryption key is specified when restoring from an unencrypted DB cluster snapshot, the restored DB cluster is encrypted using the specified KMS encryption key
    • Copying an encrypted snapshot shared from another AWS account, requires access to the KMS encryption key used to encrypt the DB snapshot.
    • Because KMS encryption keys are specific to the region that they are created in, encrypted snapshot cannot be copied to another region
  • Transparent Data Encryption (TDE)
    • Automatically encrypts the data before it is written to the underlying storage device and decrypts when it is read  from the storage device
    • is supported by Oracle and SQL Server
      • Oracle requires key storage outside of the KMS and integrates with CloudHSM for this
      • SQL Server requires a key but is managed by RDS

SSL to Encrypt a Connection to a DB Instance

  • Encrypt connections using SSL for data in transit between the applications and the DB instance
  • Amazon RDS creates an SSL certificate and installs the certificate on the DB instance when RDS provisions the instance.
  • SSL certificates are signed by a certificate authority. SSL certificate includes the DB instance endpoint as the Common Name (CN) for the SSL certificate to guard against spoofing attacks
  • While SSL offers security benefits, be aware that SSL encryption is a compute-intensive operation and will increase the latency of the database connection.

RDS Security Groups

  • Security groups control the access that traffic has in and out of a DB instance
  • VPC security groups act like a firewall controlling network access to your DB instance.
  • VPC security groups can be configured and associated with the DB instance to allow access from an IP address range, port, or EC2 security group
  • Database security groups default to a “deny all” access mode and customers must specifically authorize network ingress.

Master User Account Privileges

  • When you create a new DB instance, the default master user that used gets certain privileges for that DB instance
  • Subsequently, other users with permissions can be created

Event Notification

  • Event notifications can be configured for important events that occur on the DB instance
  • Notifications of a variety of important events that can occur on the RDS instance, such as whether the instance was shut down, a backup was started, a failover occurred, the security group was changed, or your storage space is low can be received

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Can I encrypt connections between my application and my DB Instance using SSL?
    1. No
    2. Yes
    3. Only in VPC
    4. Only in certain regions
  2. Which of these configuration or deployment practices is a security risk for RDS?
    1. Storing SQL function code in plaintext
    2. Non-Multi-AZ RDS instance
    3. Having RDS and EC2 instances exist in the same subnet
    4. RDS in a public subnet (Making RDS accessible to the public internet in a public subnet poses a security risk, by making your database directly addressable and spammable. DB instances deployed within a VPC can be configured to be accessible from the Internet or from EC2 instances outside the VPC. If a VPC security group specifies a port access such as TCP port 22, you would not be able to access the DB instance because the firewall for the DB instance provides access only via the IP addresses specified by the DB security groups the instance is a member of and the port defined when the DB instance was created. Refer link)

References

AWS_RDS_User_Guide – Security

 

AWS Blue Green Deployment – Certification

AWS Blue Green Deployment

  • Blue/green deployments provide near zero-downtime release and rollback capabilities.
  • Blue/green deployment works by shifting traffic between two identical environments that are running different versions of the application
    • Blue environment represents the current application version serving production traffic.
    • In parallel, the green environment is staged running a different version of your application.
    • After the green environment is ready and tested, production traffic is redirected from blue to green.
    • If any problems are identified, you can roll back by reverting traffic back to the blue environment.

NOTE: Advanced Topic required for DevOps Professional Exam Only

AWS Services

Route 53

  • Route 53 is a highly available and scalable authoritative DNS service that route user requests
  • Route 53 with its DNS service allows administrators to direct traffic by simply updating DNS records in the hosted zone
  • TTL can be adjusted for resource records to be shorter which allow record changes to propagate faster to clients

Elastic Load Balancing

  • Elastic Load Balancing distributes incoming application traffic across EC2 instances
  • Elastic Load Balancing scales in response to incoming requests, performs health checking against Amazon EC2 resources, and naturally integrates with other AWS tools, such as Auto Scaling.
  • ELB also helps perform health checks of EC2 instances to route traffic only to the healthy instances

Auto Scaling

  • Auto Scaling allows different versions of launch configuration, which define templates used to launch EC2 instances, to be attached to an Auto Scaling group to enable blue/green deployment.
  • Auto Scaling’s termination policies and Standby state enable blue/green deployment
    • Termination policies in Auto Scaling groups to determine which EC2 instances to remove during a scaling action.
    • Auto Scaling also allows instances to be placed in Standby state, instead of termination, which helps with quick rollback when required
  • Auto Scaling with Elastic Load Balancing can be used to balance and scale the traffic

Elastic Beanstalk

  • Elastic Beanstalk makes it easy to run multiple versions of the application and provides capabilities to swap the environment URLs, facilitating blue/green deployment.
  • Elastic Beanstalk supports Auto Scaling and Elastic Load Balancing, both of which enable blue/green deployment

OpsWorks

  • OpsWorks has the concept of stacks, which are logical groupings of AWS resources with a common purpose & should be logically managed together
  • Stacks are made of one or more layers with each layer represents a set of EC2 instances that serve a particular purpose, such as serving applications or hosting a database server.
  • OpsWorks simplifies cloning entire stacks when preparing for blue/green environments.

CloudFormation

  • CloudFormation helps describe the AWS resources through JSON formatted templates and provides automation capabilities for provisioning blue/green environments and facilitating updates to switch traffic, whether through Route 53 DNS, Elastic Load Balancing, etc
  • CloudFormation provides infrastructure as code strategy, where infrastructure is provisioned and managed using code and software development techniques, such as version control and continuous integration, in a manner similar to how application code is treated

CloudWatch

  • CloudWatch monitoring can provide early detection of application health in blue/green deployments

Deployment Techniques

DNS Routing using Route 53

  • Route 53 DNS service can help switch traffic from the blue environment to the green and vice versa, if rollback is necessary
  • Route 53 can help either switch the traffic completely or through a weighted distribution
  • Weighted distribution
    • helps distribute percentage of traffic to go to the green environment and gradually update the weights until the green environment carries the full production traffic
    • provides the ability to perform canary analysis where a small percentage of production traffic is introduced to a new environment
    • helps manage cost by using auto scaling for instances to scale based on the actual demand
  • Route 53 can handle Public or Elastic IP address, Elastic Load Balancer, Elastic Beanstalk environment web tiers etc.

Auto Scaling Group Swap Behind Elastic Load Balancer

AWS Blue Green Deployment - Auto Scaling Group

  • Elastic Load Balancing with Auto Scaling to manage EC2 resources as per the demand can be used for Blue Green deployments
  • Multiple Auto Scaling groups can be attached to the Elastic Load Balancer
  • Green ASG can be attached to an existing ELB while Blue ASG is already attached to the ELB to serve traffic
  • ELB would start routing requests to the Green Group as for HTTP/S listener it uses a least outstanding requests routing algorithm
  • Green group capacity can be increased to process more traffic while the Blue group capacity can be reduced either by terminating the instances or by putting the instances in a standby mode
  • Standby is a good option because if roll back to the blue environment needed, blue server instances can be put back in service and they’re ready to go
  • If no issues with the Green group, the blue group can be decommissioned by adjusting the group size to zero

Update Auto Scaling Group Launch Configurations

AWS Blue Green Deployment - Auto Scaling Launch

  • Auto Scaling groups have their own launch configurations which define template for EC2 instances to be launched
  • Auto Scaling group can have only one launch configuration at a time, and it can’t be modified. If needs modification, a new launch configuration can be created and attached to the existing Auto Scaling Group
  • After a new launch configuration is in place, any new instances that are launched use the new launch configuration parameters, but existing instances are not affected.
  • When Auto Scaling removes instances (referred to as scaling in) from the group, the default termination policy is to remove instances with the oldest launch configuration
  • To deploy the new version of the application in the green environment, update the Auto Scaling group with the new launch configuration, and then scale the Auto Scaling group to twice its original size.
  • Then, shrink the Auto Scaling group back to the original size
  • To perform a rollback, update the Auto Scaling group with the old launch configuration. Then, do the preceding steps in reverse

Elastic Beanstalk Application Environment Swap

AWS Blue Green Deployment - Elastic Beanstalk

  • Elastic Beanstalk multiple environment and environment url swap feature helps enable Blue Green deployment
  • Elastic Beanstalk can be used to host the blue environment exposed via URL to access the environment
  • Elastic Beanstalk provides several deployment policies, ranging from policies that perform an in-place update on existing instances, to immutable deployment using a set of new instances.
  • Elastic Beanstalk performs an in-place update when the application versions are updated, however application may become unavailable to users for a short period of time.
  • To avoid the downtime, a new version can be deployed to a separate Green environment with its own URL, launched with the existing environment’s configuration
  • Elastic Beanstalk’s Swap Environment URLs feature can be used to promote the green environment to serve production traffic
  • Elastic Beanstalk performs a DNS switch, which typically takes a few minutes
  • To perform a rollback, invoke Swap Environment URL again.

Clone a Stack in AWS OpsWorks and Update DNS

  • OpsWorks can be used to create
    • Blue environment stack with the current version of the application and serving production traffic
    • Green environment stack with the newer version of the application and is not receiving any traffic
  • To promote to the green environment/stack into production, update DNS records to point to the green environment/stack’s load balancer

AWS Blue Green deployment patterns

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What is server immutability?
    1. Not updating a server after creation. (During the new release, a new set of EC2 instances are rolled out by terminating older instances and are disposable. EC2 instance usage is considered temporary or ephemeral in nature for the period of deployment until the current release is active)
    2. The ability to change server counts.
    3. Updating a server after creation.
    4. The inability to change server counts.
  2. You need to deploy a new application version to production. Because the deployment is high-risk, you need to roll the new version out to users over a number of hours, to make sure everything is working correctly. You need to be able to control the proportion of users seeing the new version of the application down to the percentage point. You use ELB and EC2 with Auto Scaling Groups and custom AMIs with your code pre-installed assigned to Launch Configurations. There are no database-level changes during your deployment. You have been told you cannot spend too much money, so you must not increase the number of EC2 instances much at all during the deployment, but you also need to be able to switch back to the original version of code quickly if something goes wrong. What is the best way to meet these requirements?
    1. Create a second ELB, Auto Scaling Launch Configuration, and Auto Scaling Group using the Launch Configuration. Create AMIs with all code pre-installed. Assign the new AMI to the second Auto Scaling Launch Configuration. Use Route53 Weighted Round Robin Records to adjust the proportion of traffic hitting the two ELBs. (Use Weighted Round Robin DNS Records and reverse proxies allow such fine-grained tuning of traffic splits. Blue-Green option does not meet the requirement that we mitigate costs and keep overall EC2 fleet size consistent, so we must select the 2 ELB and ASG option with WRR DNS tuning)
    2. Use the Blue-Green deployment method to enable the fastest possible rollback if needed. Create a full second stack of instances and cut the DNS over to the new stack of instances, and change the DNS back if a rollback is needed. (Full second stack is expensive)
    3. Create AMIs with all code pre-installed. Assign the new AMI to the Auto Scaling Launch Configuration, to replace the old one. Gradually terminate instances running the old code (launched with the old Launch Configuration) and allow the new AMIs to boot to adjust the traffic balance to the new code. On rollback, reverse the process by doing the same thing, but changing the AMI on the Launch Config back to the original code. (Cannot modify the existing launch config)
    4. Migrate to use AWS Elastic Beanstalk. Use the established and well-tested Rolling Deployment setting AWS provides on the new Application Environment, publishing a zip bundle of the new code and adjusting the wait period to spread the deployment over time. Re-deploy the old code bundle to rollback if needed.
  3. When thinking of AWS Elastic Beanstalk, the ‘Swap Environment URLs’ feature most directly aids in what?
    1. Immutable Rolling Deployments
    2. Mutable Rolling Deployments
    3. Canary Deployments
    4. Blue-Green Deployments (Complete switch from one environment to other)
  4. You were just hired as a DevOps Engineer for a startup. Your startup uses AWS for 100% of their infrastructure. They currently have no automation at all for deployment, and they have had many failures while trying to deploy to production. The company has told you deployment process risk mitigation is the most important thing now, and you have a lot of budget for tools and AWS resources. Their stack: 2-tier API Data stored in DynamoDB or S3, depending on type, Compute layer is EC2 in Auto Scaling Groups, They use Route53 for DNS pointing to an ELB, An ELB balances load across the EC2 instances. The scaling group properly varies between 4 and 12 EC2 servers. Which of the following approaches, given this company’s stack and their priorities, best meets the company’s needs?
    1. Model the stack in AWS Elastic Beanstalk as a single Application with multiple Environments. Use Elastic Beanstalk’s Rolling Deploy option to progressively roll out application code changes when promoting across environments. (Does not support DynamoDB also need Blue Green deployment for zero downtime deployment as cost is not a constraint)
    2. Model the stack in 3 CloudFormation templates: Data layer, compute layer, and networking layer. Write stack deployment and integration testing automation following Blue-Green methodologies.
    3. Model the stack in AWS OpsWorks as a single Stack, with 1 compute layer and its associated ELB. Use Chef and App Deployments to automate Rolling Deployment. (Does not support DynamoDB also need Blue Green deployment for zero downtime deployment as cost is not a constraint)
    4. Model the stack in 1 CloudFormation template, to ensure consistency and dependency graph resolution. Write deployment and integration testing automation following Rolling Deployment methodologies. (Need Blue Green deployment for zero downtime deployment as cost is not a constraint)
  5. You are building out a layer in a software stack on AWS that needs to be able to scale out to react to increased demand as fast as possible. You are running the code on EC2 instances in an Auto Scaling Group behind an ELB. Which application code deployment method should you use?
    1. SSH into new instances those come online, and deploy new code onto the system by pulling it from an S3 bucket, which is populated by code that you refresh from source control on new pushes. (is slow and manual)
    2. Bake an AMI when deploying new versions of code, and use that AMI for the Auto Scaling Launch Configuration. (Pre baked AMIs can help to get started quickly)
    3. Create a Dockerfile when preparing to deploy a new version to production and publish it to S3. Use UserData in the Auto Scaling Launch configuration to pull down the Dockerfile from S3 and run it when new instances launch. (is slow)
    4. Create a new Auto Scaling Launch Configuration with UserData scripts configured to pull the latest code at all times. (is slow)
  6. You company runs a complex customer relations management system that consists of around 10 different software components all backed by the same Amazon Relational Database (RDS) database. You adopted AWS OpsWorks to simplify management and deployment of that application and created an AWS OpsWorks stack with layers for each of the individual components. An internal security policy requires that all instances should run on the latest Amazon Linux AMI and that instances must be replaced within one month after the latest Amazon Linux AMI has been released. AMI replacements should be done without incurring application downtime or capacity problems. You decide to write a script to be run as soon as a new Amazon Linux AMI is released. Which solutions support the security policy and meet your requirements? Choose 2 answers
    1. Assign a custom recipe to each layer, which replaces the underlying AMI. Use AWS OpsWorks life-cycle events to incrementally execute this custom recipe and update the instances with the new AMI.
    2. Create a new stack and layers with identical configuration, add instances with the latest Amazon Linux AMI specified as a custom AMI to the new layer, switch DNS to the new stack, and tear down the old stack. (Blue-Green Deployment)
    3. Identify all Amazon Elastic Compute Cloud (EC2) instances of your AWS OpsWorks stack, stop each instance, replace the AMI ID property with the ID of the latest Amazon Linux AMI ID, and restart the instance. To avoid downtime, make sure not more than one instance is stopped at the same time.
    4. Specify the latest Amazon Linux AMI as a custom AMI at the stack level, terminate instances of the stack and let AWS OpsWorks launch new instances with the new AMI.
    5. Add new instances with the latest Amazon Linux AMI specified as a custom AMI to all AWS OpsWorks layers of your stack, and terminate the old ones.
  7. Your company runs an event management SaaS application that uses Amazon EC2, Auto Scaling, Elastic Load Balancing, and Amazon RDS. Your software is installed on instances at first boot, using a tool such as Puppet or Chef, which you also use to deploy small software updates multiple times per week. After a major overhaul of your software, you roll out version 2.0 new, much larger version of the software of your running instances. Some of the instances are terminated during the update process. What actions could you take to prevent instances from being terminated in the future? (Choose two)
    1. Use the zero downtime feature of Elastic Beanstalk to deploy new software releases to your existing instances. (No such feature, you can perform environment url swap)
    2. Use AWS CodeDeploy. Create an application and a deployment targeting the Auto Scaling group. Use CodeDeploy to deploy and update the application in the future. (Refer link)
    3. Run “aws autoscaling suspend-processes” before updating your application. (Refer link)
    4. Use the AWS Console to enable termination protection for the current instances. (Termination protection does not work with Auto Scaling)
    5. Run “aws autoscaling detach-load-balancers” before updating your application. (Does not prevent Auto Scaling to terminate the instances)

References

AWS Blue/Green Deployment Whitepaper

AWS Solution Architect Professional Certification Exam Preparation Guide

I recently cleared the AWS Solution Architect Professional Certification Exam with 93% after almost 2 months of preparation

Topic Level Scoring:
1.0 High Availability and Business Continuity: 100%
2.0 Costing: 75%
3.0 Deployment Management: 100%
4.0 Network Design: 85%
5.0 Data Storage: 90%
6.0 Security: 92%
7.0 Scalability & Elasticity: 100%
8.0 Cloud Migration & Hybrid Architecture: 85%

NOTE: if you are looking for Solution Architect & SysOps Associate preparation guide please check my blog post

Solution Architect – Professional exam is quite an exhaustive exam with 77 questions in 180 minutes and covers a lot of AWS services and the combinations how they work and integrate together. However, the questions are bit old and has not kept pace with the fast changing AWS enhancements

Exam covers a lot of topics and their integration with other AWS Services, would suggest to know

AWS Certification Exam Cheat Sheet

AWS Certification Exam Cheat Sheet

AWS Certification Exams cover a lot of topics and a wide range of services with minute details for features, patterns, anti patterns and their integration with other services. This blog post is just to have a quick summary of all the services and key points for a quick glance before you appear for the exam

AWS Region, AZs, Edge locations

  • Each region is a separate geographic area, completely independent, isolated from the other regions & helps achieve the greatest possible fault tolerance and stability
  • Communication between regions is across the public Internet
  • Each region has multiple Availability Zones
  • Each AZ is physically isolated, geographically separated from each other and designed as an independent failure zone
  • AZs are connected with low-latency private links (not public internet)
  • Edge locations are locations maintained by AWS through a worldwide network of data centers for the distribution of content to reduce latency.

AWS Services

Consolidate Billing

  • Paying account with multiple linked accounts
  • Paying account is independent and should be only used for billing purpose
  • Paying account cannot access resources of other accounts unless given exclusively access through Cross Account roles
  • All linked accounts are independent and soft limit of 20
  • One bill per AWS account
  • provides Volume pricing discount for usage across the accounts
  • allows unused Reserved Instances to be applied across the group
  • Free tier is not applicable across the accounts

Tags & Resource Groups

  • are metadata, specified as key/value pairs with the AWS resources
  • are for labelling purposes and helps managing, organizing resources
  • can be inherited when created resources created from Auto Scaling, Cloud Formation, Elastic Beanstalk etc
  • can be used for
    • Cost allocation to categorize and track the AWS costs
    • Conditional Access Control policy to define permission to allow or deny access on resources based on tags
  • Resource Group is a collection of resources that share one or more tags

IDS/IPS

  • Promiscuous mode is not allowed, as AWS and Hypervisor will not deliver any traffic to instances this is not specifically addressed to the instance
  • IDS/IPS strategies
    • Host Based Firewall – Forward Deployed IDS where the IDS itself is installed on the instances
    • Host Based Firewall – Traffic Replication where IDS agents installed on instances which send/duplicate the data to a centralized IDS system
    • In-Line Firewall – Inbound IDS/IPS Tier (like a WAF configuration) which identifies and drops suspect packets

DDOS Mitigation

  • Minimize the Attack surface
    • use ELB/CloudFront/Route 53 to distribute load
    • maintain resources in private subnets and use Bastion servers
  • Scale to absorb the attack
    • scaling helps buy time to analyze and respond to an attack
    • auto scaling with ELB to handle increase in load to help absorb attacks
    • CloudFront, Route 53 inherently scales as per the demand
  • Safeguard exposed resources
    • user Route 53 for aliases to hide source IPs and Private DNS
    • use CloudFront geo restriction and Origin Access Identity
    • use WAF as part of the infrastructure
  • Learn normal behavior (IDS/WAF)
    • analyze and benchmark to define rules on normal behavior
    • use CloudWatch
  • Create a plan for attacks

AWS Services Region, AZ, Subnet VPC limitations

  • Services like IAM (user, role, group, SSL certificate), Route 53, STS are Global and available across regions
  • All other AWS services are limited to Region or within Region and do not exclusively copy data across regions unless configured
  • AMI are limited to region and need to be copied over to other region
  • EBS volumes are limited to the Availability Zone, and can be migrated by creating snapshots and copying them to another region
  • Reserved instances are limited to Availability Zone and cannot be migrated to another region
  • RDS instances are limited to the region and can be recreated in a different region by either using snapshots or promoting a Read Replica
  • Placement groups are limited to the Availability Zone
  • S3 data is replicated within the region and can be move to another region using cross region replication
  • DynamoDB maintains data within the region can be replicated to another region using DynamoDB cross region replication (using DynamoDB streams) or Data Pipeline using EMR (old method)
  • Redshift Cluster span within an Availability Zone only, and can be created in other AZ using snapshots

Disaster Recovery Whitepaper

  • RTO is the time it takes after a disruption to restore a business process to its service level and RPO acceptable amount of data loss measured in time before the disaster occurs
  • Techniques (RTO & RPO reduces and the Cost goes up as we go down)
    • Backup & Restore – Data is backed up and restored, within nothing running
    • Pilot light – Only minimal critical service like RDS is running and rest of the services can be recreated and scaled during recovery
    • Warm Standby – Fully functional site with minimal configuration is available and can be scaled during recovery
    • Multi-Site – Fully functional site with identical configuration is available and processes the load
  • Services
    • Region and AZ to launch services across multiple facilities
    • EC2 instances with the ability to scale and launch across AZs
    • EBS with Snapshot to recreate volumes in different AZ or region
    • AMI to quickly launch preconfigured EC2 instances
    • ELB and Auto Scaling to scale and launch instances across AZs
    • VPC to create private, isolated section
    • Elastic IP address as static IP address
    • ENI with pre allocated Mac Address
    • Route 53 is highly available and scalable DNS service to distribute traffic across EC2 instances and ELB in different AZs and regions
    • Direct Connect for speed data transfer (takes time to setup and expensive then VPN)
    • S3 and Glacier (with RTO of 3-5 hours) provides durable storage
    • RDS snapshots and Multi AZ support and Read Replicas across regions
    • DynamoDB with cross region replication
    • Redshift snapshots to recreate the cluster
    • Storage Gateway to backup the data in AWS
    • Import/Export to move large amount of data to AWS (if internet speed is the bottleneck)
    • CloudFormation, Elastic Beanstalk and Opsworks as orchestration tools for automation and recreate the infrastructure

 

AWS Certification – Analytics Services – Cheat Sheet

Data Pipeline

  • orchestration service that helps define data-driven workflows to automate and schedule regular data movement and data processing activities
  • integrates with on-premises and cloud-based storage systems
  • allows scheduling, retry, and failure logic for the workflows

EMR

  • is a web service that utilizes a hosted Hadoop framework running on the web-scale infrastructure of EC2 and S3
  • launches all nodes for a given cluster in the same Availability Zone, which improves performance as it provides higher data access rate
  • seamlessly supports Reserved, On-Demand and Spot Instances
  • consists of Master Node for management and Slave nodes, which consists of Core nodes holding data and Task nodes for performing tasks only
  • is fault tolerant for slave node failures and continues job execution if a slave node goes down
  • does not automatically provision another node to take over failed slaves
  • supports Persistent and Transient cluster types
    • Persistent which continue to run
    • Transient which terminates once the job steps are completed
  • supports EMRFS which allows S3 to be used as a durable HA data storage

Kinesis

  • enables real-time processing of streaming data at massive scale
  • provides ordering of records, as well as the ability to read and/or replay records in the same order to multiple Kinesis applications
  • data is replicated across three data centers within a region and preserved for 24 hours, by default and can be extended to 7 days
  • streams can be scaled using multiple shards, based on the partition key, with each shard providing the capacity of 1MB/sec data input and 2MB/sec data output with 1000 PUT requests per second
  • Kinesis vs SQS
    • real-time processing of streaming big data vs reliable, highly scalable hosted queue for storing messages
    • ordered records, as well as the ability to read and/or replay records in the same order vs no guarantee on data ordering (with the standard queues before the FIFO queue feature was released)
    • data storage up to 24 hours, extended to 7 days vs up to 4 days, can be configured from 1 minute to 14 days but cleared if deleted by the consumer
    • supports multiple consumers vs single consumer at a time and requires multiple queues to deliver message to multiple consumers

AWS Certification – Application Services – Cheat Sheet

SQS

  • extremely scalable queue service and potentially handles millions of messages
  • helps build fault tolerant, distributed loosely coupled applications
  • stores copies of the messages on multiple servers for redundancy and high availability
  • guarantees At-Least-Once Delivery, but does not guarantee Exact One Time Delivery which might result in duplicate messages (Not true anymore with the introduction of FIFO queues)
  • does not maintain or guarantee message order, and if needed sequencing information needs to be added to the message itself (Not true anymore with the introduction of FIFO queues)
  • supports multiple readers and writers interacting with the same queue as the same time
  • holds message for 4 days, by default, and can be changed from 1 min – 14 days after which the message is deleted
  • message needs to be explicitly deleted by the consumer once processed
  • allows send, receive and delete batching which helps club up to 10 messages in a single batch while charging price for a single message
  • handles visibility of the message to multiple consumers using Visibility Timeout, where the message once read by a consumer is not visible to the other consumers till the timeout occurs
  • can handle load and performance requirements by scaling the worker instances as the demand changes (Job Observer pattern)
  • message sample allowing short and long polling
    • returns immediately vs waits for fixed time for e.g. 20 secs
    • might not return all messages as it samples a subset of servers vs returns all available messages
    • repetitive vs helps save cost with long connection
  • supports delay queues to make messages available after a certain delay, can you used to differentiate from priority queues
  • supports dead letter queues, to redirect messages which failed to process after certain attempts instead of being processed repeatedly
  • Design Patterns
    • Job Observer Pattern can help coordinate number of EC2 instances with number of job requests (Queue Size) automatically thus Improving cost effectiveness and performance
    • Priority Queue Pattern can be used to setup different queues with different handling either by delayed queues or low scaling capacity for handling messages in lower priority queues

SNS

  • delivery or sending of messages to subscribing endpoints or clients
  • publisher-subscriber model
  • Producers and Consumers communicate asynchronously with subscribers by producing and sending a message to a topic
  • supports Email (plain or JSON), HTTP/HTTPS, SMS, SQS
  • supports Mobile Push Notifications to push notifications directly to mobile devices with services like Amazon Device Messaging (ADM), Apple Push Notification Service (APNS), Google Cloud Messaging (GCM) etc. supported
  • order is not guaranteed and No recall available
  • integrated with Lambda to invoke functions on notifications
  • for Email notifications, use SNS or SES directly, SQS does not work

SWF

  • orchestration service to coordinate work across distributed components
  • helps define tasks, stores, assigns tasks to workers, define logic, tracks and monitors the task and maintains workflow state in a durable fashion
  • helps define tasks which can be executed on AWS cloud or on-premises
  • helps coordinating tasks across the application which involves managing intertask dependencies, scheduling, and concurrency in accordance with the logical flow of the application
  • supports built-in retries, timeouts and logging
  • supports manual tasks
  • Characteristics
    • deliver exactly once
    • uses long polling, which reduces number of polls without results
    • Visibility of task state via API
    • Timers, signals, markers, child workflows
    • supports versioning
    • keeps workflow history for a user-specified time
  • AWS SWF vs AWS SQS
    • task-oriented vs message-oriented
    • track of all tasks and events vs needs custom handling

SES

  • highly scalable and cost-effective email service
  • uses content filtering technologies to scan outgoing emails to check standards and email content for spam and malware
  • supports full fledged emails to be sent as compared to SNS where only the message is sent in Email
  • ideal for sending bulk emails at scale
  • guarantees first hop
  • eliminates the need to support custom software or applications to do heavy lifting of email transport

AWS Certification – Database Services – Cheat Sheet

RDS

  • provides Relational Database service
  • supports MySQL, MariaDB, PostgreSQL, Oracle, Microsoft SQL Server, and the new, MySQL-compatible Amazon Aurora DB engine
  • as it is a managed service, shell (root ssh) access is not provided
  • manages backups, software patching, automatic failure detection, and recovery
  • supports use initiated manual backups and snapshots
  • daily automated backups with database transaction logs enables Point in Time recovery up to the last five minutes of database usage
  • snapshots are user-initiated storage volume snapshot of DB instance, backing up the entire DB instance and not just individual databases that can be restored as a independent RDS instance
  • support encryption at rest using KMS as well as encryption in transit using SSL endpoints
  • for encrypted database
    • logs, snapshots, backups, read replicas are all encrypted as well
    • cross region replicas and snapshots does not work across region
  • Multi-AZ deployment
    • provides high availability and automatic failover support and is NOT a scaling solution
    • maintains a synchronous standby replica in a different AZ
    • transaction success is returned only if the commit is successful both on the primary and the standby DB
    • Oracle, PostgreSQL, MySQL, and MariaDB DB instances use Amazon technology, while SQL Server DB instances use SQL Server Mirroring
    • snapshots and backups are taken from standby & eliminate I/O freezes
    • during automatic failover, its seamless and RDS switches to the standby instance and updates the DNS record to point to standby
    • failover can be forced with the Reboot with failover option
  • Read Replicas
    • uses the PostgreSQL, MySQL, and MariaDB DB engines’ built-in replication functionality to create a separate Read Only instance
    • updates are asynchronously copied to the Read Replica, and data might be stale
    • can help scale applications and reduce read only load 
    • requires automatic backups enabled
    • replicates all databases in the source DB instance
    • for disaster recovery, can be promoted to a full fledged database
    • can be created in a different region for MySQL, Postgres and MariaDB, for disaster recovery, migration and low latency across regions
  • RDS does not support all the features of underlying databases, and if required the database instance can be launched on an EC2 instance
  • RMAN (Recovery Manager) can be used for Oracles backup and recovery when running on an EC2 instance

DynamoDB

  • fully managed NoSQL database service
  • synchronously replicates data across three facilities in an AWS Region, giving high availability and data durability
  • runs exclusively on SSDs to provide high I/O performance
  • provides provisioned table reads and writes
  • automatically partitions, reallocates and re-partitions the data and provisions additional server capacity as data or throughput changes
  • provides Eventually consistent (by default) or Strongly Consistent option to be specified during an read operation
  • creates and maintains indexes for the primary key attributes for efficient access of data in the table
  • supports secondary indexes
    • allows querying attributes other then the primary key attributes without impacting performance.
    • are automatically maintained as sparse objects
  • Local vs Global secondary index
    • shares partition key + different sort key vs different partition + sort key
    • search limited to partition vs across all partition
    • unique attributes vs non unique attributes
    • linked to the base table vs independent separate index
    • only created during the base table creation vs can be created later
    • cannot be deleted after creation vs can be deleted
    • consumes provisioned throughput capacity of the base table vs independent throughput
    • returns all attributes for item vs only projected attributes
    • Eventually or Strongly vs Only Eventually consistent reads
    • size limited to 10Gb per partition vs unlimited
  • supports cross region replication using DynamoDB streams which leverages Kinesis and provides time-ordered sequence of item-level changes and can help for lower RPO, lower RTO disaster recovery
  • Data Pipeline jobs with EMR can be used for disaster recovery with higher RPO, lower RTO requirements
  • supports triggers to allow execution of custom actions or notifications based on item-level updates

ElastiCache

  • managed web service that provides in-memory caching to deploy and run Memcached or Redis protocol-compliant cache clusters
  • ElastiCache with Redis,
    • like RDS, supports Multi-AZ, Read Replicas and Snapshots
    • Read Replicas are created across AZ within same region using Redis’s asynchronous replication technology
    • Multi-AZ differs from RDS as there is no standby, but if the primary goes down a Read Replica is promoted as primary
    • Read Replicas cannot span across regions, as RDS supports
    • cannot be scaled out and if scaled up cannot be scaled down
    • allows snapshots for backup and restore
    • AOF can be enabled for recovery scenarios, to recover the data in case the node fails or service crashes. But it does not help in case the underlying hardware fails
    • Enabling Redis Multi-AZ as a Better Approach to Fault Tolerance
  • ElastiCache with Memcached
    • can be scaled up by increasing size and scaled out by adding nodes
    • nodes can span across multiple AZs within the same region
    • cached data is spread across the nodes, and a node failure will always result in some data loss from the cluster
    • supports auto discovery
    • every node should be homogenous and of same instance type
  • ElastiCache Redis vs Memcached
    • complex data objects vs simple key value storage
    • persistent vs non persistent, pure caching
    • automatic failover with Multi-AZ vs Multi-AZ not supported
    • scaling using Read Replicas vs using multiple nodes
    • backup & restore supported vs not supported
  • can be used state management to keep the web application stateless

Redshift

  • fully managed, fast and powerful, petabyte scale data warehouse service
  • uses replication and continuous backups to enhance availability and improve data durability and can automatically recover from node and component failures
  • provides Massive Parallel Processing (MPP) by distributing & parallelizing queries across multiple physical resources
  • columnar data storage improving query performance and allowing advance compression techniques
  • only supports Single-AZ deployments and the nodes are available within the same AZ, if the AZ supports Redshift clusters
  • spot instances are NOT an option