AWS Automated Backups – Certification

AWS Automated Backups

  • AWS allows automated backups for
    • RDS
    • ElastiCache – Redis only
    • Redshift
  • AWS does not perform automated backups for EC2 EBS volumes and needs to be manually scripted
  • AWS stores the backups and snapshots in S3

RDS Backups

  • RDS supports automated backups as well as manual snapshots
  • Automated Backups
    • enable point-in-time recovery of the DB Instance
    • perform a full daily backup and captures transaction logs (as updates to your DB instance are made
    • are performed during the defined preferred backup window and is retained for user-specified period of time called the retention period (default 1 day with a max of 35 days)
    • When a point-in-time recovery is initiated, transaction logs are applied to the most appropriate daily backup in order to restore the DB instance to the specific requested time.
    • allows a point-in-time restore and an ability to specify any second during the retention period, up to the Latest Restorable Time
    • are deleted when the DB instance is deleted
  • Snapshots
    • are user-initiated and enable to back up the DB instance in a known state as frequently as needed, and then restored to that specific state at any time.
    • can be created with the AWS Management Console or by using the CreateDBSnapshot API call.
    • are not deleted when the DB instance is deleted
  • Automated backups and snapshots can result in a performance hit, if Multi-AZ is not enabled

ElastiCache Automated Backups

  • ElastiCache supports Automated backups for Redis cluster only
  • ElastiCache creates a backup of the cluster on a daily basis
  • Snapshot will degrade performance, so should be performed during least bust part of the day
  • Backups are performed during the Backup period and retained for backup retention limit defined, with a maximum of 35 days
  • ElastiCache also allows manual snapshots of the cluster

Redshift Automated Backups

  • Amazon Redshift enables automated backups, by default
  • Redshift replicates all the data within your data warehouse cluster when it is loaded and also continuously backs up the data to S3
  • Redshift retains backups for 1 day which can be extended to max 35 days
  • Redshift only backs up data that has changed and are incremental so most snapshots use up a small amount of storage
  • Redshift also allows manual snapshots of the data warehouse

EC2 EBS Backups

  • EBS does not provide automated backups
  • EBS snapshots can be created by using the AWS Management Console, the command line interface (CLI), or the APIs
  • Backups degrade performance
  • Stored on S3
  • EBS Snapshots are incremental and block-based, and they consume space only for changed data after the initial snapshot is created
  • Data can be restored from snapshots by created a volume from the snapshot
  • EBS snapshots are region specific and can be copied between AWS regions

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which two AWS services provide out-of-the-box user configurable automatic backup-as-a-service and backup rotation options? Choose 2 answers
    1. Amazon S3
    2. Amazon RDS
    3. Amazon EBS
    4. Amazon Redshift
  2. You have been asked to automate many routine systems administrator backup and recovery activities. Your current plan is to leverage AWS-managed solutions as much as possible and automate the rest with the AWS CLI and scripts. Which task would be best accomplished with a script?
    1. Creating daily EBS snapshots with a monthly rotation of snapshots
    2. Creating daily RDS snapshots with a monthly rotation of snapshots
    3. Automatically detect and stop unused or underutilized EC2 instances
    4. Automatically add Auto Scaled EC2 instances to an Amazon Elastic Load Balancer

AWS Billing and Cost Management – Certification

AWS Billing and Cost Management

  • AWS Billing and Cost Management is the service that you use to pay AWS bill, monitor your usage, and budget your costs

Analyzing Costs with Graphs

  • AWS provides Cost Explorer tool which allows filter graphs by API operations, Availability Zones, AWS service, custom cost allocation tags, EC2 instance type, purchase options, region, usage type, usage type groups, or, if Consolidated Billing used, by linked account.

Budgets

  • Budgets can be used to track AWS costs to see usage-to-date and current estimated charges from AWS
  • Budgets use the cost visualization provided by Cost Explorer to show the status of the budgets and to provide forecasts of your estimated costs.
  • Budgets can be used to create CloudWatch alarms that notify when you go over your budgeted amounts, or when the estimated costs exceed budgets
  • Notifications can be sent to an SNS topic and to email addresses associated with your budget notification

Cost Allocation Tags

  • Tags can be used to organize AWS resources, and cost allocation tags to track the AWS costs on a detailed level.
  • Upon cost allocation tags activation, AWS uses the cost allocation tags to organize the resource costs on the cost allocation report making it easier to categorize and track your AWS costs.
  • AWS provides two types of cost allocation tags,
    • an AWS-generated tag AWS defines, creates, and applies the AWS-generated tag for you,
    • and user-defined tags that you define, create,
  • Both types of tags must be activated separately before they can appear in Cost Explorer or on a cost allocation report

Alerts on Cost Limits

  • CloudWatch can be used to create billing alerts when the AWS costs exceed specified thresholds
  • When the usage exceeds threshold amounts, AWS sends an email notification

Consolidated Billing

Refer to My Blog Post about Consolidated Billing

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. An organization is using AWS since a few months. The finance team wants to visualize the pattern of AWS spending. Which of the below AWS tool will help for this requirement?
    • AWS Cost Manager
    • AWS Cost Explorer (Check Cost Explorer)
    • AWS CloudWatch
    • AWS Consolidated Billing (Will not help visualize)
  2. Your company wants to understand where cost is coming from in the company’s production AWS account. There are a number of applications and services running at any given time. Without expending too much initial development time, how best can you give the business a good understanding of which applications cost the most per month to operate?
    1. Create an automation script, which periodically creates AWS Support tickets requesting detailed intra-month information about your bill.
    2. Use custom CloudWatch Metrics in your system, and put a metric data point whenever cost is incurred.
    3. Use AWS Cost Allocation Tagging for all resources, which support it. Use the Cost Explorer to analyze costs throughout the month. (Refer link)
    4. Use the AWS Price API and constantly running resource inventory scripts to calculate total price based on multiplication of consumed resources over time.
  3. You need to know when you spend $1000 or more on AWS. What’s the easy way for you to see that notification?
    1. AWS CloudWatch Events tied to API calls, when certain thresholds are exceeded, publish to SNS.
    2. Scrape the billing page periodically and pump into Kinesis.
    3. AWS CloudWatch Metrics + Billing Alarm + Lambda event subscription. When a threshold is exceeded, email the manager.
    4. Scrape the billing page periodically and publish to SNS.
  4. A user is planning to use AWS services for his web application. If the user is trying to set up his own billing management system for AWS, how can he configure it?
    1. Set up programmatic billing access. Download and parse the bill as per the requirement
    2. It is not possible for the user to create his own billing management service with AWS
    3. Enable the AWS CloudWatch alarm which will provide APIs to download the alarm data
    4. Use AWS billing APIs to download the usage report of each service from the AWS billing console
  5. An organization is setting up programmatic billing access for their AWS account. Which of the below mentioned services is not required or enabled when the organization wants to use programmatic access?
    1. Programmatic access
    2. AWS bucket to hold the billing report
    3. AWS billing alerts
    4. Monthly Billing report
  6. A user has setup a billing alarm using CloudWatch for $200. The usage of AWS exceeded $200 after some days. The user wants to increase the limit from $200 to $400? What should the user do?
    1. Create a new alarm of $400 and link it with the first alarm
    2. It is not possible to modify the alarm once it has crossed the usage limit
    3. Update the alarm to set the limit at $400 instead of $200 (Refer link)
    4. Create a new alarm for the additional $200 amount
  7. A user is trying to configure the CloudWatch billing alarm. Which of the below mentioned steps should be performed by the user for the first time alarm creation in the AWS Account Management section?
    1. Enable Receiving Billing Reports
    2. Enable Receiving Billing Alerts
    3. Enable AWS billing utility
    4. Enable CloudWatch Billing Threshold

References

AWS_Billing_&_Cost_Management – User_Guide

AWS RDS Monitoring & Notification – Certification

AWS RDS Monitoring & Notification

  • RDS integrates with CloudWatch and provides metrics for monitoring
  • CloudWatch alarms can be created over a single metric that sends an SNS message when the alarm changes state
  • RDS also provides SNS notification whenever any RDS event occurs

CloudWatch RDS Monitoring

  • RDS DB instance can be monitored using CloudWatch, which collects and processes raw data from RDS into readable, near real-time metrics.
  • The statistics are recorded for a period of two weeks, so that you can access historical information and gain a better perspective on how the service is performing.
  • By default, RDS metric data is automatically sent to Amazon CloudWatch in 1-minute periods
  • CloudWatch RDS Metrics
    • BinLogDiskUsage – Amount of disk space occupied by binary logs on the master. Applies to MySQL read replicas.
    • CPUUtilization – Percentage of CPU utilization.
    • DatabaseConnections – Number of database connections in use.
    • DiskQueueDepth – The number of outstanding IOs (read/write requests) waiting to access the disk.
    • FreeableMemory – Amount of available random access memory.
    • FreeStorageSpace – Amount of available storage space.
    • ReplicaLag – Amount of time a Read Replica DB instance lags behind the source DB instance. Applies to MySQL, MariaDB, and PostgreSQL Read Replicas.
    • SwapUsage – Amount of swap space used on the DB instance.
    • ReadIOPS – Average number of disk I/O operations per second.
    • WriteIOPS – Average number of disk I/O operations per second.
    • ReadLatency – Average amount of time taken per disk I/O operation.
    • WriteLatency – Average amount of time taken per disk I/O operation.
    • ReadThroughput – Average number of bytes read from disk per second.
    • WriteThroughput – Average number of bytes written to disk per second.
    • NetworkReceiveThroughput – Incoming (Receive) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.
    • NetworkTransmitThroughput – Outgoing (Transmit) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.

RDS Event Notification

  • RDS uses the SNS to provide notification when an RDS event occurs
  • RDS groups the events into categories, which can be subscribed so that a notification is sent when an event in that category occurs.
  • Event category for a DB instance, DB cluster, DB snapshot, DB cluster snapshot, DB security group or for a DB parameter group can be subscribed
  • Event notifications are sent to the email addresses provided during subscription creation
  • Subscription can be easily turn off notification without deleting a subscription by setting the Enabled radio button to No in the RDS console or by setting the Enabled parameter to false using the CLI or RDS API.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You run a web application with the following components Elastic Load Balancer (ELB), 3 Web/Application servers, 1 MySQL RDS database with read replicas, and Amazon Simple Storage Service (Amazon S3) for static content. Average response time for users is increasing slowly. What three CloudWatch RDS metrics will allow you to identify if the database is the bottleneck? Choose 3 answers
    1. The number of outstanding IOs waiting to access the disk
    2. The amount of write latency
    3. The amount of disk space occupied by binary logs on the master.
    4. The amount of time a Read Replica DB Instance lags behind the source DB Instance
    5. The average number of disk I/O operations per second.
  2. Typically, you want your application to check whether a request generated an error before you spend any time processing results. The easiest way to find out if an error occurred is to look for an __________ node in the response from the Amazon RDS API.
    1. Incorrect
    2. Error
    3. FALSE
  3. In the Amazon CloudWatch, which metric should I be checking to ensure that your DB Instance has enough free storage space?
    1. FreeStorage
    2. FreeStorageSpace
    3. FreeStorageVolume
    4. FreeDBStorageSpace
  4. A user is receiving a notification from the RDS DB whenever there is a change in the DB security group. The user does not want to receive these notifications for only a month. Thus, he does not want to delete the notification. How can the user configure this?
    1. Change the Disable button for notification to “Yes” in the RDS console
    2. Set the send mail flag to false in the DB event notification console
    3. The only option is to delete the notification from the console
    4. Change the Enable button for notification to “No” in the RDS console
  5. A sys admin is planning to subscribe to the RDS event notifications. For which of the below mentioned source categories the subscription cannot be configured?
    1. DB security group
    2. DB snapshot
    3. DB options group
    4. DB parameter group
  6. A user is planning to setup notifications on the RDS DB for a snapshot. Which of the below mentioned event categories is not supported by RDS for this snapshot source type?
    1. Backup (Refer link)
    2. Creation
    3. Deletion
    4. Restoration
  7. A system admin is planning to setup event notifications on RDS. Which of the below mentioned services will help the admin setup notifications?
    1. AWS SES
    2. AWS Cloudtrail
    3. AWS CloudWatch
    4. AWS SNS
  8. A user has setup an RDS DB with Oracle. The user wants to get notifications when someone modifies the security group of that DB. How can the user configure that?
    1. It is not possible to get the notifications on a change in the security group
    2. Configure SNS to monitor security group changes
    3. Configure event notification on the DB security group
    4. Configure the CloudWatch alarm on the DB for a change in the security group
  9. It is advised that you watch the Amazon CloudWatch “_____” metric (available via the AWS Management Console or Amazon Cloud Watch APIs) carefully and recreate the Read Replica should it fall behind due to replication errors.
    1. Write Lag
    2. Read Replica
    3. Replica Lag
    4. Single Replica

AWS RDS Security – Certification

AWS RDS Security

  • AWS provides multiple features to provide RDS security
    • DB instance can be hosted in a VPC for the greatest possible network access control
    • IAM policies can be used to assign permissions that determine who is allowed to manage RDS resources
    • Security groups allow to control what IP addresses or EC2 instances can connect to the databases on a DB instance
    • Secure Socket Layer (SSL) connections with DB instances
    • RDS encryption to secure RDS instances and snapshots at rest.
    • Network encryption and transparent data encryption (TDE) with Oracle DB instances

RDS Authentication and Access Control

  • IAM can be used to control which RDS operations each individual user has permission to call

Encrypting RDS Resources

  • RDS encrypted instances use the industry standard AES-256 encryption algorithm to encrypt data on the server that hosts the RDS instance
  • RDS then handles authentication of access and decryption of this data with a minimal impact on performance, and with no need to modify your database client applications
  • Data at Rest Encryption
    • can be enabled on RDS instances to encrypt the underlying storage
    • encryption keys are managed by KMS
    • can be enabled only during instance creation
    • once enabled, the encryption keys cannot be changed
    • if the key is lost, the DB can only be restored from the backup
  • Once encryption is enabled for an RDS instance,
    • logs are encrypted
    • snapshots are encrypted
    • automated backups are encrypted
    • read replicas are encrypted
  • Cross region replicas and snapshots copy does not work since the key is only available in a single region
  • RDS DB Snapshot considerations
    • DB snapshot encrypted using an KMS encryption key can be copied
    • Copying an encrypted DB snapshot, results in an encrypted copy of the DB snapshot
    • When copying, DB snapshot can either be encrypted with the same KMS encryption key as the original DB snapshot, or a different KMS encryption key to encrypt the copy of the DB snapshot.
    • An unencrypted DB snapshot can be copied to an encrypted snapshot, a quick way to add encryption to a previously encrypted DB instance.
    • Encrypted snapshot can be restored only to an encrypted DB instance
    • If a KMS encryption key is specified when restoring from an unencrypted DB cluster snapshot, the restored DB cluster is encrypted using the specified KMS encryption key
    • Copying an encrypted snapshot shared from another AWS account, requires access to the KMS encryption key used to encrypt the DB snapshot.
    • Because KMS encryption keys are specific to the region that they are created in, encrypted snapshot cannot be copied to another region
  • Transparent Data Encryption (TDE)
    • Automatically encrypts the data before it is written to the underlying storage device and decrypts when it is read  from the storage device
    • is supported by Oracle and SQL Server
      • Oracle requires key storage outside of the KMS and integrates with CloudHSM for this
      • SQL Server requires a key but is managed by RDS

SSL to Encrypt a Connection to a DB Instance

  • Encrypt connections using SSL for data in transit between the applications and the DB instance
  • Amazon RDS creates an SSL certificate and installs the certificate on the DB instance when RDS provisions the instance.
  • SSL certificates are signed by a certificate authority. SSL certificate includes the DB instance endpoint as the Common Name (CN) for the SSL certificate to guard against spoofing attacks
  • While SSL offers security benefits, be aware that SSL encryption is a compute-intensive operation and will increase the latency of the database connection.

RDS Security Groups

  • Security groups control the access that traffic has in and out of a DB instance
  • VPC security groups act like a firewall controlling network access to your DB instance.
  • VPC security groups can be configured and associated with the DB instance to allow access from an IP address range, port, or EC2 security group
  • Database security groups default to a “deny all” access mode and customers must specifically authorize network ingress.

Master User Account Privileges

  • When you create a new DB instance, the default master user that used gets certain privileges for that DB instance
  • Subsequently, other users with permissions can be created

Event Notification

  • Event notifications can be configured for important events that occur on the DB instance
  • Notifications of a variety of important events that can occur on the RDS instance, such as whether the instance was shut down, a backup was started, a failover occurred, the security group was changed, or your storage space is low can be received

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Can I encrypt connections between my application and my DB Instance using SSL?
    1. No
    2. Yes
    3. Only in VPC
    4. Only in certain regions
  2. Which of these configuration or deployment practices is a security risk for RDS?
    1. Storing SQL function code in plaintext
    2. Non-Multi-AZ RDS instance
    3. Having RDS and EC2 instances exist in the same subnet
    4. RDS in a public subnet (Making RDS accessible to the public internet in a public subnet poses a security risk, by making your database directly addressable and spammable. DB instances deployed within a VPC can be configured to be accessible from the Internet or from EC2 instances outside the VPC. If a VPC security group specifies a port access such as TCP port 22, you would not be able to access the DB instance because the firewall for the DB instance provides access only via the IP addresses specified by the DB security groups the instance is a member of and the port defined when the DB instance was created. Refer link)

References

AWS_RDS_User_Guide – Security

 

AWS Blue Green Deployment – Certification

AWS Blue Green Deployment

  • Blue/green deployments provide near zero-downtime release and rollback capabilities.
  • Blue/green deployment works by shifting traffic between two identical environments that are running different versions of the application
    • Blue environment represents the current application version serving production traffic.
    • In parallel, the green environment is staged running a different version of your application.
    • After the green environment is ready and tested, production traffic is redirected from blue to green.
    • If any problems are identified, you can roll back by reverting traffic back to the blue environment.

NOTE: Advanced Topic required for DevOps Professional Exam Only

AWS Services

Route 53

  • Route 53 is a highly available and scalable authoritative DNS service that route user requests
  • Route 53 with its DNS service allows administrators to direct traffic by simply updating DNS records in the hosted zone
  • TTL can be adjusted for resource records to be shorter which allow record changes to propagate faster to clients

Elastic Load Balancing

  • Elastic Load Balancing distributes incoming application traffic across EC2 instances
  • Elastic Load Balancing scales in response to incoming requests, performs health checking against Amazon EC2 resources, and naturally integrates with other AWS tools, such as Auto Scaling.
  • ELB also helps perform health checks of EC2 instances to route traffic only to the healthy instances

Auto Scaling

  • Auto Scaling allows different versions of launch configuration, which define templates used to launch EC2 instances, to be attached to an Auto Scaling group to enable blue/green deployment.
  • Auto Scaling’s termination policies and Standby state enable blue/green deployment
    • Termination policies in Auto Scaling groups to determine which EC2 instances to remove during a scaling action.
    • Auto Scaling also allows instances to be placed in Standby state, instead of termination, which helps with quick rollback when required
  • Auto Scaling with Elastic Load Balancing can be used to balance and scale the traffic

Elastic Beanstalk

  • Elastic Beanstalk makes it easy to run multiple versions of the application and provides capabilities to swap the environment URLs, facilitating blue/green deployment.
  • Elastic Beanstalk supports Auto Scaling and Elastic Load Balancing, both of which enable blue/green deployment

OpsWorks

  • OpsWorks has the concept of stacks, which are logical groupings of AWS resources with a common purpose & should be logically managed together
  • Stacks are made of one or more layers with each layer represents a set of EC2 instances that serve a particular purpose, such as serving applications or hosting a database server.
  • OpsWorks simplifies cloning entire stacks when preparing for blue/green environments.

CloudFormation

  • CloudFormation helps describe the AWS resources through JSON formatted templates and provides automation capabilities for provisioning blue/green environments and facilitating updates to switch traffic, whether through Route 53 DNS, Elastic Load Balancing, etc
  • CloudFormation provides infrastructure as code strategy, where infrastructure is provisioned and managed using code and software development techniques, such as version control and continuous integration, in a manner similar to how application code is treated

CloudWatch

  • CloudWatch monitoring can provide early detection of application health in blue/green deployments

Deployment Techniques

DNS Routing using Route 53

  • Route 53 DNS service can help switch traffic from the blue environment to the green and vice versa, if rollback is necessary
  • Route 53 can help either switch the traffic completely or through a weighted distribution
  • Weighted distribution
    • helps distribute percentage of traffic to go to the green environment and gradually update the weights until the green environment carries the full production traffic
    • provides the ability to perform canary analysis where a small percentage of production traffic is introduced to a new environment
    • helps manage cost by using auto scaling for instances to scale based on the actual demand
  • Route 53 can handle Public or Elastic IP address, Elastic Load Balancer, Elastic Beanstalk environment web tiers etc.

Auto Scaling Group Swap Behind Elastic Load Balancer

AWS Blue Green Deployment - Auto Scaling Group

  • Elastic Load Balancing with Auto Scaling to manage EC2 resources as per the demand can be used for Blue Green deployments
  • Multiple Auto Scaling groups can be attached to the Elastic Load Balancer
  • Green ASG can be attached to an existing ELB while Blue ASG is already attached to the ELB to serve traffic
  • ELB would start routing requests to the Green Group as for HTTP/S listener it uses a least outstanding requests routing algorithm
  • Green group capacity can be increased to process more traffic while the Blue group capacity can be reduced either by terminating the instances or by putting the instances in a standby mode
  • Standby is a good option because if roll back to the blue environment needed, blue server instances can be put back in service and they’re ready to go
  • If no issues with the Green group, the blue group can be decommissioned by adjusting the group size to zero

Update Auto Scaling Group Launch Configurations

AWS Blue Green Deployment - Auto Scaling Launch

  • Auto Scaling groups have their own launch configurations which define template for EC2 instances to be launched
  • Auto Scaling group can have only one launch configuration at a time, and it can’t be modified. If needs modification, a new launch configuration can be created and attached to the existing Auto Scaling Group
  • After a new launch configuration is in place, any new instances that are launched use the new launch configuration parameters, but existing instances are not affected.
  • When Auto Scaling removes instances (referred to as scaling in) from the group, the default termination policy is to remove instances with the oldest launch configuration
  • To deploy the new version of the application in the green environment, update the Auto Scaling group with the new launch configuration, and then scale the Auto Scaling group to twice its original size.
  • Then, shrink the Auto Scaling group back to the original size
  • To perform a rollback, update the Auto Scaling group with the old launch configuration. Then, do the preceding steps in reverse

Elastic Beanstalk Application Environment Swap

AWS Blue Green Deployment - Elastic Beanstalk

  • Elastic Beanstalk multiple environment and environment url swap feature helps enable Blue Green deployment
  • Elastic Beanstalk can be used to host the blue environment exposed via URL to access the environment
  • Elastic Beanstalk provides several deployment policies, ranging from policies that perform an in-place update on existing instances, to immutable deployment using a set of new instances.
  • Elastic Beanstalk performs an in-place update when the application versions are updated, however application may become unavailable to users for a short period of time.
  • To avoid the downtime, a new version can be deployed to a separate Green environment with its own URL, launched with the existing environment’s configuration
  • Elastic Beanstalk’s Swap Environment URLs feature can be used to promote the green environment to serve production traffic
  • Elastic Beanstalk performs a DNS switch, which typically takes a few minutes
  • To perform a rollback, invoke Swap Environment URL again.

Clone a Stack in AWS OpsWorks and Update DNS

  • OpsWorks can be used to create
    • Blue environment stack with the current version of the application and serving production traffic
    • Green environment stack with the newer version of the application and is not receiving any traffic
  • To promote to the green environment/stack into production, update DNS records to point to the green environment/stack’s load balancer

AWS Blue Green deployment patterns

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What is server immutability?
    1. Not updating a server after creation. (During the new release, a new set of EC2 instances are rolled out by terminating older instances and are disposable. EC2 instance usage is considered temporary or ephemeral in nature for the period of deployment until the current release is active)
    2. The ability to change server counts.
    3. Updating a server after creation.
    4. The inability to change server counts.
  2. You need to deploy a new application version to production. Because the deployment is high-risk, you need to roll the new version out to users over a number of hours, to make sure everything is working correctly. You need to be able to control the proportion of users seeing the new version of the application down to the percentage point. You use ELB and EC2 with Auto Scaling Groups and custom AMIs with your code pre-installed assigned to Launch Configurations. There are no database-level changes during your deployment. You have been told you cannot spend too much money, so you must not increase the number of EC2 instances much at all during the deployment, but you also need to be able to switch back to the original version of code quickly if something goes wrong. What is the best way to meet these requirements?
    1. Create a second ELB, Auto Scaling Launch Configuration, and Auto Scaling Group using the Launch Configuration. Create AMIs with all code pre-installed. Assign the new AMI to the second Auto Scaling Launch Configuration. Use Route53 Weighted Round Robin Records to adjust the proportion of traffic hitting the two ELBs. (Use Weighted Round Robin DNS Records and reverse proxies allow such fine-grained tuning of traffic splits. Blue-Green option does not meet the requirement that we mitigate costs and keep overall EC2 fleet size consistent, so we must select the 2 ELB and ASG option with WRR DNS tuning)
    2. Use the Blue-Green deployment method to enable the fastest possible rollback if needed. Create a full second stack of instances and cut the DNS over to the new stack of instances, and change the DNS back if a rollback is needed. (Full second stack is expensive)
    3. Create AMIs with all code pre-installed. Assign the new AMI to the Auto Scaling Launch Configuration, to replace the old one. Gradually terminate instances running the old code (launched with the old Launch Configuration) and allow the new AMIs to boot to adjust the traffic balance to the new code. On rollback, reverse the process by doing the same thing, but changing the AMI on the Launch Config back to the original code. (Cannot modify the existing launch config)
    4. Migrate to use AWS Elastic Beanstalk. Use the established and well-tested Rolling Deployment setting AWS provides on the new Application Environment, publishing a zip bundle of the new code and adjusting the wait period to spread the deployment over time. Re-deploy the old code bundle to rollback if needed.
  3. When thinking of AWS Elastic Beanstalk, the ‘Swap Environment URLs’ feature most directly aids in what?
    1. Immutable Rolling Deployments
    2. Mutable Rolling Deployments
    3. Canary Deployments
    4. Blue-Green Deployments (Complete switch from one environment to other)
  4. You were just hired as a DevOps Engineer for a startup. Your startup uses AWS for 100% of their infrastructure. They currently have no automation at all for deployment, and they have had many failures while trying to deploy to production. The company has told you deployment process risk mitigation is the most important thing now, and you have a lot of budget for tools and AWS resources. Their stack: 2-tier API Data stored in DynamoDB or S3, depending on type, Compute layer is EC2 in Auto Scaling Groups, They use Route53 for DNS pointing to an ELB, An ELB balances load across the EC2 instances. The scaling group properly varies between 4 and 12 EC2 servers. Which of the following approaches, given this company’s stack and their priorities, best meets the company’s needs?
    1. Model the stack in AWS Elastic Beanstalk as a single Application with multiple Environments. Use Elastic Beanstalk’s Rolling Deploy option to progressively roll out application code changes when promoting across environments. (Does not support DynamoDB also need Blue Green deployment for zero downtime deployment as cost is not a constraint)
    2. Model the stack in 3 CloudFormation templates: Data layer, compute layer, and networking layer. Write stack deployment and integration testing automation following Blue-Green methodologies.
    3. Model the stack in AWS OpsWorks as a single Stack, with 1 compute layer and its associated ELB. Use Chef and App Deployments to automate Rolling Deployment. (Does not support DynamoDB also need Blue Green deployment for zero downtime deployment as cost is not a constraint)
    4. Model the stack in 1 CloudFormation template, to ensure consistency and dependency graph resolution. Write deployment and integration testing automation following Rolling Deployment methodologies. (Need Blue Green deployment for zero downtime deployment as cost is not a constraint)
  5. You are building out a layer in a software stack on AWS that needs to be able to scale out to react to increased demand as fast as possible. You are running the code on EC2 instances in an Auto Scaling Group behind an ELB. Which application code deployment method should you use?
    1. SSH into new instances those come online, and deploy new code onto the system by pulling it from an S3 bucket, which is populated by code that you refresh from source control on new pushes. (is slow and manual)
    2. Bake an AMI when deploying new versions of code, and use that AMI for the Auto Scaling Launch Configuration. (Pre baked AMIs can help to get started quickly)
    3. Create a Dockerfile when preparing to deploy a new version to production and publish it to S3. Use UserData in the Auto Scaling Launch configuration to pull down the Dockerfile from S3 and run it when new instances launch. (is slow)
    4. Create a new Auto Scaling Launch Configuration with UserData scripts configured to pull the latest code at all times. (is slow)
  6. You company runs a complex customer relations management system that consists of around 10 different software components all backed by the same Amazon Relational Database (RDS) database. You adopted AWS OpsWorks to simplify management and deployment of that application and created an AWS OpsWorks stack with layers for each of the individual components. An internal security policy requires that all instances should run on the latest Amazon Linux AMI and that instances must be replaced within one month after the latest Amazon Linux AMI has been released. AMI replacements should be done without incurring application downtime or capacity problems. You decide to write a script to be run as soon as a new Amazon Linux AMI is released. Which solutions support the security policy and meet your requirements? Choose 2 answers
    1. Assign a custom recipe to each layer, which replaces the underlying AMI. Use AWS OpsWorks life-cycle events to incrementally execute this custom recipe and update the instances with the new AMI.
    2. Create a new stack and layers with identical configuration, add instances with the latest Amazon Linux AMI specified as a custom AMI to the new layer, switch DNS to the new stack, and tear down the old stack. (Blue-Green Deployment)
    3. Identify all Amazon Elastic Compute Cloud (EC2) instances of your AWS OpsWorks stack, stop each instance, replace the AMI ID property with the ID of the latest Amazon Linux AMI ID, and restart the instance. To avoid downtime, make sure not more than one instance is stopped at the same time.
    4. Specify the latest Amazon Linux AMI as a custom AMI at the stack level, terminate instances of the stack and let AWS OpsWorks launch new instances with the new AMI.
    5. Add new instances with the latest Amazon Linux AMI specified as a custom AMI to all AWS OpsWorks layers of your stack, and terminate the old ones.
  7. Your company runs an event management SaaS application that uses Amazon EC2, Auto Scaling, Elastic Load Balancing, and Amazon RDS. Your software is installed on instances at first boot, using a tool such as Puppet or Chef, which you also use to deploy small software updates multiple times per week. After a major overhaul of your software, you roll out version 2.0 new, much larger version of the software of your running instances. Some of the instances are terminated during the update process. What actions could you take to prevent instances from being terminated in the future? (Choose two)
    1. Use the zero downtime feature of Elastic Beanstalk to deploy new software releases to your existing instances. (No such feature, you can perform environment url swap)
    2. Use AWS CodeDeploy. Create an application and a deployment targeting the Auto Scaling group. Use CodeDeploy to deploy and update the application in the future. (Refer link)
    3. Run “aws autoscaling suspend-processes” before updating your application. (Refer link)
    4. Use the AWS Console to enable termination protection for the current instances. (Termination protection does not work with Auto Scaling)
    5. Run “aws autoscaling detach-load-balancers” before updating your application. (Does not prevent Auto Scaling to terminate the instances)

References

AWS Blue/Green Deployment Whitepaper

AWS DynamoDB Secondary Indexes – Certification

AWS DynamoDB Secondary Indexes

  • DynamoDB provides fast access to items in a table by specifying primary key values
  • Secondary indexes on a table allow efficient access to data with attributes other than the primary key
  • Secondary index
    • is a data structure that contains a subset of attributes from a table
    • is associated with exactly one table, from which it obtains its data
    • requires an alternate key for the index partition key and sort key
    • additionally can define projected attributes which are copied from the base table into the index along with the primary key attributes
    • is automatically maintained by DynamoDB
    • any addition, modification, or deletion of items in the base table, any indexes on that table are also updated to reflect these changes.
  • DynamoDB supports two types of secondary indexes
    • Global secondary index – an index with a partition key and a sort key that can be different from those on the base table
    • Local secondary index – an index that has the same partition key as the base table, but a different sort key

Global Secondary Indexes

  • DynamoDB creates and maintains indexes for the primary key attributes for efficient access of data in the table, which allows applications to quickly retrieve data by specifying primary key values.
  • Global Secondary Indexes (GSI) are indexes that contain partition or composite partition-and-sort keys that can be different from the keys in the table on which the index is based.
  • Global secondary index is considered “global” because queries on the index can span all items in a table, across all partitions.
  • Multiple secondary indexes can be created on a table, and queries issued against these indexes.
  • Applications benefit from having one or more secondary keys available to allow efficient access to data with attributes other than the primary key.
  • GSIs support non-unique attributes, which increases query flexibility by enabling queries against any non-key attribute in the table
  • GSIs support non-unique attributes, which increases query flexibility by enabling queries against any non-key attribute in the table
  • GSIs support eventual consistency. DynamoDB automatically handles item additions, updates and deletes in a GSI when corresponding changes are made to the table asynchronously
  • Data in a secondary index consists of GSI alternate key, primary key and  attributes that are projected, or copied, from the table into the index.
  • Attributes that are part of an item in a table, but not part of the GSI key, primary key of the table, or projected attributes are not returned on querying the GSI index
  • GSIs manage throughput independently of the table they are based on and the provisioned throughput for the table and each associated GSI needs to be specified at creation time

Local Secondary Indexes

  • Local secondary index are indexes that has the same partition key as the table, but a different sort key.
  • Local secondary index is “local” cause every partition of a local secondary index is scoped to a table partition that has the same partition key.
  • LSI allows search using a secondary index in place of the sort key, thus expanding the number of attributes that can be used for queries which can be conducted efficiently
  • LSI are updated automatically when the primary index is updated and reads support both strong and eventually consistent options
  • LSIs can only be queried via the Query API
  • LSIs cannot be added to existing tables at this time
  • LSIs cannot be modified once it is created at this time
  • LSI cannot be removed from a table once they are created at this time
  • LSI consumes provisioned throughput capacity as part of the table with which it is associated

DynamoDB Secondary Indexes - GSI vs LSI

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. In DynamoDB, a secondary index is a data structure that contains a subset of attributes from a table, along with an alternate key to support ____ operations.
    1. None of the above
    2. Both
    3. Query
    4. Scan
  2. In regard to DynamoDB, what is the Global secondary index?
    1. An index with a hash and range key that can be different from those on the table
    2. An index that has the same range key as the table, but a different hash key
    3. An index that has the same hash key and range key as the table
    4. An index that has the same hash key as the table, but a different range key
  3. In regard to DynamoDB, can I modify the index once it is created?
    1. Yes, if it is a primary hash key index
    2. Yes, if it is a Global secondary index
    3. No
    4. Yes, if it is a local secondary index
  4. When thinking of DynamoDB, what are true of Global Secondary Key properties?
    1. The partition key and sort key can be different from the table.
    2. Only the partition key can be different from the table.
    3. Either the partition key or the sort key can be different from the table, but not both.
    4. Only the sort key can be different from the table.

References

AWS Key Management Service – KMS – Certification

AWS Key Management Service – KMS

  • AWS KMS is a managed encryption service that enables encryption of data easily
  • KMS provides a highly available key storage, management, and auditing solution to encrypt the data across AWS services & within applications
  • KMS is seamlessly integrated with several other AWS services to make encrypting data in those service easy
  • KMS Keys are only stored and used in the region in which they are created. They cannot be transferred to another region
  • KMS enforces usage and management policies, to control which IAM user, role from your account or other accounts who can manage and use keys
  • KMS is integrated with CloudTrail, so all requests to use the keys are logged to understand who used which key when
  • KMS allows rotation of the keys,
    • if keys generated by KMS rotated automatically by KMS, data does not need to be re-encrypted. KMS keeps previous versions of keys to use for decryption of data encrypted under an old version of a key. All new encryption requests against a key in AWS KMS are encrypted under the newest version of the key.
    • if manually rotated, data has to be re-encrypted depending on the application’s configuration
    • Automatic key rotation is not supported for imported keys

KMS Working

  • KMS centrally manages and securely stores the keys
  • Keys can be generated or imported from your key management infrastructure
  • Keys can be used from within the applications and supported AWS services to protect the data, but the key never leaves KMS AWS.
  • Data is submitted to AWS KMS to be encrypted, or decrypted, under keys that you control.
  • Usage policies on these keys can be set that determine which users can use them to encrypt and decrypt data.

Envelope encryption

  • AWS cloud services integrated with AWS KMS use a method called envelope encryption to protect the data.
  • Envelope encryption is an optimized method for encrypting data that uses two different keys
  • With Envelop encryption
    • A data key is generated and used by the AWS service to encrypt each piece of data or resource.
    • Data key is encrypted under a master key that you define in AWS KMS.
    • Encrypted data key is then stored by the AWS service.
    • For data decryption by the AWS service, the encrypted data key is passed to AWS KMS and decrypted under the master key that was originally encrypted under so the service can then decrypt your data.
  • KMS does support sending data less than 4 KB to be encrypted, envelope encryption can offer significant performance benefits
  • When the data is encrypted directly with KMS it must be transferred over the network.
  • Envelope encryption reduces the network load for the application or AWS cloud service as Only the request and fulfillment of the data key through KMS must go over the network

KMS Features

  • Create keys with a unique alias and description
  • Import your own keys
  • Control which IAM users and roles can manage keys
  • Control which IAM users and roles can use keys to encrypt & decrypt data
  • Choose to have AWS KMS automatically rotate keys on an annual basis
  • Temporarily disable keys so they cannot be used by anyone
  • Re-enable disabled keys
  • Delete keys that you no longer use
  • Audit use of keys by inspecting logs in AWS CloudTrail

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are designing a personal document-archiving solution for your global enterprise with thousands of employee. Each employee has potentially gigabytes of data to be backed up in this archiving solution. The solution will be exposed to he employees as an application, where they can just drag and drop their files to the archiving system. Employees can retrieve their archives through a web interface. The corporate network has high bandwidth AWS DirectConnect connectivity to AWS. You have regulatory requirements that all data needs to be encrypted before being uploaded to the cloud. How do you implement this in a highly available and cost efficient way?
    1. Manage encryption keys on-premise in an encrypted relational database. Set up an on-premises server with sufficient storage to temporarily store files and then upload them to Amazon S3, providing a client-side master key. (Storing temporary increases cost and not a high availability option)
    2. Manage encryption keys in a Hardware Security Module (HSM) appliance on-premise server with sufficient storage to temporarily store, encrypt, and upload files directly into amazon Glacier. (Not cost effective)
    3. Manage encryption keys in amazon Key Management Service (KMS), upload to amazon simple storage service (s3) with client-side encryption using a KMS customer master key ID and configure Amazon S3 lifecycle policies to store each object using the amazon glacier storage tier. (With CSE-KMS the encryption happens at client side before the object is upload to S3 and KMS is cost effective as well)
    4. Manage encryption keys in an AWS CloudHSM appliance. Encrypt files prior to uploading on the employee desktop and then upload directly into amazon glacier (Not cost effective)
  2. An AWS customer is deploying an application that is composed of an Auto Scaling group of EC2 Instances. The customers security policy requires that every outbound connection from these instances to any other service within the customers Virtual Private Cloud must be authenticated using a unique x 509 certificate that contains the specific instance-id. In addition an x 509 certificates must be designed by the customer’s Key management service in order to be trusted for authentication.
    Which of the following configurations will support these requirements?

    1. Configure an IAM Role that grants access to an Amazon S3 object containing a signed certificate and configure the Auto Scaling group to launch instances with this role. Have the instances bootstrap get the certificate from Amazon S3 upon first boot.
    2. Embed a certificate into the Amazon Machine Image that is used by the Auto Scaling group Have the launched instances generate a certificate signature request with the instance’s assigned instance-id to the Key management service for signature.
    3. Configure the Auto Scaling group to send an SNS notification of the launch of a new instance to the trusted key management service. Have the Key management service generate a signed certificate and send it directly to the newly launched instance.
    4. Configure the launched instances to generate a new certificate upon first boot. Have the Key management service poll the AutoScaling group for associated instances and send new instances a certificate signature that contains the specific instance-id.

IAM Role – Identity Providers and Federation – Certification

IAM Role – Identity Providers and Federation

  • Identify Provider can be used to grant external user identities permissions to AWS resources without having to be created within your AWS account.
  • External user identities can be authenticated either through the organization’s authentication system or through a well-know identity provider such as login with Amazon, Google etc.
  • Identity providers help keep the AWS account secure without having the need to distribute or embed long term in the application
  • To use an IdP, you create an IAM identity provider entity to establish a trust relationship between your AWS account and the IdP.
  • IAM supports IdPs that are compatible with OpenID Connect (OIDC) or SAML 2.0 (Security Assertion Markup Language 2.0)

Web Identity Federation

Complete Process Flow

IAM Web Identity Federation

  1. Mobile or Web Application needs to be configured with the IdP which gives each application a unique ID or client ID (also called audience)
  2. Create an Identity Provider entity for OIDC compatible IdP in IAM.
  3. Create IAM role and define the
    1. Trust policy –  specify the IdP (like Amazon) as the Principal (the trusted entity), and include a Condition that matches the IdP assigned app ID
    2. Permission policy – specify the permissions the application can assume
  4. Application calls the sign-in interface for the IdP to login
  5. IdP authenticates the user and returns an authentication token (OAuth access token or OIDC ID token) with information about the user to the application
  6. Application then makes an unsigned call to the STS service with the AssumeRoleWithWebIdentity action to request temporary security credentials.
  7. Application passes the IdP’s authentication token along with the Amazon Resource Name (ARN) for the IAM role created for that IdP.
  8. AWS verifies that the token is trusted and valid and if so, returns temporary security credentials (access key, secret access key, session token, expiry time) to the application that have the permissions for the role that you name in the request.
  9. STS response also includes metadata about the user from the IdP, such as the unique user ID that the IdP associates with the user.
  10. Using the Temporary credentials, the application makes signed requests to AWS
  11. User ID information from the identity provider can distinguish users in the app for e.g., objects can be put into S3 folders that include the user ID as prefixes or suffixes. This lets you create access control policies that lock the folder so only the user with that ID can access it.
  12. Application can cache the temporary security credentials and refresh them before their expiry accordingly. Temporary credentials, by default, are good for an hour.

Interactive Website provides a very good way to understand the flow

Mobile or Web Identity Federation with Cognito

Amazon Cognito

  • Use Amazon Cognito as the identity broker for almost all web identity federation scenarios
  • Amazon Cognito is easy to use and provides additional capabilities like anonymous (unauthenticated) access
  • Amazon Cognito also helps synchronizing user data across devices and providers

Web Identify Federation using Cognito

SAML 2.0-based Federation

  • AWS supports identity federation with SAML 2.0 (Security Assertion Markup Language 2.0), an open standard that many identity providers (IdPs) use.
  • SAML 2.0 based federation feature enables federated single sign-on (SSO), so users can log into the AWS Management Console or call the AWS APIs without having to create an IAM user for everyone in your organization.
  • By using SAML, the process of configuring federation with AWS can be simplified by using the IdP’s service instead of writing custom identity proxy code.
  • This is useful in organizations that have integrated their identity systems (such as Windows Active Directory or OpenLDAP) with software that can produce SAML assertions to provide information about user identity and permissions (such as Active Directory Federation Services or Shibboleth)

Complete Process Flow

SAML based Federation

  1. Create a SAML provider entity in AWS using the SAML metadata document provided by the Organizations IdP to establish a “trust” between your AWS account and the IdP
  2. SAML metadata document includes the issuer name, a creation date, an expiration date, and keys that AWS can use to validate authentication responses (assertions) from your organization.
  3. Create IAM roles which defines
    1. Trust policy with the SAML provider as the principal, which establishes a trust relationship between the organization and AWS
    2. Permission policy establishes what users from the organization are allowed to do in AWS
  4. SAML trust is completed by configuring the Organization’s IdP with information about AWS and the role(s) that you want the federated users to use. This is referred to as configuring relying party trust between your IdP and AWS
  5. Application calls the sign-in interface for the Organization IdP to login
  6. IdP authenticates the user and generates a SAML authentication response which includes assertions that identify the user and include attributes about the user
  7. Application then makes an unsigned call to the STS service with the AssumeRoleWithSAML action to request temporary security credentials.
  8. Application passes the ARN of the SAML provider, the ARN of the role to assume, the SAML assertion about the current user returned by IdP and the time for which the credentials should be valid. An optional IAM Policy parameter can be provided to further restrict the permissions to the user
  9. AWS verifies that the SAML assertion is trusted and valid and if so, returns temporary security credentials (access key, secret access key, session token, expiry time) to the application that have the permissions for the role named in the request.
  10. STS response also includes metadata about the user from the IdP, such as the unique user ID that the IdP associates with the user.
  11. Using the Temporary credentials, the application makes signed requests to AWS to access the services
  12. Application can cache the temporary security credentials and refresh them before their expiry accordingly. Temporary credentials, by default, are good for an hour.

SAML 2.0 based federation can also be used to grant access to the federated users to the AWS Management console. This requires the use of the AWS SSO endpoint instead of directly calling the AssumeRoleWithSAML API. The endpoint calls the API for the user and returns a URL that automatically redirects the user’s browser to the AWS Management Console.

Complete Process Flow

SAML based SSO to AWS Console

  1. User browses to the organization’s portal and selects the option to go to the AWS Management Console.
  2. Portal performs the function of the identity provider (IdP) that handles the exchange of trust between the organization and AWS.
  3. Portal verifies the user’s identity in the organization.
  4. Portal generates a SAML authentication response that includes assertions that identify the user and include attributes about the user.
  5. Portal sends this response to the client browser.
  6. Client browser is redirected to the AWS SSO endpoint and posts the SAML assertion.
  7. AWS SSO endpoint handles the call for the AssumeRoleWithSAML API action on the user’s behalf and requests temporary security credentials from STS and creates a console sign-in URL that uses those credentials.
  8. AWS sends the sign-in URL back to the client as a redirect.
  9. Client browser is redirected to the AWS Management Console. If the SAML authentication response includes attributes that map to multiple IAM roles, the user is first prompted to select the role to use for access to the console.

Custom Identity broker Federation

Custom Identity broker Federation

  • If the Organization doesn’t support SAML compatible IdP, a Custom Identity Broker can be used to provide the access
  • Custom Identity Broker should perform the following steps
    • Verify that the user is authenticated by the local identity system.
    • Call the AWS Security Token Service (AWS STS) AssumeRole (recommended) or GetFederationToken (by default, has a expiration period of 36 hours) APIs to obtain temporary security credentials for the user.
    • Temporary credentials limit the permissions a user has to the AWS resource
    • Call an AWS federation endpoint and supply the temporary security credentials to get a sign-in token.
    • Construct a URL for the console that includes the token.
    • URL that the federation endpoint provides is valid for 15 minutes after it is created.
    • Give the URL to the user or invoke the URL on the user’s behalf.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A photo-sharing service stores pictures in Amazon Simple Storage Service (S3) and allows application sign-in using an OpenID Connect-compatible identity provider. Which AWS Security Token Service approach to temporary access should you use for the Amazon S3 operations?
    1. SAML-based Identity Federation
    2. Cross-Account Access
    3. AWS IAM users
    4. Web Identity Federation
  2. Which technique can be used to integrate AWS IAM (Identity and Access Management) with an on-premise LDAP (Lightweight Directory Access Protocol) directory service?
    1. Use an IAM policy that references the LDAP account identifiers and the AWS credentials.
    2. Use SAML (Security Assertion Markup Language) to enable single sign-on between AWS and LDAP
    3. Use AWS Security Token Service from an identity broker to issue short-lived AWS credentials. (Refer Link)
    4. Use IAM roles to automatically rotate the IAM credentials when LDAP credentials are updated.
    5. Use the LDAP credentials to restrict a group of users from launching specific EC2 instance types.
  3. You are designing a photo sharing mobile app the application will store all pictures in a single Amazon S3 bucket. Users will upload pictures from their mobile device directly to Amazon S3 and will be able to view and download their own pictures directly from Amazon S3. You want to configure security to handle potentially millions of users in the most secure manner possible. What should your server-side application do when a new user registers on the photo-sharing mobile application? [PROFESSIONAL]
    1. Create a set of long-term credentials using AWS Security Token Service with appropriate permissions Store these credentials in the mobile app and use them to access Amazon S3.
    2. Record the user’s Information in Amazon RDS and create a role in IAM with appropriate permissions. When the user uses their mobile app create temporary credentials using the AWS Security Token Service ‘AssumeRole’ function. Store these credentials in the mobile app’s memory and use them to access Amazon S3. Generate new credentials the next time the user runs the mobile app.
    3. Record the user’s Information in Amazon DynamoDB. When the user uses their mobile app create temporary credentials using AWS Security Token Service with appropriate permissions. Store these credentials in the mobile app’s memory and use them to access Amazon S3 Generate new credentials the next time the user runs the mobile app.
    4. Create IAM user. Assign appropriate permissions to the IAM user Generate an access key and secret key for the IAM user, store them in the mobile app and use these credentials to access Amazon S3.
    5. Create an IAM user. Update the bucket policy with appropriate permissions for the IAM user Generate an access Key and secret Key for the IAM user, store them In the mobile app and use these credentials to access Amazon S3.
  4. Your company has recently extended its datacenter into a VPC on AWS to add burst computing capacity as needed Members of your Network Operations Center need to be able to go to the AWS Management Console and administer Amazon EC2 instances as necessary. You don’t want to create new IAM users for each NOC member and make those users sign in again to the AWS Management Console. Which option below will meet the needs for your NOC members? [PROFESSIONAL]
    1. Use OAuth 2.0 to retrieve temporary AWS security credentials to enable your NOC members to sign in to the AWS Management Console.
    2. Use Web Identity Federation to retrieve AWS temporary security credentials to enable your NOC members to sign in to the AWS Management Console.
    3. Use your on-premises SAML 2.O-compliant identity provider (IDP) to grant the NOC members federated access to the AWS Management Console via the AWS single sign-on (SSO) endpoint.
    4. Use your on-premises SAML 2.0-compliant identity provider (IDP) to retrieve temporary security credentials to enable NOC members to sign in to the AWS Management Console
  5. A corporate web application is deployed within an Amazon Virtual Private Cloud (VPC) and is connected to the corporate data center via an iPsec VPN. The application must authenticate against the on-premises LDAP server. After authentication, each logged-in user can only access an Amazon Simple Storage Space (S3) keyspace specific to that user. Which two approaches can satisfy these objectives? (Choose 2 answers) [PROFESSIONAL]
    1. Develop an identity broker that authenticates against IAM security Token service to assume a IAM role in order to get temporary AWS security credentials. The application calls the identity broker to get AWS temporary security credentials with access to the appropriate S3 bucket. (Needs to authenticate against LDAP and not IAM)
    2. The application authenticates against LDAP and retrieves the name of an IAM role associated with the user. The application then calls the IAM Security Token Service to assume that IAM role. The application can use the temporary credentials to access the appropriate S3 bucket. (Authenticates with LDAP and calls the AssumeRole)
    3. Develop an identity broker that authenticates against LDAP and then calls IAM Security Token Service to get IAM federated user credentials The application calls the identity broker to get IAM federated user credentials with access to the appropriate S3 bucket. (Custom Identity broker implementation, with authentication with LDAP and using federated token)
    4. The application authenticates against LDAP the application then calls the AWS identity and Access Management (IAM) Security Token service to log in to IAM using the LDAP credentials the application can use the IAM temporary credentials to access the appropriate S3 bucket. (Can’t login to IAM using LDAP credentials)
    5. The application authenticates against IAM Security Token Service using the LDAP credentials the application uses those temporary AWS security credentials to access the appropriate S3 bucket. (Need to authenticate with LDAP)
  6. Company B is launching a new game app for mobile devices. Users will log into the game using their existing social media account to streamline data capture. Company B would like to directly save player data and scoring information from the mobile app to a DynamoDB table named Score Data When a user saves their game the progress data will be stored to the Game state S3 bucket. what is the best approach for storing data to DynamoDB and S3? [PROFESSIONAL]
    1. Use an EC2 Instance that is launched with an EC2 role providing access to the Score Data DynamoDB table and the GameState S3 bucket that communicates with the mobile app via web services.
    2. Use temporary security credentials that assume a role providing access to the Score Data DynamoDB table and the Game State S3 bucket using web identity federation
    3. Use Login with Amazon allowing users to sign in with an Amazon account providing the mobile app with access to the Score Data DynamoDB table and the Game State S3 bucket.
    4. Use an IAM user with access credentials assigned a role providing access to the Score Data DynamoDB table and the Game State S3 bucket for distribution with the mobile app.
  7. A user has created a mobile application which makes calls to DynamoDB to fetch certain data. The application is using the DynamoDB SDK and root account access/secret access key to connect to DynamoDB from mobile. Which of the below mentioned statements is true with respect to the best practice for security in this scenario?
    1. User should create a separate IAM user for each mobile application and provide DynamoDB access with it
    2. User should create an IAM role with DynamoDB and EC2 access. Attach the role with EC2 and route all calls from the mobile through EC2
    3. The application should use an IAM role with web identity federation which validates calls to DynamoDB with identity providers, such as Google, Amazon, and Facebook
    4. Create an IAM Role with DynamoDB access and attach it with the mobile application
  8. You are managing the AWS account of a big organization. The organization has more than 1000+ employees and they want to provide access to the various services to most of the employees. Which of the below mentioned options is the best possible solution in this case?
    1. The user should create a separate IAM user for each employee and provide access to them as per the policy
    2. The user should create an IAM role and attach STS with the role. The user should attach that role to the EC2 instance and setup AWS authentication on that server
    3. The user should create IAM groups as per the organization’s departments and add each user to the group for better access control
    4. Attach an IAM role with the organization’s authentication service to authorize each user for various AWS services
  9. Your fortune 500 company has under taken a TCO analysis evaluating the use of Amazon S3 versus acquiring more hardware The outcome was that all employees would be granted access to use Amazon S3 for storage of their personal documents. Which of the following will you need to consider so you can set up a solution that incorporates single sign-on from your corporate AD or LDAP directory and restricts access for each user to a designated user folder in a bucket? (Choose 3 Answers) [PROFESSIONAL]
    1. Setting up a federation proxy or identity provider
    2. Using AWS Security Token Service to generate temporary tokens
    3. Tagging each folder in the bucket
    4. Configuring IAM role
    5. Setting up a matching IAM user for every user in your corporate directory that needs access to a folder in the bucket
  10. An AWS customer is deploying a web application that is composed of a front-end running on Amazon EC2 and of confidential data that is stored on Amazon S3. The customer security policy that all access operations to this sensitive data must be authenticated and authorized by a centralized access management system that is operated by a separate security team. In addition, the web application team that owns and administers the EC2 web front-end instances is prohibited from having any ability to access the data that circumvents this centralized access management system. Which of the following configurations will support these requirements? [PROFESSIONAL]
    1. Encrypt the data on Amazon S3 using a CloudHSM that is operated by the separate security team. Configure the web application to integrate with the CloudHSM for decrypting approved data access operations for trusted end-users. (S3 doesn’t integrate directly with CloudHSM, also there is no centralized access management system control)
    2. Configure the web application to authenticate end-users against the centralized access management system. Have the web application provision trusted users STS tokens entitling the download of approved data directly from Amazon S3 (Controlled access and admins cannot access the data as it needs authentication)
    3. Have the separate security team create and IAM role that is entitled to access the data on Amazon S3. Have the web application team provision their instances with this role while denying their IAM users access to the data on Amazon S3 (Web team would have access to the data)
    4. Configure the web application to authenticate end-users against the centralized access management system using SAML. Have the end-users authenticate to IAM using their SAML token and download the approved data directly from S3. (not the way SAML auth works and not sure if the centralized access management system is SAML complaint)
  11. What is web identity federation?
    1. Use of an identity provider like Google or Facebook to become an AWS IAM User.
    2. Use of an identity provider like Google or Facebook to exchange for temporary AWS security credentials.
    3. Use of AWS IAM User tokens to log in as a Google or Facebook user.
    4. Use of AWS STS Tokens to log in as a Google or Facebook user.
  12. Games-R-Us is launching a new game app for mobile devices. Users will log into the game using their existing Facebook account and the game will record player data and scoring information directly to a DynamoDB table. What is the most secure approach for signing requests to the DynamoDB API?
    1. Create an IAM user with access credentials that are distributed with the mobile app to sign the requests
    2. Distribute the AWS root account access credentials with the mobile app to sign the requests
    3. Request temporary security credentials using web identity federation to sign the requests
    4. Establish cross account access between the mobile app and the DynamoDB table to sign the requests
  13. You are building a mobile app for consumers to post cat pictures online. You will be storing the images in AWS S3. You want to run the system very cheaply and simply. Which one of these options allows you to build a photo sharing application without needing to worry about scaling expensive uploads processes, authentication/authorization and so forth?
    1. Build the application out using AWS Cognito and web identity federation to allow users to log in using Facebook or Google Accounts. Once they are logged in, the secret token passed to that user is used to directly access resources on AWS, like AWS S3. (Amazon Cognito is a superset of the functionality provided by web identity federation. Refer link)
    2. Use JWT or SAML compliant systems to build authorization policies. Users log in with a username and password, and are given a token they can use indefinitely to make calls against the photo infrastructure.
    3. Use AWS API Gateway with a constantly rotating API Key to allow access from the client-side. Construct a custom build of the SDK and include S3 access in it.
    4. Create an AWS oAuth Service Domain ad grant public signup and access to the domain. During setup, add at least one major social media site as a trusted Identity Provider for users.
  14. The Marketing Director in your company asked you to create a mobile app that lets users post sightings of good deeds known as random acts of kindness in 80-character summaries. You decided to write the application in JavaScript so that it would run on the broadest range of phones, browsers, and tablets. Your application should provide access to Amazon DynamoDB to store the good deed summaries. Initial testing of a prototype shows that there aren’t large spikes in usage. Which option provides the most cost-effective and scalable architecture for this application? [PROFESSIONAL]
    1. Provide the JavaScript client with temporary credentials from the Security Token Service using a Token Vending Machine (TVM) on an EC2 instance to provide signed credentials mapped to an Amazon Identity and Access Management (IAM) user allowing DynamoDB puts and S3 gets. You serve your mobile application out of an S3 bucket enabled as a web site. Your client updates DynamoDB. (Single EC2 instance not a scalable architecture)
    2. Register the application with a Web Identity Provider like Amazon, Google, or Facebook, create an IAM role for that provider, and set up permissions for the IAM role to allow S3 gets and DynamoDB puts. You serve your mobile application out of an S3 bucket enabled as a web site. Your client updates DynamoDB. (Can work with JavaScript SDK, is scalable and cost effective)
    3. Provide the JavaScript client with temporary credentials from the Security Token Service using a Token Vending Machine (TVM) to provide signed credentials mapped to an IAM user allowing DynamoDB puts. You serve your mobile application out of Apache EC2 instances that are load-balanced and autoscaled. Your EC2 instances are configured with an IAM role that allows DynamoDB puts. Your server updates DynamoDB. (Is Scalable but Not cost effective)
    4. Register the JavaScript application with a Web Identity Provider like Amazon, Google, or Facebook, create an IAM role for that provider, and set up permissions for the IAM role to allow DynamoDB puts. You serve your mobile application out of Apache EC2 instances that are load-balanced and autoscaled. Your EC2 instances are configured with an IAM role that allows DynamoDB puts. Your server updates DynamoDB. (Is Scalable but Not cost effective)

References

AWS IAM User Guide – Id Role Providers

AWS Elastic Map Reduce – EMR – Certification

AWS EMR

  • Amazon EMR is a web service that utilizes a hosted Hadoop framework running on the web-scale infrastructure of EC2 and S3
  • EMR enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data
  • EMR
    • uses Apache Hadoop as its distributed data processing engine, which is an open source, Java software that supports data-intensive distributed applications running on large clusters of commodity hardware
    • is ideal for problems that necessitate fast and efficient processing of large amounts of data
    • lets the focus be on crunching or analyzing big data without having to worry about time-consuming set-up, management or tuning of Hadoop clusters or the compute capacity
    • can help perform data-intensive tasks for applications such as web indexing, data mining, log file analysis, machine learning, financial analysis, scientific simulation, and bioinformatics research etc
    • provides web service interface to launch the clusters and monitor processing-intensive computation on clusters
    • is a batch-processing framework that measures the common processing time duration in hours to days, if the use case is to have processing at real time or within minutes Apache Spark or Storm would be a better option
  • EMR seamlessly supports On-Demand, Spot, and Reserved Instances
  • EMR launches all nodes for a given cluster in the same EC2 Availability Zone, which improves performance as it provides higher data access rate
  • EMR supports different EC2 instance types including Standard, High CPU, High Memory, Cluster Compute, High I/O, and High Storage
    • Standard Instances have memory to CPU ratios suitable for most general-purpose applications.
    • High CPU instances have proportionally more CPU resources than memory (RAM) and are well suited for compute-intensive applications
    • High Memory instances offer large memory sizes for high throughput applications
    • Cluster Compute instances have proportionally high CPU with increased network performance and are well suited for High Performance Compute (HPC) applications and other demanding network-bound applications
    • High Storage instances offer 48 TB of storage across 24 disks and are ideal for applications that require sequential access to very large data sets such as data warehousing and log processing
  • EMR charges on hourly increments i.e. once the cluster is running,  charges apply entire hour
  • EMR integrates with CloudTrail to record AWS API calls

NOTE: Topic mainly for Solution Architect Professional Exam Only

EMR Architecture

  • Amazon EMR uses industry proven, fault-tolerant Hadoop software as its data processing engine
  • Hadoop is an open source, Java software that supports data-intensive distributed applications running on large clusters of commodity hardware
  • Hadoop splits the data into multiple subsets and assigns each subset to more than one EC2 instance. So, if an EC2 instance fails to process one subset of data, the results of another Amazon EC2 instance can be used
  • EMR consists of Master node, one or more Slave nodes
    • Master Node
      • EMR currently does not support automatic failover of the master nodes or master node state recovery
      • If master node goes down, the EMR cluster will be terminated and the job needs to be re-executed
    • Slave Nodes – Core nodes and Task nodes
      • Core nodes
        • host persistent data using Hadoop Distributed File System (HDFS) and run Hadoop tasks
        • can be increased in an existing cluster
      • Task nodes
        • only run Hadoop tasks
        • can be increased or decreased in an existing cluster
      • EMR is fault tolerant for slave failures and continues job execution if a slave node goes down.
      • Currently, EMR does not automatically provision another node to take over failed slaves
  • EMR supports Bootstrap actions which allow
    • users a way to run custom set-up prior to the execution of the cluster.
    • can be used to install software or configure instances before running the cluster

EMR Security

  • EMR cluster starts with different security groups for Master and Slaves
    • Master security group
      • has a port open for communication with the service.
      • has a SSH port open to allow direct SSH into the instances, using the key specified at startup
    • Slave security group
      • only allows interaction with the master instance
      • SSH to the slave nodes can be done by doing SSH to the master node and then to the slave node
    • Security groups can be configured with different access rules

EMR Security Encryption

  • EMR enables use of security configuration
    • which helps to encrypt data at-rest, data in-transit, or both
    • can be used to specify settings for S3 encryption with EMR file system (EMRFS), local disk encryption, and in-transit encryption
    • is stored in EMR rather than the cluster configuration making it reusable
    • gives flexibility to choose from several options, including keys managed by AWS KMS, keys managed by S3, and keys and certificates from custom providers that you supply
  • At-rest Encryption for S3 with EMRFS
    • EMRFS supports Server-side (SSE-S3, SSE-KMS) and Client-side encryption (CSE-KMS or CSE-Custom)
    • S3 SSE and CSE encryption with EMRFS are mutually exclusive; either one can be selected but not both
    • Transport layer security (TLS) encrypts EMRFS objects in-transit between EMR cluster nodes & S3
  • At-rest Encryption for Local Disks
    • Open-source HDFS Encryption
      • HDFS exchanges data between cluster instances during distributed processing, and also reads from and writes data to instance store volumes and the EBS volumes attached to instances
      • Open-source Hadoop encryption options are activated
        • Secure Hadoop RPC is set to “Privacy”, which uses Simple Authentication Security Layer (SASL).
        • Data encryption on HDFS block data transfer is set to true and is configured to use AES 256 encryption.
    • LUKS. In addition to HDFS encryption, the Amazon EC2 instance store volumes (except boot volumes) and the attached Amazon EBS volumes of cluster instances are encrypted using LUKS
  • In-Transit Data Encryption
    • Encryption artifacts used for in-transit encryption in one of two ways:
      • either by providing a zipped file of certificates that you upload to S3,
      • or by referencing a custom Java class that provides encryption artifacts

EMR Cluster Types

  • EMR has two cluster types, transient and persistent
  • Transient EMR Clusters
    • Transient EMR clusters are clusters that shut down when the job or the steps (series of jobs) are complete
    • Transient EMT clusters can be used in situations
      • where total number of EMR processing hours per day < 24 hours and its beneficial to shut down the cluster when it’s not being used.
      • using HDFS as your primary data storage.
      • job processing is intensive, iterative data processing.
  • Persistent EMR Clusters
    • Persistent EMR clusters continue to run after the data processing job is complete
    • Persistent EMR clusters can be used in situations
      • frequently run processing jobs where it’s beneficial to keep the cluster running after the previous job.
      • processing jobs have an input-output dependency on one another.
      • In rare cases when it is more cost effective to store the data on HDFS instead of S3

EMR Best Practices

  • Data Migration
    • Two tools – S3DistCp and DistCp – can be used to move data stored on the local (data center) HDFS storage to S3, from S3 to HDFS and between S3 and local disk (non HDFS) to S3
    • AWS Import/Export and Direct Connect can also be considered for moving data
  • Data Collection
    • Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, & moving large amounts of log data
    • Flume agents can be installed on the data sources (web-servers, app servers etc) and data shipped to the collectors which can then be stored in persistent storage like S3 or HDFS
  • Data Aggregation
    • Data aggregation refers to techniques for gathering individual data records (for e.g. log records) and combining them into a large bundle of data files i.e. creating a large file from small files
    • Hadoop, on which EMR runs, generally performs better with fewer large files compared to many small files
    • Hadoop splits the file on HDFS on multiple nodes, while for the data in S3 it uses the HTTP Range header query to split the files which helps improve performance by supporting parallelization
    • Log collectors like Flume and Fluentd can be used to aggregate data before copying it to the final destination (S3 or HDFS)
    • Data aggregation has following benefits
      • Improves data ingest scalability by reducing the number of times needed to upload data to AWS
      • Reduces the number of files stored on S3 (or HDFS), which inherently helps provide better performance when processing data
      • Provides a better compression ratio as compressing large, highly compressible files is often more effective than compressing a large number of smaller files.
  • Data compression
    • Data compression can be used at the input as well as intermediate outputs from the mappers
    • Data compression helps
      • Lower storage costs
      • Lower bandwidth cost for data transfer
      • Better data processing performance by moving less data between data storage location, mappers, and reducers
      • Better data processing performance by compressing the data that EMR writes to disk, i.e. achieving better performance by writing to disk less frequently
    • Data Compression can have an impact on Hadoop data splitting logic as some of the compression techniques like gzip do not support it
    • Data Compression Techniques
  • Data Partitioning
    • Data partitioning helps in data optimizations and lets you create unique buckets of data and eliminate the need for a data processing job to read the entire data set
    • Data can be partitioned by
      • Data type (time series)
      • Data processing frequency (per hour, per day, etc.)
      • Data access and query pattern (query on time vs. query on geo location)
  • Cost Optimization
    • AWS offers different pricing models for EC2 instances
      • On-Demand instances
        • are a good option if using transient EMR jobs or if the EMR hourly usage is less than 17% of the time
      • Reserved instances
        • are a good option for persistent EMR cluster or if the  EMR hourly usage is more than 17% of the time as is more cost effective
      • Spot instances
        • can be a cost effective mechanism to add compute capacity
        • can be used where the data is persists on S3
        • can be used to add extra task capacity with Task nodes, and
        • is not suited for Master node, as if it is lost the cluster is lost and Core nodes (data nodes) as they host data and if lost needs to be recovered to rebalance the HDFS cluster
    • Architecture pattern can be used,
      • Run master node on On-Demand or Reserved Instances (if running persistent EMR clusters).
      • Run a portion of the EMR cluster on core nodes using On-Demand or Reserved Instances and
      • the rest of the cluster on task nodes using Spot Instances.

EMR – S3 vs HDFS

  • Storing data on S3 provides several benefits
    • inherent features high availability, durability, lifecycle management, data encryption and archival of data to Glacier
    • cost effective as storing data in S3 is cheaper as compared to HDFS with the replication factor
    • ability to use Transient EMR cluster and shutdown the clusters after the job is completed, with data being maintained in S3
    • ability to use Spot instances and not having to worry about losing the spot instances any time
    • provides data durability from any HDFS node failures, where node failures exceed the HDFS replication factor
    • data ingestion with high throughput data stream to S3 is much easier than ingesting to HDFS

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You require the ability to analyze a large amount of data, which is stored on Amazon S3 using Amazon Elastic Map Reduce. You are using the cc2.8xlarge instance type, who’s CPUs are mostly idle during processing. Which of the below would be the most cost efficient way to reduce the runtime of the job? [PROFESSIONAL]
    1. Create smaller files on Amazon S3.
    2. Add additional cc2.8xlarge instances by introducing a task group.
    3. Use smaller instances that have higher aggregate I/O performance.
    4. Create fewer, larger files on Amazon S3.
  2. A customer’s nightly EMR job processes a single 2-TB data file stored on Amazon Simple Storage Service (S3). The Amazon Elastic Map Reduce (EMR) job runs on two On-Demand core nodes and three On-Demand task nodes. Which of the following may help reduce the EMR job completion time? Choose 2 answers [PROFESSIONAL]
    1. Use three Spot Instances rather than three On-Demand instances for the task nodes.
    2. Change the input split size in the MapReduce job configuration.
    3. Use a bootstrap action to present the S3 bucket as a local filesystem.
    4. Launch the core nodes and task nodes within an Amazon Virtual Cloud.
    5. Adjust the number of simultaneous mapper tasks.
    6. Enable termination protection for the job flow.
  3. Your department creates regular analytics reports from your company’s log files. All log data is collected in Amazon S3 and processed by daily Amazon Elastic Map Reduce (EMR) jobs that generate daily PDF reports and aggregated tables in CSV format for an Amazon Redshift data warehouse. Your CFO requests that you optimize the cost structure for this system. Which of the following alternatives will lower costs without compromising average performance of the system or data integrity for the raw data? [PROFESSIONAL]
    1. Use reduced redundancy storage (RRS) for PDF and CSV data in Amazon S3. Add Spot instances to Amazon EMR jobs. Use Reserved Instances for Amazon Redshift. (Only Spot instances impacts performance)
    2. Use reduced redundancy storage (RRS) for all data in S3. Use a combination of Spot instances and Reserved Instances for Amazon EMR jobs. Use Reserved instances for Amazon Redshift (Combination of the Spot and reserved with guarantee performance and help reduce cost. Also, RRS would reduce cost and guarantee data integrity, which is different from data durability)
    3. Use reduced redundancy storage (RRS) for all data in Amazon S3. Add Spot Instances to Amazon EMR jobs. Use Reserved Instances for Amazon Redshift (Only Spot instances impacts performance)
    4. Use reduced redundancy storage (RRS) for PDF and CSV data in S3. Add Spot Instances to EMR jobs. Use Spot Instances for Amazon Redshift. (Spot instances impacts performance and Spot instance not available for Redshift)
  4. A research scientist is planning for the one-time launch of an Elastic MapReduce cluster and is encouraged by her manager to minimize the costs. The cluster is designed to ingest 200TB of genomics data with a total of 100 Amazon EC2 instances and is expected to run for around four hours. The resulting data set must be stored temporarily until archived into an Amazon RDS Oracle instance. Which option will help save the most money while meeting requirements? [PROFESSIONAL]
    1. Store ingest and output files in Amazon S3. Deploy on-demand for the master and core nodes and spot for the task nodes.
    2. Optimize by deploying a combination of on-demand, RI and spot-pricing models for the master, core and task nodes. Store ingest and output files in Amazon S3 with a lifecycle policy that archives them to Amazon Glacier. (Master and Core must be RI or On Demand. Cannot be Spot)
    3. Store the ingest files in Amazon S3 RRS and store the output files in S3. Deploy Reserved Instances for the master and core nodes and on-demand for the task nodes. (Need better durability for ingest file. Spot instances can be used for task nodes for cost saving. RI will not provide cost saving in this case)
    4. Deploy on-demand master, core and task nodes and store ingest and output files in Amazon S3 RRS (Input should be in S3 standard, as re-ingesting the input data might end up being more costly then holding the data for limited time in standard S3)
  5. Your company sells consumer devices and needs to record the first activation of all sold devices. Devices are not activated until the information is written on a persistent database. Activation data is very important for your company and must be analyzed daily with a MapReduce job. The execution time of the data analysis process must be less than three hours per day. Devices are usually sold evenly during the year, but when a new device model is out, there is a predictable peak in activation’s, that is, for a few days there are 10 times or even 100 times more activation’s than in average day. Which of the following databases and analysis framework would you implement to better optimize costs and performance for this workload? [PROFESSIONAL]
    1. Amazon RDS and Amazon Elastic MapReduce with Spot instances.
    2. Amazon DynamoDB and Amazon Elastic MapReduce with Spot instances.
    3. Amazon RDS and Amazon Elastic MapReduce with Reserved instances.
    4. Amazon DynamoDB and Amazon Elastic MapReduce with Reserved instances

References

AWS ElastiCache – Certification

AWS ElastiCache

  • AWS ElastiCache is a managed web service that lets you easily deploy and run Memcached or Redis protocol-compliant cache clusters in the cloud
  • ElastiCache is available in two flavours: Memcached and Redis
  • ElastiCache helps
    • simplify and offload the management, monitoring, and operation of in-memory cache environments, enabling the engineering resources to focus on developing applications
    • automate common administrative tasks required to operate a distributed cache environment.
    • improves the performance of web applications by allowing retrieval of information from a fast, managed, in-memory caching system, instead of relying entirely on slower disk-based databases.
    • not only to improve load & response times to user actions and queries, but also reduce the cost associated with scaling web applications
    • helps automatically detect and replace failed cache nodes, providing a resilient system that mitigates the risk of overloaded databases, which can slow website and application load times
    • provides enhanced visibility into key performance metrics associated with the cache nodes through integration with CloudWatch
    • code, applications, and popular tools already using Memcached or Redis environments work seamlessly, with being protocol- compliant with Memcached and Redis environments
  • ElastiCache provides in-memory caching which can
    • significantly improve latency and throughput for many
      • read-heavy application workloads for e.g. social networking, gaming, media sharing and Q&A portals or
      • compute-intensive workloads such as a recommendation engine
    • improve application performance by storing critical pieces of data in memory for low-latency access.
    • be used to cache results of I/O-intensive database queries or the results of computationally-intensive calculations.
  • ElastiCache currently allows access only from the EC2 network and cannot be accessed from outside networks like on-premises servers

AWS ElastiCache Redis vs Memcached

Redis

  • Redis is an open source, BSD licensed, advanced key-value cache & store
  • ElastiCache enables the management, monitoring and operation of a Redis node; creation, deletion and modification of the node
  • ElastiCache for Redis can be used as a primary in-memory key-value data store, providing fast, sub millisecond data performance, high availability and scalability up to 16 nodes plus up to 5 read replicas, each of up to 3.55 TiB of in-memory data
  • ElastiCache for Redis supports
    • Redis Master/Slave replication.
    • Multi-AZ operation by creating read replicas in another AZ
    • Backup and Restore feature for persistence by snapshotting
  • ElastiCache for Redis can be vertically scaled upwards by selecting a larger node type, however it cannot be scaled down
  • Parameter group can be specified for Redis during installation, which acts as a “container” for Redis configuration values that can be applied to one or more Redis primary clusters
  • Append Only File (AOF)
    • provides persistence and can be enabled for recovery scenarios
    • if a node restarts or service crash, Redis will replay the updates from an AOF file, thereby recovering the data lost due to the restart or crash
    • cannot protect against all failure scenarios, cause if the underlying hardware fails, a new server would be provisioned and the AOF file will no longer be available to recover the data
    • Enabling Redis Multi-AZ as a Better Approach to Fault Tolerance, as failing over to a read replica is much faster than rebuilding the primary from an AOF file

Redis Read Replica

  • Read Replicas help provide Read scaling and handling failures
  • Read Replicas are kept in sync with the Primary node using Redis’s asynchronous replication technology
  • Redis Read Replicas can help
    • Horizontal scaling beyond the compute or I/O capacity of a single primary node for read-heavy workloads.
    • Serving read traffic while the primary is unavailable either being down due to failure or maintenance
    • Data protection scenarios to promote a Read Replica as primary node, in case the primary node or the AZ of the primary node fails
  • ElastiCache supports initiated or forced failover where it flips the DNS record for the primary node to point at the read replica, which is in turn promoted to become the new primary
  • Read replica cannot span across regions and may only be provisioned in the same or different AZ of the same Region as the cache node primary

Redis Multi-AZ

  • ElastiCache for Redis shard consists of a primary and up to 5 read replicas
  • Redis asynchronously replicates the data from the primary node to the read replicas
  • ElastiCache for Redis Multi-AZ mode
    • provides enhanced availability and smaller need for administration as the node failover is automatic
    • impact on the ability to read/write to the primary is limited to the time it takes for automatic failover to complete
    • no longer needs monitoring of Redis nodes and manually initiating a recovery in the event of a primary node disruption
  • During certain types of planned maintenance, or in the unlikely event of ElastiCache node failure or AZ failure,
    • it automatically detects the failure,
    • selects a replica, depending upon the read replica with the smallest asynchronous replication lag to the primary, and promotes it to become the new primary node
    • it will also propagate the DNS changes so that the the primary endpoint remains the same
  • If Multi-AZ is not enabled,
    • ElastiCache monitors the primary node
    • in case the node becomes unavailable or unresponsive, it will repair the node by acquiring new service resources
    • it propagates the DNS endpoint changes to redirect the node’s existing DNS name to point to the new service resources.
    • If the primary node cannot be healed and you will have the choice to promote one of the read replicas to be the new primary

Redis Backup & Restore

  • Backup and Restore allows users to create snapshots of the Redis clusters
  • Snapshots can be used for recovery, restoration, archiving purpose or warm start an ElastiCache for Redis cluster with preloaded data
  • Snapshots can created on a cluster basis and uses Redis’ native mechanism to create and store an RDB file as the snapshot
  • Increased latencies for a brief period at the node might be encountered while taking a snapshot, and is recommended to be taken from a Read Replica minimizing performance impact
  • Snapshots can be created either automatically (if configured) or manually
  • ElastiCache for Redis cluster when deleted removes the automatic snapshots. However, manual snapshots are retained

Memcached

  • Memcached is an in-memory key-value store for small chunks or arbitrary data
  • ElastiCache for Memcached can be used to cache a variety of objects
    • from the content in persistent data stores such as RDS, DynamoDB, or self-managed databases hosted on EC2) to
    • dynamically generated web pages (with Nginx for example), or
    • transient session data that may not require a persistent backing store
  • ElastiCache for Memcached
    • can be scaled Vertically by increasing the node type size
    • can be scaled Horizontally by adding and removing nodes
    • does not support persistence of data
  • ElastiCache for Memcached cluster can have
    • nodes can span across multiple AZs within the same region
    • maximum of 20 nodes per cluster with a maximum of 100 nodes per region (soft limit and can be extended)
  • ElasticCache for Memcached supports auto discovery, which enables automatic discovery of cache nodes by clients when they are added to or removed from an ElastiCache cluster

ElastiCache Mitigating Failures

  • ElastiCache should be designed to plan so that failures have a minimal impact upon your application and data
  • Mitigating Failures when Running Memcached
    • Mitigating Node Failures
      • spread the cached data over more nodes
      • as Memcached does not support replication, a node failure will always result in some data loss from the cluster
      • having more nodes will reduce the proportion of cache data lost
    • Mitigating Availability Zone Failures
      • locate the nodes in as many availability zones as possible, only the data cached in that AZ is lost, not the data cached in the other AZs
  • Mitigating Failures when Running Redis
    • Mitigating Cluster Failures
      • Redis Append Only Files (AOF)
        • enable AOF so whenever data is written to the Redis cluster, a corresponding transaction record is written to a Redis AOF
        • when Redis process restarts, ElastiCache creates a replacement cluster and provisions it and repopulating it with data from AOF
        • It is time consuming
        • AOF can get big
        • Using AOF cannot protect you from all failure scenarios
      • Redis Replication Groups
        • A Redis replication group is comprised of a single primary cluster which your application can both read from and write to, and from 1 to 5 read-only replica clusters.
        • Data written to the primary cluster is also asynchronously updated on the read replica clusters
        • When a Read Replica fails, ElastiCache detects the failure, replaces the instance in the same AZ and synchronizes with the Primary Cluster
        • Redis Multi-AZ with Automatic Failover, ElastiCache detects Primary cluster failure, promotes a read replica with least replication lag to primary
        • Multi-AZ with Auto Failover is disabled, ElastiCache detects Primary cluster failure, creates a new one and syncs the new Primary with one of the existing replicas
    • Mitigating Availability Zone Failures
      • locate the clusters in as many availability zones as possible

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What does Amazon ElastiCache provide?
    1. A service by this name doesn’t exist. Perhaps you mean Amazon CloudCache.
    2. A virtual server with a huge amount of memory.
    3. A managed In-memory cache service
    4. An Amazon EC2 instance with the Memcached software already pre-installed.
  2. You are developing a highly available web application using stateless web servers. Which services are suitable for storing session state data? Choose 3 answers.
    1. Elastic Load Balancing
    2. Amazon Relational Database Service (RDS)
    3. Amazon CloudWatch
    4. Amazon ElastiCache
    5. Amazon DynamoDB
    6. AWS Storage Gateway
  3. Which statement best describes ElastiCache?
    1. Reduces the latency by splitting the workload across multiple AZs
    2. A simple web services interface to create and store multiple data sets, query your data easily, and return the results
    3. Offload the read traffic from your database in order to reduce latency caused by read-heavy workload
    4. Managed service that makes it easy to set up, operate and scale a relational database in the cloud
  4. Our company is getting ready to do a major public announcement of a social media site on AWS. The website is running on EC2 instances deployed across multiple Availability Zones with a Multi-AZ RDS MySQL Extra Large DB Instance. The site performs a high number of small reads and writes per second and relies on an eventual consistency model. After comprehensive tests you discover that there is read contention on RDS MySQL. Which are the best approaches to meet these requirements? (Choose 2 answers)
    1. Deploy ElastiCache in-memory cache running in each availability zone
    2. Implement sharding to distribute load to multiple RDS MySQL instances
    3. Increase the RDS MySQL Instance size and Implement provisioned IOPS
    4. Add an RDS MySQL read replica in each availability zone
  5. You are using ElastiCache Memcached to store session state and cache database queries in your infrastructure. You notice in CloudWatch that Evictions and Get Misses are both very high. What two actions could you take to rectify this? Choose 2 answers
    1. Increase the number of nodes in your cluster
    2. Tweak the max_item_size parameter
    3. Shrink the number of nodes in your cluster
    4. Increase the size of the nodes in the cluster
  6. You have been tasked with moving an ecommerce web application from a customer’s datacenter into a VPC. The application must be fault tolerant and well as highly scalable. Moreover, the customer is adamant that service interruptions not affect the user experience. As you near launch, you discover that the application currently uses multicast to share session state between web servers, In order to handle session state within the VPC, you choose to:
    1. Store session state in Amazon ElastiCache for Redis (scalable and makes the web applications stateless)
    2. Create a mesh VPN between instances and allow multicast on it
    3. Store session state in Amazon Relational Database Service (RDS solution not highly scalable)
    4. Enable session stickiness via Elastic Load Balancing (affects user experience if the instance goes down)
  7. When you are designing to support a 24-hour flash sale, which one of the following methods best describes a strategy to lower the latency while keeping up with unusually heavy traffic?
    1. Launch enhanced networking instances in a placement group to support the heavy traffic (only improves internal communication)
    2. Apply Service Oriented Architecture (SOA) principles instead of a 3-tier architecture (just simplifies architecture)
    3. Use Elastic Beanstalk to enable blue-green deployment (only minimizes download for applications and ease of rollback)
    4. Use ElastiCache as in-memory storage on top of DynamoDB to store user sessions (scalable, faster read/writes and in memory storage)
  8. You are configuring your company’s application to use Auto Scaling and need to move user state information. Which of the following AWS services provides a shared data store with durability and low latency?
    1. AWS ElastiCache Memcached (does not provide durability as if the node is gone the data is gone)
    2. Amazon Simple Storage Service
    3. Amazon EC2 instance storage
    4. Amazon DynamoDB
  9. Your application is using an ELB in front of an Auto Scaling group of web/application servers deployed across two AZs and a Multi-AZ RDS Instance for data persistence. The database CPU is often above 80% usage and 90% of I/O operations on the database are reads. To improve performance you recently added a single-node Memcached ElastiCache Cluster to cache frequent DB query results. In the next weeks the overall workload is expected to grow by 30%. Do you need to change anything in the architecture to maintain the high availability for the application with the anticipated additional load and Why?
    1. You should deploy two Memcached ElastiCache Clusters in different AZs because the RDS Instance will not be able to handle the load if the cache node fails.
    2. If the cache node fails the automated ElastiCache node recovery feature will prevent any availability impact. (does not provide high availability, as data is lost if the node is lost)
    3. Yes you should deploy the Memcached ElastiCache Cluster with two nodes in the same AZ as the RDS DB master instance to handle the load if one cache node fails. (Single AZ affects availability as DB is Multi AZ and would be overloaded is the AZ goes down)
    4. No if the cache node fails you can always get the same data from the DB without having any availability impact. (Will overload the database affecting availability)
  10. A read only news reporting site with a combined web and application tier and a database tier that receives large and unpredictable traffic demands must be able to respond to these traffic fluctuations automatically. What AWS services should be used meet these requirements?
    1. Stateless instances for the web and application tier synchronized using ElastiCache Memcached in an autoscaling group monitored with CloudWatch and RDS with read replicas.
    2. Stateful instances for the web and application tier in an autoscaling group monitored with CloudWatch and RDS with read replicas (Stateful instances will not allow for scaling)
    3. Stateful instances for the web and application tier in an autoscaling group monitored with CloudWatch and multi-AZ RDS (Stateful instances will allow not for scaling & multi-AZ is for high availability and not scaling)
    4. Stateless instances for the web and application tier synchronized using ElastiCache Memcached in an autoscaling group monitored with CloudWatch and multi-AZ RDS (multi-AZ is for high availability and not scaling)
  11. You have written an application that uses the Elastic Load Balancing service to spread traffic to several web servers. Your users complain that they are sometimes forced to login again in the middle of using your application, after they have already logged in. This is not behavior you have designed. What is a possible solution to prevent this happening?
    1. Use instance memory to save session state.
    2. Use instance storage to save session state.
    3. Use EBS to save session state.
    4. Use ElastiCache to save session state.
    5. Use Glacier to save session slate.