AWS RDS Multi-AZ Deployment

RDS Multi-AZ Deployment Overview

  • Multi-AZ deployments provides high availability and automatic failover support for DB instances
  • Multi-AZ helps improve the durability and availability of a critical system, enhancing availability during planned system maintenance, DB instance failure and Availability Zone disruption.
  • Multi-AZ is a High Availability feature is not a scaling solution for read-only scenarios; standby replica can’t be used to serve read traffic. To service read-only traffic, use a Read Replica.
  • Multi-AZ deployments for Oracle, PostgreSQL, MySQL, and MariaDB DB instances use Amazon technology, while SQL Server DB instances use SQL Server Mirroring.
  • In a Multi-AZ deployment,
    • RDS automatically provisions and maintains a synchronous standby replica in a different Availability Zone.
    • Copies of data are stored in different Availability Zones for greater levels of data durability.
    • Primary DB instance is synchronously replicated across Availability Zones to a standby replica to provide
      • data redundancy,
      • eliminate I/O freezes during snapshots and backups
      • and minimize latency spikes during system backups.
    • DB instances may have increased write and commit latency compared to a Single AZ deployment, due to the synchronous data replication
    • Transaction success is returned only if the commit is successful both on the primary and the standby DB
    • There might be a change in latency if the deployment fails over to the standby replica, although AWS is engineered with low-latency network connectivity between Availability Zones.
  • When using the BYOL licensing model, a license for both the primary instance and the standby replica is required
  • For production workloads, it is recommended to use Multi-AZ deployment with Provisioned IOPS and DB instance classes (m1.large and larger), optimized for Provisioned IOPS for fast, consistent performance.
  • When Single-AZ deployment is modified to a Multi-AZ deployment (for engines other than SQL Server or Amazon Aurora)
    • RDS takes a snapshot of the primary DB instance from the deployment and restores the snapshot into another Availability Zone.
    • RDS then sets up synchronous replication between the primary DB instance and the new instance.
    • This avoids downtime when conversion from Single AZ to Multi-AZ

RDS Multi-AZ Failover Process

  • In the event of a planned or unplanned outage of the DB instance,
    • RDS automatically switches to a standby replica in another AZ, if enabled for Multi-AZ.
    • Time it takes for the failover to complete depends on the database activity and other conditions at the time the primary DB instance became unavailable.
    • Failover times are typically 60-120 secs. However, large transactions or a lengthy recovery process can increase failover time.
    • Failover mechanism automatically changes the DNS record of the DB instance to point to the standby DB instance.
    • Multi-AZ switch is seamless to the applications as there is no change in the endpoint URLs but just needs to re-establish any existing connections to the DB instance.
  • RDS handles failover automatically so that database operations can be resumed as quickly as possible without administrative intervention.
  • Primary DB instance switches over automatically to the standby replica if any of the following conditions occur:
    • An Availability Zone outage
    • Primary DB instance fails
    • DB instance’s server type is changed
    • Operating system of the DB instance is undergoing software patching
    • A manual failover of the DB instance was initiated using Reboot with failover (also referred to as Forced Failover)
  • If the Multi-AZ DB instance has failed over, can be determined by
    • DB event subscriptions can be setup to notify you via email or SMS that a failover has been initiated.
    • DB events can be viewed via the Amazon RDS console or APIs.
    • Current state of your Multi-AZ deployment can be viewed via the RDS console and APIs.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company is deploying a new two-tier web application in AWS. The company has limited staff and requires high availability, and the application requires complex queries and table joins. Which configuration provides the solution for the company’s requirements?
    1. MySQL Installed on two Amazon EC2 Instances in a single Availability Zone (does not provide High Availability out of the box)
    2. Amazon RDS for MySQL with Multi-AZ
    3. Amazon ElastiCache (Just a caching solution)
    4. Amazon DynamoDB (Not suitable for complex queries and joins)
  2. What would happen to an RDS (Relational Database Service) multi-Availability Zone deployment if the primary DB instance fails?
    1. IP of the primary DB Instance is switched to the standby DB Instance.
    2. A new DB instance is created in the standby availability zone.
    3. The canonical name record (CNAME) is changed from primary to standby.
    4. The RDS (Relational Database Service) DB instance reboots.
  3. Will my standby RDS instance be in the same Availability Zone as my primary?
    1. Only for Oracle RDS types
    2. Yes
    3. Only if configured at launch
    4. No
  4. Is creating a Read Replica of another Read Replica supported?
    1. Only in certain regions
    2. Only with MySQL based RDS
    3. Only for Oracle RDS types
    4. No
  5. A user is planning to set up the Multi-AZ feature of RDS. Which of the below mentioned conditions won’t take advantage of the Multi-AZ feature?
    1. Availability zone outage
    2. A manual failover of the DB instance using Reboot with failover option
    3. Region outage
    4. When the user changes the DB instance’s server type
  6. When you run a DB Instance as a Multi-AZ deployment, the “_____” serves database writes and reads
    1. secondary
    2. backup
    3. stand by
    4. primary
  7. When running my DB Instance as a Multi-AZ deployment, can I use the standby for read or write operations?
    1. Yes
    2. Only with MSSQL based RDS
    3. Only for Oracle RDS instances
    4. No
  8. Read Replicas require a transactional storage engine and are only supported for the _________ storage engine
    1. OracleISAM
    2. MSSQLDB
    3. InnoDB
    4. MyISAM
  9. A user is configuring the Multi-AZ feature of an RDS DB. The user came to know that this RDS DB does not use the AWS technology, but uses server mirroring to achieve replication. Which DB is the user using right now?
    1. MySQL
    2. Oracle
    3. MS SQL
    4. PostgreSQL
  10. If you have chosen Multi-AZ deployment, in the event of a planned or unplanned outage of your primary DB Instance, Amazon RDS automatically switches to the standby replica. The automatic failover mechanism simply changes the ______ record of the main DB Instance to point to the standby DB Instance.
    1. DNAME
    2. CNAME
    3. TXT
    4. MX
  11. When automatic failover occurs, Amazon RDS will emit a DB Instance event to inform you that automatic failover occurred. You can use the _____ to return information about events related to your DB Instance
    1. FetchFailure
    2. DescriveFailure
    3. DescribeEvents
    4. FetchEvents
  12. The new DB Instance that is created when you promote a Read Replica retains the backup window period.
    1. TRUE
    2. FALSE
  13. Will I be alerted when automatic failover occurs?
    1. Only if SNS configured
    2. No
    3. Yes
    4. Only if Cloudwatch configured
  14. Can I initiate a “forced failover” for my MySQL Multi-AZ DB Instance deployment?
    1. Only in certain regions
    2. Only in VPC
    3. Yes
    4. No
  15. A user is accessing RDS from an application. The user has enabled the Multi-AZ feature with the MS SQL RDS DB. During a planned outage how will AWS ensure that a switch from DB to a standby replica will not affect access to the application?
    1. RDS will have an internal IP which will redirect all requests to the new DB
    2. RDS uses DNS to switch over to standby replica for seamless transition
    3. The switch over changes Hardware so RDS does not need to worry about access
    4. RDS will have both the DBs running independently and the user has to manually switch over
  16. Which of the following is part of the failover process for a Multi-AZ Amazon Relational Database Service (RDS) instance?
    1. The failed RDS DB instance reboots.
    2. The IP of the primary DB instance is switched to the standby DB instance.
    3. The DNS record for the RDS endpoint is changed from primary to standby.
    4. A new DB instance is created in the standby availability zone.
  17. Which of these is not a reason a Multi-AZ RDS instance will failover?
    1. An Availability Zone outage
    2. A manual failover of the DB instance was initiated using Reboot with failover
    3. To autoscale to a higher instance class (Refer link)
    4. Master database corruption occurs
    5. The primary DB instance fails
  18. How does Amazon RDS multi Availability Zone model work?
    1. A second, standby database is deployed and maintained in a different availability zone from master, using synchronous replication. (Refer link)
    2. A second, standby database is deployed and maintained in a different availability zone from master using asynchronous replication.
    3. A second, standby database is deployed and maintained in a different region from master using asynchronous replication.
    4. A second, standby database is deployed and maintained in a different region from master using synchronous replication.
  19. A user is using a small MySQL RDS DB. The user is experiencing high latency due to the Multi AZ feature. Which of the below mentioned options may not help the user in this situation?
    1. Schedule the automated back up in non-working hours
    2. Use a large or higher size instance
    3. Use PIOPS
    4. Take a snapshot from standby Replica
  20. What is the charge for the data transfer incurred in replicating data between your primary and standby?
    1. No charge. It is free.
    2. Double the standard data transfer charge
    3. Same as the standard data transfer charge
    4. Half of the standard data transfer charge
  21. A user has enabled the Multi AZ feature with the MS SQL RDS database server. Which of the below mentioned statements will help the user understand the Multi AZ feature better?
    1. In a Multi AZ, AWS runs two DBs in parallel and copies the data asynchronously to the replica copy
    2. In a Multi AZ, AWS runs two DBs in parallel and copies the data synchronously to the replica copy
    3. In a Multi AZ, AWS runs just one DB but copies the data synchronously to the standby replica
    4. AWS MS SQL does not support the Multi AZ feature

AWS Route 53 Routing Policy – Certification

AWS Route 53 Routing Policy

AWS Route 53 routing policy determines how AWS would respond to the DNS queries and provides multiple Routing policy options

Simple Routing Policy

  • Simple routing policy is a simple round robin policy and can be applied when there is a single resource doing the function for the domain for e.g. web server that serves content for the website
  • AWS Route 53 responds to the DNS queries based on the values in the resource record set for e.g. ip address in an A record

Weighted Routing Policy

  • Weighted routing policy enables Route 53 to route traffic to different resources in specified proportions (weights) for e.g., 75% one server and 25% to the other during a pilot release
  • Weights can be assigned any number from 0 to 255
  • Weighted routing policy can be applied when there are multiple resources that perform the same function for e.g., webservers serving the same site
  • Weighted resource record sets let you associate multiple resources with a single DNS name
  • Common use cases include
    • load balancing
    • A/B testing and piloting new versions of software
  • To create a group of weighted resource record sets, two or more resource record sets can be created that have the same combination of DNS name and type, and each resource record set is assigned a unique identifier and a relative weight.
  • When processing a DNS query, Route 53 searches for a resource record set or a group of resource record sets that have the specified name and type.
  • Route 53 selects one from the group. Probability of any one resource record set being selected depends on its weight as a proportion of the total weight for all resource record sets in the group for e.g., suppose for www.example.com has three resource record sets with weights of 1 (20%), 1 (20%), and 3 (60%)(sum = 5). On average, Route 53 selects each of the first two resource record sets one-fifth of the time, and returns the third resource record set three-fifths of the time.

Latency-based Routing (LBR) Policy

  • Latency-based Routing Policy enables Route 53 to respond to the DNS query based on which data center gives the user the lowest network latency
  • Latency-based routing policy can be used when there are multiple resources performing the same function and Route 53 needs to be configured to respond to the DNS queries with the resources that provide the fastest response with lowest latency
  • Latency resource record set can be created for the EC2 resource in each region that hosts the application. When Route 53 receives a query for the corresponding domain, it selects the latency resource record set for the EC2 region that gives the user the lowest latency. Route 53 then responds with the value associated with that resource record set for e.g., you might have web servers for example.com in the EC2 data centers in Ireland and in Tokyo. When a user browses to example.com from Singapore, Route 53 will pick up the data center (Tokyo) which has the lowest latency from the users location
  • Latency between hosts on the Internet can change over time as a result of changes in network connectivity and routing. Latency-based routing is based on latency measurements performed over a period of time, and the measurements reflect these changes for e.g. if the latency from the user in Singapore to Ireland improves, the user can be routed to Ireland
  • Latency based routing cannot guarantee users from the same geographic will be served from the same location for any compliance reason
  • Latency resource record sets can be created using any record type that Route 53 supports except NS or SOA

Failover Routing Policy

  • Failover routing policy allows active-passive failover configuration, in which one resource takes all traffic when it’s available and the other resource takes all traffic when the first resource isn’t available.
  • Route 53 health checking agents will monitor each location/endpoint of the application to determine the availability.
  • Failover routing policy is applicable for Public hosted zones only

Geolocation Routing Policy

  • Geolocation routing policy enables Route 53 to respond to DNS queries based on the geographic location of the users i.e. location from which the DNS queries originate
  • Common use cases include
    • localization of content and presenting some or all of the website in the users language
    • restrict distribution of content to only the locations in which you have distribution rights.
    • balancing load across endpoints in a predictable, easy-to-manage way, so that each user location is consistently routed to the same endpoint.
  • Geolocation routing policy allows geographic locations to be specified by continent, country, or by state (only in US)
  • Geolocation record sets, if created, for overlapping geographic regions for e.g. continent and then for the country within the same continent, priority goes to the smallest geographic region, which allows some queries for a continent to be routed to one resource and queries for selected countries on that continent to a different resource
  • Geolocation works by mapping IP addresses to locations, which might not mapped to a exact geographic location
  • A default resource record set can be created to handle these queries and also the ones which do not have an explicit record set created
  • Route 53 returns a “no answer”response for queries from those locations, if a default resource record set if not created
  • Two geolocation resource record sets that specify the same geographic location cannot be created
  • Route 53 supports the edns-client-subnet extension of EDNS0 (EDNS0 adds several optional extensions to the DNS protocol.) to improve the accuracy of geolocation routing

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You have deployed a web application targeting a global audience across multiple AWS Regions under the domain name example.com. You decide to use Route 53 Latency-Based Routing to serve web requests to users from the region closest to the user. To provide business continuity in the event of server downtime you configure weighted record sets associated with two web servers in separate Availability Zones per region. During a DR test you notice that when you disable all web servers in one of the regions Route 53 does not automatically direct all users to the other region. What could be happening? (Choose 2 answers)
    1. Latency resource record sets cannot be used in combination with weighted resource record sets.
    2. You did not setup an http health check for one or more of the weighted resource record sets associated with the disabled web servers
    3. The value of the weight associated with the latency alias resource record set in the region with the disabled servers is higher than the weight for the other region.
    4. One of the two working web servers in the other region did not pass its HTTP health check
    5. You did not set “Evaluate Target Health” to “Yes” on the latency alias resource record set associated with example.com in the region where you disabled the servers.
  2. The compliance department within your multi-national organization requires that all data for your customers that reside in the European Union (EU) must not leave the EU and also data for customers that reside in the US must not leave the US without explicit authorization. What must you do to comply with this requirement for a web based profile management application running on EC2?
    1. Run EC2 instances in multiple AWS Availability Zones in single Region and leverage an Elastic Load Balancer with session stickiness to route traffic to the appropriate zone to create their profile (should be in 2 different regions – US and Europe)
    2. Run EC2 instances in multiple Regions and leverage Route 53’s Latency Based Routing capabilities to route traffic to the appropriate region to create their profile (Latency based routing policy would not guarantee the compliance requirement)
    3. Run EC2 instances in multiple Regions and leverage a third party data provider to determine if a user needs to be redirect to the appropriate region to create their profile
    4. Run EC2 instances in multiple AWS Availability Zones in a single Region and leverage a third party data provider to determine if a user needs to be redirect to the appropriate zone to create their profile(should be in 2 different regions – US and Europe)
  3. A US-based company is expanding their web presence into Europe. The company wants to extend their AWS infrastructure from Northern Virginia (us-east-1) into the Dublin (eu-west-1) region. Which of the following options would enable an equivalent experience for users on both continents?
    1. Use a public-facing load balancer per region to load-balance web traffic, and enable HTTP health checks.
    2. Use a public-facing load balancer per region to load-balance web traffic, and enable sticky sessions.
    3. Use Amazon Route 53, and apply a geolocation routing policy to distribute traffic across both regions
    4. Use Amazon Route 53, and apply a weighted routing policy to distribute traffic across both regions.
  4. You have been asked to propose a multi-region deployment of a web-facing application where a controlled portion of your traffic is being processed by an alternate region. Which configuration would achieve that goal?
    1. Route 53 record sets with weighted routing policy
    2. Route 53 record sets with latency based routing policy
    3. Auto Scaling with scheduled scaling actions set
    4. Elastic Load Balancing with health checks enabled
  5. Your company is moving towards tracking web page users with a small tracking image loaded on each page. Currently you are serving this image out of us-east, but are starting to get concerned about the time it takes to load the image for users on the west coast. What are the two best ways to speed up serving this image? Choose 2 answers
    1. Use Route 53’s Latency Based Routing and serve the image out of us-west-2 as well as us-east-1
    2. Serve the image out through CloudFront
    3. Serve the image out of S3 so that it isn’t being served of your web application tier
    4. Use EBS PIOPs to serve the image faster out of your EC2 instances
  6. Your API requires the ability to stay online during AWS regional failures. Your API does not store any state, it only aggregates data from other sources – you do not have a database. What is a simple but effective way to achieve this uptime goal?
    1. Use a CloudFront distribution to serve up your API. Even if the region your API is in goes down, the edge locations CloudFront uses will be fine.
    2. Use an ELB and a cross-zone ELB deployment to create redundancy across datacenters. Even if a region fails, the other AZ will stay online.
    3. Create a Route53 Weighted Round Robin record, and if one region goes down, have that region redirect to the other region.
    4. Create a Route53 Latency Based Routing Record with Failover and point it to two identical deployments of your stateless API in two different regions. Make sure both regions use Auto Scaling Groups behind ELBs. (Refer link)

References