AWS RDS Aurora

AWS RDS Aurora

  • AWS Aurora is a relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases.
  • Aurora is a fully managed, MySQL- and PostgreSQL-compatible, relational database engine i.e. applications developed with MySQL can switch to Aurora with little or no changes
  • Aurora delivers up to 5x performance of MySQL without requiring any changes to most MySQL applications
  • Aurora PostgreSQL delivers up to 3x performance of PostgreSQL.
  • RDS manages the Aurora databases, handling time-consuming tasks such as provisioning, patching, backup, recovery, failure detection and repair.
  • Based on the database usage, Aurora storage will automatically grow, from 10GB to 64TiB in 10GB increments with no impact to database performance

Aurora DB Clusters

AWS Aurora Architecture

  • Aurora DB cluster consists of one or more DB instances and a cluster volume that manages the data for those DB instances.
  • An Aurora cluster volume is a virtual database storage volume that spans multiple AZs, with each AZ having a copy of the DB cluster data
  • Two types of DB instances make up an Aurora DB cluster:
    • Primary DB instance
      • Supports read and write operations, and performs all of the data modifications to the cluster volume.
      • Each Aurora DB cluster has one primary DB instance.
    • Aurora Replica
      • Connects to the same storage volume as the primary DB instance and supports only read operations.
      • Each Aurora DB cluster can have up to 15 Aurora Replicas in addition to the primary DB instance.
      • Provides high availability by locating Replicas in separate AZs
      • Aurora automatically fails over to an Aurora Replica in case the primary DB instance becomes unavailable.
      • Failover priority for Aurora Replicas can be specified.
      • Aurora Replicas can also offload read workloads from the primary DB instance
  • For Aurora multi-master clusters
    • all DB instances have read/write capability, with no difference between primary and replica.

Connection Endpoints

  • Aurora involves a cluster of DB instances instead of a single instance
  • Endpoint refers to an intermediate handler with the host name and port specified to connect to the cluster
  • Aurora uses the endpoint mechanism to abstract these connections

Cluster endpoint

  • Cluster endpoint (or writer endpoint) for an Aurora DB cluster connects to the current primary DB instance for that DB cluster.
  • Cluster endpoint is the only one that can perform write operations such as DDL statements as well as read operations
  • Each Aurora DB cluster has one cluster endpoint and one primary DB instance.
  • Cluster endpoint provides failover support for read/write connections to the DB cluster. If the current primary DB instance of a DB cluster fails, Aurora automatically fails over to a new primary DB instance. During a failover, the DB cluster continues to serve connection requests to the cluster endpoint from the new primary DB instance, with minimal interruption of service.

Reader endpoint

  • Reader endpoint for an Aurora DB cluster provides load-balancing support for read-only connections to the DB cluster.
  • Use the reader endpoint for read operations, such as queries.
  • Reader endpoint reduces the overhead on the primary instance by processing the statements on the read-only Aurora Replicas.
  • Each Aurora DB cluster has one reader endpoint.
  • If the cluster contains one or more Aurora Replicas, the reader endpoint load-balances each connection request among the Aurora Replicas.

Custom endpoint

  • Custom endpoint for an Aurora cluster represents a set of DB instances that you choose.
  • Aurora performs load balancing and chooses one of the instances in the group to handle the connection.
  • An Aurora DB cluster has no custom endpoints until one created and upto five custom endpoints can be created for each provisioned Aurora cluster.
  • Aurora Serverless clusters does not support custom endpoints

Instance endpoint

  • An instance endpoint connects to a specific DB instance within an Aurora cluster and provides direct control over connections to the DB cluster.
  • Each DB instance in a DB cluster has its own unique instance endpoint. So there is one instance endpoint for the current primary DB instance of the DB cluster, and there is one instance endpoint for each of the Aurora Replicas in the DB cluster.

High Availability and Replication

  • Aurora is designed to offer greater than 99.99% availability
  • Aurora provides data durability and reliability
    • by replicating the database volume six ways across three Availability Zones in a single region
    • backing up the data continuously to  S3.
  • Aurora transparently recovers from physical storage failures; instance failover typically takes less than 30 seconds.
  • If the primary DB instance fails, Aurora automatically fails over to a new primary DB instance, by either promoting an existing Aurora Replica to a new primary DB instance or creating a new primary DB instance
  • Aurora automatically divides the database volume into 10GB segments spread across many disks. Each 10GB chunk of the database volume is replicated six ways, across three Availability Zones
  • RDS databases for e.g. MySQL, Oracle etc. have the data in a single AZ
  • Aurora is designed to transparently handle
    • the loss of up to two copies of data without affecting database write availability and
    • up to three copies without affecting read availability.
  • Aurora storage is also self-healing. Data blocks and disks are continuously scanned for errors and repaired automatically.
  • Aurora Replicas share the same underlying volume as the primary instance. Updates made by the primary are visible to all Aurora Replicas
  • As Aurora Replicas share the same data volume as the primary instance, there is virtually no replication lag
  • Any Aurora Replica can be promoted to become primary without any data loss and therefore can be used for enhancing fault tolerance in the event of a primary DB Instance failure.
  • To increase database availability, 1 to 15 replicas can be created in any of 3 AZs, and RDS will automatically include them in failover primary selection in the event of a database outage.

Security

  • Aurora uses SSL (AES-256) to secure the connection between the database instance and the application
  • Aurora allows database encryption using keys managed through AWS Key Management Service (KMS).
  • Encryption and decryption are handled seamlessly.
  • With Aurora encryption, data stored at rest in the underlying storage is encrypted, as are its automated backups, snapshots, and replicas in the same cluster.
  • Encryption of existing unencrypted Aurora instance is not supported. Create a new encrypted Aurora instance and migrate the data

Backup and Restore

  • Automated backups are always enabled on Aurora DB Instances.
  • Backups do not impact database performance.
  • Aurora also allows creation of manual snapshots
  • Aurora automatically maintains 6 copies of the data across 3 AZs and will automatically attempt to recover the database in a healthy AZ with no data loss
  • If in any case the data is unavailable within Aurora storage,
    • DB Snapshot can be restored or
    • point-in-time restore operation can be performed to a new instance. Latest restorable time for a point-in-time restore operation can be up to 5 minutes in the past.
  • Restoring a snapshot creates a new Aurora DB instance
  • Deleting Aurora database deletes all the automated backups (with an option to create a final snapshot), but would not remove the manual snapshots.
  • Snapshots (including encrypted ones) can be shared with another AWS accounts

Aurora Serverless

  • Amazon Aurora Serverless is an on-demand, autoscaling configuration for the MySQL-compatible and PostgreSQL-compatible editions of Aurora.
  • An Aurora Serverless DB cluster automatically starts up, shuts down, and scales capacity up or down based on the application’s needs.
  • Aurora Serverless provides a relatively simple, cost-effective option for infrequent, intermittent, or unpredictable workloads.

Aurora Global Database

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Company wants to use MySQL compatible relational database with greater performance. Which AWS service can be used?
    1. Aurora
    2. RDS
    3. SimpleDB
    4. DynamoDB
  2. An application requires a highly available relational database with an initial storage capacity of 8 TB. The database will grow by 8 GB every day. To support expected traffic, at least eight read replicas will be required to handle database reads. Which option will meet these requirements?
    1. DynamoDB
    2. Amazon S3
    3. Amazon Aurora
    4. Amazon Redshift
  3. A company is migrating their on-premise 10TB MySQL database to AWS. As a compliance requirement, the company wants to have the data replicated across three availability zones. Which Amazon RDS engine meets the above business requirement?
    1. Use Multi-AZ RDS
    2. Use RDS
    3. Use Aurora
    4. Use DynamoDB