AWS Database Services Cheat Sheet

AWS Database Services Cheat Sheet

AWS Database Services

Relational Database Service – RDS

  • provides Relational Database service
  • supports MySQL, MariaDB, PostgreSQL, Oracle, Microsoft SQL Server, and the new, MySQL-compatible Amazon Aurora DB engine
  • as it is a managed service, shell (root ssh) access is not provided
  • manages backups, software patching, automatic failure detection, and recovery
  • supports use initiated manual backups and snapshots
  • daily automated backups with database transaction logs enables Point in Time recovery up to the last five minutes of database usage
  • snapshots are user-initiated storage volume snapshot of DB instance, backing up the entire DB instance and not just individual databases that can be restored as a independent RDS instance
  • RDS Security
    • support encryption at rest using KMS as well as encryption in transit using SSL endpoints
    • supports IAM database authentication, which prevents the need to store static user credentials in the database, because authentication is managed externally using IAM.
    • supports Encryption only during creation of an RDS DB instance
    • existing unencrypted DB cannot be encrypted and you need to create a  snapshot, created a encrypted copy of the snapshot and restore as encrypted DB
    • supports Secret Manager for storing and rotating secrets
    • for encrypted database
      • logs, snapshots, backups, read replicas are all encrypted as well
      • cross region replicas and snapshots does not work across region (Note – this is possible now with latest AWS enhancement)
  • Multi-AZ deployment
    • provides high availability and automatic failover support and is NOT a scaling solution
    • maintains a synchronous standby replica in a different AZ
    • transaction success is returned only if the commit is successful both on the primary and the standby DB
    • Oracle, PostgreSQL, MySQL, and MariaDB DB instances use Amazon technology, while SQL Server DB instances use SQL Server Mirroring
    • snapshots and backups are taken from standby & eliminate I/O freezes
    • during automatic failover, its seamless and RDS switches to the standby instance and updates the DNS record to point to standby
    • failover can be forced with the Reboot with failover option
  • Read Replicas
    • uses the PostgreSQL, MySQL, and MariaDB DB engines’ built-in replication functionality to create a separate Read Only instance
    • updates are asynchronously copied to the Read Replica, and data might be stale
    • can help scale applications and reduce read only load
    • requires automatic backups enabled
    • replicates all databases in the source DB instance
    • for disaster recovery, can be promoted to a full fledged database
    • can be created in a different region for disaster recovery, migration and low latency across regions
    • can’t create encrypted read replicas from unencrypted DB or read replica
  • RDS does not support all the features of underlying databases, and if required the database instance can be launched on an EC2 instance
  • RDS Components
    • DB parameter groups contains engine configuration values that can be applied to one or more DB instances of the same instance type for e.g. SSL, max connections etc.
    • Default DB parameter group cannot be modified, create a custom one and attach to the DB
    • Supports static and dynamic parameters
      • changes to dynamic parameters are applied immediately (irrespective of apply immediately setting)
      • changes to static parameters are NOT applied immediately and require a manual reboot.
  • RDS Monitoring & Notification
    • integrates with CloudWatch and CloudTrail
    • CloudWatch provides metrics about CPU utilization from the hypervisor for a DB instance, and Enhanced Monitoring gathers its metrics from an agent on the instance
    • Performance Insights is a database performance tuning and monitoring feature that helps illustrate the database’s performance and help analyze any issues that affect it
    • supports RDS Event Notification which uses the SNS to provide notification when an RDS event like creation, deletion or snapshot creation etc occurs

Aurora

  • is a relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases
  • is a managed services and handles time-consuming tasks such as provisioning, patching, backup, recovery, failure detection and repair
  • is a proprietary technology from AWS (not open sourced)
  • provides PostgreSQL and MySQL compatibility
  • is “AWS cloud optimized” and claims 5x performance improvement
    over MySQL on RDS, over 3x the performance of PostgreSQL on RDS
  • scales storage automatically in increments of 10GB, up to 64 TB with no impact to database performance. Storage is striped across 100s of volumes.
  • no need to provision storage in advance.
  • provides self-healing storage. Data blocks and disks are continuously scanned for errors and repaired automatically.
  • provides instantaneous failover
  • replicates each chunk of my the database volume six ways across three Availability Zones i.e. 6 copies of the data across 3 AZ
    • requires 4 copies out of 6 needed for writes
    • requires 3 copies out of 6 need for reads
  • costs more than RDS (20% more) – but is more efficient
  • Read Replicas
    • can have 15 replicas while MySQL has 5, and the replication process is faster (sub 10 ms replica lag)
    • share the same data volume as the primary instance in the same AWS Region, there is virtually no replication lag
    • supports Automated failover for master in less than 30 seconds
    • supports Cross Region Replication using either physical or logical replication.
  • Security
    • supports Encryption at rest using KMS
    • supports Encryption in flight using SSL (same process as MySQL or Postgres)
    • Automated backups, snapshots and replicas are also encrypted
    • Possibility to authenticate using IAM token (same method as RDS)
    • supports protecting the instance with security groups
    • does not support SSH access to the underlying servers
  • Aurora Serverless
    • provides automated database Client  instantiation and on-demand  autoscaling based on actual usage
    • provides a relatively simple, cost-effective option for infrequent, intermittent, or unpredictable workloads
    • automatically starts up, shuts down, and scales capacity up or down based on the application’s needs. No capacity planning needed
    • Pay per second, can be more cost-effective
  • Aurora Global Database
    • allows a single Aurora database to span multiple AWS regions.
    • provides Physical replication, which uses dedicated infrastructure that leaves the databases entirely available to serve the application
    • supports 1 Primary Region (read / write)
    • replicates across up to 5 secondary (read-only) regions, replication lag is less than 1 second
    • supports up to 16 Read Replicas per secondary region
    • recommended for low-latency global reads and disaster recovery with an RTO of < 1 minute
    • failover is not automated and if the primary region becomes unavailable, a secondary region can be manually removed from an Aurora Global Database and promote it to take full reads and writes. Application needs to be updated to point to the newly promoted region.
  • Aurora Backtrack
    • Backtracking “rewinds” the DB cluster to the specified time
    • Backtracking performs in place restore and does not create a new instance. There is a minimal downtime associated with it.
  • Aurora Clone feature allows quick and cost-effective creation of Aurora Cluster duplicates
  • supports parallel or distributed query using Aurora Parallel Query, which refers to the ability to push down and distribute the computational load of a single query across thousands of CPUs in Aurora’s storage layer.

DynamoDB

  • fully managed NoSQL database service
  • synchronously replicates data across three facilities in an AWS Region, giving high availability and data durability
  • runs exclusively on SSDs to provide high I/O performance
  • provides provisioned table reads and writes
  • automatically partitions, reallocates, and re-partitions the data and provisions additional server capacity as data or throughput changes
  • creates and maintains indexes for the primary key attributes for efficient access to data in the table
  • DynamoDB Table classes currently support
    • DynamoDB Standard table class is the default and is recommended for the vast majority of workloads.
    • DynamoDB Standard-Infrequent Access (DynamoDB Standard-IA) table class which is optimized for tables where storage is the dominant cost.
  • supports Secondary Indexes
    • allows querying attributes other than the primary key attributes without impacting performance.
    • are automatically maintained as sparse objects
  • Local secondary index vs Global secondary index
    • shares partition key + different sort key vs different partition + sort key
    • search limited to partition vs across all partition
    • unique attributes vs non-unique attributes
    • linked to the base table vs independent separate index
    • only created during the base table creation vs can be created later
    • cannot be deleted after creation vs can be deleted
    • consumes provisioned throughput capacity of the base table vs independent throughput
    • returns all attributes for item vs only projected attributes
    • Eventually or Strongly vs Only Eventually consistent reads
    • size limited to 10Gb per partition vs unlimited
  • DynamoDB Consistency
    • provides Eventually consistent (by default) or Strongly Consistent option to be specified during a read operation
    • supports Strongly consistent reads for a few operations like Query, GetItem, and BatchGetItem using the ConsistentRead parameter
  • DynamoDB Throughput Capacity
    • supports On-demand and Provisioned read/write capacity modes
    • Provisioned mode requires the number of reads and writes per second as required by the application to be specified
    • On-demand mode provides flexible billing option capable of serving thousands of requests per second without capacity planning
  • DynamoDB Auto Scaling helps dynamically adjust provisioned throughput capacity on your behalf, in response to actual traffic patterns.
  • DynamoDB Adaptive capacity is a feature that enables DynamoDB to run imbalanced workloads indefinitely.
  • DynamoDB Global Tables provide multi-master, cross-region replication capability of DynamoDB to support data access locality and regional fault tolerance for database workloads.
  • DynamoDB Streams provides a time-ordered sequence of item-level changes made to data in a table
  • DynamoDB Time to Live (TTL)
    • enables a per-item timestamp to determine when an item expiry
    • expired items are deleted from the table without consuming any write throughput.
  • DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement – from milliseconds to microseconds – even at millions of requests per second.
  • DynamoDB cross-region replication
    • allows identical copies (called replicas) of a DynamoDB table (called master table) to be maintained in one or more AWS regions.
    • using DynamoDB streams which leverages Kinesis and provides time-ordered sequence of item-level changes and can help for lower RPO, lower RTO disaster recovery
  • DynamoDB Triggers (just like database triggers) are a feature that allows the execution of custom actions based on item-level updates on a table.
  • VPC Gateway Endpoints provide private access to DynamoDB from within a VPC without the need for an internet gateway or NAT gateway.

ElastiCache

  • managed web service that provides in-memory caching to deploy and run Memcached or Redis protocol-compliant cache clusters
  • ElastiCache with Redis,
    • like RDS, supports Multi-AZ, Read Replicas and Snapshots
    • Read Replicas are created across AZ within same region using Redis’s asynchronous replication technology
    • Multi-AZ differs from RDS as there is no standby, but if the primary goes down a Read Replica is promoted as primary
    • Read Replicas cannot span across regions, as RDS supports
    • cannot be scaled out and if scaled up cannot be scaled down
    • allows snapshots for backup and restore
    • AOF can be enabled for recovery scenarios, to recover the data in case the node fails or service crashes. But it does not help in case the underlying hardware fails
    • Enabling Redis Multi-AZ as a Better Approach to Fault Tolerance
  • ElastiCache with Memcached
    • can be scaled up by increasing size and scaled out by adding nodes
    • nodes can span across multiple AZs within the same region
    • cached data is spread across the nodes, and a node failure will always result in some data loss from the cluster
    • supports auto discovery
    • every node should be homogenous and of same instance type
  • ElastiCache Redis vs Memcached
    • complex data objects vs simple key value storage
    • persistent vs non persistent, pure caching
    • automatic failover with Multi-AZ vs Multi-AZ not supported
    • scaling using Read Replicas vs using multiple nodes
    • backup & restore supported vs not supported
  • can be used state management to keep the web application stateless

Redshift

  • fully managed, fast and powerful, petabyte scale data warehouse service
  • uses replication and continuous backups to enhance availability and improve data durability and can automatically recover from node and component failures
  • provides Massive Parallel Processing (MPP) by distributing & parallelizing queries across multiple physical resources
  • columnar data storage improving query performance and allowing advance compression techniques
  • only supports Single-AZ deployments and the nodes are available within the same AZ, if the AZ supports Redshift clusters
  • spot instances are NOT an option

AWS RDS Proxy

RDS Proxy

AWS RDS Proxy

  • fully managed, highly available database proxy for RDS that makes applications more secure, scalable, more resilient to database failures.
  • allows apps to pool and share DB connections established with the database
  • improves database efficiency by reducing stress on the database resources (e.g. CPU, RAM) by minimizing open connections and creation of new connections.
  • is serverless and scales automatically to accommodate your workload.
  • is highly available and deployed across multiple Availability Zones.
  • increases resiliency to database failures by automatically connecting to a standby DB instance while preserving application connections.
  • reduces RDS and Aurora failover time by up to 66%.
  • protects the database against oversubscription by providing control over the number of database connections that are created.
  • queues or throttles application connections that can’t be served immediately from the pool of connections.
  • supports RDS (MySQL, PostgreSQL, MariaDB) and Aurora
  • is fully managed and there is no need to provision or manage any additional infrastructure.
  • required no code changes for most apps, just need to point to the RDS proxy endpoint instead of the RDS endpoint
  • enforce IAM Authentication for DB, and securely store credentials in AWS Secrets Manager
  • is never publicly accessible (must be accessed from VPC)

RDS Proxy

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company migrated one of its business-critical database workloads to an Amazon Aurora Multi-AZ DB cluster. The company requires a very low RTO and needs to improve the application recovery time after database failover. Which approach meets these requirements?
    1. Set the max_connections parameter to 16,000 in the instance-level parameter group.
    2. Modify the client connection timeout to 300 seconds.
    3. Create an Amazon RDS Proxy database proxy and update client connections to point to the proxy endpoint.
    4. Enable the query cache at the instance level.
  2. A company is running a serverless application on AWS Lambda that stores data in an Amazon RDS for MySQL DB instance. Usage has steadily increased, and recently there have been numerous “too many connections” errors when the Lambda function attempts to connect to the database. The company already has configured the database to use the maximum max_connections value that is possible. What should a SysOps administrator do to resolve these errors?
    1. Create a read replica of the database. Use Amazon Route 53 to create a weighted DNS record that contains both databases.
    2. Use Amazon RDS Proxy to create a proxy. Update the connection string in the Lambda function.
    3. Increase the value in the max_connect_errors parameter in the parameter group that the database uses.
    4. Update the Lambda function’s reserved concurrency to a higher value.

References

Amazon_RDS_Proxy

AWS RDS Monitoring & Notification

AWS RDS Monitoring & Notification

  • RDS integrates with CloudWatch and provides metrics for monitoring
  • CloudWatch alarms can be created over a single metric that sends an SNS message when the alarm changes state
  • RDS also provides SNS notification whenever any RDS event occurs
  • RDS Performance Insights is a database performance tuning and monitoring feature that helps illustrate the database’s performance and help analyze any issues that affect it
  • RDS Recommendations provides automated recommendations for database resources.

 RDS CloudWatch Monitoring

  • RDS DB instance can be monitored using CloudWatch, which collects and processes raw data from RDS into readable, near real-time metrics.
  • Statistics are recorded so that you can access historical information and gain a better perspective on how the service is performing.
  • By default, RDS metric data is automatically sent to CloudWatch in 1-minute periods
  • CloudWatch RDS Metrics
    • BinLogDiskUsage – Amount of disk space occupied by binary logs on the master. Applies to MySQL read replicas.
    • CPUUtilization – Percentage of CPU utilization.
    • DatabaseConnections – Number of database connections in use.
    • DiskQueueDepth – The number of outstanding IOs (read/write requests) waiting to access the disk.
    • FreeableMemory – Amount of available random access memory.
    • FreeStorageSpace – Amount of available storage space.
    • ReplicaLag – Amount of time a Read Replica DB instance lags behind the source DB instance.
    • SwapUsage – Amount of swap space used on the DB instance.
    • ReadIOPS – Average number of disk I/O operations per second.
    • WriteIOPS – Average number of disk I/O operations per second.
    • ReadLatency – Average amount of time taken per disk I/O operation.
    • WriteLatency – Average amount of time taken per disk I/O operation.
    • ReadThroughput – Average number of bytes read from disk per second.
    • WriteThroughput – Average number of bytes written to disk per second.
    • NetworkReceiveThroughput – Incoming (Receive) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.
    • NetworkTransmitThroughput – Outgoing (Transmit) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.

RDS Enhanced Monitoring

  • RDS provides metrics in real-time for the operating system (OS) that the DB instance runs on.
  • By default, Enhanced Monitoring metrics are stored for 30 days in the CloudWatch Logs, which are different from typical CloudWatch metrics.

CloudWatch vs Enhanced Monitoring Metrics

  • CloudWatch gathers metrics about CPU utilization from the hypervisor for a DB instance, and Enhanced Monitoring gathers its metrics from an agent on the instance.
  • Enhanced Monitoring metrics are useful to understand how different processes or threads on a DB instance use the CPU.
  • There might be differences between the measurements because the hypervisor layer performs a small amount of work. The differences can be greater if the DB instances use smaller instance classes because then there are likely more virtual machines (VMs) that are managed by the hypervisor layer on a single physical instance.

RDS Performance Insights

  • Performance Insights is a database performance tuning and monitoring feature that helps check the database’s performance and helps analyze any issues that affect it.
  • Database load is measured using a metric called Average Active Sessions or AAS which is calculated by sampling memory to determine the state of each active database connection.
  • AAS is the total number of sessions divided by the total number of samples for a specific time period.
  • Performance Insights help visualize the database load and filter the load by waits, SQL statements, hosts, or users.

RDS CloudTrail Logs

  • CloudTrail provides a record of actions taken by a user, role, or an AWS service in RDS.
  • CloudTrail captures all API calls for RDS as events, including calls from the console and from code calls to RDS API operations.
  • CloudTrail can help determine the request that was made to RDS, the IP address from which the request was made, who made the request, when it was made, and additional details.

RDS Recommendations

  • RDS provides automated recommendations for database resources.
  • The recommendations provide best practice guidance by analyzing DB instance configuration, usage, and performance data.

RDS Event Notification

  • RDS uses the SNS to provide notification when an RDS event occurs
  • RDS groups the events into categories, which can be subscribed so that a notification is sent when an event in that category occurs.
  • Event category for a DB instance, DB cluster, DB snapshot, DB cluster snapshot, DB security group, or for a DB parameter group can be subscribed
  • Event notifications are sent to the email addresses provided during subscription creation
  • Subscriptions can be easily turned off without deleting a subscription by setting the Enabled radio button to No in the RDS console or by setting the Enabled parameter to false using the CLI or RDS API.

RDS Trusted Advisor

  • Trusted Advisor inspects the AWS environment and then makes recommendations when opportunities exist to save money, improve system availability and performance, or help close security gaps.
  • Trusted Advisor has the following RDS-related checks:
    • RDS Idle DB Instances
    • RDS Security Group Access Risk
    • RDS Backups
    • RDS Multi-AZ

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You run a web application with the following components Elastic Load Balancer (ELB), 3 Web/Application servers, 1 MySQL RDS database with read replicas, and Amazon Simple Storage Service (Amazon S3) for static content. Average response time for users is increasing slowly. What three CloudWatch RDS metrics will allow you to identify if the database is the bottleneck? Choose 3 answers
    1. The number of outstanding IOs waiting to access the disk
    2. The amount of write latency
    3. The amount of disk space occupied by binary logs on the master.
    4. The amount of time a Read Replica DB Instance lags behind the source DB Instance
    5. The average number of disk I/O operations per second.
  2. Typically, you want your application to check whether a request generated an error before you spend any time processing results. The easiest way to find out if an error occurred is to look for an __________ node in the response from the Amazon RDS API.
    1. Incorrect
    2. Error
    3. FALSE
  3. In the Amazon CloudWatch, which metric should I be checking to ensure that your DB Instance has enough free storage space?
    1. FreeStorage
    2. FreeStorageSpace
    3. FreeStorageVolume
    4. FreeDBStorageSpace
  4. A user is receiving a notification from the RDS DB whenever there is a change in the DB security group. The user does not want to receive these notifications for only a month. Thus, he does not want to delete the notification. How can the user configure this?
    1. Change the Disable button for notification to “Yes” in the RDS console
    2. Set the send mail flag to false in the DB event notification console
    3. The only option is to delete the notification from the console
    4. Change the Enable button for notification to “No” in the RDS console
  5. A sys admin is planning to subscribe to the RDS event notifications. For which of the below mentioned source categories the subscription cannot be configured?
    1. DB security group
    2. DB snapshot
    3. DB options group
    4. DB parameter group
  6. A user is planning to setup notifications on the RDS DB for a snapshot. Which of the below mentioned event categories is not supported by RDS for this snapshot source type?
    1. Backup (Refer link)
    2. Creation
    3. Deletion
    4. Restoration
  7. A system admin is planning to setup event notifications on RDS. Which of the below mentioned services will help the admin setup notifications?
    1. AWS SES
    2. AWS Cloudtrail
    3. AWS CloudWatch
    4. AWS SNS
  8. A user has setup an RDS DB with Oracle. The user wants to get notifications when someone modifies the security group of that DB. How can the user configure that?
    1. It is not possible to get the notifications on a change in the security group
    2. Configure SNS to monitor security group changes
    3. Configure event notification on the DB security group
    4. Configure the CloudWatch alarm on the DB for a change in the security group
  9. It is advised that you watch the Amazon CloudWatch “_____” metric (available via the AWS Management Console or Amazon Cloud Watch APIs) carefully and recreate the Read Replica should it fall behind due to replication errors.
    1. Write Lag
    2. Read Replica
    3. Replica Lag
    4. Single Replica

AWS RDS DB Maintenance & Upgrades

RDS DB Maintenance and Upgrades

  • Changes to a DB instance can occur when a DB instance is manually modified for e.g. DB engine version is upgraded, or when RDS performs maintenance on an instance

RDS Maintenance

  • RDS performs periodic maintenance on RDS resources, such as DB instances, and most often involves updates to the DB instance’s operating system (OS).
  • Maintenance items can either
    • be applied manually on a DB instance at one’s convenience
    • or wait for the automatic maintenance process initiated by RDS during the defined weekly maintenance window.
  • Maintenance window only determines when pending operations start but does not limit the total execution time of these operations.
  • Maintenance operations are not guaranteed to finish before the maintenance window ends and can continue beyond the specified end time.
  • Maintenance update availability can be checked both on the RDS console and by using the RDS API. And if an update is available, one can
    • Defer the maintenance items.
    • Apply the maintenance items immediately.
    • Schedule them to start during the next defined maintenance window
  • Maintenance items marked as
    • Required cannot be deferred indefinitely, if deferred AWS will send a notify the time when the update will be performed next
    • Available and can be deferred indefinitely and the update will not be applied to the DB instance.
  • Required patching is automatically scheduled only for patches that are related to security and instance reliability. Such patching occurs infrequently (typically once every few months) and seldom requires more than a fraction of your maintenance window.
  • Maintenance items require that RDS take the DB instance offline for a short time. Maintenance that requires DB instances to be offline includes scale compute operations, which generally take only a few minutes from start to finish, and required operating system or database patching.
  • Multi-AZ deployment for the DB instance reduces the impact of a maintenance event by following these steps:
    • Perform maintenance on standby.
    • Promote the standby to primary.
    • Perform maintenance on the old primary, which becomes the new standby.
  • When the database engine for the DB instance is modified in a Multi-AZ deployment, RDS upgrades both the primary and secondary DB instances at the same time. In this case, the database engine for the entire Multi-AZ deployment is shut down during the upgrade.

Operating System Updates

  • Upgrades to the operating system are most often for security issues and should be done as soon as possible.
  • OS updates on a DB instance can be applied at one’s convenience or can wait for the maintenance process initiated by RDS to apply the update during the defined maintenance window
  • DB instance is not automatically backed up when an OS update is applied and should be backup up before the update is applied

Database Engine Version Upgrade

  • DB instance engine version can be upgraded when a new DB engine version is supported by RDS.
  • Database version upgrades consist of major and minor version upgrades.
    • Major database version upgrades
      • can contain changes that are not backward-compatible
      • RDS doesn’t apply major version upgrades automatically
      • DB instance should be manually modified and thoroughly tested before applying it to the production instances.
    • Minor version upgrades
      • Each DB engine handles minor version upgrade slightly differently
        for e.g. RDS automatically apply minor version upgrades to a DB instance running PostgreSQL, but must be manually applied to a DB instance running Oracle.
  • Amazon posts an announcement to the forums announcement page and sends a customer e-mail notification before upgrading an DB instance
  • Amazon schedule the upgrades at specific times through the year, to help plan around them, because downtime is required to upgrade a DB engine version, even for Multi-AZ instances.
  • RDS takes two DB snapshots during the upgrade process.
    • First DB snapshot is of the DB instance before any upgrade changes have been made. If the upgrade fails, it can be restored from the snapshot to create a DB instance running the old version.
    • Second DB snapshot is taken when the upgrade completes. After the upgrade is complete, database engine can’t be reverted to the previous version. For returning to the previous version, restore the first DB snapshot taken to create a new DB instance.
  • If the DB instance is using read replication, all of the Read Replicas must be upgraded before upgrading the source instance.
  • If the DB instance is in a Multi-AZ deployment, both the primary and standby replicas are upgraded at the same time and would result in an outage. The time for the outage varies based on your database engine, version, and the size of your DB instance.

RDS Maintenance Window

  • Every DB instance has a weekly maintenance window defined during which any system changes are applied.
  • Maintenance window is an opportunity to control when DB instance modifications and software patching occur, in the event either are requested or required.
  • If a maintenance event is scheduled for a given week, it will be initiated during the 30-minute maintenance window as defined
  • Maintenance events mostly complete during the 30-minute maintenance window, although larger maintenance events may take more time
  • 30-minute maintenance window is selected at random from an 8-hour block of time per region. If you don’t specify a preferred maintenance window when you create the DB instance, Amazon RDS assigns a 30-minute maintenance window on a randomly selected day of the week.
  • RDS will consume some of the resources on the DB instance while maintenance is being applied, minimally affecting performance.
  • For some maintenance events, a Multi-AZ failover may be required for a maintenance update to be complete.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A user has launched an RDS MySQL DB with the Multi AZ feature. The user has scheduled the scaling of instance storage during maintenance window. What is the correct order of events during maintenance window? 1. Perform maintenance on standby 2. Promote standby to primary 3. Perform maintenance on original primary 4. Promote original master back as primary
    1. 1, 2, 3, 4
    2. 1, 2, 3
    3. 2, 3, 4, 1
  2. Can I control if and when MySQL based RDS Instance is upgraded to new supported versions?
    1. No
    2. Only in VPC
    3. Yes
  3. A user has scheduled the maintenance window of an RDS DB on Monday at 3 AM. Which of the below mentioned events may force to take the DB instance offline during the maintenance window?
    1. Enabling Read Replica
    2. Making the DB Multi AZ
    3. DB password change
    4. Security patching
  4. A user has launched an RDS postgreSQL DB with AWS. The user did not specify the maintenance window during creation. The user has configured RDS to update the DB instance type from micro to large. If the user wants to have it during the maintenance window, what will AWS do?
    1. AWS will not allow to update the DB until the maintenance window is configured
    2. AWS will select the default maintenance window if the user has not provided it
    3. AWS will ask the user to specify the maintenance window during the update
    4. It is not possible to change the DB size from micro to large with RDS
  5. Can I test my DB Instance against a new version before upgrading?
    1. No
    2. Yes
    3. Only in VPC

References

AWS RDS Storage

AWS RDS Storage

  • RDS storage uses Elastic Block Store – EBS volumes for database and log storage.
  • RDS automatically stripes across multiple EBS volumes to enhance IOPS performance, depending on the amount of storage requested

RDS Storage Types

  • RDS storage provides three storage types: General Purpose (SSD), Provisioned IOPS (input/output operations per second), and Magnetic.
  • These storage types differ in performance characteristics and price, which allows tailoring of storage performance and cost to the database needs
  • MySQL, MariaDB, PostgreSQL, and Oracle RDS DB instances can be created with up to 64TB of storage, and SQL Server RDS DB instances with up to 16TB of storage when using the Provisioned IOPS and General Purpose (SSD) storage types.
  • Existing MySQL, PostgreSQL, and Oracle RDS database instances can be scaled to these new database storage limits without any downtime.

Magnetic (Standard)

  • Magnetic storage, also called standard storage, offers cost-effective storage that is ideal for applications with light or burst I/O requirements.
  • They deliver approximately 100 IOPS on average, with burst capability of up to hundreds of IOPS, and they can range in size from 5 GB to 3 TB, depending on the DB instance engine.
  • Magnetic storage is not reserved for a single DB instance, so performance can vary greatly depending on the demands placed on shared resources by other customers.

General Purpose (SSD)

  • General purpose, SSD-backed storage, also called gp2, can provide faster access than disk-based storage.
  • They can deliver single-digit millisecond latencies, with a base performance of 3 IOPS per Gigabyte (GB) and the ability to burst to 3,000 IOPS for extended periods of time up to a maximum of 10,000 PIOPS.
  • General Purpose volumes can range in size from 5 GB to 6 TB for MySQL, MariaDB, PostgreSQL, and Oracle DB instances, and from 20 GB to 4 TB for SQL Server DB instances.
  • General Purpose is excellent for small to medium-sized databases.

Provisioned IOPS

  • Provisioned IOPS storage is designed to meet the needs of I/O-intensive workloads, particularly database workloads, that are sensitive to storage performance and consistency in random access I/O throughput.
  • Provisioned IOPS storage is a storage type that delivers fast, predictable, and consistent throughput performance.
  • For any production application that requires fast and consistent I/O performance, Amazon recommends Provisioned IOPS (input/output operations per second) storage.
  • Provisioned IOPS storage is optimized for I/O intensive, online transaction processing (OLTP) workloads that have consistent performance requirements.
  • Provisioned IOPS helps with performance tuning.
  • Dedicated IOPS rate and storage space allocation is specified, when a DB instance is created. RDS provisions that IOPS rate and storage for the lifetime of the DB instance or until it is changed.
  • RDS delivers within 10 percent of the provisioned IOPS performance 99.9 percent of the time over a given year.

Adding Storage and Changing Storage Type

  • DB instance can be modified to use additional storage and converted to a different storage type.
  • However, storage allocated for a DB instance cannot be decreased
  • MySQL, MariaDB, PostgreSQL, and Oracle DB instances can be scaled up for storage, which helps improve I/O capacity.
  • Storage capacity nor the type of storage for a SQL Server DB instance can be changed due to the extensibility limitations of striped storage attached to a Windows Server environment.
  • During the scaling process, the DB instance will be available for reads and writes, but may experience performance degradation
  • Adding storage may take several hours; the duration of the process depends on several factors such as load, storage size, storage type, amount of IOPS provisioned (if any), and number of prior scale storage operations.
  • While storage is being added, nightly backups are suspended and no other RDS operations can take place, including modify, reboot, delete, create Read Replica, and create DB Snapshot

Performance Metrics

  • Amazon RDS provides several metrics that can be used to determine how the DB instance is performing.
    • IOPS
      • the number of I/O operations completed per second.
      • it is reported as the average IOPS for a given time interval.
      • RDS reports read and write IOPS separately on one minute intervals.
      • Total IOPS is the sum of the read and write IOPS.
      • Typical values for IOPS range from zero to tens of thousands per second.
    • Latency
      • the elapsed time between the submission of an I/O request and its completion
      • it is reported as the average latency for a given time interval.
      • RDS reports read and write latency separately on one minute intervals in units of seconds.
      • Typical values for latency are in the millisecond (ms)
    • Throughput
      • the number of bytes per second transferred to or from disk
      • it is reported as the average throughput for a given time interval.
      • RDS reports read and write throughput separately on one minute intervals using units of megabytes per second (MB/s).
      • Typical values for throughput range from zero to the I/O channel’s maximum bandwidth.
    • Queue Depth
      • the number of I/O requests in the queue waiting to be serviced.
      • these are I/O requests that have been submitted by the application but have not been sent to the device because the device is busy servicing other I/O requests.
      • it is reported as the average queue depth for a given time interval.
      • RDS reports queue depth in one minute intervals. Typical values for queue depth range from zero to several hundred.
      • Time spent waiting in the queue is a component of Latency and
        Service Time (not available as a metric).

RDS Storage Facts

  • First time a DB instance is started and accesses an area of disk for the first time, the process can take longer than all subsequent accesses to the same disk area. This is known as the “first touch penalty”. Once an area of disk has incurred the first touch penalty, that area of disk does not incur the penalty again for the life of the instance, even if the DB instance is rebooted, restarted, or the DB instance class changes. Note that a DB instance created from a snapshot, a point-in-time restore, or a read replica is a new instance and does incur this first touch penalty.
  • RDS manages the DB instance and it reserves overhead space on the instance. While the amount of reserved storage varies by DB instance class and other factors, this reserved space can be as much as one or two percent of the total storage
  • Provisioned IOPS provides a way to reserve I/O capacity by specifying IOPS. Like any other system capacity attribute, maximum throughput under load will be constrained by the resource that is consumed first, which could be IOPS, channel bandwidth, CPU, memory, or database internal resources.
  • Current maximum channel bandwidth available is 4000 megabits per second (Mbps) full duplex. In terms of the read and write throughput metrics, this equates to about 210 megabytes per second (MB/s) in each direction. A perfectly balanced workload of 50% reads and 50% writes may attain a maximum combined throughput of 420 MB/s, which includes protocol overhead, so the actual data throughput may be less.
  • Provisioned IOPS works with an I/O request size of 32 KB. Provisioned IOPS consumption is a linear function of I/O request size above 32 KB. An I/O request smaller than 32 KB is handled as one I/O; for e.g. 1000 16 KB I/O requests are treated the same as 1000 32 KB requests. I/O requests larger than 32 KB consume more than one I/O request; while, a 48 KB I/O request consumes 1.5 I/O requests of storage capacity; a 64 KB I/O request consumes 2 I/O requests

Factors That Impact RDS Storage Performance

  • Several factors can affect the performance of a DB instance, such as instance configuration, I/O characteristics, and workload demand.
  • System related activities also consume I/O capacity and may reduce database instance performance while in progress:
    • DB snapshot creation
    • Nightly backups
    • Multi-AZ peer creation
    • Read replica creation
    • Scaling storage
  • System resources can constrain the throughput of a DB instance, but there can be other reasons for a bottleneck. Database could be the issue if :-
    • Channel throughput limit is not reached
    • Queue depths are consistently low
    • CPU utilization is under 80%
    • Free memory available
    • No swap activity
    • Plenty of free disk space
    • Application has dozens of threads all submitting transactions as fast as the database will take them, but there is clearly unused I/O capacity

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. When should I choose Provisioned IOPS over Standard RDS storage?
    1. If you have batch-oriented workloads
    2. If you use production online transaction processing (OLTP) workloads
    3. If you have workloads that are not sensitive to consistent performance
  2. Is decreasing the storage size of a DB Instance permitted?
    1. Depends on the RDMS used
    2. Yes
    3. No
  3. Because of the extensibility limitations of striped storage attached to Windows Server, Amazon RDS does not currently support increasing storage on a _____ DB Instance.
    1. SQL Server
    2. MySQL
    3. Oracle
  4. If I want to run a database in an Amazon instance, which is the most recommended Amazon storage option?
    1. Amazon Instance Storage
    2. Amazon EBS
    3. You can’t run a database inside an Amazon instance.
    4. Amazon S3
  5. For each DB Instance class, what is the maximum size of associated storage capacity?
    1. 1TiB
    2. 2TiB
    3. 8TiB
    4. 16TiB (The limit keeps on changing so please check the latest always)

References

AWS Relational Database Service – RDS

Relational Database Service – RDS

  • Relational Database Service – RDS is a web service that makes it easier to set up, operate, and scale a relational database in the cloud.
  • provides cost-efficient, resizable capacity for an industry-standard relational database and manages common database administration tasks such as hardware provisioning, database setup, patching, and backups.
  • features & benefits
    • CPU, memory, storage, and IOPs can be scaled independently.
    • manages backups, software patching, automatic failure detection, and recovery.
    • automated backups can be performed as needed, or manual backups can be triggered as well. Backups can be used to restore a database, and the restore process works reliably and efficiently.
    • provides Multi-AZ high availability with a primary instance and a synchronous standby secondary instance that can failover seamlessly when a problem occurs.
    • provides elasticity & scalability by enabling Read Replicas to increase read scaling.
    • supports MySQL, MariaDB, PostgreSQL, Oracle, and Microsoft SQL Server, and the new, MySQL-compatible Aurora DB engine
    • supports IAM users and permissions to control who has access to the RDS database service
    • databases can be further protected by putting them in a VPC, using SSL for data in transit and encryption for data in rest
    • However, as it is a managed service, shell (root ssh) access to DB instances is not provided, and this restricts access to certain system procedures and tables that require advanced privileges.

RDS Components

  • DB Instance
    • is a basic building block of RDS
    • is an isolated database environment in the cloud
    • each DB instance runs a DB engine. AWS currently supports MySQL, MariaDB, PostgreSQL, Oracle, and Microsoft SQL Server & Aurora DB engines
    • can be accessed from AWS command-line tools, RDS APIs, or the AWS Management RDS Console.
    • computation and memory capacity of a DB instance is determined by its DB instance class, which can be selected as per the needs
    • supports three storage types: Magnetic, General Purpose (SSD), and Provisioned IOPS (SSD), which differ in performance and price
    • each DB instance has a DB instance identifier, which is a customer-supplied name and must be unique for that customer in an AWS region. It uniquely identifies the DB instance when interacting with the RDS API and AWS CLI commands.
    • each DB instance can host multiple user-created databases or a single Oracle database with multiple schemas.
    • can be hosted in an AWS VPC environment for better control
  • Regions and Availability Zones
    • AWS resources are housed in highly available data center facilities in different areas of the world, these data centers are called regions which further contain multiple distinct locations called Availability Zones
    • Each AZ is engineered to be isolated from failures in other AZs and to provide inexpensive, low-latency network connectivity to other AZs in the same region
    • DB instances can be hosted in different AZs, an option called a Multi-AZ deployment.
      • RDS automatically provisions and maintains a synchronous standby replica of the DB instance in a different AZ.
      • Primary DB instance is synchronously replicated across AZs to the standby replica
      • Provides data redundancy, failover support, eliminates I/O freezes, and minimizes latency spikes during system backups.
  • Security Groups
    • security group controls the access to a DB instance, by allowing access to the specified IP address ranges or EC2 instances
  • DB Parameter Groups
    • A DB parameter group contains engine configuration values that can be applied to one or more DB instances of the same instance type
    • help define configuration values specific to the selected DB Engine for e.g. max_connections, force_ssl , autocommit
    • supports default parameter group, which cannot be edited.
    • supports custom parameter group, to override values
    • supports static and dynamic parameter groups
      • changes to dynamic parameters are applied immediately (irrespective of apply immediately setting)
      • changes to static parameters are NOT applied immediately and require a manual reboot.
  • DB Option Groups
    • Some DB engines offer tools or optional features that simplify managing the databases and making the best use of data.
    • RDS makes such tools available through option groups for e.g. Oracle Application Express (APEX), SQL Server Transparent Data Encryption, and MySQL Memcached support.

RDS Interfaces

  • RDS can be interacted with multiple interfaces
    • AWS RDS Management console
    • Command Line Interface
    • Programmatic Interfaces which include SDKs, libraries in different languages, and RDS API

RDS Multi-AZ & Read Replicas

  • Multi-AZ deployment
    • provides high availability, durability, and automatic failover support
    • helps improve the durability and availability of a critical system, enhancing availability during planned system maintenance, DB instance failure, and Availability Zone disruption.
    • automatically provisions and manages a synchronous standby instance in a different AZ.
    • automatically fails over in case of any issues with the primary instance
    • A Multi-AZ DB instance deployment has one standby DB instance that provides failover support but doesn’t serve read traffic.
    • A Multi-AZ DB cluster deployment has two standby DB instances that provide failover support and can also serve read traffic.
  • Read replicas
    • enable increased scalability and database availability in the case of an AZ failure.
    • allow elastic scaling beyond the capacity constraints of a single DB instance for read-heavy database workloads

RDS Security

  • DB instance can be hosted in a VPC for the greatest possible network access control.
  • IAM policies can be used to assign permissions that determine who is allowed to manage RDS resources.
  • Security groups allow control of what IP addresses or EC2 instances can connect to the databases on a DB instance.
  • RDS supports encryption in transit using SSL connections
  • RDS supports encryption at rest to secure instances and snapshots at rest.
  • Network encryption and transparent data encryption (TDE) with Oracle DB instances
  • Authentication can be implemented using Password, Kerberos, and IAM database authentication.

RDS Backups, Snapshot

  • Automated backups
    • are enabled by default for a new DB instance.
    • enables recovery of the database to any point in time, with database change logs, during the backup retention period, up to the last five minutes of database usage.
  • DB snapshots are manual, user-initiated backups that enable backup of the DB instance to a known state, and restore to that specific state at any time.

RDS Monitoring & Notification

  • RDS integrates with CloudWatch and provides metrics for monitoring
  • CloudWatch alarms can be created over a single metric that sends an SNS message when the alarm changes state
  • RDS also provides SNS notification whenever any RDS event occurs
  • RDS Performance Insights is a database performance tuning and monitoring feature that helps illustrate the database’s performance and help analyze any issues that affect it
  • RDS Recommendations provides automated recommendations for database resources.

RDS Pricing

  • Instance class
    • Pricing is based on the class (e.g., micro) of the DB instance consumed.
  • Running time
    • Usage is billed in one-second increments, with a minimum of 10 mins.
  • Storage
    • Storage capacity provisioned for the DB instance is billed per GB per month
    • If the provisioned storage capacity is scaled within the month, the bill will be pro-rated.
  • I/O requests per month
    • Total number of storage I/O requests made in a billing cycle.
  • Provisioned IOPS (per IOPS per month)
    • Provisioned IOPS rate, regardless of IOPS consumed, for RDS Provisioned IOPS (SSD) storage only.
    • Provisioned storage for EBS volumes is billed in one-second increments, with a minimum of 10 minutes.
  • Backup storage
    • Automated backups & any active database snapshots consume storage
    • Increasing backup retention period or taking additional database snapshots increases the backup storage consumed by the database.
    • RDS provides backup storage up to 100% of the provisioned database storage at no additional charge for e.g., if you have 10 GB-months of provisioned database storage, RDS provides up to 10 GB-months of backup storage at no additional charge.
    • Most databases require less raw storage for a backup than for the primary dataset, so if multiple backups are not maintained, you will never pay for backup storage.
    • Backup storage is free only for active DB instances.
  • Data transfer
    • Internet data transfer out of the DB instance.
  • Reserved Instances
    • In addition to regular RDS pricing, reserved DB instances can be purchased

Further Reading

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What does Amazon RDS stand for?
    1. Regional Data Server.
    2. Relational Database Service
    3. Regional Database Service.
  2. How many relational database engines does RDS currently support?
    1. MySQL, Postgres, MariaDB, Oracle, and Microsoft SQL Server
    2. Just two: MySQL and Oracle.
    3. Five: MySQL, PostgreSQL, MongoDB, Cassandra and SQLite.
    4. Just one: MySQL.
  3. If I modify a DB Instance or the DB parameter group associated with the instance, should I reboot the instance for the changes to take effect?
    1. No
    2. Yes
  4. What is the name of licensing model in which I can use your existing Oracle Database licenses to run Oracle deployments on Amazon RDS?
    1. Bring Your Own License
    2. Role Bases License
    3. Enterprise License
    4. License Included
  5. Will I be charged if the DB instance is idle?
    1. No
    2. Yes
    3. Only is running in GovCloud
    4. Only if running in VPC
  6. What is the minimum charge for the data transferred between Amazon RDS and Amazon EC2 Instances in the same Availability Zone?
    1. USD 0.10 per GB
    2. No charge. It is free.
    3. USD 0.02 per GB
    4. USD 0.01 per GB
  7. Does Amazon RDS allow direct host access via Telnet, Secure Shell (SSH), or Windows Remote Desktop Connection?
    1. Yes
    2. No
    3. Depends on if it is in VPC or not
  8. What are the two types of licensing options available for using Amazon RDS for Oracle?
    1. BYOL and Enterprise License
    2. BYOL and License Included
    3. Enterprise License and License Included
    4. Role based License and License Included
  9. A user plans to use RDS as a managed DB platform. Which of the below mentioned features is not supported by RDS?
    1. Automated backup
    2. Automated scaling to manage a higher load
    3. Automated failure detection and recovery
    4. Automated software patching
  10. A user is launching an AWS RDS with MySQL. Which of the below mentioned options allows the user to configure the InnoDB engine parameters?
    1. Options group
    2. Engine parameters
    3. Parameter groups
    4. DB parameters
  11. A user is planning to use the AWS RDS with MySQL. Which of the below mentioned services the user is not going to pay?
    1. Data transfer
    2. RDS CloudWatch metrics
    3. Data storage
    4. I/O requests per month

References

AWS_Relational_Database_Service_RDS

AWS RDS Best Practices

AWS RDS Best Practices

AWS recommends RDS best practices in terms of Monitoring, Performance, and security

RDS Basic Operational Guidelines

  • Monitoring
    • Memory, CPU, and storage usage should be monitored.
    • CloudWatch can be setup for notifications when usage patterns change or when the capacity of deployment is approached, so that system performance and availability can be maintained
  • Scaling
    • Scale up the DB instance when approaching storage capacity limits.
    • There should be some buffer in storage and memory to accommodate unforeseen increases in demand from the applications.
  • Backups
    • Enable Automatic Backups and set the backup window to occur during the daily low in WriteIOPS.
    • Use Multi-AZ to reduce to impact of backups on the primary DB instance.
  • On a MySQL DB instance,
    • Do not create more than 10,000 tables using Provisioned IOPS or 1000 tables using standard storage. Large numbers of tables will significantly increase database recovery time after a failover or database crash. If you need to create more tables than recommended, set the innodb_file_per_table parameter to 0.
    • Avoid tables in the database growing too large. Provisioned storage limits restrict the maximum size of a MySQL table file to 6 TB. Instead, partition the large tables so that file sizes are well under the 6 TB limit. This can also improve performance and recovery time.
  • Performance
    • If the database workload requires more I/O than provisioned, recovery after a failover or database failure will be slow.
    • To increase the I/O capacity of a DB instance,
      • Migrate to a DB instance class with High I/O capacity.
      • Convert from standard storage to Provisioned IOPS storage, and use a DB instance class that is optimized for Provisioned IOPS.
      • if using Provisioned IOPS storage, provision additional throughput capacity.
  • Multi-AZ & Failover
    • Deploy applications in all Availability Zones, so if an AZ goes down, applications in other AZs will still be available.
    • Use RDS DB events to monitor failovers.
    • Set a TTL of less than 30 seconds, if the client application is caching the DNS data of the DB instances. As the underlying IP address of a DB instance can change after a failover, caching the DNS data for an extended time can lead to connection failures if the application tries to connect to an IP address that no longer is in service.
    • Multi-AZ requires the transaction logging feature to be enabled. Do not use features like Simple recover mode, offline mode or Read-only mode which turn of transaction logging.
    • To shorten failover time
      • Ensure that sufficient Provisioned IOPS allocated for your workload. Inadequate I/O can lengthen failover times. Database recovery requires I/O.
      • Use smaller transactions. Database recovery relies on transactions, so break up large transactions into multiple smaller transactions to shorten failover time
    • Test failover for your DB instance to understand how long the process takes for your use case and to ensure that the application that accesses your DB instance can automatically connect to the new DB instance after failover.

DB Instance RAM Recommendations

  • An RDS performance best practice is to allocate enough RAM so that the working set resides almost completely in memory.
  • Value of ReadIOPS should be small and stable.
  • ReadIOPS metric can be checked, using AWS CloudWatch while the DB instance is under load, to tell if the working set is almost all in memory
  • If scaling up the DB instance class with more RAM, results in a dramatic drop in ReadIOPS, the working set was not almost completely in memory.
  • Continue to scale up until ReadIOPS no longer drops dramatically after a scaling operation, or ReadIOPS is reduced to a very small amount.

RDS Security Best Practices

  • Do not use AWS root credentials to manage RDS resources, and IAM users should be created for everyone,
  • Grant each user the minimum set of permissions required to perform his or her duties.
  • Use IAM groups to effectively manage permissions for multiple users.
  • Rotate your IAM credentials regularly.

Using Enhanced Monitoring to Identify Operating System Issues

  • RDS provides metrics in real time for the operating system (OS) that your DB instance runs on.
  • Enhanced monitoring is available for all DB instance classes except for db.t1.micro and db.m1.small.

Using Metrics to Identify Performance Issues

  • To identify performance issues caused by insufficient resources and other common bottlenecks, you can monitor the metrics available for your Amazon RDS DB instance
  • Performance metrics should be monitored on a regular basis to benchmark the average, maximum, and minimum values for a variety of time ranges. to help identify performance degradation.
  • CloudWatch alarms can be set for particular metric thresholds to be alerted when they are reached or breached
  • A DB instance has a number of different categories of metrics which includes CPU, memory, disk space, IOPS, db connections and network traffic, and how to determine acceptable values depends on the metric.
  • One of the best ways to improve DB instance performance is to tune the most commonly used and most resource-intensive queries to make them less expensive to run.

Recovery

  • MySQL
    • InnoDB is the recommended and supported storage engine for MySQL DB instances on Amazon RDS.
    • However, MyISAM performs better than InnoDB if you require intense, full-text search capability.
    • Point-In-Time Restore and snapshot restore features of Amazon RDS for MySQL require a crash-recoverable storage engine and are supported for the InnoDB storage engine only.
    • Although MySQL supports multiple storage engines with varying capabilities, not all of them are optimized for crash recovery and data durability.
    • MyISAM storage engine does not support reliable crash recovery and might prevent a Point-In-Time Restore or snapshot restore from working as intended which might result in lost or corrupt data when MySQL is restarted after a crash.
  • MariaDB
    • XtraDB is the recommended and supported storage engine for MariaDB DB instances on Amazon RDS.
    • Point-In-Time Restore and snapshot restore features of Amazon RDS for MariaDB require a crash-recoverable storage engine and are supported for the XtraDB storage engine only.
    • Although MariaDB supports multiple storage engines with varying capabilities, not all of them are optimized for crash recovery
      and data durability.
    • For e.g although Aria is a crash-safe replacement for MyISAM, it might still prevent a Point-In-Time Restore or snapshot restore from working as intended. This might result in lost or corrupt data when MariaDB is restarted after a crash.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You are running a database on an EC2 instance, with the data stored on Elastic Block Store (EBS) for persistence At times throughout the day, you are seeing large variance in the response times of the database queries Looking into the instance with the isolate command you see a lot of wait time on the disk volume that the database’s data is stored on. What two ways can you improve the performance of the database’s storage while maintaining the current persistence of the data? Choose 2 answers
    1. Move to an SSD backed instance
    2. Move the database to an EBS-Optimized Instance
    3. Use Provisioned IOPs EBS
    4. Use the ephemeral storage on an m2.4xLarge Instance Instead
  2. Amazon RDS automated backups and DB Snapshots are currently supported for only the __________ storage engine
    1. InnoDB
    2. MyISAM

References

AWS_RDS_Best_Practices

AWS Storage Options – RDS, DynamoDB & Database on EC2

AWS Storage Options Whitepaper with RDS, DynamoDB & Database on EC2 Cont.

Provides a brief summary for the Ideal Use cases, Anti-Patterns and other factors for Amazon RDS, DynamoDB & Databases on EC2 storage options

Amazon RDS

  • RDS is a web service that provides the capabilities of MySQL, Oracle, MariaDB, Postgres or Microsoft SQL Server relational database as a managed, cloud-based service
  • RDS eliminates much of the administrative overhead associated with launching, managing, and scaling your own relational database on Amazon EC2 or in another computing environment.

Ideal Usage Patterns

  • RDS is a great solution for cloud-based fully-managed relational database
  • RDS is also optimal for new applications with structured data that requires more sophisticated querying and joining capabilities than that provided by Amazon’s NoSQL database offering, DynamoDB.
  • RDS provides full compatibility with the databases supported and direct access to native database engines, code and libraries and is ideal for existing applications that rely on these databases

Anti-Patterns

  • Index and query-focused data
    • If the applications don’t require advanced features such as joins and complex transactions and is more oriented toward indexing and querying data, DynamoDB would be more appropriate for this needs
  • Numerous BLOBs
    • If the application makes heavy use of files (audio files, videos, images, etc), it is a better choice to use S3 to store the objects instead of database engines Blob feature and use RDS or DynamoDB only to save the metadata
  • Automated scalability
    • RDS provides pushbutton scaling and it only scales up and has limited scale out ability. If fully-automated scaling is needed, DynamoDB may be a better choice.
  • Complete control
    • RDS does not provide admin access and does not enable the full feature set of the database engines.
    • So if the application requires complete, OS-level control of the database server with full root or admin login privileges, a self-managed database on EC2 may be a better match.
  • Other database platforms
    • RDS, at this time, provides a MySQL, Oracle, MariaDB, PostgreSQL and SQL Server databases.
    • If any other database platform (such as IBM DB2, Informix, or Sybase) is needed, it should be deployed on a self-managed database on an EC2 instance by using a relational database AMI, or by installing database software on an EC2 instance.

Performance

  • RDS Provisioned IOPS, where the IOPS can be specified when the instance is launched and is guaranteed over the life of the instance, provides a high-performance storage option designed to deliver fast, predictable, and consistent performance for I/O intensive transactional database workload

Durability and Availability

  • RDS leverages Amazon EBS volumes as its data store
  • RDS provides database backups, for enhanced durability, which are replicated across multiple AZ’s
    • Automated backups
      • If enabled, RDS will automatically perform a full daily backup of your data during the specified backup window, and will also capture DB transaction logs
    • User initiated backups
      • User can initiate backups at time and they are not deleted unless deleted explicitly by the user
  • RDS Multi AZ’s feature enhances both the durability and the availability of the database by synchronously replicating the data between a primary RDS DB instance and a standby instance in another Availability Zone, which prevents data loss,
  • RDS provides a DNS endpoint and in case of an failure on the primary, it automatically fails over to the standby instance
  • RDS also allows Read replicas for the supported databases, which are replicated asynchronously

Cost Model

  • RDS offers a tiered pricing structure, based on the size of the database instance, the deployment type (Single-AZ/Multi-AZ), and the AWS region.
  • Pricing for RDS is based on several factors: the DB instance hours (per hour), the amount of provisioned database storage (per GB-month and per million I/O requests), additional backup storage (per GB-month), and data transfer in/out (per GB per month)

Scalability and Elasticity

  • RDS resources can be scaled elastically in several dimensions: database storage size, database storage IOPS rate, database instance compute capacity, and the number of read replicas
  • RDS supports “pushbutton scaling” of both database storage and compute resources. Additional storage can either be added immediately or during the next maintenance cycle
  • RDS for MySQL also enables you to scale out beyond the capacity of a single database deployment for read-heavy database workloads by creating one or more read replicas.
  • Multiple RDS instances can also be configured to leverage database partitioning or sharding to spread the workload over multiple DB instances, achieving even greater database scalability and elasticity.

Interfaces

  • RDS APIs and the AWS Management Console provide a management interface that allows you to create, delete, modify, and terminate RDS DB instances; to create DB snapshots; and to perform point-in-time restores
  • There is no AWS data API for Amazon RDS.
  • Once a database is created, RDS provides a DNS endpoint for the database which can be used to connect to the database.
  • Endpoint does not change over the lifetime of the instance even during the failover in case of Multi-AZ configuration

Amazon DynamoDB

  • Amazon DynamoDB is a fast, fully-managed NoSQL database service that makes it simple and cost-effective to store and retrieve any amount of data, and serve any level of request traffic.
  • DynamoDB being a managed service helps offload the administrative burden of operating and scaling a highly-available distributed database cluster.
  • DynamoDB helps meet the latency and throughput requirements of highly demanding applications by providing extremely fast and predictable performance with seamless throughput and storage scalability.
  • DynamoDB provides both eventually-consistent reads (by default), and strongly-consistent reads (optional), as well as implicit item-level transactions for item put, update, delete, conditional operations, and increment/decrement.
  • Amazon DynamoDB handles the data as below :-
    • DynamoDB stores structured data in tables, indexed by primary key, and allows low-latency read and write access to items.
    • DynamoDB supports three data types: number, string, and binary, in both scalar and multi-valued sets.
    • Tables do not have a fixed schema, so each data item can have a different number of attributes.
    • Primary key can either be a single-attribute hash key or a composite hash-range key.
    • Local secondary indexes provide additional flexibility for querying against attributes other than the primary key.

Ideal Usage Patterns

  • DynamoDB is ideal for existing or new applications that need a flexible NoSQL database with low read and write latencies, and the ability to scale storage and throughput up or down as needed without code changes or downtime.
  • Use cases require a highly available and scalable database because downtime or performance degradation has an immediate negative impact on an organization’s business. for e.g. mobile apps, gaming, digital ad serving, live voting and audience interaction for live events, sensor networks, log ingestion, access control for web-based content, metadata storage for S3 objects, e-commerce shopping carts, and web session management

Anti-Patterns

  • Structured data with Join and/or Complex Transactions
    • If the application uses structured data and required joins, complex transactions or other relationship infrastructure provided by traditional database platforms, it is better to use RDS or Database installed on an EC2 instance
  • Large Blob data
    • If the application uses large blob data for e.g. media, files, videos etc., it is better to use S3 to store the objects and use DynamoDB to store metadata for e.g. name, size, content-type etc
  • Large Objects with Low I/O rate
    • DynamoDB uses SSD drives and is optimized for workloads with a high I/O rate per GB stored. If the applications stores very large amounts of data that are infrequently accessed, S3 might be a better choice
  • Prewritten application with databases
    • For Porting an existing application using databases, RDS or database installed on the EC2 instance would be a better and seamless solution

Performance

  • SSDs and limited indexing on attributes provides high throughput and low latency and drastically reduces the cost of read and write operations.
  • Predictable performance can be achieved by defining the provisioned throughput capacity required for a given table.
  • DynamoDB handles the provisioning of resources to achieve the requested throughput rate, taking away the burden to think about instances, hardware, memory, and other factors that can affect an application’s throughput rate.
  • Provisioned throughput capacity reservations are elastic and can be increased or decreased on demand.

Durability and Availability

  • DynamoDB has built-in fault tolerance that automatically and synchronously replicates data across three AZ’s in a region for high availability and to help protect data against individual machine, or even facility failures.

Cost Model

  • DynamoDB has three pricing components: provisioned throughput capacity (per hour), indexed data storage (per GB per month), data transfer in or out (per GB per month)

Scalability and Elasticity

  • DynamoDB is both highly-scalable and elastic.
  • DynamoDB provides unlimited storage capacity, and the service automatically allocates more storage as the demand increases
  • Data is automatically partitioned and re-partitioned as needed, while the use of SSDs provides predictable low-latency response times at any scale.
  • DynamoDB is also elastic, in that you can simply “dial-up” or “dial-down” the read and write capacity of a table as your needs change.

Interfaces

  • DynamoDB provides a low-level REST API, as well as higher-level SDKs in different languages
  • APIs provide both a management and data interface for Amazon DynamoDB, that enable table management (creating, listing, deleting, and obtaining metadata) and working with attributes (getting, writing, and deleting attributes; query using an index, and full scan).

Databases on EC2

  • EC2 with EBS volumes allows hosting a self managed relational database
  • Ready to use, prebuilt AMIs are also available from leading database solutions

Ideal Usage Patterns

  • Self managed database on EC2 is an ideal scenario for users whose application requires a specific traditional relational database not supported by Amazon RDS for e.g. IBM DB2, Informix, or Sybase
  • Users or applications that require a maximum level of administrative control and configurability which is not provided by RDS

Anti-Patterns

  • Index and query-focused data
    • If the applications don’t require advanced features such as joins and complex transactions and is more oriented toward indexing and querying data, DynamoDB would be more appropriate for this needs
  • Numerous BLOBs
    • If the application makes heavy use of files (audio files, videos, images, and so on), it is a better choice to use S3 to store the objects instead of database engines Blob feature and use RDS or DynamoDB only to save the metadata
  • Automated scalability
    • Relational databases on EC2 leverages the scalability and elasticity of the underlying AWS platform, but this requires system administrators or DBAs to perform a manual or scripted task. If you need pushbutton scaling or fully-automated scaling, DynamoDB or RDS may be a better choice.
  • RDS supported database platforms
    • If the application using RDS supported database engine and all the features are available, RDS would be a better choice instead of self managed relational database on EC2

Performance

  • Performance depends on the size of the underlying EC2 instance, the number and configuration of the EBS volumes and the database itself
  • Performance can be increased by scaling up memory and compute resources by choosing a larger Amazon EC2 instance size.
  • For database storage, it is usually best to use EBS Provisioned IOPS volumes. To scale up I/O performance, the Provisioned IOPS can be increased, the number of EBS volumes changed, or use software RAID 0 (disk striping) across multiple EBS volumes, which will aggregate total IOPS and bandwidth.

Durability & Availability

  • As the database on EC2 uses EBS as storage, it has the same durability and availability provided by EBS and can be further enhanced by using EBS snapshots or by using third-party database backup utilities (such as Oracle’s RMAN) to store database backups in Amazon S3

Cost Model

  • Cost for running a database on EC2 instance is mainly determined by the size and the number of EC2 instance running, the size of the EBS volume used for database storage and any third party licensing cost for the database

Scalability & Elasticity

  • Users of traditional relational database solutions on Amazon EC2 can take advantage of the scalability and elasticity of the underlying AWS platform by creating AMI and spawning multiple instances

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Which of the following are use cases for Amazon DynamoDB? Choose 3 answers
    1. Storing BLOB data.
    2. Managing web sessions
    3. Storing JSON documents
    4. Storing metadata for Amazon S3 objects
    5. Running relational joins and complex updates.
    6. Storing large amounts of infrequently accessed data.
  2. A client application requires operating system privileges on a relational database server. What is an appropriate configuration for highly available database architecture?
    1. A standalone Amazon EC2 instance
    2. Amazon RDS in a Multi-AZ configuration
    3. Amazon EC2 instances in a replication configuration utilizing a single Availability Zone
    4. Amazon EC2 instances in a replication configuration utilizing two different Availability Zones
  3. You are developing a new mobile application and are considering storing user preferences in AWS, which would provide a more uniform cross-device experience to users using multiple mobile devices to access the application. The preference data for each user is estimated to be 50KB in size. Additionally 5 million customers are expected to use the application on a regular basis. The solution needs to be cost-effective, highly available, scalable and secure, how would you design a solution to meet the above requirements?
    1. Setup an RDS MySQL instance in 2 availability zones to store the user preference data. Deploy a public facing application on a server in front of the database to manage security and access credentials
    2. Setup a DynamoDB table with an item for each user having the necessary attributes to hold the user preferences. The mobile application will query the user preferences directly from the DynamoDB table. Utilize STS. Web Identity Federation, and DynamoDB Fine Grained Access Control to authenticate and authorize access (DynamoDB provides high availability as it synchronously replicates data across three facilities within an AWS Region and scalability as it is designed to scale its provisioned throughput up or down while still remaining available. Also suitable for storing user preference data)
    3. Setup an RDS MySQL instance with multiple read replicas in 2 availability zones to store the user preference data .The mobile application will query the user preferences from the read replicas. Leverage the MySQL user management and access privilege system to manage security and access credentials.
    4. Store the user preference data in S3 Setup a DynamoDB table with an item for each user and an item attribute pointing to the user’ S3 object. The mobile application will retrieve the S3 URL from DynamoDB and then access the S3 object directly utilize STS, Web identity Federation, and S3 ACLs to authenticate and authorize access.
  4. A customer is running an application in US-West (Northern California) region and wants to setup disaster recovery failover to the Asian Pacific (Singapore) region. The customer is interested in achieving a low Recovery Point Objective (RPO) for an Amazon RDS multi-AZ MySQL database instance. Which approach is best suited to this need?
    1. Synchronous replication
    2. Asynchronous replication
    3. Route53 health checks
    4. Copying of RDS incremental snapshots
  5. You are designing a file -sharing service. This service will have millions of files in it. Revenue for the service will come from fees based on how much storage a user is using. You also want to store metadata on each file, such as title, description and whether the object is public or private. How do you achieve all of these goals in a way that is economical and can scale to millions of users?
    1. Store all files in Amazon Simple Storage Service (53). Create a bucket for each user. Store metadata in the filename of each object, and access it with LIST commands against the S3 API.
    2. Store all files in Amazon 53. Create Amazon DynamoDB tables for the corresponding key -value pairs on the associated metadata, when objects are uploaded.
    3. Create a striped set of 4000 IOPS Elastic Load Balancing volumes to store the data. Use a database running in Amazon Relational Database Service (RDS) to store the metadata.
    4. Create a striped set of 4000 IOPS Elastic Load Balancing volumes to store the data. Create Amazon DynamoDB tables for the corresponding key-value pairs on the associated metadata, when objects are uploaded.
  6. Company ABCD has recently launched an online commerce site for bicycles on AWS. They have a “Product” DynamoDB table that stores details for each bicycle, such as, manufacturer, color, price, quantity and size to display in the online store. Due to customer demand, they want to include an image for each bicycle along with the existing details. Which approach below provides the least impact to provisioned throughput on the “Product” table?
    1. Serialize the image and store it in multiple DynamoDB tables
    2. Create an “Images” DynamoDB table to store the Image with a foreign key constraint to the “Product” table
    3. Add an image data type to the “Product” table to store the images in binary format
    4. Store the images in Amazon S3 and add an S3 URL pointer to the “Product” table item for each image