AWS RDS Monitoring & Notification

AWS RDS Monitoring & Notification

  • RDS integrates with CloudWatch and provides metrics for monitoring
  • CloudWatch alarms can be created over a single metric that sends an SNS message when the alarm changes state
  • RDS also provides SNS notification whenever any RDS event occurs
  • RDS Performance Insights is a database performance tuning and monitoring feature that helps illustrate the databaseā€™s performance and help analyze any issues that affect it
  • RDS Recommendations provides automated recommendations for database resources.

Ā RDS CloudWatch Monitoring

  • RDS DB instance can be monitored using CloudWatch, which collects and processes raw data from RDS into readable, near real-time metrics.
  • Statistics are recorded so that you can access historical information and gain a better perspective on how the service is performing.
  • By default, RDS metric data is automatically sent to CloudWatch in 1-minute periods
  • CloudWatch RDS Metrics
    • BinLogDiskUsage ā€“ Amount of disk space occupied by binary logs on the master. Applies to MySQL read replicas.
    • CPUUtilization ā€“ Percentage of CPU utilization.
    • DatabaseConnections ā€“ Number of database connections in use.
    • DiskQueueDepth ā€“ The number of outstanding IOs (read/write requests) waiting to access the disk.
    • FreeableMemory ā€“ Amount of available random access memory.
    • FreeStorageSpace ā€“ Amount of available storage space.
    • ReplicaLag ā€“ Amount of time a Read Replica DB instance lags behind the source DB instance.
    • SwapUsage ā€“ Amount of swap space used on the DB instance.
    • ReadIOPS ā€“ Average number of disk I/O operations per second.
    • WriteIOPS ā€“ Average number of disk I/O operations per second.
    • ReadLatency ā€“ Average amount of time taken per disk I/O operation.
    • WriteLatency ā€“ Average amount of time taken per disk I/O operation.
    • ReadThroughput ā€“ Average number of bytes read from disk per second.
    • WriteThroughput ā€“ Average number of bytes written to disk per second.
    • NetworkReceiveThroughput ā€“ Incoming (Receive) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.
    • NetworkTransmitThroughput ā€“ Outgoing (Transmit) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.

RDS Enhanced Monitoring

  • RDS provides metrics in real-time for the operating system (OS) that the DB instance runs on.
  • By default, Enhanced Monitoring metrics are stored for 30 days in the CloudWatch Logs, which are different from typical CloudWatch metrics.

CloudWatch vs Enhanced Monitoring Metrics

  • CloudWatch gathers metrics about CPU utilization from the hypervisor for a DB instance, and Enhanced Monitoring gathers its metrics from an agent on the instance.
  • Enhanced Monitoring metrics are useful to understand how different processes or threads on a DB instance use the CPU.
  • There might be differences between the measurements because the hypervisor layer performs a small amount of work. The differences can be greater if the DB instances use smaller instance classes because then there are likely more virtual machines (VMs) that are managed by the hypervisor layer on a single physical instance.

RDS Performance Insights

  • Performance Insights is a database performance tuning and monitoring feature that helps check the databaseā€™s performance and helps analyze any issues that affect it.
  • Database load is measured using a metric called Average Active Sessions or AAS which is calculated by sampling memory to determine the state of each active database connection.
  • AAS is the total number of sessions divided by the total number of samples for a specific time period.
  • Performance Insights help visualize the database load and filter the load by waits, SQL statements, hosts, or users.

RDS CloudTrail Logs

  • CloudTrail provides a record of actions taken by a user, role, or an AWS service in RDS.
  • CloudTrail captures all API calls for RDS as events, including calls from the console and from code calls to RDS API operations.
  • CloudTrail can help determine the request that was made to RDS, the IP address from which the request was made, who made the request, when it was made, and additional details.

RDSĀ Recommendations

  • RDS provides automated recommendations for database resources.
  • The recommendations provide best practice guidance by analyzing DB instance configuration, usage, and performance data.

RDS Event Notification

  • RDS uses the SNS to provide notification when an RDS event occurs
  • RDS groups theĀ events into categories, which can be subscribed so that a notification is sent when an event in that category occurs.
  • Event category for a DB instance, DB cluster, DB snapshot, DB cluster snapshot, DB security group, or for a DB parameter group can be subscribed
  • Event notifications are sent to the email addresses provided during subscription creation
  • Subscriptions can be easily turned off without deleting a subscription by setting the Enabled radio button to No in the RDS console or by setting the Enabled parameter to false using the CLI or RDS API.

RDS Trusted Advisor

  • Trusted Advisor inspects the AWS environment and then makes recommendations when opportunities exist to save money, improve system availability and performance, or help close security gaps.
  • Trusted Advisor has the following RDS-related checks:
    • RDS Idle DB Instances
    • RDS Security Group Access Risk
    • RDS Backups
    • RDS Multi-AZ

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You run a web application with the following components Elastic Load Balancer (ELB), 3 Web/Application servers, 1 MySQL RDS database with read replicas, and Amazon Simple Storage Service (Amazon S3) for static content. Average response time for users is increasing slowly. What three CloudWatch RDS metrics will allow you to identify if the database is the bottleneck? Choose 3 answers
    1. The number of outstanding IOs waiting to access the disk
    2. The amount of write latency
    3. The amount of disk space occupied by binary logs on the master.
    4. The amount of time a Read Replica DB Instance lags behind the source DB Instance
    5. The average number of disk I/O operations per second.
  2. Typically, you want your application to check whether a request generated an error before you spend any time processing results. The easiest way to find out if an error occurred is to look for an __________ node in the response from the Amazon RDS API.
    1. Incorrect
    2. Error
    3. FALSE
  3. In the Amazon CloudWatch, which metric should I be checking to ensure that your DB Instance has enough free storage space?
    1. FreeStorage
    2. FreeStorageSpace
    3. FreeStorageVolume
    4. FreeDBStorageSpace
  4. A user is receiving a notification from the RDS DB whenever there is a change in the DB security group. The user does not want to receive these notifications for only a month. Thus, he does not want to delete the notification. How can the user configure this?
    1. Change the Disable button for notification to ā€œYesā€ in the RDS console
    2. Set the send mail flag to false in the DB event notification console
    3. The only option is to delete the notification from the console
    4. Change the Enable button for notification to ā€œNoā€ in the RDS console
  5. A sys admin is planning to subscribe to the RDS event notifications. For which of the below mentioned source categories the subscription cannot be configured?
    1. DB security group
    2. DB snapshot
    3. DB options group
    4. DB parameter group
  6. A user is planning to setup notifications on the RDS DB for a snapshot. Which of the below mentioned event categories is not supported by RDS for this snapshot source type?
    1. Backup (Refer link)
    2. Creation
    3. Deletion
    4. Restoration
  7. A system admin is planning to setup event notifications on RDS. Which of the below mentioned services will help the admin setup notifications?
    1. AWS SES
    2. AWS Cloudtrail
    3. AWS CloudWatch
    4. AWS SNS
  8. A user has setup an RDS DB with Oracle. The user wants to get notifications when someone modifies the security group of that DB. How can the user configure that?
    1. It is not possible to get the notifications on a change in the security group
    2. Configure SNS to monitor security group changes
    3. Configure event notification on the DB security group
    4. Configure the CloudWatch alarm on the DB for a change in the security group
  9. It is advised that you watch the Amazon CloudWatch ā€œ_____ā€ metric (available via the AWS Management Console or Amazon Cloud Watch APIs) carefully and recreate the Read Replica should it fall behind due to replication errors.
    1. Write Lag
    2. Read Replica
    3. Replica Lag
    4. Single Replica

9 thoughts on ā€œAWS RDS Monitoring & Notificationā€

    1. Not everything, but few of them are very important for SysOps like Replica Lag which is usually tested.

    1. The average would not help in bottleneck.
      The other factors clearly indicate the bottlenecks, the lags and outstanding requests.

  1. Thank you for detailed explanation.
    i have one query on the same line , is there a way to get notification when users logged in as ā€œROOTā€

    1. Logging into the RDS using ROOT ? This needs to be database specific as it is outside the AWS same like ssh to EC2 instance.

  2. Hi,

    I had a question.
    When an application server is not able to connect the RDS MySQL server where can we troubleshoot what is the cause?
    VPC Flow Logs
    RDS error logs

    I remember these above two options

    Thank You.

  3. For Q1, who will the option d will have an impact on the response time? Even if there is a replication lag, still read operations to read replica will not be delayed but data as present at that point of time will be responded.

Comments are closed.