AWS EC2 Monitoring – Certification

EC2 Monitoring

Status Checks

  • Status monitoring help quickly determine whether EC2 has detected any problems that might prevent instances from running applications.
  • EC2 performs automated checks on every running EC2 instance to identify hardware and software issues.
  • Status checks are performed every minute and each returns a pass or a fail status.
  • If all checks pass, the overall status of the instance is OK.
  • If one or more checks fail, the overall status is Impaired.
  • Status checks are built into EC2, so they cannot be disabled or deleted.
  • Status checks data augments the information that EC2 already provides about the intended state of each instance (such as pending, running, stopping) as well as the utilization metrics that Amazon CloudWatch monitors (CPU utilization, network traffic, and disk activity).
  • Alarms can be created, or deleted, that are triggered based on the result of the status checks. for e.g., an alarm can be created to warn if status checks fail on a specific instance.

System Status Checks

  • monitor the AWS systems required to use your instance to ensure they are working properly.
  • detect problems with the instance that require AWS involvement to repair.
  • When a system status check fails, one can either
    • wait for AWS to fix the issue
    • or resolve it by by stopping and restarting or terminating and replacing an instance
  • System status checks failure might be cause of
    • Loss of network connectivity
    • Loss of system power
    • Software issues on the physical host
    • Hardware issues on the physical host

Instance Status Checks

  • monitor the software and network configuration of the individual instance
  • checks detect problems that requires involvement to repair.
  • When an instance status check fails, it can be resolved by either rebooting the instance or by making modifications in the operating system
  • Instance status checks failure might be cause of
    • Failed system status checks
    • Misconfigured networking or startup configuration
    • Exhausted memory
    • Corrupted file system
    • Incompatible kernel

CloudWatch Monitoring

  • CloudWatch, helps monitor EC2 instances, which collects and processes
    raw data from EC2 into readable, near real-time metrics.
  • Statistics are recorded for a period of two weeks, so that historical information can be accessed and used to gain a better perspective on how
    the application or service is performing.
  • By default Basic monitoring is enabled and EC2 metric data is sent to CloudWatch in 5-minute periods automatically
  • Detailed monitoring can be enabled on EC2 instance, which sends data to CloudWatch in 1-minute periods.
  • Aggregating Statistics Across Instances/ASG/AMI ID
    • Aggregate statistics are available for the instances that have detailed monitoring (at an additional charge) enable, which provides data in 1-minute periods
    • Instances that use basic monitoring are not included in the aggregates.
    • CloudWatch does not aggregate data across Regions. Therefore, metrics are completely separate between Regions.
    • CloudWatch returns statistics for all dimensions in the AWS/EC2 namespace, if no dimension is specified
    • The technique for retrieving all dimensions across an AWS namespace does not work for custom namespaces published to CloudWatch.
    • Statistics include Sum, Average, Minimum, Maximum, Data Samples
    • With custom namespaces, the complete set of dimensions that are associated with any given data point to retrieve statistics that include the data point must be specified
  • CloudWatch alarms
    • can be created to monitor any one of the EC2 instance’s metrics.
    • can be configured to automatically send you a notification when the metric reaches a specified threshold.
    • can automatically stop, terminate, reboot, or recover EC2 instances
    • can automatically recover an EC2 instance when the instance becomes impaired due to an underlying hardware failure a problem that requires AWS involvement to repair
    • can automatically stop or terminate the instances to save costs (EC2 instances that use an EBS volume as the root device can be stopped
      or terminated, whereas instances that use the instance store as the root device can only be terminated)
    • can use EC2ActionsAccess IAM role, which enables AWS to perform stop, terminate, or reboot actions on EC2 instances
    • If you have read/write permissions for CloudWatch but not for EC2, alarms can still be created but the stop or terminate actions won’t be performed on the EC2 instance

EC2 Metrics

  • CPUCreditUsage
    • (Only valid for T2 instances) The number of CPU credits consumed
      during the specified period.
    • This metric identifies the amount of time during which physical CPUs
      were used for processing instructions by virtual CPUs allocated to
      the instance.
    • CPU Credit metrics are available at a 5 minute frequency.
  • CPUCreditBalance
    • (Only valid for T2 instances) The number of CPU credits that an instance has accumulated.
    • This metric is used to determine how long an instance can burst beyond its baseline performance level at a given rate.
    • CPU Credit metrics are available at a 5 minute frequency.
  • CPUUtilization
    • % of allocated EC2 compute units that are currently in use on the instance. This metric identifies the processing power required to run an application upon a selected instance.
  • DiskReadOps
    • Completed read operations from all instance store volumes available to the instance in a specified period of time.
  • DiskWriteOps
    • Completed write operations to all instance store volumes available to the instance in a specified period of time.
  • DiskReadBytes
    • Bytes read from all instance store volumes available to the instance.
    • This metric is used to determine the volume of the data the application reads from the hard disk of the instance.
    • This can be used to determine the speed of the application.
  • DiskWriteBytes
    • Bytes written to all instance store volumes available to the instance.
    • This metric is used to determine the volume of the data the application writes onto the hard disk of the instance.
    • This can be used to determine the speed of the application.
  • NetworkIn
    • The number of bytes received on all network interfaces by the instance. This metric identifies the volume of incoming network traffic to an application on a single instance.
  • NetworkOut
    • The number of bytes sent out on all network interfaces by the instance. This metric identifies the volume of outgoing network traffic to an application on a single instance.
  • NetworkPacketsIn
    • The number of packets received on all network interfaces by the instance. This metric identifies the volume of incoming traffic in terms of the number of packets on a single instance.
    • This metric is available for basic monitoring only
  • NetworkPacketsOut
    • The number of packets sent out on all network interfaces by the instance. This metric identifies the volume of outgoing traffic in terms of the number of packets on a single instance.
    • This metric is available for basic monitoring only.
  • StatusCheckFailed
    • Reports if either of the status checks, StatusCheckFailed_Instance and StatusCheckFailed_System, that has failed.
    • Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure.
    • Status check metrics are available at 1 minute frequency
  • StatusCheckFailed_Instance
    • Reports whether the instance has passed the Amazon EC2 instance status check in the last minute.
    • Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure.
    • Status check metrics are available at 1 minute frequency
  • StatusCheckFailed_System
    • Reports whether the instance has passed the EC2 system status check in the last minute.
    • Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure.
    • Status check metrics are available at a 1 minute frequency

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. In the basic monitoring package for EC2, Amazon CloudWatch provides the following metrics:
    1. Web server visible metrics such as number failed transaction requests
    2. Operating system visible metrics such as memory utilization
    3. Database visible metrics such as number of connections
    4. Hypervisor visible metrics such as CPU utilization
  2. Which of the following requires a custom CloudWatch metric to monitor?
    1. Memory Utilization of an EC2 instance
    2. CPU Utilization of an EC2 instance
    3. Disk usage activity of an EC2 instance
    4. Data transfer of an EC2 instance
  3. A user has configured CloudWatch monitoring on an EBS backed EC2 instance. If the user has not attached any additional device, which of the below mentioned metrics will always show a 0 value?
    1. DiskReadBytes
    2. NetworkIn
    3. NetworkOut
    4. CPUUtilization
  4. A user is running a batch process on EBS backed EC2 instances. The batch process starts a few instances to process Hadoop Map reduce jobs, which can run between 50 – 600 minutes or sometimes for more time. The user wants to configure that the instance gets terminated only when the process is completed. How can the user configure this with CloudWatch?
    1. Setup the CloudWatch action to terminate the instance when the CPU utilization is less than 5%
    2. Setup the CloudWatch with Auto Scaling to terminate all the instances
    3. Setup a job which terminates all instances after 600 minutes
    4. It is not possible to terminate instances automatically
  5. An AWS account owner has setup multiple IAM users. One IAM user only has CloudWatch access. He has setup the alarm action, which stops the EC2 instances when the CPU utilization is below the threshold limit. What will happen in this case?
    1. It is not possible to stop the instance using the CloudWatch alarm
    2. CloudWatch will stop the instance when the action is executed
    3. The user cannot set an alarm on EC2 since he does not have the permission
    4. The user can setup the action but it will not be executed if the user does not have EC2 rights
  6. A user has launched 10 instances from the same AMI ID using Auto Scaling. The user is trying to see the average CPU utilization across all instances of the last 2 weeks under the CloudWatch console. How can the user achieve this?
    1. View the Auto Scaling CPU metrics (Refer AS Instance Monitoring)
    2. Aggregate the data over the instance AMI ID (Works but needs detailed monitoring enabled)
    3. The user has to use the CloudWatchanalyser to find the average data across instances
    4. It is not possible to see the average CPU utilization of the same AMI ID since the instance ID is different

AWS CloudWatch – Certification

AWS CloudWatch

  • AWS CloudWatch monitors AWS resources and applications in real-time.
  • CloudWatch can be used to collect and track metrics, which are the variables to be measured for resources and applications.
  • CloudWatch alarms can be configured
    • to send notifications or
    • to automatically make changes to the resources based on defined rules
  • In addition to monitoring the built-in metrics that come with AWS, custom metrics can also be monitored
  • CloudWatch provides system-wide visibility into resource utilization, application performance, and operational health.
  • By default, CloudWatch stores the log data indefinitely, and the retention can be changed for each log group at any time
  • CloudWatch Alarm history is stored for only 14 days

CloudWatch Architecture

CloudWatch Architecture

  • CloudWatch collects various metrics from various resources
  • These metrics, as statistics, are available to the user through Console, CLI
  • CloudWatch allows creation of alarms with defined rules
    • to perform actions to auto scaling or stop, start, or terminate instances
    • to send notifications using SNS actions on your behalf

CloudWatch Concepts

Metrics

  • Metric is the fundamental concept in CloudWatch.
  • Uniquely defined by a name, a namespace, and one or more dimensions
  • Represents a time-ordered set of data points published to CloudWatch.
  • Each data point has a time stamp, and (optionally) a unit of measure
  • Data points can be either custom metrics or metrics from other
    services in AWS.
  • Statistics can be retrieved about those data points as an ordered set of time-series data that occur within a specified time window.
  • When the statistics are requested, the returned data stream is identified by namespace, metric name, dimension, and (optionally) the unit.
  • Metrics exist only in the region in which they are created
  • CloudWatch stores the metric data for two weeks
  • Metrics cannot be deleted, but they automatically expire in 14 days if no new data is published to them.

Namespaces

  • CloudWatch namespaces are containers for metrics.
  • Metrics in different namespaces are isolated from each other, so that metrics from different applications are not mistakenly aggregated into the same statistics.
  • AWS namespaces all follow the convention AWS/<service>, for e.g. AWS/EC2 and AWS/ELB
  • Namespace names must be fewer than 256 characters in length.
  • There is no default namespace. Each data element put into CloudWatch must specify a namespace

Dimensions

  • A dimension is a name/value pair that uniquely identifies a metric.
  • Every metric has specific characteristics that describe it, and you can think of dimensions as categories for those characteristics.
  • Dimensions helps design a structure for the statistics plan.
  • Dimensions are part of the unique identifier for a metric, whenever a unique name pair is added to one of the metrics, a new metric is created
  • Dimensions can be used to filter result sets that CloudWatch query returns
  • A metric can be assigned up to ten dimensions to a metric.

Time Stamps

  • Each metric data point must be marked with a time stamp to identify the data point on a time series
  • Time stamp can be up to two weeks in the past and up to two hours into the future.
  • If no time stamp is provided, CloudWatch creates a time stamp based on the time the data element was received.
  • All times reflect the UTC time zone when statistics are retrieved

Units

  • Units represent the statistic’s unit of measure for e.g. count, bytes, % etc

Statistics

  • Statistics are metric data aggregations over specified periods of time
  • Aggregations are made using the namespace, metric name, dimensions, and the data point unit of measure, within the specified time period

Periods

  • Period is the length of time associated with a specific statistic.
  • Each statistic represents an aggregation of the metrics data collected for a specified period of time.
  • Although periods are expressed in seconds, the minimum granularity for a period is one minute.

Aggregation

  • CloudWatch aggregates statistics according to the period length specified in calls to GetMetricStatistics.
  • Multiple data points can be published with the same or similar time stamps. CloudWatch aggregates them by period length when the statistics about those data points are requested.
  • Aggregated statistics are only available when using detailed monitoring.
  • Instances that use basic monitoring are not included in the aggregates
  • CloudWatch does not aggregate data across regions.

Alarms

  • Alarms can automatically initiate actions on behalf of the user, based on specified parameters
  • Alarm watches a single metric over a specified time period, and performs one or more actions based on the value of the metric relative to a given threshold over a number of time periods.
  • Alarms invoke actions for sustained state changes only i.e. the state must have changed and been maintained for a specified number of periods
  • Action can be a
    • SNS notification
    • Auto Scaling policies
    • EC2 action – stop or terminate EC2 instances
  • After an alarm invokes an action due to a change in state, its subsequent behavior depends on the type of action associated with the alarm.
    • For Auto Scaling policy notifications, the alarm continues to invoke the action for every period that the alarm remains in the new state.
    • For SNS notifications, no additional actions are invoked.
  • An alarm has three possible states:
    • OK—The metric is within the defined threshold
    • ALARM—The metric is outside of the defined threshold
    • INSUFFICIENT_DATA—Alarm has just started, the metric is not available, or not enough data is available for the metric to determine the alarm state
  • Alarms exist only in the region in which they are created.
  • Alarm actions must reside in the same region as the alarm
  • Alarm history is available for the last 14 days.
  • Alarm can be tested by setting it to any state using the SetAlarmState API (mon-set-alarm-state command). This temporary state change lasts only until the next alarm comparison occurs.
  • Alarms can be disabled and enabled using the DisableAlarmActions and EnableAlarmActions APIs (mon-disable-alarm-actions and mon-enable-alarm-actions commands).

Regions

  • CloudWatch does not aggregate data across regions. Therefore, metrics are completely separate between regions.

Custom Metrics

  • CloudWatch allows publishing custom metrics with put-metric-data CLI command (or its Query API equivalent PutMetricData)
  • CloudWatch creates a new metric if put-metric-data is called with a new metric name,  else it associates the data with the specified existing metric
  • put-metric-data command can only publish one data point per call
  • CloudWatch stores data about a metric as a series of data points and each data point has an associated time stamp
  • Creating a new metric using the put-metric-data command, can take up to two minutes before statistics can be retrieved on the new metric using the get-metric-statistics command and can take up to fifteen minutes before the new metric appears in the list of metrics retrieved using the list-metrics command.
  • CloudWatch allows publishing
    • Single data point
      • Data points can be published with time stamps as granular as one-thousandth of a second, CloudWatch aggregates the data to a minimum granularity of one minute
      • CloudWatch records the average (sum of all items divided by number of items) of the values received for every 1-minute period, as well as number of samples, maximum value, and minimum value for the same time period
      • CloudWatch uses one-minute boundaries when aggregating data points
    • Aggregated set of data points called a statistics set
      • Data can also be aggregated before being published to CloudWatch
      • Aggregating data minimizes the number of calls reducing it to a single call per minute with the statistic set of data
      • Statistics include Sum, Average, Minimum, Maximum, Data Sample
  • If the application produces data that is more sporadic and have periods that have no associated data, either a the value zero (0) or no value at all can be published
  • However, it can be helpful to publish zero instead of no value
    • to monitor the health of your application for e.g. alarm can be configured to notify if no metrics published every 5 minutes
    • to track the total number of data points
    • to have statistics such as minimum and average to include data points with the value 0.

Supported Services

For Supported Services refer @ CloudWatch Supported Services

Accessing CloudWatch

  • CloudWatch can be accessed using
    • AWS CloudWatch console
    • CloudWatch CLI
    • AWS CLI
    • CloudWatch API
    • AWS SDKs

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company needs to monitor the read and write IOPs metrics for their AWS MySQL RDS instance and send real-time alerts to their operations team. Which AWS services can accomplish this? Choose 2 answers
    1. Amazon Simple Email Service (Cannot be integrated with CloudWatch directly)
    2. Amazon CloudWatch
    3. Amazon Simple Queue Service
    4. Amazon Route 53
    5. Amazon Simple Notification Service
  2. A customer needs to capture all client connection information from their load balancer every five minutes. The company wants to use this data for analyzing traffic patterns and troubleshooting their applications. Which of the following options meets the customer requirements?
    1. Enable AWS CloudTrail for the load balancer.
    2. Enable access logs on the load balancer. (Refer link)
    3. Install the Amazon CloudWatch Logs agent on the load balancer.
    4. Enable Amazon CloudWatch metrics on the load balancer (does not provide Client connection information)
  3. A user is running a batch process on EBS backed EC2 instances. The batch process starts a few instances to process Hadoop Map reduce jobs, which can run between 50 – 600 minutes or sometimes for more time. The user wants to configure that the instance gets terminated only when the process is completed. How can the user configure this with CloudWatch?
    1. Setup the CloudWatch action to terminate the instance when the CPU utilization is less than 5%
    2. Setup the CloudWatch with Auto Scaling to terminate all the instances
    3. Setup a job which terminates all instances after 600 minutes
    4. It is not possible to terminate instances automatically
  4. A user has two EC2 instances running in two separate regions. The user is running an internal memory management tool, which captures the data and sends it to CloudWatch in US East, using a CLI with the same namespace and metric. Which of the below mentioned options is true with respect to the above statement?
    1. The setup will not work as CloudWatch cannot receive data across regions
    2. CloudWatch will receive and aggregate the data based on the namespace and metric
    3. CloudWatch will give an error since the data will conflict due to two sources
    4. CloudWatch will take the data of the server, which sends the data first
  5. A user is sending the data to CloudWatch using the CloudWatch API. The user is sending data 90 minutes in the future. What will CloudWatch do in this case?
    1. CloudWatch will accept the data
    2. It is not possible to send data of the future
    3. It is not possible to send the data manually to CloudWatch
    4. The user cannot send data for more than 60 minutes in the future
  6. A user is having data generated randomly based on a certain event. The user wants to upload that data to CloudWatch. It may happen that event may not have data generated for some period due to randomness. Which of the below mentioned options is a recommended option for this case?
    1. For the period when there is no data, the user should not send the data at all
    2. For the period when there is no data the user should send a blank value
    3. For the period when there is no data the user should send the value as 0 (Refer User Guide)
    4. The user must upload the data to CloudWatch as having no data for some period will cause an error at CloudWatch monitoring
  7. A user has a weighing plant. The user measures the weight of some goods every 5 minutes and sends data to AWS CloudWatch for monitoring and tracking. Which of the below mentioned parameters is mandatory for the user to include in the request list?
    1. Value
    2. Namespace (refer put-metric request)
    3. Metric Name
    4. Timezone
  8. A user has a refrigerator plant. The user is measuring the temperature of the plant every 15 minutes. If the user wants to send the data to CloudWatch to view the data visually, which of the below mentioned statements is true with respect to the information given above?
    1. The user needs to use AWS CLI or API to upload the data
    2. The user can use the AWS Import Export facility to import data to CloudWatch
    3. The user will upload data from the AWS console
    4. The user cannot upload data to CloudWatch since it is not an AWS service metric
  9. A user has launched an EC2 instance. The user is planning to setup the CloudWatch alarm. Which of the below mentioned actions is not supported by the CloudWatch alarm?
    1. Notify the Auto Scaling launch config to scale up
    2. Send an SMS using SNS
    3. Notify the Auto Scaling group to scale down
    4. Stop the EC2 instance
  10. A user has a refrigerator plant. The user is measuring the temperature of the plant every 15 minutes. If the user wants to send the data to CloudWatch to view the data visually, which of the below mentioned statements is true with respect to the information given above?
    1. The user needs to use AWS CLI or API to upload the data
    2. The user can use the AWS Import Export facility to import data to CloudWatch
    3. The user will upload data from the AWS console
    4. The user cannot upload data to CloudWatch since it is not an AWS service metric
  11. A user is trying to aggregate all the CloudWatch metric data of the last 1 week. Which of the below mentioned statistics is not available for the user as a part of data aggregation?
    1. Aggregate
    2. Sum
    3. Sample data
    4. Average
  12. A user has setup a CloudWatch alarm on an EC2 action when the CPU utilization is above 75%. The alarm sends a notification to SNS on the alarm state. If the user wants to simulate the alarm action how can he achieve this?
    1. Run activities on the CPU such that its utilization reaches above 75%
    2. From the AWS console change the state to ‘Alarm’
    3. The user can set the alarm state to ‘Alarm’ using CLI
    4. Run the SNS action manually
  13. A user is publishing custom metrics to CloudWatch. Which of the below mentioned statements will help the user understand the functionality better?
    1. The user can use the CloudWatch Import tool
    2. The user should be able to see the data in the console after around 15 minutes
    3. If the user is uploading the custom data, the user must supply the namespace, timezone, and metric name as part of the command
    4. The user can view as well as upload data using the console, CLI and APIs
  14. An application that you are managing has EC2 instances and DynamoDB tables deployed to several AWS Regions. In order to monitor the performance of the application globally, you would like to see two graphs 1) Avg CPU Utilization across all EC2 instances and 2) Number of Throttled Requests for all DynamoDB tables. How can you accomplish this?
    1. Tag your resources with the application name, and select the tag name as the dimension in the CloudWatch Management console to view the respective graphs (CloudWatch metrics are regional)
    2. Use the CloudWatch CLI tools to pull the respective metrics from each regional endpoint. Aggregate the data offline & store it for graphing in CloudWatch.
    3. Add SNMP traps to each instance and DynamoDB table. Leverage a central monitoring server to capture data from each instance and table. Put the aggregate data into CloudWatch for graphing (Can’t add SNMP traps to DynamoDB as it is a managed service)
    4. Add a CloudWatch agent to each instance and attach one to each DynamoDB table. When configuring the agent set the appropriate application name & view the graphs in CloudWatch. (Can’t add agents to DynamoDB as it is a managed service)
  15. You have set up Individual AWS accounts for each project. You have been asked to make sure your AWS Infrastructure costs do not exceed the budget set per project for each month. Which of the following approaches can help ensure that you do not exceed the budget each month?
    1. Consolidate your accounts so you have a single bill for all accounts and projects (Consolidation will not help limit per account)
    2. Set up auto scaling with CloudWatch alarms using SNS to notify you when you are running too many Instances in a given account (many instances do not directly map to cost and would not give exact cost)
    3. Set up CloudWatch billing alerts for all AWS resources used by each project, with a notification occurring when the amount for each resource tagged to a particular project matches the budget allocated to the project. (as each project already has a account, no need for resource tagging)
    4. Set up CloudWatch billing alerts for all AWS resources used by each account, with email notifications when it hits 50%. 80% and 90% of its budgeted monthly spend

CloudWatch Monitoring Supported AWS Services

  • CloudWatch offers either basic or detailed monitoring for supported AWS services.
  • Basic monitoring means that a service sends data points to CloudWatch every five minutes.
  • Detailed monitoring means that a service sends data points to CloudWatch every minute.
  • If the AWS service supports both basic and detailed monitoring, the basic would be enabled by default and the detailed monitoring needs to be enabled for details metrics

AWS Services with Monitoring support

  • Auto Scaling
    • By default, basic monitoring is enabled when the launch configuration is created using the AWS Management Console and detailed monitoring is enabled when the launch configuration is created using using the AWS CLI or an API
    • Auto Scaling sends data to CloudWatch every 5 minutes by default, when created from Console.
    • For an additional charge, you can enable detailed monitoring for Auto Scaling, which sends data to CloudWatch every minute.
  • Amazon CloudFront
    • Amazon CloudFront sends data to CloudWatch every minute by default.
  • Amazon CloudSearch
    • Amazon CloudSearch sends data to CloudWatch every minute by default.
  • Amazon CloudWatch Events
    • Amazon CloudWatch Events sends data to CloudWatch every minute by default.
  • Amazon CloudWatch Logs
    • Amazon CloudWatch Logs sends data to CloudWatch every minute by default.
  • Amazon DynamoDB
    • Amazon DynamoDB sends data to CloudWatch every minute for some metrics and every 5 minutes for other metrics.
  • Amazon EC2 Container Service
    • Amazon EC2 Container Service sends data to CloudWatch every minute.
  • Amazon ElastiCache
    • Amazon ElastiCache sends data to CloudWatch every minute.
  • Amazon Elastic Block Store
    • Amazon Elastic Block Store sends data to CloudWatch every 5 minutes.
    • Provisioned IOPS SSD (io1) volumes automatically send one-minute metrics to CloudWatch.
  • Amazon Elastic Compute Cloud
    • Amazon EC2 sends data to CloudWatch every 5 minutes by default. For an additional charge, you can enable detailed monitoring for Amazon EC2, which sends data to CloudWatch every minute.
  • Elastic Load Balancing
    • Elastic Load Balancing sends data to CloudWatch every minute.
  • Amazon Elastic MapReduce
    • Amazon Elastic MapReduce sends data to CloudWatch every 5 minutes.
  • Amazon Elasticsearch Service
    • Amazon Elasticsearch Service sends data to CloudWatch every minute.
  • Amazon Kinesis Streams
    • Amazon Kinesis Streams sends data to CloudWatch every minute.
  • Amazon Kinesis Firehose
    • Amazon Kinesis Firehose sends data to CloudWatch every minute.
  • AWS Lambda
    • AWS Lambda sends data to CloudWatch every minute.
  • Amazon Machine Learning
    • Amazon Machine Learning sends data to CloudWatch every 5 minutes.
  • AWS OpsWorks
    • AWS OpsWorks sends data to CloudWatch every minute.
  • Amazon Redshift
    • Amazon Redshift sends data to CloudWatch every minute.
  • Amazon Relational Database Service
    • Amazon Relational Database Service sends data to CloudWatch every minute.
  • Amazon Route 53
    • Amazon Route 53 sends data to CloudWatch every minute.
  • Amazon Simple Notification Service
    • Amazon Simple Notification Service sends data to CloudWatch every 5 minutes.
  • Amazon Simple Queue Service
    • Amazon Simple Queue Service sends data to CloudWatch every 5 minutes.
  • Amazon Simple Storage Service
    • Amazon Simple Storage Service sends data to CloudWatch once a day.
  • Amazon Simple Workflow Service
    • Amazon Simple Workflow Service sends data to CloudWatch every 5 minutes.
  • AWS Storage Gateway
    • AWS Storage Gateway sends data to CloudWatch every 5 minutes.
  • AWS WAF
    • AWS WAF sends data to CloudWatch every minute.
  • Amazon WorkSpaces
    • Amazon WorkSpaces sends data to CloudWatch every 5 minutes.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. What is the minimum time Interval for the data that Amazon CloudWatch receives and aggregates?
    1. One second
    2. Five seconds
    3. One minute
    4. Three minutes
    5. Five minutes
  2. In the ‘Detailed’ monitoring data available for your Amazon EBS volumes, Provisioned IOPS volumes automatically send _____ minute metrics to Amazon CloudWatch.
    1. 3
    2. 1
    3. 5
    4. 2
  3. Using Amazon CloudWatch’s Free Tier, what is the frequency of metric updates, which you receive?
    1. 5 minutes
    2. 500 milliseconds.
    3. 30 seconds
    4. 1 minute
  4. What is the type of monitoring data (for Amazon EBS volumes) which is available automatically in 5-minute periods at no charge called?
    1. Basic
    2. Primary
    3. Detailed
    4. Local
  5. A user has created an Auto Scaling group using CLI. The user wants to enable CloudWatch detailed monitoring for that group. How can the user configure this?
    1. When the user sets an alarm on the Auto Scaling group, it automatically enables detail monitoring
    2. By default detailed monitoring is enabled for Auto Scaling
    3. Auto Scaling does not support detailed monitoring
    4. Enable detail monitoring from the AWS console
  6. A user is trying to understand the detailed CloudWatch monitoring concept. Which of the below mentioned services provides detailed monitoring with CloudWatch without charging the user extra?
    1. AWS Auto Scaling
    2. AWS Route 53
    3. AWS EMR
    4. AWS SNS
  7. A user is trying to understand the detailed CloudWatch monitoring concept. Which of the below mentioned services does not provide detailed monitoring with CloudWatch?
    1. AWS EMR
    2. AWS RDS
    3. AWS ELB
    4. AWS Route53
  8. A user has enabled detailed CloudWatch monitoring with the AWS Simple Notification Service. Which of the below mentioned statements helps the user understand detailed monitoring better?
    1. SNS will send data every minute after configuration
    2. There is no need to enable since SNS provides data every minute
    3. AWS CloudWatch does not support monitoring for SNS
    4. SNS cannot provide data every minute
  9. A user has configured an Auto Scaling group with ELB. The user has enabled detailed CloudWatch monitoring on Auto Scaling. Which of the below mentioned statements will help the user understand the functionality better?
    1. It is not possible to setup detailed monitoring for Auto Scaling
    2. In this case, Auto Scaling will send data every minute and will charge the user extra
    3. Detailed monitoring will send data every minute without additional charges
    4. Auto Scaling sends data every minute only and does not charge the user

References

AWS EC2 EBS Monitoring

Overview

Amazon Web Services (AWS) automatically provides data, such as Amazon CloudWatch metrics and volume status checks to help monitor EBS volumes

EBS Monitoring

CloudWatch Monitoring

  • CloudWatch metrics are statistical data that you can use to view, analyze, and set alarms on the operational behavior of the EBS volumes
  • CloudWatch provides the below by default
    • Basic – Data, in 5-minute periods at no charge, which includes data from the root devices volumes for EBS backed instances
    • Detailed – Provisioned IOPS (SSD) volumes send one-minute metrics
  • EBS Metrics
    • VolumeReadBytes & VolumeWriteBytes
      • Provides information on the I/O operations in a specified period of time, in bytes
    • VolumeReadOps & VolumeWriteOps
      • Total number (count) of I/O operations in a specified period of time
    • VolumeTotalReadTime & VolumeTotalWriteTime
      • Total number of seconds spent by all operations that completed in a specified period of time
    • VolumeIdleTime
      • Total number of seconds, in a specific period, when the volume was idle (no read and write operations)
    • VolumeQueueLength
      • Number of read and write operations, in a specific period, waiting to be completed
    • VolumeThroughputPercentage (Provisioned IOPS (SSD) volumes only)
      • Percentage of I/O operations per second (IOPS) delivered of the total IOPS provisioned
    • VolumeConsumedReadWriteOps (Provisioned IOPS (SSD) volumes only)
      • Total amount of read and write operations (normalized to 256K capacity units) consumed in a specified period of time

Volume Status Checks Monitoring

EC2 EBS Volume Status Check Monitoring

  • Volume status checks are automated tests that run every 5 minutes and return a pass or fail status.
  • Volume check status
    • Ok – all the status checks passed
    • Imparied – if the status checks failed
    • Insufficient-Data – checks are still in progress
    • Warning – the I/O performance of the volume is below expectations
  • When Amazon EBS determines the volume’s data is potentially inconsistent, it disables the I/O to the EBS volume from the attached EC2 instance to prevent any data corruption. This leads to the status check to fail and the volume status to be impaired. Amazon waits for the I/O to be enabled, giving you an opportunity to perform consistency checks
  • If the auto disabling of I/O is not needed, it can be overridden by enabling the Auto-Enabled IO flag, which would make the EBS volume auto available immediately after impaired status
  • Events would be fire for notification whenever the I/O for an EBS volume is disabled
  • I/O performance status checks, applicable only for Provisioned IOPS (SSD) volumes, compares actual volume performance with the expected volume performance and alerts if performing below expectations. This status check is performed every 1 minutes, however is collected by CloudWatch every 5 mins.
  • While initializing Provisioned IOPS (SSD) volumes that were restored from snapshots, the performance of the volume may drop below 50 percent of its expected level, which causes the volume to display a warning state in the I/O Performance status check. This is expected and can be ignored.

EC2 EBS Volume Status

Volume Events Monitoring

  • Amazon EBS generates events for volume status checks
  • Each event includes a start time that indicates the time at which the event occurred, and a duration that indicates how long I/O for the volume was disabled
  • Events description can be Awaiting Action (to enabled I/O), IO enabled, IO Auto-Enabled, or whether the status check resulted in Normal, Degraded, Severely Degraded or stalled status

Sample Exam Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A user has configured CloudWatch monitoring on an EBS backed EC2 instance. If the user has not attached any additional device, which of the below mentioned metrics will always show a 0 value?
    1. DiskReadBytes
    2. NetworkIn
    3. NetworkOut
    4. CPUUtilization

References

AWS ELB Monitoring

Following Features can be used for Monitoring Elastic Load Balancer

Cloud Watch Metrics

  • Elastic Load Balancing publishes data points to Amazon CloudWatch about your load balancers and back-end instances
  • Elastic Load Balancing reports metrics to CloudWatch only when requests are flowing through the load balancer. If there are requests flowing through the load balancer, Elastic Load Balancing measures and sends its metrics in 60-second intervals. If there are no requests flowing through the load balancer or no data for a metric, the metric is not reported.

Following metrics are available

  • HealthyHostCount, UnHealthyHostCount
    • Number of healthy and unhealthy instances registered with the load balancer.
    • Most useful statistics are average, min, and max
  • RequestCount
    • Number of requests completed or connections made during the specified interval (1 or 5 minutes).
    • Most useful statistic is sum
  • Latency
    • Time elapsed, in seconds, after the request leaves the load balancer until the headers of the response are received.
    • Most useful statistic is average
  • SurgeQueueLength
    • Total number of requests that are pending routing.
    • Load balancer queues a request if it is unable to establish a connection with a healthy instance in order to route the request.
    • Maximum size of the queue is 1,024. Additional requests are rejected when the queue is full.
    • Most useful statistic is max, because it represents the peak of queued requests.
  • SpilloverCount
    • The total number of requests that were rejected because the surge queue is full. Should ideally be 0
    • Most useful statistic is sum.
  • HTTPCode_ELB_4XX, HTTPCode_ELB_5XX
    • Client and Server error code generated by the load balancer
    • Most useful statistic is sum.
  • HTTPCode_Backend_2XX, HTTPCode_Backend_3XX, HTTPCode_Backend_4XX, HTTPCode_Backend_5XX
    • Number of HTTP response codes generated by registered instances
    • Most useful statistic is sum.

Elastic Load Balancer access logs

  • Elastic Load Balancing provides access logs that capture detailed information about all requests sent to your load balancer.
  • Each log contains information such as the time the request was received, the client’s IP address, latencies, request paths, and server responses.
  • Elastic Load Balancing captures the logs and stores them in the Amazon S3 bucket
  • Access logging is disabled by default and can be enabled without any additional charge. You are only charged for S3 storage

CloudTrail Logs

  • AWS CloudTrail can be used to capture all calls to the Elastic Load Balancing API made by or on behalf of your AWS account and either made using Elastic Load Balancing API directly, or indirectly through the AWS Management Console or AWS CLI
  • CloudTrail stores the information as log files in an Amazon S3 bucket that you specify.
  • Logs collected by CloudTrail can be used to monitor the activity of your load balancers and determine what API call was made, what source IP address was used, who made the call, when it was made, and so on

Sample Exam Questions

  1. An admin is planning to monitor the ELB. Which of the below mentioned services does not help the admin capture the monitoring information about the ELB activity
    1. ELB Access logs
    2. ELB health check
    3. CloudWatch metrics
    4. ELB API calls with CloudTrail

 

References

Elastic Load Balance developer guide