Status monitoring help quickly determine whether EC2 has detected any problems that might prevent instances from running applications.
EC2 performs automated checks on every running EC2 instance to identify hardware and software issues.
Status checks are performed every minute and each returns a pass or a fail status.
If all checks pass, the overall status of the instance is OK.
If one or more checks fail, the overall status is Impaired.
Status checks are built into EC2, so they cannot be disabled or deleted.
Status checks data augments the information that EC2 already provides about the intended state of each instance (such as pending, running, stopping) as well as the utilization metrics that Amazon CloudWatch monitors (CPU utilization, network traffic, and disk activity).
Alarms can be created, or deleted, that are triggered based on the result of the status checks. for e.g., an alarm can be created to warn if status checks fail on a specific instance.
System Status Checks
monitor the AWS systems required to use your instance to ensure they are working properly.
detect problems with the instance that require AWS involvement to repair.
When a system status check fails, one can either
wait for AWS to fix the issue
or resolve it by by stopping and restarting or terminating and replacing an instance
System status checks failure might be cause of
Loss of network connectivity
Loss of system power
Software issues on the physical host
Hardware issues on the physical host
Instance Status Checks
monitor the software and network configuration of the individual instance
checks detect problems that requires involvement to repair.
When an instance status check fails, it can be resolved by either rebooting the instance or by making modifications in the operating system
Instance status checks failure might be cause of
Failed system status checks
Misconfigured networking or startup configuration
Exhausted memory
Corrupted file system
Incompatible kernel
CloudWatch Monitoring
CloudWatch, helps monitor EC2 instances, which collects and processes
raw data from EC2 into readable, near real-time metrics.
Statistics are recorded for a period of two weeks, so that historical information can be accessed and used to gain a better perspective on how
the application or service is performing.
By default Basic monitoring is enabled and EC2 metric data is sent to CloudWatch in 5-minute periods automatically
Detailed monitoring can be enabled on EC2 instance, which sends data to CloudWatch in 1-minute periods.
Aggregating Statistics Across Instances/ASG/AMI ID
Aggregate statistics are available for the instances that have detailed monitoring (at an additional charge) enable, which provides data in 1-minute periods
Instances that use basic monitoring are not included in the aggregates.
CloudWatch does not aggregate data across Regions. Therefore, metrics are completely separate between Regions.
CloudWatch returns statistics for all dimensions in the AWS/EC2 namespace, if no dimension is specified
The technique for retrieving all dimensions across an AWS namespace does not work for custom namespaces published to CloudWatch.
Statistics include Sum, Average, Minimum, Maximum, Data Samples
With custom namespaces, the complete set of dimensions that are associated with any given data point to retrieve statistics that include the data point must be specified
CloudWatch alarms
can be created to monitor any one of the EC2 instance’s metrics.
can be configured to automatically send you a notification when the metric reaches a specified threshold.
can automatically stop, terminate, reboot, or recover EC2 instances
can automatically recover an EC2 instance when the instance becomes impaired due to an underlying hardware failure a problem that requires AWS involvement to repair
can automatically stop or terminate the instances to save costs (EC2 instances that use an EBS volume as the root device can be stopped
or terminated, whereas instances that use the instance store as the root device can only be terminated)
can use EC2ActionsAccess IAM role, which enables AWS to perform stop, terminate, or reboot actions on EC2 instances
If you have read/write permissions for CloudWatch but not for EC2, alarms can still be created but the stop or terminate actions won’t be performed on the EC2 instance
EC2 Metrics
CPUCreditUsage
(Only valid for T2 instances) The number of CPU credits consumed
during the specified period.
This metric identifies the amount of time during which physical CPUs
were used for processing instructions by virtual CPUs allocated to
the instance.
CPU Credit metrics are available at a 5 minute frequency.
CPUCreditBalance
(Only valid for T2 instances) The number of CPU credits that an instance has accumulated.
This metric is used to determine how long an instance can burst beyond its baseline performance level at a given rate.
CPU Credit metrics are available at a 5 minute frequency.
CPUUtilization
% of allocated EC2 compute units that are currently in use on the instance. This metric identifies the processing power required to run an application upon a selected instance.
DiskReadOps
Completed read operations from all instance store volumes available to the instance in a specified period of time.
DiskWriteOps
Completed write operations to all instance store volumes available to the instance in a specified period of time.
DiskReadBytes
Bytes read from all instance store volumes available to the instance.
This metric is used to determine the volume of the data the application reads from the hard disk of the instance.
This can be used to determine the speed of the application.
DiskWriteBytes
Bytes written to all instance store volumes available to the instance.
This metric is used to determine the volume of the data the application writes onto the hard disk of the instance.
This can be used to determine the speed of the application.
NetworkIn
The number of bytes received on all network interfaces by the instance. This metric identifies the volume of incoming network traffic to an application on a single instance.
NetworkOut
The number of bytes sent out on all network interfaces by the instance. This metric identifies the volume of outgoing network traffic to an application on a single instance.
NetworkPacketsIn
The number of packets received on all network interfaces by the instance. This metric identifies the volume of incoming traffic in terms of the number of packets on a single instance.
This metric is available for basic monitoring only
NetworkPacketsOut
The number of packets sent out on all network interfaces by the instance. This metric identifies the volume of outgoing traffic in terms of the number of packets on a single instance.
This metric is available for basic monitoring only.
StatusCheckFailed
Reports if either of the status checks, StatusCheckFailed_Instance and StatusCheckFailed_System, that has failed.
Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure.
Status check metrics are available at 1 minute frequency
StatusCheckFailed_Instance
Reports whether the instance has passed the Amazon EC2 instance status check in the last minute.
Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure.
Status check metrics are available at 1 minute frequency
StatusCheckFailed_System
Reports whether the instance has passed the EC2 system status check in the last minute.
Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure.
Status check metrics are available at a 1 minute frequency
AWS Certification Exam Practice Questions
Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
Open to further feedback, discussion and correction.
In the basic monitoring package for EC2, Amazon CloudWatch provides the following metrics:
Web server visible metrics such as number failed transaction requests
Operating system visible metrics such as memory utilization
Database visible metrics such as number of connections
Hypervisor visible metrics such as CPU utilization
Which of the following requires a custom CloudWatch metric to monitor?
Memory Utilization of an EC2 instance
CPU Utilization of an EC2 instance
Disk usage activity of an EC2 instance
Data transfer of an EC2 instance
A user has configured CloudWatch monitoring on an EBS backed EC2 instance. If the user has not attached any additional device, which of the below mentioned metrics will always show a 0 value?
DiskReadBytes
NetworkIn
NetworkOut
CPUUtilization
A user is running a batch process on EBS backed EC2 instances. The batch process starts a few instances to process Hadoop Map reduce jobs, which can run between 50 – 600 minutes or sometimes for more time. The user wants to configure that the instance gets terminated only when the process is completed. How can the user configure this with CloudWatch?
Setup the CloudWatch action to terminate the instance when the CPU utilization is less than 5%
Setup the CloudWatch with Auto Scaling to terminate all the instances
Setup a job which terminates all instances after 600 minutes
It is not possible to terminate instances automatically
An AWS account owner has setup multiple IAM users. One IAM user only has CloudWatch access. He has setup the alarm action, which stops the EC2 instances when the CPU utilization is below the threshold limit. What will happen in this case?
It is not possible to stop the instance using the CloudWatch alarm
CloudWatch will stop the instance when the action is executed
The user cannot set an alarm on EC2 since he does not have the permission
The user can setup the action but it will not be executed if the user does not have EC2 rights
A user has launched 10 instances from the same AMI ID using Auto Scaling. The user is trying to see the average CPU utilization across all instances of the last 2 weeks under the CloudWatch console. How can the user achieve this?
Aggregate the data over the instance AMI ID (Works but needs detailed monitoring enabled)
The user has to use the CloudWatchanalyser to find the average data across instances
It is not possible to see the average CPU utilization of the same AMI ID since the instance ID is different
6 thoughts on “AWS EC2 Monitoring – Certification”
Hi Jayadra
Q#6, A user has launched 10 instances from the same AMI ID using Auto Scaling. The user is trying to see the average CPU utilization across all instances of the last 2 weeks under the CloudWatch console. How can the user achieve this?
Hi, Can you please explain Q#3 … I launched an EC2 and tried seeing all the metrics i do agree with ans … but i am not sure why the count of DiskReadBytes is 0 … as FIRST i would expect it to read the Disk during boot up time ( for config if linux like mstrc etc ) and hence non zero … SECOND If there is no Disk Reading what is CPU Utilization non zero ??
DiskReadBytes is Bytes read from all instance store volumes available to the instance. So it is applicable to the Instance Store Volumes only and not for the EBS volumes.
For Q5, the user will get the error “A system administrator must provision SWF permissions for your IAM user so that the IAM user can perform this action” when trying to save the cloudwatch alarm if he doesn’t have ec2 permissions.
If you have read/write permissions for Amazon CloudWatch but not for Amazon EC2, you can still create an alarm but the stop or terminate actions won’t be performed on the instance. However, if you are later granted permission to use the associated Amazon EC2 API actions, the alarm actions you created earlier will be performed. For more information, see Permissions and Policies in the IAM User Guide.
Hi Jayadra
Q#6, A user has launched 10 instances from the same AMI ID using Auto Scaling. The user is trying to see the average CPU utilization across all instances of the last 2 weeks under the CloudWatch console. How can the user achieve this?
I believe answer is A
Pls refer this url
http://docs.aws.amazon.com/autoscaling/latest/userguide/as-instance-monitoring.html#as-view-group-metrics
Thanks Santhel, the Autoscaling CPU metrics should be the right answer. As for #B, it needs detailed monitoring enabled.
Hi, Can you please explain Q#3 … I launched an EC2 and tried seeing all the metrics i do agree with ans … but i am not sure why the count of DiskReadBytes is 0 … as FIRST i would expect it to read the Disk during boot up time ( for config if linux like mstrc etc ) and hence non zero … SECOND If there is no Disk Reading what is CPU Utilization non zero ??
DiskReadBytes is Bytes read from all instance store volumes available to the instance. So it is applicable to the Instance Store Volumes only and not for the EBS volumes.
Hi Jayendra
For Q5, the user will get the error “A system administrator must provision SWF permissions for your IAM user so that the IAM user can perform this action” when trying to save the cloudwatch alarm if he doesn’t have ec2 permissions.
The answer should be C
Option D is right for Q5. Not sure why you would get a SWF permissions.
Please refer AWS documentation @ CloudWatch UsingAlarmActions