Amazon CloudWatch Logs

Amazon CloudWatch Logs

  • CloudWatch Logs can be used to monitor, store, and access log files from EC2 instances, CloudTrail, Route 53, Lambda, ECS, EKS, and other sources.
  • CloudWatch Logs uses the log data for monitoring with no code changes required.
  • CloudWatch Logs require CloudWatch Unified Agent (formerly CloudWatch logs agent) to be installed on the EC2 instances and on-premises servers.
  • CloudWatch Logs agent makes it easy to quickly send both rotated and non-rotated log data off of a host and into the log service.
  • A VPC endpoint can be configured to keep traffic between VPC and CloudWatch Logs from leaving the Amazon network. It doesn’t require an IGW, NAT, VPN connection, or Direct Connect connection
  • CloudWatch Logs allows exporting log data from the log groups to an S3 bucket, which can then be used for custom processing and analysis, or to load onto other systems.
  • Log data is encrypted while in transit and while it is at rest
  • Log data can be encrypted using AWS KMS customer managed keys (CMK).

CloudWatch Logs Concepts

Log Events

  • A log event is a record of some activity recorded by the application or resource being monitored.
  • Log event record contains two properties: the timestamp of when the event occurred, and the raw event message

Log Streams

  • A log stream is a sequence of log events that share the same source for e.g. logs events from an Apache access log on a specific host.

Log Groups

  • Log groups define groups of log streams that share the same retention, monitoring, and access control settings for e.g. Apache access logs from each host grouped through log streams into a single log group
  • Each log stream has to belong to one log group
  • There is no limit on the number of log streams that can belong to one log group.

Log Classes

  • CloudWatch Logs supports two log classes: Standard and Infrequent Access
  • Standard Log Class – full-featured option for logs that require real-time monitoring with all CloudWatch Logs features including Live Tail, Metric Filters, Subscription Filters, Logs Insights, and Contributor Insights.
  • Infrequent Access Log Class – cost-optimized log class for infrequently accessed logs with 50% lower ingestion costs. Ideal for ad-hoc querying and forensic analysis.
  • Infrequent Access class has limitations: no subscription filters, no Live Tail, no Contributor Insights, and can only be queried using CloudWatch Logs Insights.
  • Log class can be specified when creating a log group and can be changed later.

Metric Filters

  • Metric filters can be used to extract metric observations from ingested events and transform them to data points in a CloudWatch metric.
  • Metric filters are assigned to log groups, and all of the filters assigned to a log group are applied to their log streams.

Retention Settings

  • Retention settings can be used to specify how long log events are kept in CloudWatch Logs.
  • Expired log events get deleted automatically.
  • Retention settings are assigned to log groups, and the retention assigned to a log group is applied to their log streams.
  • Retention periods range from 1 day to 10 years, or indefinite retention.

Account Policies

  • Account policies help apply configurations across all log groups in an AWS account or specific log groups matching a selection criteria.
  • Supports data protection policies to automatically mask sensitive data like credit card numbers, social security numbers, and email addresses.
  • Supports subscription filter policies to centrally manage log forwarding to destinations like Kinesis, Lambda, or OpenSearch.

CloudWatch Logs Use cases

Monitor Logs from EC2 Instances in Real-time

  • can help monitor applications and systems using log data
  • can help track number of errors for e.g. 404, 500, for even specific literal terms “NullReferenceException”, occurring in the applications, which can then be matched to a threshold to send notification

Monitor AWS CloudTrail Logged Events

  • can be used to monitor particular API activity as captured by CloudTrail by creating alarms in CloudWatch and receive notifications

Archive Log Data

  • can help store the log data in highly durable storage, an alternative to S3
  • log retention setting can be modified, so that any log events older than this setting are automatically deleted.

Log Route 53 DNS Queries

  • can help log information about the DNS queries that Route 53 receives.

Real-time Processing of Log Data with Subscriptions

  • Subscriptions can help get access to real-time feed of logs events from CloudWatch logs and have it delivered to other services such as Kinesis Data Streams, Kinesis Data Firehose, or AWS Lambda for custom processing, analysis, or loading to other systems
  • A subscription filter defines the filter pattern to use for filtering which log events get delivered to the AWS resource, as well as information about where to send matching log events to.
  • CloudWatch Logs log group can also be configured to stream data to Amazon OpenSearch Service cluster in near real-time

Searching and Filtering

  • CloudWatch Logs allows searching and filtering the log data by creating one or more metric filters.
  • Metric filters define the terms and patterns to look for in log data as it is sent to CloudWatch Logs.
  • CloudWatch Logs uses these metric filters to turn log data into numerical CloudWatch metrics that can be put as graph or set an alarm on.

CloudWatch Logs Advanced Features

CloudWatch Logs Insights

  • CloudWatch Logs Insights enables interactive log analytics with a purpose-built query language.
  • Supports three query languages: CloudWatch Logs Insights query syntax, OpenSearch SQL, and OpenSearch Piped Processing Language (PPL).
  • OpenSearch SQL – allows using familiar SQL syntax to query logs, including JOIN operations to correlate logs across multiple log groups.
  • OpenSearch PPL – uses pipe-delimited commands for easier composition of complex queries.
  • Supports pattern analysis to automatically identify patterns in log data and group similar log entries.
  • Supports anomaly detection command to identify unusual patterns or behaviors in logs.
  • Can query multiple log groups simultaneously and visualize results with time-series graphs.

CloudWatch Logs Live Tail

  • Live Tail enables real-time streaming of log events as they are ingested into CloudWatch Logs.
  • Useful for troubleshooting and debugging applications in real-time without waiting for log aggregation.
  • Can filter log events in real-time using filter patterns.
  • Supports tailing multiple log groups and log streams simultaneously.
  • Available through AWS Console, AWS CLI, and AWS SDKs.
  • Charged on a per-minute basis during active sessions.

CloudWatch Logs Anomaly Detection

  • Anomaly detection uses machine learning and pattern recognition to automatically identify anomalies in log data.
  • Establishes baselines of typical log content and alerts when deviations occur.
  • Can create up to 500 log anomaly detectors per account (increased from 10).
  • Anomaly detectors continuously scan log events ingested into log groups.
  • Can configure CloudWatch alarms to notify when anomalies are detected.
  • Helps reduce time to identify and resolve operational issues.

Data Protection and Masking

  • Data protection policies automatically discover and mask sensitive data in log events.
  • Supports detection of sensitive data types including credit card numbers, social security numbers, email addresses, IP addresses, and custom patterns.
  • Can audit sensitive data by sending findings to CloudWatch Logs or S3.
  • Can mask sensitive data by replacing it with a hash or redacting it entirely.
  • Policies can be applied at log group level or account level.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Once we have our logs in CloudWatch, we can do a number of things such as: Choose 3. Choose the 3 correct answers:[CDOP]
    1. Send the log data to AWS Lambda for custom processing or to load into other systems
    2. Stream the log data to Amazon Kinesis
    3. Stream the log data into Amazon OpenSearch Service in near real-time with CloudWatch Logs subscriptions.
    4. Record API calls for your AWS account and delivers log files containing API calls to your Amazon S3 bucket
  2. You have decided to set the threshold for errors on your application to a certain number and once that threshold is reached you need to alert the Senior DevOps engineer. What is the best way to do this? Choose 3. Choose the 3 correct answers: [CDOP]
    1. Set the threshold your application can tolerate in a CloudWatch Logs group and link a CloudWatch alarm on that threshold.
    2. Use CloudWatch Logs agent to send log data from the app to CloudWatch Logs from Amazon EC2 instances
    3. Pipe data from EC2 to the application logs using AWS Data Pipeline and CloudWatch
    4. Once a CloudWatch alarm is triggered, use SNS to notify the Senior DevOps Engineer.
  3. You are hired as the new head of operations for a SaaS company. Your CTO has asked you to make debugging any part of your entire operation simpler and as fast as possible. She complains that she has no idea what is going on in the complex, service-oriented architecture, because the developers just log to disk, and it’s very hard to find errors in logs on so many services. How can you best meet this requirement and satisfy your CTO? [CDOP]
    1. Copy all log files into AWS S3 using a cron job on each instance. Use an S3 Notification Configuration on the PutBucket event and publish events to AWS Lambda. Use the Lambda to analyze logs as soon as they come in and flag issues. (is not fast in search and introduces delay)
    2. Begin using CloudWatch Logs on every service. Stream all Log Groups into S3 objects. Use AWS EMR cluster jobs to perform adhoc MapReduce analysis and write new queries when needed. (is not fast in search and introduces delay)
    3. Copy all log files into AWS S3 using a cron job on each instance. Use an S3 Notification Configuration on the PutBucket event and publish events to AWS Kinesis. Use Apache Spark on AWS EMR to perform at-scale stream processing queries on the log chunks and flag issues. (is not fast in search and introduces delay)
    4. Begin using CloudWatch Logs on every service. Stream all Log Groups into an Amazon OpenSearch Service Domain and perform log analysis on a search cluster. (OpenSearch stack is designed specifically for real-time, ad-hoc log analysis and aggregation)
  4. You use Amazon CloudWatch as your primary monitoring system for your web application. After a recent software deployment, your users are getting Intermittent 500 Internal Server Errors when using the web application. You want to create a CloudWatch alarm, and notify an on-call engineer when these occur. How can you accomplish this using AWS services? (Choose three.) [CDOP]
    1. Deploy your web application as an AWS Elastic Beanstalk application. Use the default Elastic Beanstalk CloudWatch metrics to capture 500 Internal Server Errors. Set a CloudWatch alarm on that metric.
    2. Install a CloudWatch Logs Agent on your servers to stream web application logs to CloudWatch.
    3. Use Amazon Simple Email Service to notify an on-call engineer when a CloudWatch alarm is triggered.
    4. Create a CloudWatch Logs group and define metric filters that capture 500 Internal Server Errors. Set a CloudWatch alarm on that metric.
    5. Use Amazon Simple Notification Service to notify an on-call engineer when a CloudWatch alarm is triggered.
    6. Use AWS Data Pipeline to stream web application logs from your servers to CloudWatch.

References

AWS CloudWatch Logs User Guide