Google Cloud Monitoring
- Cloud Monitoring collects measurements of key aspects of the service and of the Google Cloud resources used.
- Cloud Monitoring provides tools to visualize and monitor this data.
- Cloud Monitoring helps gain visibility into the performance, availability, and health of the applications and infrastructure.
- Cloud Monitoring collects metrics, events, and metadata from Google Cloud, AWS, hosted uptime probes, and application instrumentation.
- Using the BindPlane service, data can be collected from over 150 common application components, on-premise systems, and hybrid cloud systems.
Cloud Monitoring Workspaces
- Cloud Monitoring uses Workspaces to organize monitoring information
- Workspace is a tool for monitoring resources across Google Cloud projects
- A Workspace accesses metric data from its monitored projects, but the metric data remains in those projects.
- Every Workspace has a host project. If you delete the host project, you also delete the Workspace.
- A Workspace always monitors its Google Cloud host project
- Host project is the project used to create the Workspace. The name of the Workspace is set to the name of the host project. This isn’t configurable.
- Host project for Workspace stores all of the configuration content for dashboards, alerting policies, uptime checks, notification channels, and group definitions that you configure.
- Workspace can monitor multiple projects but a Google Cloud project can be monitored by exactly 1 Workspace.
- Projects can be moved from one workspace to another workspace
- Two different workspaces can be merged into a single workspace
Cloud Monitoring Metrics
- Metrics are a collection of measurements that help you understand how the applications and system services are performing.
- Measurements might include the latency of requests to a service, the amount of disk space available on a machine, the number of tables in the SQL database, the number of widgets sold, and so forth.
- Metric Value type includes
- For measurements consisting of a single value at a time
BOOL
, a booleanINT64
, a 64-bit integerDOUBLE
, a double-precision floatSTRING
, a string
- For distribution measurements, the value isn’t a single value but a group of values.
- The value type for distribution measurements is
DISTRIBUTION
. - Values in distribution include the mean, count, max, and other statistics, computed for a group of values.
- Latency metrics typically capture data as distributions
- The value type for distribution measurements is
- For measurements consisting of a single value at a time
- Metric Kind includes
- Gauge metric – Value is measured at a specific instant in time for e.g, CPU utilization, current temperature.
- Delta metric – Value is measured as the change since it was last recorded for e.g., metrics measuring request counts are delta metrics; each value records how many requests were received since the last data point was recorded.
- Cumulative metric – Value constantly increases over time for e.g., a metric for “sent bytes” might be cumulative; each value records the total number of bytes sent by a service at that time.
Cloud Monitoring Agent
- Google Cloud’s operations suite provides the following agents for collecting metrics on Linux and Windows VM instances.
- Ops Agent
- The primary and preferred agent for collecting telemetry from the Compute Engine instances.
- This agent combines logging and metrics into a single agent, providing YAML-based configurations for collecting the logs and metrics, and features high-throughput logging.
- Ops Agent uses Fluent Bit for logs, which supports high-throughput logging, and the OpenTelemetry Collector for metrics.
- Legacy Monitoring Agent
- The agent gathers system and application metrics from virtual machine instances and sends them to Cloud Monitoring.
- By default, the legacy monitoring agent collects disk, CPU, network, and process metrics.
- The agent can be configured to monitor third-party applications to get the full list of agent metrics.
- The agent is a collectd-based daemon that gathers system and application metrics from VM instances and sends them to Monitoring.
Cloud Monitoring – Uptime Checks
- An uptime check is a request sent to a publicly accessible IP address on a resource to see whether it responds.
- Uptime checks can determine the availability of the following:
- URLs
- Kubernetes LoadBalancer Services
- VM instances
- App Engine services
- AWS load balancers
- The availability of a resource can be monitored by creating an alerting policy that creates an incident when the uptime check fails.
- The alerting policy can be configured to notify by email or through a different channel, and that notification can include details about the resource that failed to respond.
- The results of uptime checks can also be observed in the Monitoring uptime-check dashboards.
- For non-publicly available resources, the resource’s firewall must be configured o permit incoming traffic from the uptime-check servers
- Uptime checks are unable to reach resources that don’t have an external IP address.
GCP Certification Exam Practice Questions
- Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
- GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
- GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
- Open to further feedback, discussion and correction.
- You need to monitor resources that are distributed over different projects in Google Cloud Platform. You want to consolidate reporting under the same Stackdriver Monitoring dashboard. What should you do?
- Use Shared VPC to connect all projects, and link Stackdriver to one of the projects.
- For each project, create a Stackdriver account. In each project, create a service account for that project and grant it the role
of Stackdriver Account Editor in all other projects. - Configure a single Stackdriver account, and link all projects to the same account.
- Configure a single Stackdriver account for one of the projects. In Stackdriver, create a Group and add the other project
names as criteria for that Group.
- You are asked to set up application performance monitoring on Google Cloud projects A, B, and C as a single pane of glass. You want to monitor CPU, memory, and disk. What should you do?
- Enable API and then share charts from projects A, B, and C.
- Enable API and then give the
metrics.reader
role to projects A, B, and C. - Enable API and then use default dashboards to view all projects in sequence.
- Enable API, create a workspace under project A, and then add projects B and C.