GCP Cloud SQL

GCP Cloud SQL

  • Cloud SQL provides a cloud-based alternative to local MySQL, PostgreSQL, and SQL Server databases
  • Cloud SQL is managed solution and help handles backups, high availability and failover, data encryption, replication, monitoring and logging.
  • Cloud SQL is ideal for lift and shift migration from existing on-premises relational databases

Cloud SQL High Availability

  • Cloud SQL instance HA configuration provides data redundancy and failover capability with minimal downtime, when a zone or instance becomes unavailable due to a zonal outage, or an instance corruption
  • Cloud SQL HA configuration is also called a regional instance or cluster
  • With HA, the data continues to be available to client applications.
  • Cloud SQL HA is made up of a primary and a standby instance and is located in a primary and secondary zone within the configured region
  • If an HA-configured instance becomes unresponsive, Cloud SQL automatically switches to serving data from the standby instance.
  • Data is synchronously replicated to each zone’s persistent disk, all writes made to the primary instance are replicated to disks in both zones before a transaction is reported as committed.
  • In the event of an instance or zone failure, the persistent disk is attached to the standby instance, and it becomes the new primary instance.
  • After a failover, the instance that received the failover continues to be the primary instance, even after the original instance comes back online.
  • Once the zone or instance that experienced an outage becomes available again, the original primary instance is destroyed and recreated and It becomes the new standby instance.
  • If a failover occurs in the future, the new primary will fail over to the original instance in the original zone.
  • Cloud SQL Standby instance does not increase scalability and cannot be used for read queries
  • To see if failover has occurred, check your operation log’s failover history.

Cloud SQL High Availability

Cloud SQL Failover Process

  • Each second, the primary instance writes to a system database as a heartbeat signal.
  • Primary instance or zone fails.
  • If multiple heartbeats aren’t detected, failover is initiated. This occurs if the primary instance is unresponsive for approximately 60 seconds or the zone containing the primary instance experiences an outage.
  • The standby instance now serves data upon reconnection.
  • Through a shared static IP address with the primary instance, the standby instance now serves data from the secondary zone.
  • Users are then automatically rerouted to the new primary.

Cloud SQL Read Replica

  • Read replicas help scale horizontally the use of data in a database without degrading performance
  • Read replica is an exact copy of the primary instance. Data and other changes on the primary instance are updated in almost real time on the read replica.
  • Read replica can be promoted if the original instance becomes corrupted.
  • Primary instance and read replicas all reside in Cloud SQL
  • Read replicas are read-only; you cannot write to them
  • Read replicas do not provide failover capability
  • Read replicas cannot be made highly available like primary instances.
  • Cloud SQL currently supports 10 read replicas per primary instance
  • During a zonal outage, traffic to read replicas in that zone stops.
  • Once the zone becomes available again, any read replicas in the zone will resume replication from the primary instance.
  • If read replicas are in a zone that is not in an outage, they are connected to the standby instance when it becomes the primary instance.
  • GCP recommends putting read replicas in a different zone from the primary and standby instances. for e.g., if you have a primary instance in zone A and a standby instance in zone B, put the read replicas in zone C. This practice ensures that read replicas continue to operate even if the zone for the primary instance goes down.
  • Client application needs to be configured to send reads to the primary instance when read replicas are unavailable.
  • Cloud SQL supports Cross-region replication that lets you create a read replica in a different region from the primary instance.
  • Cloud SQL supports External read replicas that are external MySQL instances which replicate from a Cloud SQL primary instance

Cloud SQL Point In Time Recovery

  • Point-in-time recovery (PITR) uses binary logs or write-ahead logs
  • PITR requires
    • Binary logging and backups enabled for the instance, with continuous binary logs since the last backup before the event you want to recover from
    • A binary log file name and the position of the event you want to recover from (that event and all events that came after it will not be reflected in the new instance)
  • Point-in-time recovery is enabled by default when you create a new Cloud SQL instance.

Cloud SQL Proxy

  • Cloud SQL Proxy provides secure access to the instances without the need for Authorized networks or for configuring SSL.
    • Secure connections : Cloud SQL Proxy automatically encrypts traffic to and from the database using TLS 1.2 with a 128-bit AES cipher; SSL certificates are used to verify client and server identities.
    • Easier connection management : Cloud SQL Proxy handles authentication removing the need to provide static IP addresses.Cloud SQL Proxy offers the following advantages:
  • Cloud SQL Proxy does not provide a new connectivity path; it relies on existing IP connectivity. To connect to a Cloud SQL instance using private IP, the Cloud SQL Proxy must be on a resource with access to the same VPC network as the instance.
  • Cloud SQL Proxy works by having a local client running in the local environment. The application communicates with the Cloud SQL Proxy with the standard database protocol used by the database.
  • Cloud SQL Proxy uses a secure tunnel to communicate with its companion process running on the server.
  • While the proxy can listen on any port, it only creates outgoing connections to the Cloud SQL instance on port 3307.

Cloud SQL Proxy

Cloud SQL Features Comparison

Cloud SQL Features Comparison

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. An application that relies on Cloud SQL to read infrequently changing data is predicted to grow dramatically. How can you increase capacity for more read-only clients?
    1. Configure high availability on the master node
    2. Establish an external replica in the customer’s data center
    3. Use backups so you can restore if there’s an outage
    4. Configure read replicas.
  2. A Company is using Cloud SQL to host critical data. They want to enable high availability in case a complete zone goes down. How should you configure the same?
    1. Create a Read replica in the same region different zone
    2. Create a Read replica in the different region different zone
    3. Create a Failover replica in the same region different zone
    4. Create a Failover replica in the different region different zone
  3. A Company is using Cloud SQL to host critical data. They want to enable Point In Time recovery (PIT) to be able to recover the instance to a specific point in. How should you configure the same?
    1. Create a Read replica for the instance
    2. Switch to Spanner 3 node cluster
    3. Create a Failover replica for the instance
    4. Enable Binary logging and backups for the instance

 

GCP Storage Options

GCP Storage Options

GCP provides various storage options and the selection can be based on

  • Relational (SQL) vs Non-Relational (NoSQL)
  • Structured vs Unstructured
  • Transactional (OLTP) vs Analytical (OLAP)
  • Fully Managed vs Requires Provisioning
  • Global vs Regional
  • Horizontal vs Vertical scaling

Cloud Datastore

  • Cloud Datastore is a highly-scalable, non-relational NoSQL database
  • fully managed with no-ops and no planned downtime and no need to provision database instances (vs Bigtable)
  • uses a distributed architecture to automatically manage scaling.
  • queries scale with the size of the result set, not the size of the data set
  • supports ACID Atomic transactionsall or nothing (vs Bigtable)
  • provides High availability of reads and writesruns in Google data centers, which use redundancy to minimize impact from points of failure.
  • provides massive scalability with high performanceuses a distributed architecture to automatically manage scaling.
  • scales from zero to terabytes with flexible storage and querying of data
  • provides SQL-like query language
  • supports strong and eventual consistencyensures that entity lookups and ancestor queries always receive strongly consistent data. All other queries are eventually consistent.
  • supports data encryption at rest and in transit
  • provides terabytes of capacity with a maximum unit size of 1 MB per entity (vs Bigtable)
  • Consider using Cloud Datastore if you need to store semi-structured objects, or if require support for transactions and SQL-like queries.

Cloud Bigtable

  • Bigtable is a non-relational NoSQL, analytical big data database service
  • supports large quantities (>1 TB) of semi-structured or structured data (vs Datastore)
  • supports high throughput or rapidly changing data (vs BigQuery)
  • managed, but needs provisioning of nodes and can be expensive (vs Datastore and BigQuery)
  • does not support transactions or strong relational semantics (vs Datastore)
  • does not support SQL queries (vs BigQuery and Datastore)
  • ideal for time-series or natural semantic ordering data
  • can run asynchronous batch or real-time processing on the data
  • can run machine learning algorithms on the data
  • provides petabytes of capacity with a maximum unit size of 10 MB per cell and 100 MB per row.
  • Consider using Cloud Bigtable, if you need to high performance datastore to perform analytics on a large amount of structured objects.

Cloud Storage

  • Cloud Storage provides durable and highly available object storage.
  • fully managed, simple administration and scalable that does not require capacity management
  • supports unstructured data storage like binary or raw objects
  • provides high performance, internet-scale
  • supports data encryption at rest and in transit
  • Consider using Cloud Storage, if you need to store immutable blobs larger than 10 MB, such as large images or movies. This storage service provides petabytes of capacity with a maximum unit size of 5 TB per object.

Cloud SQL

  • Cloud SQL provides managed, relational SQL databases
  • Offers MySQL and PostgreSQL databases as a service
  • managed, however needs to select and provision machines (vs Cloud Spanner)
  • supports automatic replication
  • supports managed backups
  • supports vertical scaling for read and write
  • supports Horizontal scaling for read only using read replicas (vs Cloud Spanner)
  • single region only – although it now supports cross region read replicas (vs Cloud Spanner)
  • supports data encryption at rest and in transit
  • provides up to 10,230 GB, depending on machine type (vs Cloud Spanner)
  • Consider using Cloud SQL for full relational SQL support for OTLP and lift and shift of MySQL, PostgreSQL databases

Cloud Spanner

  • Cloud Spanner provides fully managed, relational SQL databases with joins and secondary indexes
  • provides cross-region, global, horizontal scalability and availability
  • supports strong consistency, including strongly consistent secondary indexes
  • provides high availability through synchronous and built-in data replication.
  • provides strong global consistency
  • supports database sizes exceeding ~2 TB (vs Cloud SQL)
  • does not provide direct lift and shift for relational databases (vs Cloud SQL)
  • expensive as compared to Cloud SQL
  • Consider using Cloud SQL for full relational SQL support, with horizontal scalability spanning petabytes for OTLP

BigQuery

  • provides fully managed, no-ops,  OLAP solution
  • provides high capacity, data warehousing analytics solution
  • ideal for big data exploration and processing
  • not ideal for operational or transactional databases
  • provides SQL interface

GCP Storage Options Decision Tree

GCP Storage Options Decision Tree

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Your application is hosted across multiple regions and consists of both relational database data and static images. Your database has over 10 TB of data. You want to use a single storage repository for each data type across all regions. Which two products would you choose for this task? (Choose two)
    1. Cloud Bigtable
    2. Cloud Spanner
    3. Cloud SQL
    4. Cloud Storage
  2. You are building an application that stores relational data from users. Users across the globe will use this application. Your CTO is concerned about the scaling requirements because the size of the user base is unknown. You need to implement a database solution that can scale with your user growth with minimum configuration changes. Which storage solution should you use?
    1. Cloud SQL
    2. Cloud Spanner
    3. Cloud Firestore
    4. Cloud Datastore
  3. Your company processes high volumes of IoT data that are time-stamped. The total data volume can be several petabytes. The data needs to be written and changed at a high speed. You want to use the most performant storage option for your data. Which product should you use?
    1. Cloud Datastore
    2. Cloud Storage
    3. Cloud Bigtable
    4. BigQuery

GCP Cloud Load Balancing

GCP Cloud Load Balancing

  • Cloud Load Balancing distributes user traffic across multiple instances of the applications and reduces the risk that the of performance issues for the applications experience by spreading the load
  • Cloud Load Balancing helps serve content as close as possible to the users on a system that can respond to over one million queries per second.
  • Cloud Load Balancing is a fully distributed, software-defined managed service. It isn’t hardware-based and there is no need to manage a physical load balancing infrastructure.

Cloud Load Balancing Features

  • External load balancing
    • for internet based applications
    • requires that you use the Premium Tier of Network Service Tiers
    • Types
      • External HTTP/S Load Balancing
      • SSL Proxy Load Balancing
      • TCP Proxy Load Balancing
      • External TCP/UDP Network Load Balancing
  • Internal load balancing
    • for internal clients inside of Google Cloud
    • can use Standard Tier
    • Types
      • Internal HTTP/S Load Balancing
      • Internal TCP/UDP Network Load Balancing
  • Regional load balancing
    • for single region applications.
    • supports only IPv4 termination.
    • Types
      • Internal HTTP/S Load Balancing
      • External TCP/UDP Network Load Balancing
      • Internal TCP/UDP Network Load Balancing
      • External HTTP/S Load Balancing (Standard Tier)
      • SSL Proxy Load Balancing (Standard Tier)
      • TCP Proxy Load Balancing (Standard Tier)
  • Global load balancing
    • for globally distributed applications
    • provides access by using a single anycast IP address
    • supports IPv6 termination.
    • Types
      • External HTTP/S Load Balancing (Premium Tier)
      • SSL Proxy Load Balancing (Premium Tier)
      • TCP Proxy Load Balancing (Premium Tier)

Pass-through vs Proxy-based load balancing

  • Proxy-based load balancing
    • acts as a proxy performing address and port translation and terminating the request before forwarding to the backend service
    • clients and backends interact with the load balancer
    • original client IP, port and protocol is forwarded using x-forwarded-for headers
    • automatically all proxy-based external load balancers inherit DDoS protection from Google Front Ends (GFEs)
    • Google Cloud Armor can be configured for external HTTP(S) load balancers
    • Types
      • Internal HTTP/S Load Balancing
      • External HTTP/S Load Balancing
      • SSL Proxy Load Balancing
      • TCP Proxy Load Balancing
  • Pass-through load balancing
    • does not modify the request or headers and passes to unchanged to the underlying backend
    • Types
      • External TCP/UDP Network Load Balancing
      • Internal TCP/UDP Network Load Balancing

Layer 4 vs Layer 7

  • Layer 4-based load balancing
    • directs traffic based on data from network and transport layer protocols, such as IP address and TCP or UDP port
  • Layer 7-based load balancing
    • adds content-based routing decisions based on attributes, such as the HTTP header and the URI
  • Supports various traffic types including HTTP(S), TCP, UDP
  • For HTTP and HTTPS traffic, use:
    • External HTTP(S) Load Balancing
    • Internal HTTP(S) Load Balancing
  • For TCP traffic, use:
    • TCP Proxy Load Balancing
    • Network Load Balancing
    • Internal TCP/UDP Load Balancing
  • For UDP traffic, use:
    • Network Load Balancing
    • Internal TCP/UDP Load Balancing

Google Cloud Load Balancing Types

Refer blog post @ Google Cloud Load Balancing Types

Load Balancing Components

Backend services

  • A backend is a group of endpoints that receive traffic from a Google Cloud load balancer, a Traffic Director-configured Envoy proxy, or a proxyless gRPC client.
  • Google Cloud supports several types of backends:
    • Instance group containing virtual machine (VM) instances.
    • Zonal NEG
    • Serverless NEG
    • Internet NEG
    • Cloud Storage bucket
  • A backend service is either global or regional in scope.

Forwarding Rules

  • A forwarding rule and its corresponding IP address represent the frontend configuration of a Google Cloud load balancer.

Health Checks

  • Google Cloud provides health checking mechanisms that determine if backends, such as instance groups and zonal network endpoint groups (NEGs), are healthy and properly respond to traffic.
  • Google Cloud provides global and regional health check systems that connect to backends on a configurable, periodic basis.
  • Each connection attempt is called a probe, and each health check system is called a prober. Google Cloud records the success or failure of each probe
  • Google Cloud computes an overall health state for each backend in the load balancer or Traffic Director based on a configurable number of sequential successful or failed probes.
    • Backends that respond successfully for the configured number of times are considered healthy.
    • Backends that fail to respond successfully for a separate number of times are unhealthy.

IPv6 termination

  • Google Cloud supports IPv6 clients with HTTP(S) Load Balancing, SSL Proxy Load Balancing, and TCP Proxy Load Balancing.
  • Load balancer accepts IPv6 connections from the users, and then proxies those connections to the backends.

SSL Certificates

  • Load balancer must have an SSL certificate and the certificate’s corresponding private key. Communication between the client and the load balancer remains private – illegible to any third party that doesn’t have this private key.
  • Google Cloud uses SSL certificates to provide privacy and security from a client to a load balancer. To achieve this, the
  • Allows multiple SSL certificates when you are serving from multiple domains using the same load balancer IP address and port, and you want to use a different SSL certificate for each domain.

SSL Policies

  • SSL policies provide the ability to control the features of SSL that the SSL proxy load balancer or external HTTP(S) load balancer negotiates with clients.
  • HTTP(S) Load Balancing and SSL Proxy Load Balancing use a set of SSL features that provides good security and wide compatibility.
  • SSL policies help control the features of SSL like SSL versions and ciphers that the load balancer negotiates with clients.

URL Maps

  • URL map helps to direct requests to a destination based on defined rules
  • When a request arrives at the load balancer, the load balancer routes the request to a particular backend service or backend bucket based on configurations in a URL map.

GCP Cloud Load Balancing Types

GCP Cloud Load Balancing Types

Google Cloud Load Balancer Comparison

Internal HTTP(S) Load Balancing

  • is a proxy-based, regional Layer 7 load balancer that enables running and scaling services behind an internal IP address.
  • distributes HTTP and HTTPS traffic to backends hosted on Compute Engine and GKE
  • is accessible only in the chosen region of the Virtual Private Cloud (VPC) network on an internal IP address.
  • enables rich traffic control capabilities based on HTTP(S) parameters.
  • is a managed service based on the open source Envoy proxy.
  • needs one proxy-only subnet in each region of a VPC network where internal HTTP(S) load balancers is used. All the internal HTTP(S) load balancers in a region and VPC network share the same proxy-only subnet because all internal HTTP(S) load balancers in the region and VPC network share a pool of Envoy proxies.
  • supports path based routing
  • preserves the Host header of the original client request and also appends two IP addresses (Client and LB )to the X-Forwarded-For header
  • supports a regional backend service, which distributes requests to healthy backends (either instance groups containing Compute Engine VMs or NEGs containing GKE containers).
  • supports a regional health check that periodically monitors the readiness of the backends. This reduces the risk that requests might be sent to backends that can’t service the request.
  • if a backend becomes unhealthy, traffic is automatically redirected to healthy backends within the same region.
  • has native support for the WebSocket protocol when using HTTP or HTTPS as the protocol to the backend
  • accepts only TLS 1.0, 1.1, 1.2, and 1.3 when terminating client SSL requests.
  • isn’t compatible with the following features:
    • Cloud CDN
    • Google Cloud Armor
    • Cloud Storage buckets
    • Google-managed SSL certificates
    • SSL policies

External HTTP(S) Load Balancing

  • is a global, proxy-based Layer 7 load balancer that enables running and scaling the services worldwide behind a single external IP address.
  • distributes HTTP and HTTPS traffic to backends hosted on Compute Engine and GKE
  • is implemented on Google Front Ends (GFEs). GFEs are distributed globally and operate together using Google’s global network and control plane.
    • In the Premium Tier, GFEs offer global load balancing
    • With Standard Tier, the load balancing is handled regionally.
  • provides cross-regional load balancing, directing traffic to the closest healthy backend that has capacity and terminating HTTP(S) traffic as close as possible to your users.
  • supports content-based load balancing using URL maps to select a backend service based on the requested host name, request path, or both.
  • supports the following backend types:
    • Instance groups
    • Zonal network endpoint groups (NEGs)
    • Serverless NEGs: One or more App Engine, Cloud Run, or Cloud Functions services
    • Internet NEGs, for endpoints that are outside of Google Cloud (also known as custom origins)
    • Buckets in Cloud Storage
  • preserves the Host header of the original client request and also appends two IP addresses (Client and LB )to the X-Forwarded-For header
  • supports Cloud Load Balancing Autoscaler, which allows users to perform autoscaling on the instance groups in a backend service.
  • supports connection draining on backend services to ensure minimal interruption to the users when an instance that is serving traffic is terminated, removed manually, or removed by an autoscaler.
  • supports Session affinity as a best-effort attempt to send requests from a particular client to the same backend for as long as the backend is healthy and has the capacity, according to the configured balancing mode. It offers three types of session affinity:
    • NONE. Session affinity is not set for the load balancer.
    • Client IP affinity sends requests from the same client IP address to the same backend.
    • Generated cookie affinity sets a client cookie when the first request is made, and then sends requests with that cookie to the same backend.
  • if a backend becomes unhealthy, traffic is automatically redirected to healthy backends within the same region.
  • has native support for the WebSocket protocol when using HTTP or HTTPS as the protocol to the backend
  • accepts only TLS 1.0, 1.1, 1.2, and 1.3 when terminating client SSL requests.
  • does not support client certificate-based authentication, also known as mutual TLS authentication.

Internal TCP/UDP Load Balancing

  • is a managed, internal, pass-through, regional Layer 4 load balancer that enables running and scaling services behind an internal IP address.
  • distributes traffic among VM instances in the same region in a Virtual Private Cloud (VPC) network by using an internal IP address.
  • provides high-performance, pass-through Layer 4 load balancer for TCP or UDP traffic.
  • routes original connections directly from clients to the healthy backends, without any interruption. Unlike proxy load balancer, it doesn’t terminate connections from clients and then open new connections to backends.
  • provides access through VPC Network Peering, Cloud VPN or Cloud Interconnect
  • does not terminate SSL traffic and SSL traffic can be terminated by the backends instead of by the load balancer
  • supports Session affinity as a best-effort attempt for TCP traffic to send requests from a particular client to the same backend for as long as the backend is healthy and has the capacity, according to the configured balancing mode. It offers three types of session affinity:
    • None : default setting, effectively same as Client IP, protocol, and port.
    • Client IP : Directs a particular client’s requests to the same backend VM based on a hash created from the client’s IP address and the destination IP address.
    • Client IP and protocol : Directs a particular client’s requests to the same backend VM based on a hash created from three pieces of information: the client’s IP address, the destination IP address, and the load balancer’s protocol (TCP or UDP).
    • Client IP, protocol, and port : Directs a particular client’s requests to the same backend VM based on a hash created from these five pieces of information:
      • Source IP address of the client sending the request
      • Source port of the client sending the request
      • Destination IP address
      • Destination port
      • Protocol (TCP or UDP)
  • UDP protocol doesn’t support sessions, session affinity doesn’t affect UDP traffic.
  • supports health check that periodically monitors the readiness of the backends. If a backend becomes unhealthy, traffic is automatically redirected to healthy backends within the same region.
  • supports HTTP(S), HTTP2, TCP and SSL as health check protocols; the protocol of the health check does not have to match the protocol of the load balancer:
  • does not offer a health check that uses the UDP protocol, but can be done using TCP-based health checks
  • supports some backends to be configured as failover backends. These backends are only used when the number of healthy VMs in the primary backend instance groups has fallen below a configurable threshold.

External TCP/UDP Network Load Balancing

  • is a managed, external, pass-through, regional Layer 4 load balancer that distributes TCP or UDP traffic originating from the internet to among virtual machine (VM) instances in the same region
  • are not proxies, but pass-through
    • Load-balanced packets are received by backend VMs with their source IP unchanged.
    • Load-balanced connections are terminated by the backend VMs.
    • Responses from the backend VMs go directly to the clients, not back through the load balancer.
  • scope of a network load balancer is regional, not global. A network load balancer cannot span multiple regions. Within a single region, the load balancer services all zones.
  • distributes connections among backend VMs contained within managed or unmanaged instance groups.
  • supports regional health check that periodically monitors the readiness of the backends. If a backend becomes unhealthy, traffic is automatically redirected to healthy backends within the same region.
  • supports HTTP(S), HTTP2, TCP and SSL as health check protocols; the protocol of the health check does not have to match the protocol of the load balancer:
  • does not offer a health check that uses the UDP protocol, but can be done using TCP-based health checks
  • supports connection tracking table and a configurable consistent hashing algorithm to determine how traffic is distributed to backend VMs.
  • supports Session affinity as a best-effort attempt for TCP traffic to send requests from a particular client to the same backend for as long as the backend is healthy and has the capacity, according to the configured balancing mode. It offers three types of session affinity:
    • None : default setting, effectively same as Client IP, protocol, and port.
    • Client IP : Directs a particular client’s requests to the same backend VM based on a hash created from the client’s IP address and the destination IP address.
    • Client IP and protocol : Directs a particular client’s requests to the same backend VM based on a hash created from three pieces of information: the client’s IP address, the destination IP address, and the load balancer’s protocol (TCP or UDP).
    • Client IP, protocol, and port : Directs a particular client’s requests to the same backend VM based on a hash created from these five pieces of information:
      • Source IP address of the client sending the request
      • Source port of the client sending the request
      • Destination IP address
      • Destination port
      • Protocol (TCP or UDP)
  • UDP protocol doesn’t support sessions, session affinity doesn’t affect UDP traffic.
  • supports connection draining which allows established TCP connections to persist until the VM no longer exists. If connection draining is disabled, established TCP connections are terminated as quickly as possible.
  • does not support Network endpoint groups (NEGs) as backends

External SSL Proxy Load Balancing

  • is a reverse proxy load balancer that distributes SSL traffic coming from the internet to virtual machine (VM) instances in the VPC network.
  • with SSL traffic, user SSL (TLS) connections are terminated at the load balancing layer, and then proxied to the closest available backend instances by using either SSL (recommended) or TCP.
  • supports global load balancing service with the Premium Tier
  • supports regional load balancing service with the Standard Tier
  • is intended for non-HTTP(S) traffic. For HTTP(S) traffic, GCP recommends using HTTP(S) Load Balancing.
  • supports proxy protocol header to preserve the original source IP addresses of incoming connections to the load balancer
  • performs traffic distribution based on the balancing mode and the hashing method selected to choose a backend (session affinity).
  • supports two types of balancing mode
    • CONNECTION : the load is spread based on how many concurrent connections the backend can handle.
    • UTILIZATION: the load is spread based on the utilization of instances in an instance group.
  • supports Session Affinity and offers client IP affinity, which forwards all requests from the same client IP address to the same backend.
  • supports a single backend service resource. Changes to the backend service are not instantaneous and would take several minutes for changes to propagate to Google Front Ends (GFEs).
  • do not support client certificate-based authentication, also known as mutual TLS authentication.

External TCP Proxy Load Balancing

  • is a reverse proxy load balancer that distributes TCP traffic coming from the internet to virtual machine (VM) instances in the VPC network
  • terminates traffic coming over a TCP connection at the load balancing layer, and then forwards to the closest available backend using TCP or SSL
  • use a single IP address for all users worldwide and automatically routes traffic to the backends that are closest to the user
  • supports global load balancing service with the Premium Tier
  • supports regional load balancing service with the Standard Tier
  • performs traffic distribution based on the balancing mode and the hashing method selected to choose a backend (session affinity).
  • supports proxy protocol header to preserve the original source IP addresses of incoming connections to the load balancer
  • supports two types of balancing mode
    • CONNECTION : the load is spread based on how many concurrent connections the backend can handle.
    • UTILIZATION: the load is spread based on the utilization of instances in an instance group.
  • supports Session Affinity and offers client IP affinity, which forwards all requests from the same client IP address to the same backend.

GCP Cloud Load Balancing Decision Tree

Google Cloud Load Balancer Decision Tree

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Your development team has asked you to set up an external TCP load balancer with SSL offload. Which load balancer should you use?
    1. SSL proxy
    2. HTTP load balancer
    3. TCP proxy
    4. HTTPS load balancer
  2. You have an instance group that you want to load balance. You want the load balancer to terminate the client SSL session. The instance group is used to serve a public web application over HTTPS. You want to follow Google-recommended practices. What should you do?
    1. Configure a HTTP(S) load balancer.
    2. Configure an internal TCP load balancer.
    3. Configure an external SSL proxy load balancer.
    4. Configure an external TCP proxy load balancer.
  3. Your development team has asked you to set up load balancer with SSL termination. The website would be using HTTPS protocol. Which load balancer should you use?
    1. SSL proxy
    2. HTTP load balancer
    3. TCP proxy
    4. HTTPS load balancer
  4. You have an application that receives SSL-encrypted TCP traffic on port 443. Clients for this application are located all over the world. You want to minimize latency for the clients. Which load balancing option should you use?
    1. HTTPS Load Balancer
    2. Network Load Balancer
    3. SSL Proxy Load Balancer
    4. Internal TCP/UDP Load Balancer. Add a firewall rule allowing ingress traffic from 0.0.0.0/0 on the target instances.

 

Google Cloud – App Engine Standard vs Flexible Environment

Google Cloud – App Engine Standard vs Flexible Environment

Application Execution

  • Standard environment
    • Application instances that run in a sandbox, using the runtime environment of a supported language only.
    • Sandbox restricts what the application can do
      • only allows the app to use a limited set of binary libraries
      • app cannot write to disk
      • limits the CPU and memory options available to the application
    • Sandbox does not support
      • SSH debugging
      • Background processes
      • Background threads (limited capability)
      • Using Cloud VPN
  • Flexible environment
    • Application instances run within Docker containers on Compute Engine virtual machines (VM).
    • As Flexible environment supports docker it can support custom runtime or source code written in other programming languages.
    • Allows selection of any Compute Engine machine type for instances so that the application has access to more memory and CPU.
  • Standard environment
    • application can accesses services such as Datastore via the built-in google.appengine APIs.
  • Flexible environment
    • Google APIs are no longer available.
    • GCP recommends using the Google Cloud client libraries, which make the application more portable.

Scaling

  • Standard Environment
    • Rapid scaling and Zero downscaling is possible, can scale from zero instances up to thousands very quickly.
    • uses a custom-designed autoscaling algorithm.
  • Flexible Environment
    • must have at least one instance running for each active version and can take longer to scale up in response to traffic.
    • uses the Compute Engine Autoscaler.

Health Checks

  • Standard environment
    • does not use health checks to determine whether or not to send traffic to an instance.
  • Flexible environment
    • Instances are health-checked, that will be used by the load balancer to determine whether or not to send traffic to an instance and whether or not it should be autohealed.

Traffic Migration

  • Standard environment
    • allows you to choose to route requests to the target version, either immediately or gradually.
  • Flexible environment
    • only allows immediate traffic migration
  • Standard environment
    • applications are single-zoned and all instances of the application live in a single availability zone
    • In the event of a zone failure, the application starts new instances in a different zone in the same region and the load balancer routes traffic to the new instances.
    • Latency spike can be observed due to loading requests and also a Memcache flush.
  • Flexible environment
    • applications use Regional Managed Instance Groups with instances  distributed among multiple availability zones within a region.
    • In the event of a single zone failure, the load balancer stops routing traffic to that zone.
  • Standard Environment
    • Deployments are generally faster than deployments in flexible environment.
    • VM Instance comes up in seconds in case of auto scaling
  • Flexible Environment
    • Instance startup time in minutes rather than seconds when compared to standard environment
    • Deployment time in minutes rather than seconds when compared to standard environment

Google Cloud - App Engine Standard vs Flexible Environment

Google Cloud – TerramEarth Case Study

TerramEarth manufactures heavy equipment for the mining and agricultural industries. About 80% of their business is from mining and 20% from agriculture. They currently have over 500 dealers and service centers in 100 countries. Their mission is to build products that make their customers more productive.

Key points here are 500 dealers and service centers are spread across the world and they want to make their customer more productive.

Solution Concept

There are 20 million TerramEarth vehicles in operation that collect 120 fields of data per second. Data is stored locally on the vehicle and can be accessed for analysis when a vehicle is serviced. The data is downloaded via a maintenance port. This same port can be used to adjust operational parameters, allowing the vehicles to be upgraded in the field with new computing modules.

Approximately 200,000 vehicles are connected to a cellular network, allowing TerramEarth to collect data directly. At a rate of 120 fields of data per second, with 22 hours of operation per day, TerramEarth collects a total of about 9 TB/day from these connected vehicles.

Key points here are TerramEarth has 20 million vehicles. Data is stored on the vehicle and is only available for analysis when the vehicle comes for servicing. Only 1% of the vehicles currently have the capability to stream real time data which produce 9 TB/day.

Executive Statement

Our competitive advantage has always been in our manufacturing process, with our ability to build better vehicles for lower cost than our competitors. However, new products with different approaches are constantly being developed, and I’m concerned that we lack the skills to undergo the next wave of transformations in our industry. My goals are to build our skills while addressing immediate market needs through incremental innovations.

Key point here is the company wants to improve their vehicles while building new skills and reducing cost.

Existing Technical Environment

TerramEarth’s existing architecture is composed of Linux and Windows-based systems that reside in a single U.S, west coast based data center. These systems gzip CSV files from the field and upload via FTP, and place the data in their data warehouse. Because this process takes time, aggregated reports are based on data that is 3 weeks old.

With this data, TerramEarth has been able to preemptively stock replacement parts and reduce unplanned downtime of their vehicles by 60%. However, because the data is stale, some customers are without their vehicles for up to 4 weeks while they wait for replacement parts.

Key point here is that the company is working with stale data and hence have increased downtime.

Application 1: Data ingest

A custom Python application reads uploaded datafiles from a single server, writes to the data warehouse

Compute:

  • Windows Server 2008 R2
    • 16 CPUs
    • 128 GB of RAM
    • 10 TB local HDD storage

Application 2: Reporting

An off the shelf application that business analysts use to run a daily report to see what equipment needs repair. Only 2 analysts of a team of 10 (5 west coast, 5 east coast) can connect to the reporting application at a time.

Compute

  • Off the shelf application. License tied to number of physical CPUs
    • Windows Server 2008 R2
    • 16 CPUs
    • 32 GB of RAM
    • 500 GB HDD

Data warehouse

  • A single PostgreSQL server
    • RedHat Linux
    • 64 CPUs
    • 128 GB of RAM
    • 4x 6TB HDD in RAID 0

Key points here are TerramEarth has its infrastructure in a single location – US West Coast. The data from vehicles is uploaded as CSV files through FTP on the data warehouse. As the data is delayed TerramEarth is only able to reduce unplanned downtime of their vehicles by 60%. Some customer vehicles do not have replacement parts for over 4 weeks.

Business Requirements

  • Decrease unplanned vehicle downtime to less than 1 week
    • Current bottleneck is mainly collection of data available for analytics. If the data collection can be improved i.e. more vehicles can be moved to Cellular connectivity the data is available more real time and hence the feedback loop can be completed earlier.
    • Can be handled using Cloud Pub/Sub, Cloud IoT, Cloud Dataflow to capture the Cellular data and have analytics done using BigQuery and Cloud Machine Learning.
  • Support the dealer network with more data on how their customers use their equipment to better position new products and services.
    • can be handled using running analytics over collected data regarding the usage and consumption
  • Have the ability to partner with different companies—especially with seed and fertilizer suppliers in the fast-growing agricultural business—to create compelling joint offerings for their customers.
    • can be handled using building APIs to expose the data externally and using Cloud Endpoints to expose the APIs.

Technical Requirements

  • Expand beyond a single datacenter to decrease latency to the American midwest and east coast
    • Can be handled using multi-regional Cloud Storage and other AWS managed services like Pub/Sub, Dataflow and BigQuery
  • Create a backup strategy
    • data can be easily backed up in Cloud Storage or BigQuery
  • Increase security of data transfer from equipment to the datacenter
    • data can be transferred using HTTPs for more security
  • Improve data in the data warehouse
    • can be handled using BigQuery as the data warehouse solution instead of a single PostgreSQL which is limited in capability and scalability
  • Use customer and equipment data to anticipate customer needs
    • can be handled using running machine learning models over the data collected

Reference Cellular Upload Architecture

Batch Upload Replacement Architecture

Reference

Google Cloud – Mountkirk Games Case Study

Mountkirk Games makes online, session-based, multiplayer games for mobile platforms. They build all of their games using some server-side integration. Historically, they have used cloud providers to lease physical servers.

Due to the unexpected popularity of some of their games, they have had problems scaling their global audience, application servers, MySQL databases, and analytics tools.

Their current model is to write game statistics to files and send them through an ETL tool that loads them into a centralized MySQL database for reporting.

Solution Concept

Mountkirk Games is building a new game, which they expect to be very popular. They plan to deploy the game’s backend on Google Compute Engine so they can capture streaming metrics, run intensive analytics, and take advantage of its autoscaling server environment and integrate with a managed NoSQL database.

So the key here is the company wants to deploy the new game to Google Compute Engine (not GAE or GKE) configured to scale (managed instance groups) to handle streaming metrics and analytics. It also needs to integrated with a NoSQL database (either Cloud Datastore or BigTable)

Executive Statement

Our last successful game did not scale well with our previous cloud provider, resulting in lower user adoption and affecting the game’s reputation. Our investors want more key performance indicators (KPIs) to evaluate the speed and stability of the game, as well as other metrics that provide deeper insight into usage patterns so we can adapt the game to target users. Additionally, our current technology stack cannot provide the scale we need, so we want to replace MySQL and move to an environment that provides autoscaling, low latency load balancing, and frees us up from managing physical servers.

So the key points here are the company wants to move to cloud and use more of scalable managed services without them needing to manage physical servers. Also they want to move from relational MySQL database to a managed database.

Business Requirements

  • Increase to a global footprint
    • Can be handled using Global HTTP load balancer with managed instance groups in each region.
    • Using multi-regional services like Cloud Storage, Cloud Datastore, Pub/Sub, BigQuery
  • Improve uptime – downtime is loss of players
    • can be handled using managed instance groups across zones and regions coupled with load balancer for High Availability
    • GCP managed services like Pub/Sub, Datastore provide a high uptime
  • Increase efficiency of the cloud resources we use
    • can be increased with scaling the resources as per the demand
    • can be increased further using Stackdriver metrics and monitoring
  • Reduce latency to all customers
    • can be reduced using Global HTTP load balancer, which would route the user to the closest region
    • using multi-regional resources would also help reduce latency

Technical Requirements

Requirements for Game Backend Platform

  1. Dynamically scale up or down based on game activity.
    • can be handled using autoscaling with managed instance groups with the ability to scale as per the demand
  2. Connect to a transactional database service to manage user profiles and game state.
    • can be handled using Cloud Datastore, which provides transactional NoSQL database service and is ideal for user profiles and game state
    • it does not need a relational database, so Datastore should work fine
  3. Store game activity in a timeseries database service for future analysis.
    • can be stored using BigQuery as it would provide a low cost data storage (as compared to Bigtable) for analytics 
    • other advantage of BigQuery over Bigtable in this case its multi-regional, meeting the global footprint and latency requirements
  4. As the system scales, ensure that data is not lost due to processing backlogs.
    • seems relevant for capturing streaming data and can be handled using Pub/Sub and Cloud Dataflow
  5. Run hardened Linux distro.
    • can be handled using Google Compute Engine, as it is the only compute option that would support custom Linux distro

Requirements for Game Analytics Platform

  1. Dynamically scale up or down based on game activity.
    • can be handled using GCP managed services which scale as per the demand
    • can be handled using Global HTTP load balancer and autoscaling managed instance groups
  2. Process incoming data on the fly directly from the game servers.
    • can be handled using Pub/Sub for capturing data and Cloud DataFlow for processing the data on the fly i.e real time
  3. Process data that arrives late because of slow mobile networks.
    • can be handled using Cloud Pub/Sub and Cloud DataFlow
  4. Allow queries to access at least 10 TB of historical data.
    • can be handled using BigQuery for storage and analytics
  5. Process files that are regularly uploaded by users’ mobile devices.
    • more relevant for kind of batch processing, which can be handled using storing the files in Cloud Storage and processing the same using Cloud DataFlow

Reference Architecture

Refer to Mobile Gaming Analysis Telemetry solution

References

GCP Shared VPC

GCP Shared VPC

  • Shared VPC allows an organization to connect resources from multiple projects to a common VPC network, so that they can communicate with each other securely and efficiently using internal IPs from that network.
  • Shared VPC requires designating a project as a host project and attach one or more other service projects to it.
  • Shared VPC allows organization administrators delegate administrative responsibilities, such as creating and managing instances, to Service Project Admins while maintaining centralized control over network resources like subnets, routes, and firewalls.
  • Shared VPC allows you to
    • implement a security best practice of least privilege for network administration, auditing, and access control.
    • apply and enforce consistent access control policies at the network level for multiple service projects in the organization while delegating administrative responsibilities
    • use service projects to separate budgeting or internal cost centers.

Shared VPC Concepts

GCP Shared VPC - Multiple host projects

  • Shared VPC connects projects within the same organization. Participating host and service projects cannot belong to different organizations
  • Linked projects can be in the same or different folders, but if they are in different folders the admin must have Shared VPC Admin rights to both folders
  • Each project in Shared VPC is either a host project or a service project
    • host project contains one or more Shared VPC networks. A Shared VPC Admin must first enable a project as a host project. After that, a Shared VPC Admin can attach one or more service projects to it.
    • service project is any project that has been attached to a host project by a Shared VPC Admin. This attachment allows it to participate in Shared VPC.
  • A project cannot be both a host and a service project simultaneously. Thus, a service project cannot be a host project to further service projects.
  • Multiple host projects can be created; however, each service project can only be attached to a single host project.
  • A project that does not participate in Shared VPC is called a standalone project.
  • VPC networks in the host project are called Shared VPC networks. Service projects resources can use subnets in the Shared VPC network
  • Shared VPC networks can be either auto or custom mode, but legacy networks are not supported.
  • Host and service projects are connected by attachments at the project level.
  • Subnets of the Shared VPC networks in the host project are accessible by Service Project Admins
  • Organization policies and IAM permissions work together to provide different levels of access control.
  • Organization policies enable setting controls at the organization, folder, or project level.

Reference

GCP Virtual Private Cloud – Shared VPC

GCP VPC Peering

GCP VPC Peering

  • GCP VPC Network Peering allows internal IP address or private connectivity across two VPC networks regardless of whether they belong to the same project or the same organization.
  • VPC Network Peering enables VPC networks connection, so that workloads in different VPC networks can communicate internally.
  • Traffic stays within Google’s network and doesn’t traverse the public internet.
  • VPC Network Peering provides following advantages over using external IP addresses or VPNs to connect networks, including:
    • Network Latency – connectivity uses only internal addresses and hence provides lower latency than connectivity that uses external addresses.
    • Network Security – service owners do not need to have their services exposed to the public Internet and deal with its associated risks.
    • Network Cost – GCP charges egress bandwidth or outbound traffic for networks using external IPs to communicate even if the traffic is within the same zone. However, for peered networks as they use internal IPs to communicate and save on those egress costs.
  • VPC Network Peering is useful in these environments:
    • SaaS (Software-as-a-Service) ecosystems in Google Cloud, which can be made available privately across different VPC networks within and across organizations.
    • Organizations that have several network administrative domains that need to communicate using internal IP addresses.

VPC Peering Properties

  • VPC Network Peering works with Compute Engine, GKE, and App Engine flexible environment.
  • Peered VPC networks remain administratively separate. Routes, firewalls, VPNs, and other traffic management tools are administered and applied separately in each of the VPC networks.
  • Each side of a peering association is set up independently. Peering will be active only when the configuration from both sides matches. Either side can choose to delete the peering association at any time.
  • VPC peers always exchange subnet routes that don’t use privately used public IP addresses. Networks must explicitly export privately used public IP subnet routes for other networks to use them and must explicitly import privately used public IP subnet routes to receive them from other networks.
  • Subnet and static routes are global. Dynamic routes can be regional or global, depending on the VPC network’s dynamic routing mode.
  • VPC network can peer with multiple VPC networks (limit of 25 currently)
  • IAM permissions for creating and deleting VPC Network Peering are included as part of the Compute Network Admin role.
  • Peering traffic (traffic flowing between peered networks) has the same latency, throughput, and availability as private traffic in the same network.
  • Billing policy for peering traffic is the same as the billing policy for private traffic in same network.
  • An organization policy administrator can use an organization policy to constrain which VPC networks can peer with VPC networks in the organization. Peering connections to particular VPC networks or to VPC networks in a particular folder or organization can be denied. The constraint applies to new peering configurations and doesn’t affect existing connections. An existing peering connection can continue to work even if a new policy denies new connections.

VPC Peering Restrictions

  • A subnet CIDR range in one peered VPC network cannot overlap with a static route in another peered network. This rule covers both subnet routes and static routes

GCP VPC Peering - Overlapping Subnet IP ranges between two peers

  • A dynamic route can overlap with a subnet route in a peer network. For dynamic routes, the destination ranges that overlap with a subnet route from the peer network are silently dropped. Google Cloud uses the subnet route.
  • Only VPC networks are supported for VPC Network Peering. Peering is NOT supported for legacy networks.
  • Subnet route exchange can’t be disabled or subnet routes are exchanged cannot be selected. After peering is established, all resources within subnet IP addresses are accessible across directly peered networks.
  • VPC Network Peering doesn’t provide granular route controls to filter out which subnet CIDR ranges are reachable across peered networks. It needs to be done using firewall rules
  • Transitive peering is NOT supported.
  • Tags or service account from one peered network in the other peered network CANNOT be used.
  • Compute Engine internal DNS names created in a network are NOT accessible to peered networks. Use the IP address instead.
  • By default, VPC Network Peering with GKE is supported when used with IP aliases. If you don’t use IP aliases, custom routes can be exportedso that GKE containers are reachable from peered networks.

Reference

Google Cloud – VPC Peering