Google Cloud Interconnect

Google Cloud Interconnect

  • Google Cloud Interconnect provides two options for extending the on-premises network to the VPC networks in Google Cloud.
    • Dedicated Interconnect (Dedicated connection) provides a direct physical connection between the on-premises network and Google’s network
    • Partner Interconnect (Use a service provider) provides connectivity between the on-premises and VPC networks through a supported service provider.
  • Cloud Interconnect provides access to all Google Cloud products and services from the on-premises network except Google Workspace.
  • Cloud Interconnect also allows access to supported APIs and services by using Private Google Access from on-premises hosts.

Dedicated Interconnect

  • Dedicated Interconnect provides direct physical connections between the on-premises network and Google’s network.
  • Dedicated Interconnect enables the transfer of large amounts of data between networks, which can be more cost-effective than purchasing additional bandwidth over the public internet.
  • Dedicated Interconnect requires your network to physically meet Google’s network in a colocation facility with your own routing equipment
  • Dedicated Interconnect supports only dynamic routing
  • Dedicate Interconnect supports bandwidth to 10 Gbps minimum to 200 Gbps maximum.
  • VLAN attachment should be associated with a Cloud Router.
  • Cloud Router creates a BGP session for the VLAN attachment and its corresponding on-premises peer router.
  • Cloud Router receives the routes that the on-premises router advertises. These routes are added as custom dynamic routes in the VPC network.
  • Cloud Router also advertises routes for Google Cloud resources to the on-premises peer router.

Google Cloud Dedicated Interconnect

Dedicated Interconnect Provisioning

  • Find a collocation facility with GCP Point of Presence (PoP) which offers Direct Interconnect connections
  • Order an Interconnect connection so that Google can allocate the necessary resources and send a Letter of Authorization and Connecting Facility Assignment (LOA-CFA).
  • LOA-CFA is sent via email to NOC (technical contact) or can be download from the Google Cloud console.
  • Submit the LOA-CFA to the vendor so that they can provision the Interconnect connections between Google’s network and your network.
  • Configure and test the connections with Google before you can use them.
  • Create VLAN attachments to allocate a VLAN on the connection.
  • Configure the on-premises router to establish a BGP session with the Cloud Router

Dedicated Interconnect Redundancy

  • Single Dedicated Interconnect connection does not offer redundancy or high availability
  • Google recommends redundancy using 2 (99.9%) or 4 (99.99%) interconnect connections so that if one connection fails, the other connection can continue to serve traffic
  • Redundant Interconnect connection with 2 connections must be created in the same metropolitan area (city) as the existing one, but in a different edge availability domain (metro availability zone).
  • Redundant Interconnect connection with 4 connections must be created with 2 connections in two different metropolitan areas (city), and each connection in a different edge availability domain (metro availability zone)
  • Dynamic routing mode for the VPC network must be global so that Cloud Router can advertise all subnets and propagate learned routes to all subnets regardless of the subnet’s region.

Google Cloud Dedicated Interconnect Redundancy

Partner Interconnect

  • Partner Interconnect provides connectivity between the on-premises network and the VPC network through a supported service provider
  • A Partner Interconnect connection is useful if the data center is in a physical location that can’t reach a Dedicated Interconnect colocation facility, or the data needs don’t warrant an entire 10-Gbps connection.
  • Partner Interconnect supports bandwidth to 50 Mbps minimum to 10 Gbps maximum.
  • Service providers have existing physical connections to Google’s network that they make available for their customers to use.
  • After the connectivity with a service provider is established, a Partner Interconnect connection from the service provider can be requested.
  • After the service provider provisions the connection, you can start passing traffic between your networks by using the service provider’s network.
  • Partner Interconnect provides Layer 2 and Layer 3 connectivity
    • For Layer 2 connections
      • you must configure and establish a BGP session between the Cloud Routers and on-premises routers for each created VLAN attachment
      • BGP configuration information is provided by the VLAN attachment after your service provider has configured it.
    • For Layer 3 connections
      • The service provider establishes a BGP session between the Cloud Routers and their edge routers for each VLAN attachment.
      • You don’t need to configure BGP on the on-premises router. Google and the service provider automatically set the correct configuration

Google Cloud Partner Interconnect

Partner Interconnect Provisioning

  • Connect the on-premises network to a supported service provider.
  • Create a VLAN attachment for a Partner Interconnect connection in the Google Cloud project, which generates a unique pairing key that must be used to request a connection from the service provider.
  • Activate the connection
  • Depending on the connection, either you or your service provider then establishes a Border Gateway Protocol (BGP) session.
  • Partner Interconnect provisioning does not require LOA-CFA

Partner Interconnect Redundancy

  • Single Partner Interconnect connection does not offer redundancy or high availability
  • 99.9% availability requires
    • At least two VLAN attachments in a single Google Cloud region, in separate edge availability domains (metro availability zones).
    • At least one Cloud Router, connected to both VLAN attachments.
  • 99.99% availability requires
    • At least four VLAN attachments across two metros, one in each edge availability domain (metro availability zone)
    • Two Cloud Routers (one in each Google Cloud region of a VPC network).
    • Associate one Cloud Router with each pair of VLAN attachments.
  • Dynamic routing mode for the VPC network must be global so that Cloud Router can advertise all subnets and propagate learned routes to all subnets regardless of the subnet’s region.

Google Cloud Dedicated Interconnect Redundancy - Layer 2

Cloud Interconnect Security

  • Cloud Interconnect does not encrypt the connection between your network and Google’s network.
  • Currently, Cloud VPN can’t be used with Dedicated Interconnect.
  • For additional security, use application-level encryption or your own VPN.

Dedicated Interconnect vs Partner Interconnect

  • Choosing between Dedicate Interconnect vs Partner Interconnect, consider the connection requirements, such as the connection location and capacity.
    • If you can’t physically meet Google’s network in a colocation facility to reach your VPC networks, you can use Partner Interconnect to connect to service providers that connect directly to Google:
    • If you have high bandwidth needs, Dedicated Interconnect can be a cost-effective solution.
    • If you require a lower bandwidth solution, Dedicated Interconnect and Partner Interconnect provide capacity options starting at 50 Mbps.

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Your company has decided to build a backup replica of their on-premises user authentication PostgreSQL database on Google
    Cloud Platform. The database is 4 TB, and large updates are frequent. Replication requires private address space communication.
    Which networking approach should you use?

    1. Google Cloud Dedicated Interconnect
    2. Google Cloud VPN connected to the data center network
    3. A NAT and TLS translation gateway installed on-premises
    4. A Google Compute Engine instance with a VPN server installed connected to the data center network
  2. A company wants to connect cloud applications to an Oracle database in its data center. Requirements are a maximum of 20 Gbps
    of data and a Service Level Agreement (SLA) of 99%. Which option best suits the requirements?

    1. Implement a high-throughput Cloud VPN connection
    2. Cloud Router with VPN
    3. Dedicated Interconnect
    4. Partner Interconnect
  3. A company wants to connect cloud applications to an Oracle database in its data center. Requirements are a maximum of 9 Gbps
    of data and a Service Level Agreement (SLA) of 99%. Which option best suits the requirements?

    1. Implement a high-throughput Cloud VPN connection
    2. Cloud Router with VPN
    3. Dedicated Interconnect
    4. Partner Interconnect

Google Cloud DNS

Google Cloud DNS

  • Cloud DNS is a high-performance, resilient, reliable, low-latency, global Domain Name System (DNS) service that publishes the domain names to the global DNS in a cost-effective way.
  • Cloud DNS helps to publish the zones and records in DNS without the burden of managing your own DNS servers and software.
  • Cloud DNS offers both public zones and private managed DNS zones.
    • A public zone is visible to the public internet
    • A private zone is visible only from one or more specified VPC networks
    • Google Cloud also creates internal DNS names for VMs automatically, even if you do not use Cloud DNS
  • With Shared VPC, Cloud DNS managed private zone, Cloud DNS peering  zone, or Cloud DNS forwarding zone must be created in the host project
  • Google Cloud offers inbound and outbound DNS forwarding for private zones
  • Cloud DNS offers DNS forwarding zones and DNS server policies to allow lookups of DNS names between the on-premises and Google Cloud environment

DNS Server Policies

  • DNS Server Policies can specify inbound DNS forwarding, outbound DNS forwarding, or both.
  • Inbound server policy refers to a policy that permits inbound DNS forwarding i.e. On-premises to VPC
  • Outbound server policy refers to one possible method for implementing outbound DNS forwarding.i.e. VPC to On-premises
  • It is possible for a policy to be both an inbound server policy and an outbound server policy if it implements the features of both.
  • DNS Server Policies is similar to DNS Forwarding zones, except that it applies to all the traffic and not a single specific domain
  • DNS Outbound Policy disables internal DNS for the selected networks

DNS Forwarding Zones

  • Cloud DNS forwarding zones help configure target name servers for specific private zones.
  • Using a forwarding zone is one way to implement outbound DNS forwarding from the VPC network.
  • A Cloud DNS forwarding zone is a special type of Cloud DNS private zone. Instead of creating records within the zone, you specify a set of forwarding targets.
  • Each forwarding target is an IP address of a DNS server, located in the VPC network, or in an on-premises network connected to the VPC network by Cloud VPN or Cloud Interconnect.
  • Cloud DNS caches responses for queries sent to Cloud DNS forwarding zones
  • DNS forwarding does not work between two Google Cloud environments

DNS Peering

  • DNS peering allows sending requests for records that come from one zone’s namespace to another VPC network for e.g., a SaaS provider can give a SaaS customer access to DNS records it manages.
  • To provide DNS peering,
    • Cloud DNS peering zone must be created and configured to perform DNS lookups in a VPC network where the records for that zone’s namespace are available.
    • The VPC network where the DNS peering zone performs lookups is called the DNS producer network.
  • To use DNS peering,
    • A network must be authorized to use a peering zone.
    • The VPC network authorized to use the peering zone is called the DNS consumer network.
  • DNS peering and VPC Network Peering are different services. DNS peering can be used with VPC Network Peering, but VPC Network Peering is NOT required for DNS peering. VPC peering does not enable DNS peering and must be setup explicitly.

Cloud DNS Forwarding and Peering

VPC Name Resolution Order

  • Each VPC network provides DNS name resolution services to the VM instances that use it.
  • When VMs use their metadata server 169.254.169.254 as their name server, Google Cloud searches for DNS records in the following order:
    • If the VPC network has an outbound server policy, Google Cloud forwards all DNS queries to those alternative servers. The VPC name resolution order consists only of this step.
    • If the VPC network does not have an outbound server policy:
      • Google Cloud tries to find a private zone that matches as much of the requested record as possible (longest suffix matching).
        • Searching records that you created in private zones.
        • Querying the forwarding targets for forwarding zones.
        • Querying the name resolution order of another VPC network by using peering zones.
      • Searches the automatically created Compute Engine internal DNS records for the project.
      • Queries publicly available zones

DNSSEC

  • DNSSEC is a feature of DNS that authenticates responses to domain name lookups
  • DNSSEC protects the domains from spoofing and cache poisoning attacks
  • DNSSEC provides strong authentication for domain lookups, but it does not provide encryption
  • Both the registrar and registry must support DNSSEC for the Top Level Domain (TLD) used
  • For Enabling DNSSEC
    • Enable DNSSEC on the domain. DNS zone for the domain must serve special DNSSEC records for public keys (DNSKEY), signatures (RRSIG), and non-existence (NSEC, or NSEC3 and NSEC3PARAM) to authenticate the zone’s contents.
    • DS record must be added to the TLD at the registrar
    • DNS resolver that validates signatures for DNSSEC-signed domains must be used

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

Google Cloud Network Endpoint Groups – NEG

Google Cloud Network Endpoint Groups – NEG

  • Network Endpoint Groups (NEG) is a configuration object that specifies a group of backend endpoints or services.
  • Network Endpoint Groups provides a logical grouping of IP addresses and ports for software services instead of entire VMs.
  • NEGs can be used as backends for External and Internal HTTP(S) load balancers, TCP/SSL Proxy load balancers, and with Traffic Director
  • Zonal NEG
    • contains one or more endpoints that can be Compute Engine VMs or services running on the VMs.
    • are zonal resources that represent collections of either IP addresses or IP address/port combinations for Google Cloud resources within a single subnet.
    • Each endpoint is specified either by an IP address or an IP:port combination.
    • All other backends in that backend service must also be zonal NEGs.
    • Zonal NEG can be used as a backend for more than one backend service
    • Backend services using zonal NEGs for backends only support balancing modes of RATE or CONNECTION. UTILIZATION is not supported
  • Internet NEG
    • contains a single endpoint that is hosted outside of Google Cloud. This endpoint is specified by hostname FQDN:port or IP:port.
    • can use an internet NEG as the backend for a backend service for a Google Cloud external HTTP(S) load balancer.
    • does not support other load balancer types.
    • ideal to serve content from an origin hosted outside of Google Cloud, and needs to be fronted by external HTTP(S) load balancer
    • allows you to
      • Use Google Edge infrastructure for terminating the user connection
      • Direct the connections to your custom origin.
      • Use Cloud CDN for your custom origin.
      • Deliver traffic to the public endpoint across Google’s private backbone, which improves reliability and can decrease latency between client and server.
  • Serverless NEG
    • points to Cloud Run, App Engine, Cloud Functions services residing in the same region as the NEG.
  • Zonal and internet NEGs define how endpoints should be reached, whether they are reachable, and where they are located.
  • Serverless NEGs don’t contain endpoints.
  • A hybrid connectivity NEG points to Traffic Director services running outside Google Cloud.

Google Cloud Network Endpoint Groups

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

Google_Cloud_Network_Endpoint_Groups

Google Kubernetes Engine – GKE

Google Kubernetes Engine – GKE

  • Google Kubernetes Engine – GKE  provides a managed environment for deploying, managing, and scaling containerized applications using Google infrastructure.

Standard vs Autopilot Cluster

  • Standard
    • Provides advanced configuration flexibility over the cluster’s underlying infrastructure.
    • Cluster configurations needed for the production workloads are determined by you
  • Autopilot
    • Provides a fully provisioned and managed cluster configuration.
    • Cluster configuration options are made for you.
    • Autopilot clusters are pre-configured with an optimized cluster configuration that is ready for production workloads.
    • GKE manages the entire underlying infrastructure of the clusters, including the control plane, nodes, and all system components.

GKE - Autopilot vs Standard Clusters

Zonal vs Regional Cluster

  • Zonal clusters
    • Zonal clusters have a single control plane in a single zone.
    • Depending on the availability requirements, nodes for the zonal cluster can be distributed in a single zone or in multiple zones.
    • Single-zone clusters
      • Master -> Single Zone & Workers -> Single Zone
      • A single-zone cluster has a single control plane running in one zone
      • Control plane manages workloads on nodes running in the same zone
    • Multi-zonal clusters
      • Master -> Single Zone & Workers -> Multi-Zone
      • A multi-zonal cluster has a single replica of the control plane running in a single zone and has nodes running in multiple zones.
      • During an upgrade of the cluster or an outage of the zone where the control plane runs, workloads still run. However, the cluster, its nodes, and its workloads cannot be configured until the control plane is available.
      • Multi-zonal clusters balance availability and cost for consistent workloads.
  • Regional clusters
    • Master -> Multi Zone & Workers -> Multi-Zone
    • A regional cluster has multiple replicas of the control plane, running in multiple zones within a given region.
    • Nodes also run in each zone where a replica of the control plane runs.
    • Because a regional cluster replicates the control plane and nodes, it consumes more Compute Engine resources than a similar single-zone or multi-zonal cluster.

GKE Zonal vs Regional Cluster

Route-Based Cluster vs VPC-Native Cluster

Refer blog post @ Google Kubernetes Engine Networking

Private Cluster

  • Private clusters help isolate nodes from having inbound and outbound connectivity to the public internet by providing nodes with internal IP addresses only.
  • External clients can still reach the services exposed as a load balancer by calling the external IP address of the HTTP(S) load balancer
  • Cloud NAT or self-managed NAT gateway can provide outbound internet access for certain private nodes
  • By default, Private Google Access is enabled, which provides private nodes and their workloads with limited outbound access to Google Cloud APIs and services over Google’s private network.
  • The defined VPC network contains the cluster nodes, and a separate Google Cloud VPC network contains the cluster’s control plane.
  • The control plane’s VPC network is located in a project controlled by Google. The Control plane’s VPC network is connected to the cluster’s VPC network with VPC Network Peering. Traffic between nodes and the control plane is routed entirely using internal IP addresses.
  • Control plane for a private cluster has a private endpoint in addition to a public endpoint
  • Control plane public endpoint access level can be controlled
    • Public endpoint access disabled
      • Most secure option as it prevents all internet access to the control plane
      • Cluster can be accessed using Bastion host/Jump server or if Cloud Interconnect and Cloud VPN have been configured from the on-premises network to connect to Google Cloud.
      • Authorized networks must be configured for the private endpoint, which must be internal IP addresses
    • Public endpoint access enabled, authorized networks enabled:
      • Provides restricted access to the control plane from defined source IP addresses
    • Public endpoint access enabled, authorized networks disabled
      • Default and least restrictive option.
      • Publicly accessible from any source IP address as long as you authenticate.
  • Nodes always contact the control plane using the private endpoint.

Shared VPC Clusters

  • Shared VPC supports both zonal and regional clusters.
  • Shared VPC supports VPC-native clusters and must have Alias IPs enabled. Legacy networks are not supported

Node Pools

  • A node pool is a group of nodes within a cluster that all have the same configuration and are identical to one another.
  • Node pools use a NodeConfig specification.
  • Each node in the pool has a cloud.google.com/gke-nodepool, Kubernetes node label,  which has the node pool’s name as its value.
  • Number of nodes and type of nodes specified during cluster creation becomes the default node pool. Additional custom node pools of different sizes and types can be added to the cluster for e.g. local SSDs, GPUs, preemptible VMs or different machine types
  • Node pools can be created, upgrade, deleted individually without affecting the whole cluster. However, a single node in a node pool cannot be configured; any configuration changes affect all nodes in the node pool.
  • You can resize node pools in a cluster by adding or removing nodes using gcloud container clusters resize CLUSTER_NAME --node-pool POOL_NAME --num-nodes NUM_NODES
  • Existing node pools can be manually upgraded or automatically upgraded.
  • For a multi-zonal or regional cluster, all of the node pools are replicated to those zones automatically. Any new node pool is automatically created or deleted in those zones.
  • GKE drains all the nodes in the node pool when a node pool is deleted

Cluster Autoscaler

  • GKE’s cluster autoscaler automatically resizes the number of nodes in a given node pool, based on the demands of the workloads.
  • Cluster autoscaler is automatic by specifying the minimum and maximum size of the node pool and does not require manual intervention.
  • Cluster autoscaler increases or decreases the size of the node pool automatically, based on the resource requests (rather than actual resource utilization) of Pods running on that node pool’s nodes
    • If Pods are unschedulable because there are not enough nodes in the node pool, cluster autoscaler adds nodes, up to the maximum size of the node pool.
    • If nodes are under-utilized, and all Pods could be scheduled even with fewer nodes in the node pool, Cluster autoscaler removes nodes, down to the minimum size of the node pool. If the node cannot be drained gracefully after a timeout period (currently 10 minutes – not configurable), the node is forcibly terminated.
  • Before enabling cluster autoscaler, design the workloads to tolerate potential disruption or ensure that critical Pods are not interrupted.
  • Workloads might experience transient disruption with autoscaling, esp. with workloads running with a single replica.

Auto-upgrading nodes

  • Node auto-upgrades help keep the nodes in the GKE cluster up-to-date with the cluster control plane (master) version when the control plane is updated on your behalf.
  • Node auto-upgrade is enabled by default when a new cluster or node pool is created with Google Cloud Console or the gcloud command,
  • Node auto-upgrades provide several benefits:
    • Lower management overhead – no need to manually track and update the nodes when the control plane is upgraded on your behalf.
    • Better security – GKE automatically ensures that security updates are applied and kept up to date.
    • Ease of use – provides a simple way to keep the nodes up to date with the latest Kubernetes features.
  • Node pools with auto-upgrades enabled are scheduled for upgrades when they meet the selection criteria. Rollouts are phased across multiple weeks to ensure cluster and fleet stability
  • During the upgrade, nodes are drained and re-created to match the current control plane version. Modifications on the boot disk of a node VM do not persist across node re-creations. To preserve modifications across node re-creation, use a DaemonSet.
  • Enabling auto-upgrades does not cause the nodes to upgrade immediately

GKE Security

Google Kubernetes Engine – GKE Security

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Your existing application running in Google Kubernetes Engine (GKE) consists of multiple pods running on four GKE n1-standard-2 nodes. You need to deploy additional pods requiring n2-highmem-16 nodes without any downtime. What should you do?
    1. Use gcloud container clusters upgrade . Deploy the new services.
    2. Create a new Node Pool and specify machine type n2-highmem-16 . Deploy the new pods.
    3. Create a new cluster with n2-highmem-16 nodes. Redeploy the pods and delete the old cluster.
    4. Create a new cluster with both n1-standard-2 and n2-highmem-16 nodes. Redeploy the pods and delete the old cluster.

References

Google_Kubernetes_Engine-GKE

Google Kubernetes Engine Networking

Google Kubernetes Engine – Networking

IP allocation

Kubernetes uses various IP ranges to assign IP addresses to Nodes, Pods, and Services.

  • Node IP
    • Each node has an IP address assigned from the cluster’s VPC network.
    • Node IP provides connectivity from control components like kube-proxy and kubelet to the Kubernetes API server.
    • Node IP is the node’s connection to the rest of the cluster.
  • Pod CIDR or Address Range
    • Each node has a pool of IP addresses that GKE assigns the Pods running on that node (a /24 CIDR block by default).
  • Pod Address
    • Each Pod has a single IP address assigned from the Pod CIDR range of its node.
    • Pod IP address is shared by all containers running within the Pod and connects them to other Pods running in the cluster.
  • Service Address Range
    • Each Service has an IP address, called the ClusterIP, assigned from the cluster’s VPC network.
  • For Standard clusters
    • a maximum of 110 Pods can run on a node with a /24 range, not 256 as you might expect. This provides a buffer so that Pods don’t become unschedulable due to a transient lack of IP addresses in the Pod IP range for a given node.
    • For ranges smaller than /24, roughly half as many Pods can be scheduled as IP addresses in the range.
  • Autopilot clusters can run a maximum of 32 Pods per node.

GKE Cluster Networking Types

  • GKE, clusters can be distinguished according to the way they route traffic from one Pod to another Pod.
    • VPC-native cluster: A cluster that uses alias IP address ranges
    • Routes-based cluster: A cluster that uses custom static routes in a VPC network

VPC-Native Clusters

  • VPC-native cluster uses alias IP address ranges
  • VPC-native is the recommended network mode for new clusters
  • VPC-native clusters have several benefits:
    • Pod IP addresses are natively routable within the cluster’s VPC network and other VPC networks connected to it by VPC Network Peering.
    • Pod IP address ranges, and subnet secondary IP address ranges in general, are accessible from on-premises networks connected with Cloud VPN or Cloud Interconnect using Cloud Routers.
    • Pod IP addresses are reserved in the VPC network before the Pods are created in the cluster. This prevents conflict with other resources in the VPC network and allows you to better plan IP address allocations.
    • Pod IP address ranges do not depend on custom static routes and do not consume the system-generated and custom static routes quota. Instead, automatically generated subnet routes handle routing for VPC-native clusters.
    • Firewall rules can be created that apply to just Pod IP address ranges instead of any IP address on the cluster’s nodes.

VPC-Native Clusters IP Allocation

Google Kubernetes Engine Networking VPC-Native Cluster IP Management

  • VPC-native cluster uses three unique subnet IP address ranges
    • Subnet’s primary IP address range for all node IP addresses.
      • Node IP addresses are assigned from the primary IP address range of the subnet associated with the cluster.
      • Both node IP addresses and the size of the subnet’s secondary IP address range for Pods limit the number of nodes that a cluster can support
    • One secondary IP address range for all Pod IP addresses.
      • Pod IP addresses are taken from the cluster subnet’s secondary IP address range for Pods.
      • By default, GKE allocates a /24 alias IP range (256 addresses) to each node for the Pods running on it.
      • On each node, those 256 alias IP addresses support up to 110 Pods.
      • Pod Address Range cannot be changed. If exhausted,
        • a new cluster with a larger Pod address range must be created or
        • node pools should be recreated after decreasing the --max-pods-per-node for the node pools.
    • Another secondary IP address range for all Service (cluster IP) addresses.
      • Service (cluster IP) addresses are taken from the cluster’s subnet’s secondary IP address range for Services.
      • Service address range should be large enough to provide addresses for all the Kubernetes Services hosted in the cluster.
  • Node, Pod, and Services IP address ranges must all be unique and subnets with overlapping primary and secondary IP addresses cannot be created.

Routes-based Cluster

  • Routes-based cluster that uses custom static routes in a VPC network i.e. it uses Google Cloud Routes to route traffic between nodes
  • In a routes-based cluster,
    • each node is allocated a /24 range of IP addresses for Pods.
    • With a /24 range, there are 256 addresses, but the maximum number of Pods per node is 110.
    • With approximately twice as many available IP addresses as possible Pods, Kubernetes is able to mitigate IP address reuse as Pods are added to and removed from a node.
  • Routes-based cluster uses two unique subnet IP address ranges
    • Subnet’s primary IP address range for all node IP addresses.
      • Node IP addresses are taken from the primary range of the cluster subnet
      • Cluster subnet must be large enough to hold the total number of nodes in your cluster.
    • Pod address range
      • A routes-based cluster has a range of IP addresses that are used for Pods and Services
      • Last /20 (4096 addresses) of the Pod address range is used for Services and the rest of the range is used for Pods
      • Pod address range size cannot be changed after cluster creation. So ensure that a large enough Pod address range is chosen to accommodate the cluster’s anticipated growth during cluster creation
  • Maximum number of nodes, Pods, and Services for a given GKE cluster is determined by the size of the cluster subnet and the size of the Pod address range.

Related Reads

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

 

Google Cloud NAT

Google Cloud NAT

  • Cloud NAT allows VM instances without external IP addresses and private GKE clusters to send outbound packets to the internet and receive any corresponding established inbound response packets.
  • Cloud NAT is a distributed, software-defined managed service. It’s not based on proxy VMs or appliances.
  • Cloud NAT allows outbound and established inbound responses to those connections
  • Cloud NAT provides source network address translation (SNAT) for VMs without external IP addresses and destination network address translation (DNAT) for established inbound response packets.
  • Cloud NAT does not implement inbound connections from the internet. DNAT is only performed for packets that arrive as responses to outbound packets.
  • Cloud NAT works only for the VM’s network interface’s primary IP address and alias IP address provided that the network interface doesn’t have an external IP address assigned to it, in which case its routed through internet gateway.
  • Cloud NAT gateway is associated with a single VPC network, region, and Cloud Router
  • Cloud NAT provides the following benefits:
    • Security
      • Reduce the need for individual VMs to each have external IP addresses. Subject to egress firewall rules, VMs without external IP addresses can access destinations on the internet.
      • With manual NAT IP address assignment, whitelisting can be performed by the destination service to allow connections from known external IP addresses.
    • Availability
      • is a distributed, software-defined managed service.
      • Can be configured on a Cloud Router, which provides the control plane for NAT, holding specified configuration parameters
    • Scalability
      • can be configured to automatically scale the number of NAT IP addresses that it uses, and it supports VMs that belong to managed instance groups, including those with autoscaling enabled.
    • Performance
      • does not reduce the network bandwidth per VM.

Traditional NAT versus Cloud NAT (click to enlarge).

Cloud NAT Specifications

  • Cloud NAT gateway provides NAT services for packets sent from a VM’s network interface as long as that network interface doesn’t have an external IP address assigned to it
  • Cloud NAT gateway can be configured to provide NAT for the VM network interface’s primary internal IP address, alias IP ranges, or both
  • Cloud NAT gateway does not change the amount of outbound or inbound bandwidth that a VM can use, as it depends on VM’s machine type
  • Cloud NAT gateway can only apply to a single network interface of a VM.
  • Cloud NAT gateway can only use routes whose next hops are the default internet gateway
  • Cloud NAT never performs NAT for traffic sent to the select external IP addresses for Google APIs and services
  • Cloud NAT gateways are associated with subnet IP address ranges in a single region and a single VPC network.
  • Cloud NAT gateway created in one VPC network cannot provide NAT to VMs in other VPC networks connected by using VPC Network Peering, even if the VMs in peered networks are in the same region as the gateway.

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You decide to set up Cloud NAT. After completing the configuration, you find that one of your instances is not using the Cloud NAT for outbound NAT. What is the most likely cause of this problem?
    1. The instance has been configured with multiple interfaces.
    2. An external IP address has been configured on the instance.
    3. You have created static routes that use RFC1918 ranges.
    4. The instance is accessible by a load balancer external IP address.

References

Google_Cloud_NAT

Google Cloud CDN – Content Delivery Network

Google Cloud CDN

  • Google Cloud CDN (Content Delivery Network) caches website and application content closer to the user
  • Cloud CDN uses Google’s global edge network to serve content closer to users, which accelerates the websites and applications.
  • Cloud CDN works with external HTTP(S) Load Balancing to deliver content to the users.
  • Cloud CDN requires the use of Google Premium Network Tier which provides the Global Anycast IP address
  • Cloud CDN content can be sourced from various types of backends (also referred to as origin servers) :
    • Instance groups
    • Zonal network endpoint groups (NEGs)
    • Serverless NEGs: One or more App Engine, Cloud Run, or Cloud Functions services
    • Internet NEGs, for endpoints that are outside of Google Cloud (also known as custom origins)
    • Buckets in Cloud Storage
  • Cloud CDN with Google Cloud Armor enforces security policies only for requests for dynamic content, cache misses, or other requests that are destined for the origin server. Cache hits are served even if the downstream Google Cloud Armor security policy would prevent that request from reaching the origin server.

Cloud CDN Flow

Google Cloud CDN Response Flow

  • When a user requests content from an external HTTP(S) load balancer, the request arrives at a Google Front End (GFE), which is at the edge of Google’s network as close as possible to the user.
  • GFE uses Cloud CDN if the load balancer’s URL map routes traffic to a backend service or backend bucket that has Cloud CDN configured
  • Cloud CDN doesn’t perform any URL redirection. The Cloud CDN cache is located at the GFE.
  • Caching happens automatically for all cacheable content, once Cloud CDN is enabled
  • Cache Hits and Cache Misses
    • A cache is a group of servers that stores and manages content so that future requests for that content can be served faster.
    • Cached content is a copy of cacheable content that is stored on origin servers.
    • Cache Hit – GFE sends the cached response, if the GFE looks in the Cloud CDN cache and finds a cached response to the user’s request
    • Cache Miss – GFE determines that it can’t fulfill the request from the cache, if the content is requested first time or expired or evicated
  • Cache Hit Ratio
    • Cache Hit Ration is the percentage of times that a requested object is served from the cache
  • Cache Egress and Cache Fill
    • Cache Egress – Data transfer from a cache to the client
    • Cache Fill – Data transfer to a cache
  • Cache Eviction
    • Cloud CDN removes or evicts content to insert new content once the it reaches its capacity
    • Content evicted is usually the one that hasn’t recently been accessed, regardless of the content’s expiration time
  • Cache Expiration
    • Content in HTTP(S) caches can have a configurable expiration time or Time To Live (TTL)
  • Cache Invalidation
    • Cache Invalidation allows one to force an object or set of objects to be ignored by the cache
    • Invalidations don’t affect cached copies in web browser caches or caches operated by third-party internet service providers.
    • Cache Invalidation are eventual
    • Invalidations are rate-limited and use patterns to control the same for e.g. use /images/* instead of each request for /images/1.jpg etc.
  • Cache Preloading
    • Caching is reactive in that an object is stored in a particular cache only if a request goes through that cache and if the response is cacheable.
    • Caches cannot be preloaded except by causing the individual caches to respond to requests.
  • An object stored in one cache does not automatically replicate into other caches; cache fill happens only in response to a client-initiated request.

Cloud CDN Signed URL

  • Cloud CDN signed URLs helps serve responses from Google Cloud’s globally distributed caches, even for authorized requests
  • Cloud CDN signed URLs help control access to the cached content
  • Signed URL is a URL that provides user read access to a private resource for a limited time without needing a Google Account
  • Anyone who knows the URL can access the resource until the expiration time for the URL is reached or the key used to sign the URL is rotated.
  • Cryptographic keys are created on a backend service or bucket, or both
  • Signed URL contains authorization within the request URL with selected elements of the request that are hashed and cryptographically signed by using a strongly generated random key.

Cloud CDN Best Practices

  • Cache static content
  • Use proper expiration time or TTL for time sensitive data
  • Use custom cache keys to improve cache hit ratio
    • Cloud CDN, by default, uses entire request URL to build the cache key
    • Cache keys can be customized to include or omit any combination of protocol, host, and query string
  • Use versioning to update content instead of cache invalidation
    • Versioning content serves a different version of the same content, effectively removing it by showing users new content before the cache entry expires
    • Invalidation is eventually consistent and should be used as a last resort

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

Google_Cloud_CDN

Google Cloud Compute Engine Snapshots

Compute Engine Snapshots

  • Snapshots provide periodic backup of the persistent disks
  • Snapshots incrementally back up data from the persistent disks.
  • Snapshots are global resources, so any snapshot is accessible by any resource within the same project.
  • Snapshots can be shared across projects.
  • Storage costs for persistent disk snapshots charge only for the total size of the snapshot.
  • Snapshots once created with the current state of the disk, can be restored as a new disk.
  • Compute Engine stores multiple copies of each snapshot across multiple locations with automatic checksums to ensure the integrity of the data.
  • Snapshots can be created from disks even while they are attached to running virtual machine (VM) instances.
  • Lifecycle of a snapshot created from a disk attached to a running VM instances is independent of the lifecycle of the VM instance.
  • Snapshots can be stored in either one Cloud Storage multi-regional location, such as asia, or one Cloud Storage regional location, such as asia-south1.
  • A multi-regional storage location provides higher availability and might reduce network costs when creating or restoring a snapshot
  • A snapshot can be used to create a new disk in any region and zone, regardless of the storage location of the snapshot.

Snapshot Creation

  • Snapshots are incremental and automatically compressed, so that they can be regularly created on a persistent disk faster and at a lower cost than regularly creating a full image of the disk.
  • Incremental snapshots work as follows:
    • The first successful snapshot of a persistent disk is a full snapshot that contains all the data on the persistent disk.
    • The second snapshot only contains any new data or modified data since the first snapshot. Data that hasn’t changed since snapshot 1 isn’t included. Instead, snapshot 2 contains references to snapshot 1 for any unchanged data.
    • Snapshot 3 contains any new or changed data since snapshot 2 but won’t contain any unchanged data from snapshot 1 or 2. Instead, snapshot 3 contains references to blocks in snapshot 1 and snapshot 2 for any unchanged data.

Snapshot Deletion

  • Compute Engine uses incremental snapshots so that each snapshot contains only the data that has changed since the previous snapshot.
  • For unchanged data, snapshots reference the data in previous snapshots.
  • When a snapshot is deleted, Compute Engine immediately marks the snapshot as DELETED in the system.
    • If the snapshot has no dependent snapshots, it is deleted outright.
    • However, if the snapshot does have dependent snapshots:
      • Any data that is required for restoring other snapshots is moved into the next snapshot, increasing its size.
      • Any data that is not required for restoring other snapshots is deleted. This lowers the total size of all your snapshots.
      • The next snapshot no longer references the snapshot marked for deletion, and instead references the snapshot before it.
  • Deleting a snapshot does not necessarily delete all the data on the snapshot because subsequent snapshots might require information stored in a previous snapshot, keep in mind that
  • To definitively delete data from the snapshots, you should delete all snapshots.

Snapshot Best Practices

  • If a snapshot is created of the persistent disk while the application is running, the snapshot might not capture pending writes that are in transit from memory to disk. So, prepare disk for consistency
    • Pause application/processes that write data, flush disk buffers
    • Unmount disk completely
    • For windows, use VSS snapshots
    • Use ext4 for linux to reduce the risk that data is cached without actually being written to the persistent disk.
  • Take only one snapshot at a time
  • Schedule snapshot off-peak hours
  • Avoid frequent snapshots, take a snapshot of the disk once per hour. Avoid taking snapshots more often than that. Disk snapshots can be created at most once every 10 minutes.
  • Use snapshot schedules as a best practice to back up your Compute Engine workloads
  • Use multiple persistent disks for large data volume. Larger amounts of data create larger snapshots, which cost more and take longer to create.
  • Run fstrim before snapshot (Linux) to clean up space, as this command removes blocks that the file system no longer needs, so that the system can create the snapshot more quickly and with a smaller size
  • Use image from an infrequently used snapshot, instead of using the snapshot itself

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. You have a workload running on Compute Engine that is critical to your business. You want to ensure that the data on the boot disk of this workload is backed up regularly. You need to be able to restore a backup as quickly as possible in case of disaster. You also want older backups to be cleaned automatically to save on cost. You want to follow Google-recommended practices. What should you do?
    1. Create a Cloud Function to create an instance template.
    2. Create a snapshot schedule for the disk using the desired interval.
    3. Create a cron job to create a new disk from the disk using gcloud.
    4. Create a Cloud Task to create an image and export it to Cloud Storage.

References

Google_Cloud_Compute_Engine_Snapshots

Google Cloud Compute Engine Storage Options

Google Cloud Compute Engine Storage Options

Persistent Disk

  • Persistent disks are durable network storage devices that the instances can access like physical disks in a desktop or a server.
  • Persistent disks are used as boot disks
  • Data on each persistent disk is distributed across several physical disks.
  • Compute Engine manages the physical disks and the data distribution to ensure redundancy and optimal performance.
  • Persistent disks are located independently from the VM instances and can be detached or moved to keep the data even after the instance is deleted
  • Persistent disk performance scales automatically with size, so they can be resized or additional ones added to meet the performance and storage space requirements.

Persistent Disk Types

  • Standard persistent disks (pd-standard) are backed by standard hard disk drives (HDD).
  • Balanced persistent disks (pd-balanced) are backed by solid-state drives (SSD). They are an alternative to SSD persistent disks that balance performance and cost.
  • SSD persistent disks (pd-ssd) are backed by solid-state drives (SSD).

Zonal Persistent Disks

  • Zonal persistent disks provide durable storage and replication of data within a single zone in a region.
  • Persistent disks have built-in redundancy to protect the data against equipment failure and to ensure data availability through datacenter maintenance events.
  • For additional space on the persistent disks, resize the disks and resize the single file system rather than repartitioning and formatting.
  • Compute Engine automatically encrypts the data in transit, before it travels outside of the instance to persistent disk storage space.
  • Zonal persistent disk remains encrypted either with system-defined keys or with customer-supplied keys.

Regional Persistent Disks

  • Regional persistent disks provide durable storage and replication of data between two zones in the same region.
  • Regional persistent disks are also designed to work with regional managed instance groups.
  • Zonal outage can be handled by force attaching the disk to the standby instance, even if the disk can’t be detached from the original VM
  • Regional persistent disks are designed for
    • workloads that require a lower RPO and RTO compared to using persistent disk snapshots.
    • write performance is less critical than data redundancy across multiple zones.
  • Regional persistent disks cannot be used with memory-optimized machines and compute-optimized machines.

Local SSD

  • Local SSDs are physically attached to the server that hosts the VM instance.
  • Local SSDs have higher throughput and lower latency than standard persistent disks or SSD persistent disks.
  • Data stored on a local SSD persists only until the instance is stopped or deleted.
  • Local SSD disks cannot be used as boot disks
  • Local SSD disks can be attached only during instance creation, and not once the instance is created
  • Local SSDs performance gains require certain trade-offs in availability, durability, and flexibility. Because of these trade-offs, Local SSD storage isn’t automatically replicated and all data on the local SSD might be lost if the instance terminates for any reason.
  • Each local SSD is 375 GB in size, but a maximum of 24 local SSD partitions can be attached for a total of 9 TB per instance.
  • Compute Engine automatically encrypts the data when it is written to local SSD storage space. Customer-supplied encryption keys is not supported with local SSDs.

Cloud Storage Buckets

  • Cloud Storage buckets are the most flexible, scalable, and durable storage option for the VM instances.
  • Cloud Storage is ideal if you don’t require the lower latency of Persistent Disks and Local SSDs, and can store the data in a Cloud Storage bucket.
  • Performance of Cloud Storage depends on the selected storage class
  • Standard storage class used in the same location as the instance gives performance that is comparable to persistent disks but with higher latency and less consistent throughput characteristics.
  • Cloud Storage buckets have built-in redundancy to protect the data against equipment failure and to ensure data availability through datacenter maintenance events
  • Cloud Storage buckets aren’t restricted to the zone where the instance is located. Multiregional Cloud Storage buckets stores the data redundantly across at least two regions within a larger multiregional location.
  • Cloud Storage bucket can be mounted on the instance as file system
  • Cloud Storage allows read and write data to a bucket from multiple instances simultaneously.
  • However, Cloud Storage buckets are object stores that don’t have the same write constraints as a POSIX file system and can’t be used as boot disks. Multiple instances working on the same file can lead to overwritten data.
  • Cloud Storage supports both encryption at rest and in transit.

Filestore

  • Filestore provides high-performance, fully managed network attached Storage (NAS)  file storage

Storage Options Comparison

Google Cloud Compute Engine Storage Options

Storage Options Performance Comparison

Google Cloud Compute Engine Storage Performance

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

Google Cloud Router

Google Cloud Router

  • Cloud Router is a fully distributed and managed service that programs custom dynamic routes and scales with the network traffic.
  • Cloud Router works with both legacy networks and VPC networks.
  • Cloud Router isn’t a connectivity option, but a service that works over Cloud VPN or Interconnect connections to provide dynamic routing by using the Border Gateway Protocol (BGP) for the VPC networks.
  • Cloud Router isn’t supported for Direct Peering or Carrier Peering connections.
  • Cloud Router isn’t a physical device that might cause a bottleneck, and it can’t be used by itself.
  • Cloud Router is required or recommended in the following cases:
    • Required for Cloud NAT
    • Required for Cloud Interconnect and HA VPN
    • A recommended configuration option for Classic VPN.
  • Cloud Router helps dynamically exchange routes between the Google Cloud network and the on-premises network.
  • Cloud Router peers with the on-premises VPN gateway or router to provide dynamic routing and exchanges topology information through BGP.
  • Cloud Router frees you from maintaining static routes
  • Google Cloud recommends creating two Cloud Routers in each region for a Cloud Interconnect for 99.99% availability.
  • Cloud Router supports following dynamic routing mode
    • Regional routing mode – provides visibility to resources only in the defined region.
    • Global routing mode – provides visibility to resources in all regions

Google Cloud Router - Global Dynamic Routing

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

References

Google_Cloud_Router