Google Kubernetes Engine – GKE

Google Kubernetes Engine – GKE

  • Google Kubernetes Engine – GKE  provides a managed environment for deploying, managing, and scaling containerized applications using Google infrastructure.

Zonal vs Regional Cluster

  • Zonal clusters
    • Zonal clusters have a single control plane in a single zone.
    • Depending on the availability requirements, nodes for the zonal cluster can be distributed in a single zone or in multiple zones.
    • Single-zone clusters
      • Master -> Single Zone & Workers -> Single Zone
      • A single-zone cluster has a single control plane running in one zone
      • Control plane manages workloads on nodes running in the same zone
    • Multi-zonal clusters
      • Master -> Single Zone & Workers -> Multi-Zone
      • A multi-zonal cluster has a single replica of the control plane running in a single zone, and has nodes running in multiple zones.
      • During an upgrade of the cluster or an outage of the zone where the control plane runs, workloads still run. However, the cluster, its nodes, and its workloads cannot be configured until the control plane is available.
      • Multi-zonal clusters balance availability and cost for consistent workloads.
  • Regional clusters
    • Master -> Multi Zone & Workers -> Multi-Zone
    • A regional cluster has multiple replicas of the control plane, running in multiple zones within a given region.
    • Nodes also run in each zone where a replica of the control plane runs.
    • Because a regional cluster replicates the control plane and nodes, it consumes more Compute Engine resources than a similar single-zone or multi-zonal cluster.

GKE Zonal vs Regional Cluster

Route-Based Cluster vs VPC-Native Cluster

Refer blog post @ Google Kubernetes Engine Networking

Private Cluster

  • Private clusters help isolate nodes from having inbound and outbound connectivity to the public internet by providing nodes with internal IP addresses only.
  • Cloud NAT or self-managed NAT gateway can provide outbound internet access for certain private nodes
  • External clients can still reach the services exposed as a load balancer by calling the external IP address of the HTTP(S) load balancer
  • By default, Private Google Access is enabled, which provides private nodes and their workloads with limited outbound access to Google Cloud APIs and services over Google’s private network.
  • Your VPC network contains the cluster nodes, and a separate Google Cloud VPC network contains the cluster’s control plane. The control plane’s VPC network is located in a project controlled by Google. The Control plane’s VPC network is connected to the cluster’s VPC network with VPC Network Peering. Traffic between nodes and the control plane is routed entirely using internal IP addresses.
  • Control plane for a private cluster has a private endpoint in addition to a public endpoint
  • Control plane public endpoint access level can be controlled
    • Public endpoint access disabled
      • Most secure option as it prevents all internet access to the control plane
      • Cluster can be accessed using Bastion host/Jump server or if Cloud Interconnect and Cloud VPN have been configured from the on-premises network to connect to Google Cloud.
      • Authorized networks must be configured for the private endpoint, which must be internal IP addresses
    • Public endpoint access enabled, authorized networks enabled:
      • Provides restricted access to the control plane from defined source IP addresses
    • Public endpoint access enabled, authorized networks disabled
      • Default and least restrictive option.
      • Publicly accessible from any source IP address as long as you authenticate.
  • Nodes always contact the control plane using the private endpoint.

Shared VPC Clusters

  • Shared VPC supports both zonal and regional clusters.
  • Shared VPC supports VPC-native clusters and must have Alias IPs enabled. Legacy networks are not supported

Node Pools

  • A node pool is a group of nodes within a cluster that all have the same configuration and are identical to one another.
  • Node pools use a NodeConfig specification.
  • Each node in the pool has a cloud.google.com/gke-nodepool, Kubernetes node label,  which has the node pool’s name as its value.
  • Number of nodes and type of nodes specified during cluster creation becomes the default node pool. Additional custom node pools of different sizes and types can be added to the cluster for e.g. local SSDs, GPUs, prememptible VMs or different machine types
  • Node pools can be created, upgrade, deleted individually without affecting the whole cluster. However, single node in a node pool cannot be configured; any configuration changes affect all nodes in the node pool.
  • You can resize node pools in a cluster by adding or removing nodes using gcloud container clusters resize CLUSTER_NAME --node-pool POOL_NAME --num-nodes NUM_NODES
  • Existing node pools can be manually upgraded or automatically upgraded.
  • For a multi-zonal or regional cluster, all of the node pools are replicated to those zones automatically. Any new node pool is automatically created or deleted in those zones.
  • GKE drains all the nodes in the node pool when a node pool is deleted

Cluster Autoscaler

  • GKE’s cluster autoscaler automatically resizes the number of nodes in a given node pool, based on the demands of the workloads.
  • Cluster autoscaler is automatic by specifying the minimum and maximum size of node pool and does not require manual intervention.
  • Cluster autoscaler increases or decreases the size of the node pool automatically, based on the resource requests (rather than actual resource utilization) of Pods running on that node pool’s nodes
    • If Pods are unschedulable because there are not enough nodes in the node pool, cluster autoscaler adds nodes, up to the maximum size of the node pool.
    • If nodes are under-utilized, and all Pods could be scheduled even with fewer nodes in the node pool, Cluster autoscaler removes nodes, down to the minimum size of the node pool. If the node cannot be drained gracefully after a timeout period (currently 10 minutes – not configurable), the node is forcibly terminated.
  • Before enabling cluster autoscaler, design the workloads to tolerate potential disruption or ensure that critical Pods are not interrupted.
  • Workloads might experience transient disruption with autoscaling, esp. with workloads running with single replica

Auto-upgrading nodes

  • Node auto-upgrades help keep the nodes in the GKE cluster up-to-date with the cluster control plane (master) version when the control plane is updated on your behalf.
  • Node auto-upgrade is enabled by default, when a new cluster or node pool is created with Google Cloud Console or the gcloud command,
  • Node auto-upgrades provide several benefits:
    • Lower management overhead – no need to manually track and update the nodes when the control plane is upgraded on your behalf.
    • Better security – GKE automatically ensures that security updates are applied and kept up to date.
    • Ease of use – provides a simple way to keep the nodes up to date with the latest Kubernetes features.
  • Node pools with auto-upgrades enabled are scheduled for upgrades when they meet the selection criteria. Rollouts are phased across multiple weeks to ensure cluster and fleet stability
  • During the upgrade, nodes are drained and re-created to match the current control plane version. Modifications on the boot disk of a node VM do not persist across node re-creations. To preserve modifications across node re-creation, use a DaemonSet.
  • Enabling auto-upgrades does not cause the nodes to upgrade immediately

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. Your existing application running in Google Kubernetes Engine (GKE) consists of multiple pods running on four GKE n1- standard-2 nodes. You need to deploy additional pods requiring n2-highmem-16 nodes without any downtime. What should you do?
    1. Use gcloud container clusters upgrade. Deploy the new services.
    2. Create a new Node Pool and specify machine type n2-highmem-16. Deploy the new pods.
    3. Create a new cluster with n2-highmem-16 nodes. Redeploy the pods and delete the old cluster.
    4. Create a new cluster with both n1-standard-2 and n2-highmem-16 nodes. Redeploy the pods and delete the old cluster.

References

Google_Kubernetes_Engine-GKE

Google Kubernetes Engine – Networking

Google Kubernetes Engine – Networking

IP allocation

Kubernetes uses various IP ranges to assign IP addresses to nodes, Pods, and Services.

  • Node IP
    • Each node has an IP address assigned from the cluster’s VPC network.
    • Node IP provides connectivity from control components like kube-proxy and the kubelet to the Kubernetes API server.
    • Node IP is the node’s connection to the rest of the cluster.
  • Pod CIDR or Address Range
    • Each node has a pool of IP addresses that GKE assigns the Pods running on that node (a /24 CIDR block by default).
  • Pod Address
    • Each Pod has a single IP address assigned from the Pod CIDR range of its node.
    • Pod IP address is shared by all containers running within the Pod and connects them to other Pods running in the cluster.
  • Service Address Range
    • Each Service has an IP address, called the ClusterIP, assigned from the cluster’s VPC network.
  • For Standard clusters
    • a maximum of 110 Pods can run on a node with a /24 range, not 256 as you might expect. This provides a buffer so that Pods don’t become unschedulable due to a transient lack of IP addresses in the Pod IP range for a given node.
    • For ranges smaller than /24, roughly half as many Pods can be scheduled as IP addresses in the range.
  • Autopilot clusters can run a maximum of 32 Pods per node.

GKE Cluster Networking Types

  • GKE, clusters can be distinguished according to the way they route traffic from one Pod to another Pod.
  • VPC-native cluster: A cluster that uses alias IP address ranges
  • Routes-based cluster: A cluster that uses custom static routes in a VPC network

VPC-Native Clusters

  • VPC-native cluster uses alias IP address ranges
  • VPC-native is the recommended network mode for new clusters
  • VPC-native clusters have several benefits:
    • Pod IP addresses are natively routable within the cluster’s VPC network and other VPC networks connected to it by VPC Network Peering.
    • Pod IP address ranges, and subnet secondary IP address ranges in general, are accessible from on-premises networks connected with Cloud VPN or Cloud Interconnect using Cloud Routers.
    • Pod IP addresses are reserved in the VPC network before the Pods are created in the cluster. This prevents conflict with other resources in the VPC network and allows you to better plan IP address allocations.
    • Pod IP address ranges do not depend on custom static routes and do not consume the system-generated and custom static routes quota. Instead, automatically generated subnet routes handle routing for VPC-native clusters.
    • Firewall rules can be created that apply to just Pod IP address ranges instead of any IP address on the cluster’s nodes.

VPC-Native Clusters IP Allocation

GKE VPC-Native Cluster IP Management

  • VPC-Native cluster uses three unique subnet IP address ranges
    • Subnet’s primary IP address range for all node IP addresses.
      • Node IP addresses are assigned from the primary IP address range of the subnet associated with the cluster.
      • Both node IP addresses and the size of the subnet’s secondary IP address range for Pods limit the number of nodes that a cluster can support
    • One secondary IP address range for all Pod IP addresses.
      • Pod IP addresses are taken from the cluster subnet’s secondary IP address range for Pods.
      • By default, GKE allocates a /24 alias IP range (256 addresses) to each node for the Pods running on it.
      • On each node, those 256 alias IP addresses support up to 110 Pods.
      • Pod Address Range cannot be changed. If exhausted,
        • a new cluster with a larger Pod address range must be created or
        • node pools should be recreated after decreasing the --max-pods-per-node for the node pools.
    • Another secondary IP address range for all Service (cluster IP) addresses.
      • Service (cluster IP) addresses are taken from the cluster’s subnet’s secondary IP address range for Services.
      • Service address range should be large enough to provide addresses for all the Kubernetes Services hosted in the cluster.
  • Node, Pod, and Services IP address ranges must all be unique and subnets with overlapping primary and secondary IP address cannot be created

Routes-based Cluster

  • Routes-based cluster that uses custom static routes in a VPC network i.e. it uses Google Cloud Routes to route traffic between nodes
  • In a routes-based cluster
    • each node is allocated a /24 range of IP addresses for Pods.
    • With a /24 range, there are 256 addresses, but the maximum number of Pods per node is 110.
    • With approximately twice as many available IP addresses as possible Pods, Kubernetes is able to mitigate IP address reuse as Pods are added to and removed from a node.
  • Routes-based cluster uses two unique subnet IP address ranges
    • Subnet’s primary IP address range for all node IP addresses.
      • Node IP addresses are taken from the primary range of the cluster subnet
      • Cluster subnet must be large enough to hold the total number of nodes in your cluster.
    • Pod address range
      • A routes-based cluster has a range of IP addresses that are used for Pods and Services
      • Last /20 (4096 addresses) of the Pod address range is used for Services and the rest of the range is used for Pods
      • Pod address range size cannot be changed after cluster creation. So ensure that a large enough Pod address range is chosen to accommodate the cluster’s anticipated growth during cluster creation
  • Maximum number of nodes, Pods, and Services for a given GKE cluster is determined by the size of the cluster subnet and the size of the Pod address range.

GCP Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • GCP services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • GCP exam questions are not updated to keep up the pace with GCP updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.

Google Cloud Compute Services Cheat Sheet

Google Cloud Compute Services

Google Cloud - Compute Services Options

Compute Engine

  • is a virtual machine (VM) hosted on Google’s infrastructure.
  • can run the public images for Google provided Linux and Windows Server as well as custom images created or imported from existing systems
  • availability policy determines how it behaves when there is a maintenance event
    • VM instance’s maintenance behavior onHostMaintenance, which determines whether the instance is live migrated MIGRATE (default) or stopped TERMINATE
    • Instance’s restart behavior automaticRestart  which determines whether the instance automatically restarts (default) if it crashes or gets stopped
  • Live migration helps keep the VM instances running even when a host system event, such as a software or hardware update, occurs
  • Preemptible VM is an instance that can be created and run at a much lower price than normal instances, however can be stopped at any time
  • Shielded VM offers verifiable integrity of the Compute Engine VM instances, to confirm the instances haven’t been compromised by boot- or kernel-level malware or rootkits.
  • Instance template is a resource used to create VM instances and managed instance groups (MIGs) with identical configuration
  • Instance group is a collection of virtual machine (VM) instances that can be managed as a single entity.
    • Managed instance groups (MIGs)
      • allows app creation with multiple identical VMs.
      • workloads can be made scalable and highly available by taking advantage of automated MIG services, including: autoscaling, autohealing, regional (multiple zone) deployment, and automatic updating
      • supports rolling update feature
      • works with load balancing services to distribute traffic across all of the instances in the group.
    • Unmanaged instance groups
      • allows load balance across a fleet of VMs that you manage yourself which may not be identical
  • Instance template are global, while instance groups are regional.
  • Machine image stores all the configuration, data, metadata and permissions from one or more disks required to create a VM instance
  • Sole-tenancy provides dedicated hosting only for the project’s VM and provides added layer of hardware isolation
  • deletionProtection prevents accidental VM deletion esp. for VMs running critical workloads and need to be protected
  • provides Sustained Discounts, Committed discounts, free tier etc in pricing

App Engine

  • App Engine helps build highly scalable applications on a fully managed serverless platform
  • Each Cloud project can contain only a single App Engine application
  • App Engine is regional, which means the infrastructure that runs the apps is located in a specific region, and Google manages it so that it is available redundantly across all of the zones within that region
  • App Engine application location or region cannot be changed once created
  • App engine allows traffic management to an application version by migrating or splitting traffic.
    • Traffic Splitting (Canary) – distributes a percentage of traffic to versions of the application.
    • Traffic Migration – smoothly switches request routing
  • Support Standard and Flexible environments
    • Standard environment
      • Application instances that run in a sandbox, using the runtime environment of a supported language only.
      • Sandbox restricts what the application can do
        • only allows the app to use a limited set of binary libraries
        • app cannot write to disk
        • limits the CPU and memory options available to the application
      • Sandbox does not support
        • SSH debugging
        • Background processes
        • Background threads (limited capability)
        • Using Cloud VPN
    • Flexible environment
      • Application instances run within Docker containers on Compute Engine virtual machines (VM).
      • As Flexible environment supports docker it can support custom runtime or source code written in other programming languages.
      • Allows selection of any Compute Engine machine type for instances so that the application has access to more memory and CPU.
  • min_idle_instances indicates the number of additional instances to be kept running and ready to serve traffic for this version.

GKE

Node Pool

GKE
commands
–num-nodes scale cluster –size is deprecated