Kubernetes and Cloud Native Associate KCNA Exam Learning Path

Kubernetes and Cloud Native Associate KCNA Exam Learning Path

I recently certified for the Kubernetes and Cloud Native Associate – KCNA exam.

  • KCNA exam focuses on a user’s foundational knowledge and skills in Kubernetes and the wider cloud native ecosystem.
  • KCNA exam is intended to prepare candidates to work with cloud-native technologies and pursue further CNCF credentials, including CKA, CKAD, and CKS.
  • KCNA validates the conceptual knowledge of
    • the entire cloud native ecosystem, particularly focusing on Kubernetes.
    • Kubernetes and cloud-native technologies, including how to deploy an application using basic kubectl commands, the architecture of Kubernetes (containers, pods, nodes, clusters), understanding the cloud-native landscape and projects (storage, networking, GitOps, service mesh), and understanding the principles of cloud-native security.

KCNA Exam Pattern

  • KCNA exam curriculum includes these general domains and their weights on the exam:
    • Kubernetes Fundamentals – 46%
    • Container Orchestration – 22%
    • Cloud Native Architecture – 16%
    • Cloud Native Observability – 8%
    • Cloud Native Application Delivery – 8%
  • KCNA exam requires you to solve 60 questions in 90 minutes.
  • Exam questions can be attempted in any order and don’t have to be sequential. So be sure to move ahead and come back later.
  • Time is more than sufficient if you are well prepared. I was able to get through the exam within an hour.

KCNA Exam Preparation and Tips

  • I used the courses from KodeKloud KCNA for practicing and it would be good enough to cover what is required for the exam.

KCNA Resources

KCNA Key Topics

Kubernetes Fundamentals

Kubernetes Architecture

  • Kubernetes is a highly popular open-source container orchestration platform that can be used to automate deployment, scaling, and the management of containerized workloads.
  • Kubernetes Architecture
    • A Kubernetes cluster consists of at least one main (control) plane, and one or more worker machines, called nodes.
    • Both the control planes and node instances can be physical devices, virtual machines, or instances in the cloud.
  • ETCD (key-value store)
    • Etcd is a consistent, distributed, and highly-available key-value store.
    • is stateful, persistent storage that stores all of Kubernetes cluster data (cluster state and config).
    • is the source of truth for the cluster.
    • can be part of the control plane, or, it can be configured externally.
  • Kubernetes API
    • API server exposes a REST interface to the Kubernetes cluster. It is the front end for the Kubernetes control plane.
    • All operations against Kubernetes objects are programmatically executed by communicating with the endpoints provided by it.
    • It tracks the state of all cluster components and manages the interaction between them.
    • It is designed to scale horizontally.
    • It consumes YAML/JSON manifest files.
    • It validates and processes the requests made via API.
  • Scheduling
    • The scheduler is responsible for assigning work to the various nodes. It keeps watch over the resource capacity and ensures that a worker node’s performance is within an appropriate threshold.
    • It schedules pods to worker nodes.
    • It watches api-server for newly created Pods with no assigned node, and selects a healthy node for them to run on.
    • If there are no suitable nodes, the pods are put in a pending state until such a healthy node appears.
    • It watches API Server for new work tasks.
    • Factors taken into account for scheduling decisions include:
      • Individual and collective resource requirements.
      • Hardware/software/policy constraints.
      • Affinity and anti-affinity specifications.
      • Data locality.
      • Inter-workload interference.
      • Deadlines and taints.
  • Controller Manager
    • Controller manager is responsible for making sure that the shared state of the cluster is operating as expected.
    • It watches the desired state of the objects it manages and watches their current state through the API server.
    • It takes corrective steps to make sure that the current state is the same as the desired state.
    • It is a controller of controllers.
    • It runs controller processes. Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process.
  • Kubelet
    • A Kubelet tracks the state of a pod to ensure that all the containers are running and healthy
    • provides a heartbeat message every few seconds to the control plane.
    • runs as an agent on each node in the cluster.
    • acts as a conduit between the API server and the node.
    • instantiates and executes Pods.
    • watches API Server for work tasks.
    • gets instructions from master and reports back to Masters.
  • Kube-proxy
    • Kube proxy is a networking component that routes traffic coming into a node from the service to the correct containers.
    • is a network proxy that runs on each node in a cluster.
    • manages IP translation and routing.
    • maintains network rules on nodes. These network rules allow network communication to Pods from inside or outside of cluster.
    • ensures each Pod gets a unique IP address.
    • makes possible that all containers in a pod share a single IP.
    • facilitates Kubernetes networking services and load-balancing across all pods in a service.
    • It deals with individual host sub-netting and ensures that the services are available to external parties.

Kubernetes Resources

  • Kubernetes Resources
    • Nodes manage and run pods; it’s the machine (whether virtualized or physical) that performs the given work.
    • Namespaces
      • provide a mechanism for isolating groups of resources within a single cluster.
      • Kubernetes starts with four initial namespaces:
        • default – default namespace for objects with no other namespace.
        • kube-system – namespace for objects created by the Kubernetes system.
        • kube-public – namespace is created automatically and is readable by all users (including those not authenticated).
        • kube-node-lease – namespace holds Lease objects associated with each node. Node leases allow the kubelet to send heartbeats so that the control plane can detect node failure.
      • Resource Quotas can be defined for each namespace to limit the resources consumed.
      • Resources within the namespaces can refer to each other with their service names.
    • Pods
      • is a group of containers and is the smallest unit that Kubernetes administers.
      • Containers in a pod share the same resources such as memory and storage.
    • ReplicaSet
      • ensures a stable set of replica Pods running at any given time.
      • helps guarantee the availability of a specified number of identical Pods.
    • Deployments
      • provide declarative updates for Pods and ReplicaSets.
      • describe the number of desired identical pod replicas to run and the preferred update strategy used when updating the deployment.
      • supports Rolling Update and Recreate update strategy.
    • Services
      • is an abstraction over the pods, and essentially, the only interface the various application consumers interact with.
      • exposes a single machine name or IP address mapped to pods whose underlying names and numbers are unreliable.
      • supports the following types
        • ClusterIP
        • NodePort
        • Load Balancer
    • Ingress
      • exposes HTTP and HTTPS routes from outside the cluster to services within the cluster.
    • DaemonSet
      • ensures that all (or some) Nodes run a copy of a Pod.
      • ensures pods are added to the newly created nodes and garbage collected as nodes are removed.
    • StatefulSet
      • is ideal for stateful applications using ReadWriteOnce volumes.
      • designed to deploy stateful applications and clustered applications that save data to persistent storage, such as persistent disks.
    • ConfigMaps
      • helps to store non-confidential data in key-value pairs.
      • can be consumed by pods as environment variables, command-line arguments, or configuration files in a volume.
    • Secrets
      • provides a container for sensitive data such as a password without putting the information in a Pod specification or a container image.
      • are not encrypted but only base64 encoded.
    • Job & ConJobs
      • creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate.
      • A CronJob creates Jobs on a repeating schedule.
    • Volumes
      • supports Persistent volumes that exist beyond the lifetime of a pod.
      • When a pod ceases to exist, Kubernetes destroys ephemeral volumes; however, Kubernetes does not destroy persistent volumes.
      • PersistentVolume (PV) is a cluster scoped piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.
      • PersistentVolumeClaim (PVC) is a request for storage by a user.
    • Labels and Annotations attach metadata to objects in Kubernetes.
      • Labels are identifying key/value pairs that can be attached to Kubernetes objects and are used in conjunction with selectors to identify groups of related resources.
      • Annotations are key/value pairs designed to hold non-identifying information that can be leveraged by tools and libraries.
  • Containers
    • Container runtime is responsible for running containers (in Pods).
    • Kubernetes supports any implementation of the Kubernetes Container Runtime Interface CRI specifications
    • To run the containers, each worker node has a container runtime engine.
    • It pulls images from a container image registry and starts and stops containers.
    • Kubernetes supports several container runtimes:

Container Orchestration

  • Container Orchestration Fundamentals
    • Containers help manage the dependencies of an application and run much more efficiently than spinning up a lot of virtual machines.
    • While virtual machines emulate a complete machine, including the operating system and a kernel, containers share the kernel of the host machine and are only isolated processes.
    • Virtual machines come with some overhead, be it boot time, size or resource usage to run the operating system. Containers on the other hand are processes, like the browser, therefore they start a lot faster and have a smaller footprint.
  • Runtime
    • Container runtime is responsible for running containers (in Pods).
    • Kubernetes supports any implementation of the Kubernetes Container Runtime Interface CRI specifications
    • To run the containers, each worker node has a container runtime engine.
    • It pulls images from a container image registry and starts and stops containers.
    • Kubernetes supports several container runtimes:
      • Docker – Standard for a long time but the usage of Docker as the runtime for Kubernetes has been deprecated and removed in Kubernetes 1.24
      • contained – containerd is the most popular lightweight and performant implementation to run containers used by all major cloud providers for the Kubernetes As A Service products.
      • CRI-O – CRI-O was created by Red Hat and with a similar code base closely related to podman and buildah.
      • gvisor – Made by Google, provides an application kernel that sits between the containerized process and the host kernel.
      • Kata Containers – A secure runtime that provides a lightweight virtual machine, but behaves like a container
    • Security
      • 4C’s of Cloud Native security are Cloud, Clusters, Containers, and Code.
      • Containers are started on a machine and they always share the same kernel, which then becomes a risk for the whole system, if containers are allowed to call kernel functions like for example killing other processes or modifying the host network by creating routing rules.
      • Kubernetes provides security features
        • Authentication using Users & Certificates
          • Certificates are the recommended way
          • Service accounts can be used to provide bearer tokens to authenticate with Kubernetes API.
        • Authorization using Node, ABAC, RBAC, Webhooks
          • Role-based access control is the most secure and recommended authorization mechanism in Kubernetes.
        • Admission Controller is an interceptor to the Kubernetes API server requests prior to persistence of the object, but after the request is authenticated and authorized.
        • Security Context helps define privileges and access control settings for a Pod or Container that includes
        • Service Mesh like Istio and Linkerd can help implement MTLS for intra-cluster pod-to-pod communication.
        • Network Policies help specify how a pod is allowed to communicate with various network “entities” over the network.
        • Kubernetes auditing provides a security-relevant, chronological set of records documenting the sequence of actions in a cluster for activities generated by users, by applications that use the Kubernetes API, and by the control plane itself.
    •  Networking
      • Container Network Interface (CNI) is a standard that can be used to write or configure network plugins and makes it very easy to swap out different plugins in various container orchestration platforms.
      • Kubernetes networking addresses four concerns:
        • Containers within a Pod use networking to communicate via loopback.
        • Cluster networking provides communication between different Pods.
        • Service API helps expose an application running in Pods to be reachable from outside your cluster.
          • Ingress provides extra functionality specifically for exposing HTTP applications, websites, and APIs.
          • Gateway API is an add-on that provides an expressive, extensible, and role-oriented family of API kinds for modeling service networking.
        • Services can also be used to publish services only for consumption inside the cluster.
    • Service Mesh
      • Service Mesh is a dedicated infrastructure layer added to the applications that allows you to transparently add capabilities without adding them to your own code.
      • Service Mesh provides capabilities like service discovery, load balancing, failure recovery, metrics, and monitoring and complex operational requirements, like A/B testing, canary deployments, rate limiting, access control, encryption, and end-to-end authentication.
      • Service mesh uses a proxy to intercept all your network traffic, allowing a broad set of application-aware features based on the configuration you set.
      • Istio is an open source service mesh that layers transparently onto existing distributed applications.
      • An Envoy proxy is deployed along with each service that you start in the cluster, or runs alongside services running on VMs.
      • Istio provides
        • Secure service-to-service communication in a cluster with TLS encryption, strong identity-based authentication and authorization
        • Automatic load balancing for HTTP, gRPC, WebSocket, and TCP traffic
        • Fine-grained control of traffic behavior with rich routing rules, retries, failovers, and fault injection
        • A pluggable policy layer and configuration API supporting access controls, rate limits and quotas
        • Automatic metrics, logs, and traces for all traffic within a cluster, including cluster ingress and egress
    •  Storage
      • Container images are read-only and consist of different layers that include everything added during the build phase ensuring that a container from an image provides the same behavior and functionality.
      • To allow writing files, a read-write layer is put on top of the container image when you start a container from an image.
      • Container on-disk files are ephemeral and lost if the container crashes.
      • Container Storage Interface (CSI) provides a uniform and standardized interface that allows attaching different storage systems no matter if it’s cloud or on-premises storage.
      • Kubernetes supports Persistent volumes that exist beyond the lifetime of a pod. When a pod ceases to exist, Kubernetes destroys ephemeral volumes; however, Kubernetes does not destroy persistent volumes.
      • Persistent Volumes is supported using API resources
        • PersistentVolume (PV)
          • is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.
          • is a cluster-level resource and not bound to a namespace
          • are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV.
        • PersistentVolumeClaim (PVC)
          • is a request for storage by a user.
          • is similar to a Pod.
          • Pods consume node resources and PVCs consume PV resources.
          • Pods can request specific levels of resources (CPU and Memory).
          • Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany, or ReadWriteMany, see AccessModes).
      • Persistent Volumes can be provisioned
          • Statically – where the cluster administrator creates the PVs which is available for use by cluster users
          • Dynamically using StorageClasses where the cluster may try to dynamically provision a volume, especially for the PVC.

Cloud Native Architecture

  • Cloud Native Architecture Fundamentals
    • Cloud native architecture guides us to optimize the software for scalability, high availability, cost efficiency, reliability, security, and faster time-to-market by using a combination of cultural, technological, and architectural design patterns.
    • Cloud native architecture includes containers, service meshes, microservices, immutable infrastructure, and declarative APIs.
    • Cloud native techniques enable loosely coupled systems that are resilient, manageable, and observable.
  • Microservices
    • Microservices are small, independent applications with a clearly defined scope of functions and responsibilities.
    • Microservices help break down an application into multiple decoupled applications, that communicate with each other in a network, which are more manageable.
    • Microservices enable multiple teams to hold ownership of different functions of the application,
    • Microservices also enable functions to be operated and scaled individually.
  •  Autoscaling
    • Autoscaling pattern provides the ability to dynamically adjust the resources based on the current demand without the need to over or under provision the resources.
    • Autoscaling can be performed using
      • Horizontal scaling – Adds new compute resources which can be new copies of the application, Virtual Machines, or physical servers.
      • Vertical scaling – Adds more resources to the existing underlying hardware.
  • Serverless
    • Serverless allows you to just focus on the code while the cloud provider takes care of the underlying resources required to execute the code.
    • Most cloud providers provide this feature as Function as a Service (FaaS) like AWS Lambda, GCP Cloud Functions, etc.
    • Serverless enables on-demand provisioning and scaling of the applications with a pay-as-you-use model.
    • CloudEvents aims to standardize serverless and event-driven architectures on multiple platforms.
      • It provides a specification of how event data should be structured.
      • Events are the basis for scaling serverless workloads or triggering corresponding functions.
  • Community and Governance
    • Open source projects hosted and supported by the CNCF are categorized according to maturity and go through a sandbox and incubation stage before graduating.
    • CNCF Technical Oversight Committee – TOC
      • is responsible for defining and maintaining the technical vision, approving new projects, accepting feedback from the end-user committee, and defining common practices that should be implemented in CNCF projects.
      • does not control the projects, but encourages them to be self-governing and community owned and practices the principle of “minimal viable governance”.
    • CNCF Project Maturity Levels
      • Sandbox Stage
        • Entry point for early stage projects.
      • Incubating Stage
        • Project meeting the sandbox stage requirements plus full technical due diligence performed, including documentation, a healthy number of committers, contributions, clear versioning scheme, documented security processes, and at least one public reference implementation
      • Graduation Stage
        • Project meeting the incubation stage criteria plus committers from at least two organizations, well-defined project governance, and committer process, maintained Core Infrastructure Initiative Best Practices Badge, third party security audit, public list of project adopters, received a supermajority vote from the TOC.
  • Personas
    • SRE, Security, Cloud, DevOps, and Containers have opened up a lot of different Cloud Native roles
      • Cloud Engineer & Architect
      • DevOps Engineer
      • Security Engineer
      • DevSecOps Engineer
      • Data Engineer
      • Full-Stack Developer
      • Site Reliability Engineer (SRE)
    • Site Reliability Engineer – SRE
      • Founded around 2003 by Google, SRE has become an important job for many organizations.
      • SRE’s goal is to create and maintain software that is reliable and scalable.
      • To measure performance and reliability, SREs use three main metrics:
        • Service Level Objectives – SLO: Specify a target level for the reliability of your service.
        • Service Level Indicators – SLI: A carefully defined quantitative measure of some aspect of the level of service that is provided
        • Service Level Agreements – SLA: An explicit or implicit contract with your users that includes consequences of meeting (or missing) the SLOs they contain.
      • Around these metrics, SREs might define an error budget. An error budget defines the amount (or time) of errors the application can have before actions are taken, like stopping deployments to production.
  • Open Standards
    • Open Standards help provide a standardized way to build, package, run, and ship modern software.
    • Open standards covers
      • Open Container Initiative (OCI) Spec: image, runtime, and distribution specification on how to run, build, and distribute containers
      • Container Network Interface (CNI): A specification on how to implement networking for Containers.
      • Container Runtime Interface (CRI): A specification on how to implement container runtimes in container orchestration systems.
      • Container Storage Interface (CSI): A specification on how to implement storage in container orchestration systems.
      • Service Mesh Interface (SMI): A specification on how to implement Service Meshes in container orchestration systems with a focus on Kubernetes.
    • OCI provides open industry standards for container technologies and defines
      • Image-spec defines how to build and package container images.
      • Runtime-spec specifies the configuration, execution environment, and lifecycle of containers.
      • Distribution-Spec, which provides a standard for the distribution of content in general and container images in particular.

Cloud Native Observability

  • Telemetry & Observability
    • Telemetry is the process of measuring and collecting data points and then transferring them to another system.
    • Observability is the ability to understand the state of a system or application by examining its outputs, logs, and performance metrics.
    • It’s a measure of how well the internal states of a system can be inferred from knowledge of its external outputs.
    • Observability mainly consists of
      • Logs: Interactions between data and the external world with messages from the application.
      • Metrics: Quantitative measurements with numerical values describing service or component behavior over time
      • Traces: Records the progression of the request while passing through multiple distributed systems.
        • Trace consists of Spans, which can include information like start and finish time, name, tags, or a log message.
        • Traces can be stored and analyzed in a tracing system like Jaeger.
    • OpenTelemetry
      • is a set of APIs, SDKs, and tools that can be used to integrate telemetry such as metrics, and protocols, but especially traces into applications and infrastructures.
      • OpenTelemetry clients can be used to export telemetry data in a standardized format to central platforms like Jaeger.
  • Prometheus
    • Prometheus is a popular, open-source monitoring system.
    • Prometheus can collect metrics that were emitted by applications and servers as time series data
    • Prometheus data model provides four core metrics:
      • Counter: A value that increases, like a request or error count
      • Gauge: Values that increase or decrease, like memory size
      • Histogram: A sample of observations, like request duration or response size
      • Summary: Similar to a histogram, but also provides the total count of observations.
    • Prometheus provides PromQL (Prometheus Query Language) to query data stored in the Time Series Database (TSDB).
    • Prometheus integrates with Grafana, which can be used to build visualization and dashboards from the collected metrics.
    • Prometheus integrates with Alertmanager to configure alerts when certain metrics reach or pass a threshold.
  • Cost Management
    • All the Cloud providers work on the Pay-as-you-use model.
    • Cost optimization can be performed by analyzing what is really needed, how long, and scaling dynamically as per the needs.
    • Some of the cost optimization techniques include
      • Right sizing the workloads and dynamically scaling as per the demand
      • Identify wasted, unused resources and have proper archival techniques
      • Using Reserved or Spot instances as per the workloads
      • Defining proper budgets and alerts

Cloud Native Application Delivery

  • Application Delivery Fundamentals
    • Application delivery includes the application lifecycle right from source code, versioning, building, testing, packaging, and deployments.
    • The old process included a lot of error-prone manual steps and the constant fear that something would break.
    • DevOps process includes both the developers and administrators and focuses on frequent, error-free, repeatable, rapid deployments.
    • Version control systems like Git provide a decentralized system that can be used to track changes in the source code.
  • CI/CD
    • Continuous Integration/Continuous Delivery (CI/CD) provides very fast, more frequent, and higher quality software rollouts with automated builds, tests, code quality checks, and deployments.
      • Continuous Integration focuses on building and testing the written code. High automation and usage of version control allow multiple developers and teams to work on the same code base.
      • Continuous Delivery focuses on automated deployment of the pre-built software.
    • CI/CD tools include Jenkins, Spinnaker, Gitlab, ArgoCD, etc.
    • CI/CD can be performed using two different approaches
      • Push-based
        • The pipeline is started and runs tools that make the changes in the platform. Changes can be triggered by a commit or merge request.
      • Pull-based
        • An agent watches the git repository for changes and compares the definition in the repository with the actual running state.
        • If changes are detected, the agent applies the changes to the infrastructure.
  • GitOps
    • Infrastructure as a Code with tools like Terraform provides complete automation with versioning and better controls increasing the quality and speed of providing infrastructure.
    • GitOps takes the idea of Git as the single source of truth a step further and integrates the provisioning and change process of infrastructure with version control operations.
  • GitOps frameworks that use the pull-based approach are Flux and ArgoCD.
    • ArgoCD is implemented as a Kubernetes controller
    • Flux is built with the GitOps Toolkit

KCNA General information and practices

  • The exam can be taken online from anywhere.
  • Make sure you have prepared your workspace well before the exams.
  • Make sure you have a valid government-issued ID card as it would be checked.
  • You are not allowed to have anything around you and no one should enter the room.
  • The exam proctor will be watching you always, so refrain from doing any other activities. Your screen is also always shared.

All the Best …

Certified Kubernetes Security Specialist CKS Learning Path

Certified Kubernetes Security Specialist Certificate

Certified Kubernetes Security Specialist CKS Learning Path

With Certified Kubernetes Security Specialist CKS certification, I have recertified the triad of Kubernetes certification. After knowing how to use and administer Kubernetes, the last piece was to understand the security intricacies and CKS preparation does provide you a deep dive into it.

  • CKS is more of an open-book test, where you have access to the official Kubernetes documentation exam, but it focuses more on hands-on experience.
  • CKS focuses on securing container-based applications and Kubernetes platforms during build, deployment, and runtime.
  • Unlike AWS and GCP certifications, you would be required to solve, debug actual problems, and provision resources on a live Kubernetes cluster.
  • Even though it is an open book test, you need to know where the information is.
  • Trust me, if you are not prepared this time is not going to be sufficient.

CKS Exam Pattern

  • CKS exam curriculum includes these general domains and their weights on the exam:
    • Cluster Setup – 10%
    • Cluster Hardening – 15%
    • System Hardening – 15%
    • Minimize Microservice Vulnerabilities – 20%
    • Supply Chain Security – 20%
    • Monitoring, Logging and Runtime Security – 20%
  • CKS exam has been upgraded and requires you to solve 15-20 questions in 2 hours. I got 16 questions.
  • CKS was already upgraded to use the k8s 1.28 version. But it keeps on being upgraded with new Kubernetes versions.
  • You are allowed to open another browser tab which can be from kubernetes.io or other product documentation like Falco. Do not open any other windows.
  • Exam questions can be attempted in any order and don’t have to be sequential. So be sure to move ahead and come back later.

CKS Exam Preparation and Tips

  • I used the courses from KodeKloud CKS for practicing and it would be good enough to cover what is required for the exam.
  • Prepare yourself with the imperative commands as much as you can. This will help cut down the time required to solve half of the questions.
  • Each exam question carries weight so be sure you attempt the exams with higher weights before focusing on the lower ones. So target the ones with higher weights and quicker solutions like debugging ones.
  • CKS exam provides 6-8 different preconfigured K8s clusters. Each question refers to a different Kubernetes cluster, and the context needs to be switched. Be sure to execute the kubectl use context command, which is available with every question and you just need to copy-paste it.
  • Check for the namespace mentioned in the question, to find resources and create resources. Use the -n <namespace>
  • You would be performing most of the interaction from the client node. However, pay attention to the node (master or worker) you need to execute the exams and make sure you return back to the base node.
  • With CKS is important to move the master node for any changes to the cluster kube-apiserver .
  • SSH to nodes and gaining root access is allowed if needed.
  • Read carefully the Information provided within the questions with the mark. They would provide very useful hints in addressing the question and save time. for e.g., namespaces to look into for a failed pod, what has already been created like configmap, secrets, network policies so that you do not create the same.
  • Make sure you know the imperative commands to create resources, as you won’t have much time to create and edit YAML files.
  • If you need to edit further use --dry-run=client -o yaml to get a headstart with the YAML spec file and edit the same.
  • I personally use alias kk=kubectl to avoid typing kubectl

CKS Resources

CKS Key Topics

Cluster Setup – 10%

Cluster Hardening – 15%

System Hardening – 15%

  • Practice CKS Exercises – System Harding
  • Minimize host OS footprint (reduce attack surface)
    • Control access using SSH, disable root and password-based logins
    • Remove unwanted packages and ports
  • Minimize IAM roles
    • IAM roles are usually with Cloud providers and relate to the least privilege access principle.
  • Minimize external access to the network
    • External access can be controlled using Network Policies through egress policies.
  • Appropriately use kernel hardening tools such as AppArmor, seccomp
    • Runtime classes provided by gvisor and kata containers can help provide further isolation of the containers
    • Secure Computing – Seccomp tool helps control syscalls made by containers
    • AppArmor can be configured for any application to reduce its potential host attack surface and provide a greater in-depth defense.
    • PodSecurityPolicies – PSP enables fine-grained authorization of pod creation and updates.
      • Apply host updates
      • Install minimal required OS fingerprint
      • Identify and address open ports
      • Remove unnecessary packages
      • Protect access to data with permissions
    • Exam tip: Know how to load AppArmor profiles, and enable them for the pods. AppArmor is in beta and needs to be enabled using container.apparmor.security.beta.kubernetes.io/<container_name>: <profile_ref>

Minimize Microservice Vulnerabilities – 20%

  • Practice CKS Exercises – Minimize Microservice Vulnerabilities
  • Setup appropriate OS-level security domains e.g. using PSP, OPA, security contexts.
    • Pod Security Contexts help define security for pods and containers at the pod or at the container level. Capabilities can be added at the container level only.
    • Pod Security Policies enable fine-grained authorization of pod creation and updates and is implemented as an optional admission controller.
    • Open Policy Agent helps enforce custom policies on Kubernetes objects without recompiling or reconfiguring the Kubernetes API server.
    • Admission controllers
      • can be used for validating configurations as well as mutating the configurations.
      • Mutating controllers are triggered before validating controllers.
      • Allows extension by adding custom controllers using MutatingAdmissionWebhook and ValidatingAdmissionWebhook.
    • Exam tip: Know how to configure Pod Security Context, Pod Security Policies
  • Manage Kubernetes secrets
    • Exam Tip: Know how to read secret values, create secrets and mount the same on the pods.
  • Use container runtime sandboxes in multi-tenant environments (e.g. gvisor, kata containers)
    • Exam tip: Know how to create a Runtime and associate it with a pod using runtimeClassName
  • Implement pod to pod encryption by use of mTLS
    • Practice manage TLS certificates in a Cluster
    • Service Mesh Istio can be used to establish MTLS for Intra pod communication.
    • Istio automatically configures workload sidecars to use mutual TLS when calling other workloads. By default, Istio configures the destination workloads using PERMISSIVE mode. When PERMISSIVE mode is enabled, a service can accept both plain text and mutual TLS traffic. In order to only allow mutual TLS traffic, the configuration needs to be changed to STRICT mode.
    • Exam tip: No questions related to mTLS appeared in the exam

Supply Chain Security – 20%

  • Practice CKS Exercises – Supply Chain Security
  • Minimize base image footprint
    • Remove unnecessary tools. Remove shells, package manager & vi tools.
    • Use slim/minimal images with required packages only. Do not include unnecessary software like build tools and utilities, troubleshooting, and debug binaries.
    • Build the smallest image possible – To reduce the size of the image, install only what is strictly needed
    • Use distroless, Alpine, or relevant base images for the app.
    • Use official images from verified sources only.
  • Secure your supply chain: whitelist allowed registries, sign and validate images
  • Use static analysis of user workloads (e.g.Kubernetes resources, Docker files)
    • Tools like Kubesec can be used to perform a static security risk analysis of the configurations files.
  • Scan images for known vulnerabilities
    • Aqua Security Trivy & Anchore can be used for scanning vulnerabilities in the container images.
    • Exam Tip: Know how to use the Trivy tool to scan images for vulnerabilities. Also, remember to use the --severity for e.g. --severity=CRITICAL flag for filtering a specific category.

Monitoring, Logging and Runtime Security – 20%

  • Practice CKS Exercises – Monitoring, Logging, and Runtime Security
  • Perform behavioral analytics of syscall process and file activities at the host and container level to detect malicious activities
  • Detect threats within a physical infrastructure, apps, networks, data, users, and workloads
  • Detect all phases of attack regardless of where it occurs and how it spreads
  • Perform deep analytical investigation and identification of bad actors within the environment
    • Tools like strace and Aqua Security Tracee can be used to check the syscalls. However, with a number of processes, it would be tough to track and monitor all and they do not provide alerting.
    • Tools like Falco & Sysdig provide deep, process-level visibility into dynamic, distributed production environments and can be used to define rules to track, monitor, and alert on activities when a certain rule is violated.
    • Exam Tip: Know how to use Falco, define new rules, enable logging. Make use of the falco_rules.local.yaml file for overrides. (I did not get questions for Falco in my exam).
  • Ensure immutability of containers at runtime
    • Immutability prevents any changes from being made to the container or to the underlying host through the container.
    • It is recommended to create new images and perform a rolling deployment instead of modifying the existing running containers.
    • Launch the container in read-only mode using the --read-only flag from the docker run or by using the readOnlyRootFilesystem option in Kubernetes.
    • PodSecurityContext and PodSecurityPolicy can be used to define and enforce container immutability
      • ReadOnlyRootFilesystem – Requires that containers must run with a read-only root filesystem (i.e. no writable layer).
      • Privileged – determines if any container in a pod can enable privileged mode. This allows the container nearly all the same access as processes running on the host.
    • Task @ Configure Pod Container Security Context
    • Exam Tip: Know how to define a PodSecurityPolicy to enforce rules. Remember, Cluster Roles and Role Binding needs to be configured to provide access to the PSP to make it work.
  • Use Audit Logs to monitor access
    • Kubernetes auditing is handled by the kube-apiserver which requires defining an audit policy file.
    • Auditing captures the stages as RequestReceived -> (Authn and Authz) -> ResponseStarted (-w) -> ResponseComplete (for success) OR Panic (for failures)
    • Exam Tip: Know how to configure audit policies and enable audit on the kube-apiserver. Make sure the kube-apiserver is up and running.
    • Task @ Kubernetes Auditing

CKS Articles

CKS General information and practices

  • The exam can be taken online from anywhere.
  • Make sure you have prepared your workspace well before the exams.
  • Make sure you have a valid government-issued ID card as it would be checked.
  • You are not allowed to have anything around you and no one should enter the room.
  • The exam proctor will be watching you always, so refrain from doing any other activities. Your screen is also always shared.
  • Copy + Paste works fine.
  • You will have an online notepad on the right corner to note down. I hardly used it, but it can be useful to type and modify text instead of using VI editor.

All the Best …

Kubernetes Security

Kubernetes Security

  • Security in general is not something that can be achieved only at the container layer. It’s a continuous process that needs to be adapted on all layers and all the time.
  • 4C’s of Cloud Native security are Cloud, Clusters, Containers, and Code.
  • Containers are started on a machine and they always share the same kernel, which then becomes a risk for the whole system, if containers are allowed to call kernel functions like for example killing other processes or modifying the host network by creating routing rules.

Authentication

Users

  • Kubernetes does not support the creation of users
  • Users can be passed as --basic-auth-file or --token-auth-file to the kube-apiserver using a static user + password (deprecated) or static user + token file.
  • This approach is deprecated.

X509 Client Certificates

  • Kubernetes requires PKI certificates for authentication over TLS.
  • Kubernetes requires PKI for the following operations:
    • Client certificates for the kubelet to authenticate to the API server
    • Server certificate for the API server endpoint
    • Client certificates for administrators of the cluster to authenticate to the API server
    • Client certificates for the API server to talk to the kubelet
    • Client certificate for the API server to talk to etcd
    • Client certificate/kubeconfig for the controller manager to talk to the API server
    • Client certificate/kubeconfig for the scheduler to talk to the API server.
      Client and server certificates for the front-proxy
  • Client certificates can be signed in two ways so that they can be used to authenticate with the Kubernetes API.
    1. Internally signing the certificate using the Kubernetes API.
      1. It involves the creation of a certificate signing request (CSR) by a client.
      2. Administrators can approve or deny the CSR.
      3. Once approved, the administrator can extract and provide a signed certificate to the requesting client or user.
      4. This method cannot be scaled for large organizations as it requires manual intervention.
    2. Use enterprise PKI, which can sign the client-submitted CSR.
      1. The signing authority can send signed certificates back to clients.
      2. This approach requires the private key to be managed by an external solution.

Refer Authentication Exercises

Service Accounts

  • Kubernetes service accounts can be used to provide bearer tokens to authenticate with Kubernetes API.
  • Bearer tokens can be verified using a webhook, which involves API configuration with option --authentication-token-webhook-config-file, which includes the details of the remote webhook service.
  • Kubernetes internally uses Bootstrap and Node authentication tokens to initialize the cluster.
  • Each namespace has a default service account created.
  • Each service account creates a secret object which stores the bearer token.
  • Existing service account for a pod cannot be modified, the pod needs to be recreated.
  • The service account can be associated with the pod using the serviceAccountName field in the pod specification and the service account secret is auto-mounted on the pod.
  •  automountServiceAccountToken flag can be used to prevent the service account from being auto-mounted.

Practice Service Account Exercises

Authorization

Node

  • Node authorization is used by Kubernetes internally and enables read, write, and auth-related operations by kubelet.
  • In order to successfully make a request, kubelet must use a credential that identifies it as being in the system:nodes group.
  • Node authorization can be enabled using the --authorization-mode=Node option in Kubernetes API Server configurations.

ABAC

  • Kubernetes defines attribute-based access control (ABAC) as “an access control paradigm whereby access rights are granted to users through the use of policies which combine attributes together.”
  • ABAC can be enabled by providing a .json file to --authorization-policy-file and --authorization-mode=ABAC options in Kubernetes API Server configurations.
  • The .json file needs to be present before Kubernetes API can be invoked.
  • Any changes in the ABAC policy file require a Kube API Server restart and hence the ABAC approach is not preferred.

AlwaysDeny/AlwaysAllow

  • AlwaysDeny or AlwaysAllow authorization mode is usually used in development environments where all requests to the Kubernetes API need to be allowed or denied.
  • AlwaysDeny or AlwaysAllow mode can be enabled using the option --authorization-mode=AlwaysDeny/AlwaysAllow while configuring Kubernetes API.
  • This mode is considered insecure and hence is not recommended in production environments.

RBAC

  • Role-based access control is the most secure and recommended authorization mechanism in Kubernetes.
  • It is an approach to restrict system access based on the roles of
    users within the cluster.
  • It allows organizations to enforce the principle of least privileges.
  • Kubernetes RBAC follows a declarative nature with clear permissions (operations), API objects (resources), and subjects (users, groups, or service accounts) declared in authorization requests.
  • RBAC authorization can be enabled using the --authorization-mode=RBAC option in Kubernetes API Server configurations.
  • RBAC can be configured using
    • Role or ClusterRole – is made up of verbs, resources, and subjects, which provide a capability (verb) on a resource
    • RoleBinding or ClusterRoleBinding – helps assign privileges to the user, group, or service account.
  • Role vs ClusterRole AND RoleBinding vs ClusterRoleBinding
    • ClusterRole is a global object whereas Role is a namespace object.
    • Roles and RoleBindings are the only namespaced resources.
    • ClusterRoleBindings (global resource) cannot be used with Roles, which is a namespaced resource.
    • RoleBindings (namespaced resource) cannot be used with ClusterRoles, which are global resources.
    • Only ClusterRoles can be aggregated.

RBAC Role Binding

Practice RBAC Exercises

Admission Controllers

  • Admission Controller is an interceptor to the Kubernetes API server requests prior to persistence of the object, but after the request is authenticated and authorized.
  • Admission controllers limit requests to create, delete, modify or connect to (proxy). They do not support read requests.
  • Admission controllers may be “validating”, “mutating”, or both.
  • Mutating controllers may modify the objects they admit; validating controllers may not.
  • Mutating controllers are executed before the validating controllers.
  • If any of the controllers in either phase reject the request, the entire request is rejected immediately and an error is returned to the end-user.
  • Admission Controllers provide fine-grained control over what can be performed on the cluster, that cannot be handled using Authentication or Authorization.

Kubernetes Admission Controllers

  • Admission controllers can only be enabled and configured by the cluster administrator using the --enable-admission-plugins and --admission-control-config-file flags.
  • Few of the admission controllers are as below
    • PodSecurityPolicy acts on the creation and modification of the pod and determines if it should be admitted based on the requested security context and the available Pod Security Policies.
    • ImagePolicyWebhook to decide if an image should be admitted.
    • MutatingAdmissionWebhook to modify a request.
    • ValidatingAdmissionWebhook to decide whether the request should be allowed to run at all.

Practice Admission Controller Exercises

Pod Security Policies

  • Pod Security Policies enable fine-grained authorization of pod creation and updates and is implemented as an optional admission controller.
  • A Pod Security Policy is a cluster-level resource that controls security-sensitive aspects of the pod specification.
  • PodSecurityPolicy is disabled, by default. Once enabled using --enable-admission-plugins, it applies itself to all the pod creation requests.
  • PodSecurityPolicies enforced without authorizing any policies will prevent any pods from being created in the cluster. The requesting user or target pod’s service account must be authorized to use the policy, by allowing the use verb on the policy.
  • PodSecurityPolicy acts both as validating and mutating admission controller. PodSecurityPolicy objects define a set of conditions that a pod must run with in order to be accepted into the system, as well as defaults for the related fields.

Practice Pod Security Policies Exercises

Pod Security Context

  • Security Context helps define privileges and access control settings for a Pod or Container that includes
    • Discretionary Access Control: Permission to access an object, like a file, is based on user ID (UID) and group ID (GID)
    • Security-Enhanced Linux (SELinux): Objects are assigned security labels.
    • Running as privileged or unprivileged.
    • Linux Capabilities: Give a process some privileges, but not all the privileges of the root user.
    • AppArmor: Use program profiles to restrict the capabilities of individual programs.
    • Seccomp: Filter a process’s system calls.
    • AllowPrivilegeEscalation: Controls whether a process can gain more privileges than its parent process. AllowPrivilegeEscalation is true always when the container is: 1) run as Privileged OR 2) has CAP_SYS_ADMIN.
    • readOnlyRootFilesystem: Mounts the container’s root filesystem as read-only.
  • PodSecurityContext holds pod-level security attributes and common container settings.
  • Fields present in container.securityContext over the field values of PodSecurityContext.

Practice Pod Security Context Exercises

MTLS or Two Way Authentication

  • Service Mesh like Istio and Linkerd can help implement MTLS for intra-cluster pod-to-pod communication.
  • Istio deploys a side-car container that handles the encryption and decryption transparently.
  • Istio supports both permissive and strict modes

Network Policies

  • By default, pods are non-isolated; they accept traffic from any source.
  • NetworkPolicies help specify how a pod is allowed to communicate with various network “entities” over the network.
  • NetworkPolicies can be used to control traffic to/from Pods, Namespaces or specific IP addresses
  • Pod- or namespace-based NetworkPolicy uses a selector to specify what traffic is allowed to and from the Pod(s) that match the selector.

Practice Network Policies Exercises

Kubernetes Auditing

  • Kubernetes auditing provides a security-relevant, chronological set of records documenting the sequence of actions in a cluster for activities generated by users, by applications that use the Kubernetes API, and by the control plane itself.
  • Audit records begin their lifecycle inside the kube-apiserver component.
  • Each request on each stage of its execution generates an audit event, which is then pre-processed according to a certain policy and written to a backend.
  • Audit policy determines what’s recorded and the backends persist the records.
  • Backend implementations include logs files and webhooks.
  • Each request can be recorded with an associated stage as below
    • RequestReceived – generated as soon as the audit handler receives the request, and before it is delegated down the handler chain.
    • ResponseStarted – generated once the response headers are sent, but before the response body is sent. This stage is only generated for long-running requests (e.g. watch).
    • ResponseComplete – generated once the response body has been completed and no more bytes will be sent.
    • Panic – generated when a panic or a failure occurs.

Kubernetes Audit Policy

Kubernetes kube-apiserver.yaml file with audit configuration

Practice Kubernetes Auditing Exercises

Seccomp – Secure Computing

  • Seccomp stands for secure computing mode and has been a feature of the Linux kernel since version 2.6.12.
  • Seccomp can be used to sandbox the privileges of a process, restricting the calls it is able to make from user space into the kernel.
  • Kubernetes lets you automatically apply seccomp profiles loaded onto a Node to the Pods and containers.

Seccomp profile

Seccomp profile attached to the pod

Practice Seccomp Exercises

AppArmor

  • AppArmor is a Linux kernel security module that supplements the standard Linux user and group-based permissions to confine programs to a limited set of resources.
  • AppArmor can be configured for any application to reduce its potential attack surface and provide a greater in-depth defense.
  • AppArmor is configured through profiles tuned to allow the access needed by a specific program or container, such as Linux capabilities, network access, file permissions, etc.
  • Each profile can be run in either enforcing mode, which blocks access to disallowed resources or complain mode, which only reports violations.
  • AppArmor helps to run a more secure deployment by restricting what containers are allowed to do, and/or providing better auditing through system logs.
  • Use aa-status to check AppArmor status and profiles are loaded
  • Use apparmor_parser -q <<profile file>> to load profiles
  • AppArmor is in beta and needs annotations to enable it using container.apparmor.security.beta.kubernetes.io/<container_name>: <profile_ref>

AppArmor profile

AppArmor usage

Practice App Armor Exercises

Kubesec

  • Kubesec can be used to perform a static security risk analysis of the configurations files.

Sample configuration file

Kubesec Report

Practice Kubesec Exercises

Trivy (or Clair or Anchore)

  • Trivy is a simple and comprehensive scanner for vulnerabilities in container images, file systems, and Git repositories, as well as for configuration issues.
  • Trivy detects vulnerabilities of OS packages (Alpine, RHEL, CentOS, etc.) and language-specific packages (Bundler, Composer, npm, yarn, etc.).
  • Trivy scans Infrastructure as Code (IaC) files such as Terraform, Dockerfile, and Kubernetes, to detect potential configuration issues that expose your deployments to the risk of attack.
  • Use trivy image <<image_name>> to scan images
  • Use --severity flag to filter the vulnerabilities as per the category.

Practice Trivy Exercises

Falco

Falco Architecture

  1. Falco can be installed as a package on the nodes OR as Daemonsets on the Kubernetes cluster
  2. Falco is driven through configuration (defaults to /etc/falco/falco.yaml ) files which includes
    1. Rules
      1. Name and description
      2. Condition to trigger the rule
      3. Priority emergency, alert, critical, error, warning, notice, info, debug
      4. Output data for the event
      5. Multiple rule files can be specified, with the last one taking the priority in case of the same rule defined in multiple files
    2. Log attributes for Falco i.e. level, format
    3. Output file and format i.e JSON or text
    4. Alerts output destination which includes stdout, file, HTTP, etc.

Practice Falco Exercises

Reduce Attack Surface

  • Follow the principle of least privilege and limit access
  • Limit Node access,
    • keep nodes private
    • disable login using the root account PermitRootLogin No and use privilege escalation using sudo  .
    • disable password-based authentication PasswordAuthentication No and use SSH keys.
  • Remove any unwanted packages
  • Block or close unwanted ports
  • Keep the base image light and limited to the bare minimum required
  • Identify and fix any open ports

Certified Kubernetes Administrator CKA Learning Path

CKA Certificate

Certified Kubernetes Administrator CKA Learning Path

Recertified Certified Kubernetes Administrator CKA certification recently with 91%. After knowing how to use Kubernetes, it was really interesting and intriguing to know Kubernetes internals and how the overall system works.

  • CKA is more of an open-book test, where you have access to the official Kubernetes documentation exam, but it focuses more on hands-on experience.
  • CKA focuses on “The skills required to be a successful Kubernetes Administrator “. It tests the candidate’s ability to do basic installation as well as configuring and managing production-grade Kubernetes clusters.
  • Unlike AWS and GCP certifications, you would be required to solve, debug actual problems, and provision resources on a live Kubernetes cluster.
  • Even though it is an open book test, you need to know where the information is.
  • Trust me, if you are not prepared this time is not going to be sufficient.

CKA Exam Pattern

  • CKA exam curriculum includes these general domains and their weights on the exam:
    • Cluster Architecture, Installation & Configuration – 25%
    • Workloads & Scheduling – 15%
    • Services & Networking – 20%
    • Storage  – 10%
    • Troubleshooting – 30%
  • CKA requires you to solve 24 questions in 3 hours.
  • CKA exam has been upgraded and requires you to solve 15-20 questions in 2 hours. I got 17 questions.
  • CKA was already upgraded to use the k8s 1.28 version. But it keeps on being upgraded with new Kubernetes versions.
  • You are allowed to open another browser tab which can be from kubernetes.io or other product documentation like Falco. Do not open any other windows.
  • Exam questions can be attempted in any order and don’t have to be sequential. So be sure to move ahead and come back later.

CKA Exam Preparation and Tips

  • I used the courses from KodeKloud CKA for practicing and it would be good enough to cover what is required for the exam.
  • Prepare yourself with the imperative commands as much as you can. This will help cut down the time required to solve half of the questions. I was not stretched for time for CKA and had much time to review.
  • Each exam question carries weight so be sure you attempt the exams with higher weights before focusing on the lower ones. So target the ones with higher weights and quicker solutions like debugging ones.
  • CKA exam provides 6-8 different preconfigured K8s clusters. Each question refers to a different Kubernetes cluster, and the context needs to be switched. Be sure to execute the kubectl use context command, which is available with every question and you just need to copy-paste it.
  • Check for the namespace mentioned in the question, to find resources and create resources. Use the -n <namespace>
  • You would be performing most of the interaction from the client node. However, pay attention to the node (master or worker) you need to execute the exams and make sure you return back to the base node.
  • With CKA is important to move the master node for any changes to the cluster kube-apiserver .
  • SSH to nodes and gaining root access is allowed if needed.
  • Read carefully the Information provided within the questions with the mark. They would provide very useful hints in addressing the question and save time. for e.g., namespaces to look into for a failed pod, what has already been created like configmap, secrets, network policies so that you do not create the same.
  • Make sure you know the imperative commands to create resources, as you won’t have much time to create and edit YAML files.
  • If you need to edit further use --dry-run=client -o yaml to get a headstart with the YAML spec file and edit the same.
  • I personally use alias kk=kubectl to avoid typing kubectl

CKA Learning Path

CKA Key Topics

Cluster Architecture, Installation & Configuration – 25%

Workloads & Scheduling – 15%

Services & Networking – 20%

Storage – 10%

Troubleshooting – 30%

  • Practice CKA Exercises – Troubleshooting
  • Evaluate cluster and node logging
  • Understand how to monitor applications
  • Manage container stdout & stderr logs
  • Troubleshoot application failure
  • Troubleshoot cluster component failure
    • Practice Debug cluster for troubleshooting control plane failure and worker node failure.
      • Understand the control plane architecture.
      • Focus on kube-apiserver, static pod config which causes the control panel pods to be referred and deployed.
      • Check pods in kube-system if they are all running. Use docker ps -a command on the node to inspect the reason for exiting containers.
      • Check kubelet service if the worker node is shown not ready
  • Troubleshoot networking

Scheduling

Security

CKA General information and practices

  • The exam can be taken online from anywhere.
  • Make sure you have prepared your workspace well before the exams.
  • Make sure you have a valid government-issued ID card as it would be checked.
  • You are not allowed to have anything around you and no one should enter the room.
  • The exam proctor will be watching you always, so refrain from doing any other activities. Your screen is also always shared.
  • Copy + Paste works fine.
  • You will have an online notepad on the right corner to note down. I hardly used it, but it can be useful to type and modify text instead of using VI editor.

All the Best …

 

Certified Kubernetes Application Developer CKAD Learning Path

Certified Kubernetes Application Developer CKAD Learning Path

After working on Kubernetes for quite some time, it was time to recertify my  Certified Kubernetes Application Developer, and am glad to have cleared it with a score of 89 with minimal preparation.

  • CKAD is more of an open-book test, where you have access to the official Kubernetes documentation exam, but it focuses more on hands-on experience.
  • CKAD focuses on “Using a Kubernetes cluster once already provisioned“. It tests the candidate’s ability to design, build, configure, and expose cloud native applications for Kubernetes.
  • Unlike AWS and GCP certifications, you would be required to solve, debug actual problems, and provision resources on a live Kubernetes cluster.
  • Even though it is an open book test, you need to know where the information is.
  • Trust me, if you are not prepared this time is not going to be sufficient.

CKAD Exam Pattern

  • CKAD exam curriculum includes these general domains and their weights on the exam:
    • Application Design and Build – 20%
    • Application Environment, Configuration and Security – 25%
    • Application Deployment – 20%
    • Services & Networking – 20%
    • Application observability and maintenance – 15%
  • CKAD requires you to solve 16 questions in 2 hours.
  • CKAD was already upgraded to use the k8s 1.28 version. But it keeps on being upgraded with new Kubernetes versions.
  • You are allowed to open another browser tab that can be from kubernetes.io or other product documentation like Falco. Do not open any other windows.
  • Exam questions can be attempted in any order and don’t have to be sequential. So be sure to flag them and move ahead and come back later.

CKAD Exam Preparation and Tips

  • I used the courses from KodeKloud CKAD for practicing and it would be good enough to cover what is required for the exam.
  • Prepare yourself with the imperative commands as much as you can. This will help cut down the time required to solve half of the questions. I was not stretched for time for CKAD and had much time to review.
  • Each exam question carries weight so be sure you attempt the exams with higher weights before focusing on the lower ones. So target the ones with higher weights and quicker solutions like debugging ones.
  • CKAD exam provides 6-8 different preconfigured K8s clusters. Each question refers to a different Kubernetes cluster, and the context needs to be switched. Be sure to execute the kubectl use context command, which is available with every question and you just need to copy-paste it.
  • Check for the namespace mentioned in the question, to find resources and create resources. Use the -n <namespace>
  • You would be performing most of the interaction from the client node. However, pay attention to the node (master or worker) you need to execute the exams and make sure you return back to the base node.
  • SSH to nodes and gaining root access is allowed if needed.
  • Read carefully the Information provided within the questions with the mark. They would provide very useful hints in addressing the question and save time. for e.g. namespaces to look into. for a failed pod, what has already been created like configmap, secrets, network policies so that you do not create the same.
  • Make sure you know the imperative commands to create resources, as you won’t have much time to create and edit YAML files.
  • If you need to edit further use --dry-run=client -o yaml to get a headstart with the YAML spec file and edit the same.
  • I personally use alias kk=kubectl to avoid typing kubectl

CKAD Resources

CKAD Key Topics

Application Design and Build – 20%

Application Environment, Configuration and Security – 25%

Application Deployment – 20%

Services & Networking – 20%

Application observability and maintenance – 15%

CKAD General information and practices

  • The exam can be taken online from anywhere.
  • Make sure you have prepared your workspace well before the exams.
  • Make sure you have a valid government-issued ID card as it would be checked.
  • You are not allowed to have anything around you and no one should enter the room.
  • The exam proctor will always watch you, so refrain from doing other activities. Your screen is also always shared.
  • Copy + Paste works fine.
  • You will have an online notepad on the right corner to note down. I hardly used it, but it can be useful to type and modify text instead of using the VI editor if you are not comfortable with it.

All the Best …

Kubernetes Resources

Kubernetes Resources

Kubernetes Resources

Namespaces

  • Namespaces provide a mechanism for isolating groups of resources within a single cluster.
  • Namespace-based scoping is applicable only for namespaced objects (e.g. Deployments, Services, etc) and not for cluster-wide objects (e.g. StorageClass, Nodes, PersistentVolumes, etc).
  • Names of resources need to be unique within a namespace, but not across namespaces.
  • Kubernetes starts with four initial namespaces:
    • default – default namespace for objects with no other namespace.
    • kube-system – namespace for objects created by the Kubernetes system.
    • kube-public – namespace is created automatically and is readable by all users (including those not authenticated).
    • kube-node-lease – namespace holds Lease objects associated with each node. Node leases allow the kubelet to send heartbeats so that the control plane can detect node failure.
  • Resource Quotas can be defined for each namespace to limit the resources consumed.
  • Resources within the namespaces can refer to each other with their service names.
  • Resources across namespace can be reached using the full DNS <<service_name>>.<<namespace_name>>.svc.cluster.local

Practice Namespace Exercises

Pods

  • A Kubernetes pod is a group of containers and is the smallest unit that Kubernetes administers.
  • Pods have a single IP address applied to every container within the pod.
  • Pods are always co-located and co-scheduled and run in a shared context.
  • Containers in a pod share the same resources such as memory and storage.
  • Shared context allows the individual Linux containers inside a pod to be treated collectively as a single application as if all the containerized processes were running together on the same host in more traditional workloads.

Practice Pod Exercises

ReplicaSet

  • ReplicaSet ensures to maintain a stable set of replica Pods running at any given time. It helps guarantee the availability of a specified number of identical Pods.
  • ReplicaSet includes the pod definition template, a selector to match the pods, and a number of replicas.
  • ReplicaSet then fulfills its purpose by creating and deleting Pods as needed to reach the desired replica number using the Pod template.
  • It is recommended to use Deployments instead of directly using ReplicaSets, as they help manage ReplicaSets and provide declarative updates to Pods.

Practice ReplicaSet Exercises

Deployment

  • Deployment provides declarative updates for Pods and ReplicaSets.
  • Deployments describe the number of desired identical pod replicas to run and the preferred update strategy used when updating the deployment.
  • A Deployment runs multiple replicas of your application and automatically replaces any instances that fail or become unresponsive.
  • Deployments represent a set of multiple, identical Pods with no unique identities.
  • Deployments are well-suited for stateless applications that use ReadOnlyMany or ReadWriteMany volumes mounted on multiple replicas but are not well-suited for workloads that use ReadWriteOnce volumes. Use StatefulSets instead.

Deploy Container Resources

Practice Deployment Exercises

Services

  • Service is an abstraction over the pods, and essentially, the only interface the various application consumers interact with.
  • The lifetime of an individual pod cannot be relied upon; everything from their IP addresses to their very existence is prone to change.
  • Kubernetes doesn’t treat its pods as unique, long-running instances; if a pod encounters an issue and dies, it’s Kubernetes’ job to replace it so that the application doesn’t experience any downtime.
  • As pods are replaced, their internal names and IPs might change.
  • A service exposes a single machine name or IP address mapped to pods whose underlying names and numbers are unreliable.
  • A service ensures that, to the outside network, everything appears to be unchanged.

Practice Services Exercises

Ingress

Ingress

  • Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster.
  • Traffic routing is controlled by rules defined on the Ingress resource.
  • An Ingress may be configured to give Services externally-reachable URLs, load balance traffic, terminate SSL/TLS and offer name-based virtual hosting
  • An Ingress controller is responsible for fulfilling the Ingress, usually with a load balancer, though it may also configure your edge router or additional frontends to help handle the traffic.
  • An Ingress with no rules sends all traffic to a single default backend.

Practice Ingress Exercises

DaemonSet

  • A DaemonSet ensures that all (or some) Nodes run a copy of a Pod.
  • DaemonSet ensures pods are added to the newly created nodes and garbage collected as nodes are removed.
  • Some typical uses of a DaemonSet are:
    • running a cluster storage daemon on every node
    • running a logs collection daemon on every node
    • running a node monitoring daemon on every node

Refer DaemonSet Exercises

StatefulSet

StatefulSet Architecture

  • StatefulSet is ideal for stateful applications using ReadWriteOnce volumes.
  • StatefulSets are designed to deploy stateful applications and clustered applications that save data to persistent storage, such as persistent disks.
  • StatefulSets represent a set of Pods with unique, persistent identities and stable hostnames that Kubernetes maintains regardless of where they are scheduled.
  • State information and other resilient data for any given StatefulSet Pod are maintained in persistent disk storage associated with the StatefulSet.
  • StatefulSets use an ordinal index for the identity and ordering of their Pods. By default, StatefulSet Pods are deployed in sequential order and are terminated in reverse ordinal order.
  • StatefulSets are suitable for deploying Kafka, MySQL, Redis, ZooKeeper, and other applications needing unique, persistent identities and stable hostnames.

ConfigMaps

  • ConfigMap helps to store non-confidential data in key-value pairs.
  • Pods can consume ConfigMaps as environment variables, command-line arguments, or as configuration files in a volume.
  • ConfigMap helps decouple environment-specific configuration from the container images so that the applications are easily portable.
  • ConfigMap does not provide secrecy or encryption. If the data you want to store are confidential, use a Secret rather than a ConfigMap, or use additional (third party) tools to keep your data private.
  • A ConfigMap is not designed to hold large chunks of data and cannot exceed 1 MiB.
  • ConfigMap can be configured on a container inside a Pod as
    • Inside a container command and args
    • Environment variables for a container
    • Add a file in read-only volume, for the application to read
    • Write code to run inside the Pod that uses the Kubernetes API to read a ConfigMap
  • ConfigMap can be configured to be immutable as it helps
    • protect from accidental (or unwanted) updates that could cause applications outages
    • improve performance of the cluster by significantly reducing the load on kube-apiserver , by closing watches for ConfigMaps marked as immutable.
  • Once a ConfigMap is marked as immutable, it is not possible to revert this change nor to mutate the contents of the data or the binaryData field. The ConfigMap needs to be deleted and recreated.

Practice ConfigMaps Exercises

Secrets

  • Secret provides a container for sensitive data such as a password without putting the information in a Pod specification or in a container image.
  • Secrets are similar to ConfigMaps but are specifically intended to hold confidential data.
  • Secrets are not really encrypted but only base64 encoded.
  • Secrets are, by default, stored unencrypted in the API server’s underlying data store (etcd). Anyone with API access can retrieve or modify a Secret, and so can anyone with access to etcd. Additionally, anyone who is authorized to create a Pod in a namespace can use that access to read any Secret in that namespace; this includes indirect access such as the ability to create a Deployment.
  • To safeguard secrets, take at least the following steps:
    • Enable Encryption at Rest for Secrets.
    • Enable or configure RBAC rules that restrict reading data in Secrets.

Practice Secrets Exercises

Jobs & Cron Jobs

  • Job creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate.
  • As pods successfully complete, the Job tracks the successful completions.
  • When a specified number of successful completions is reached, the task (ie, Job) is complete.
  • Deleting a Job will clean up the Pods it created. Suspending a Job will delete its active Pods until the Job is resumed again.
  • A job can run multiple Pods in parallel using Parallelism field.
  • A CronJob creates Jobs on a repeating schedule.

Practice Jobs Exercises

Volumes

Kubernetes Volumes

  • Container on-disk files are ephemeral and lost if the container crashes.
  • Kubernetes supports Persistent volumes that exist beyond the lifetime of a pod. When a pod ceases to exist, Kubernetes destroys ephemeral volumes; however, Kubernetes does not destroy persistent volumes.
  • Persistent Volumes is supported using API resources
    • PersistentVolume (PV)
      • is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.
      • is a cluster-level resource and not bound to a namespace
      • are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV.
    • PersistentVolumeClaim (PVC)
      • is a request for storage by a user.
      • is similar to a Pod.
      • Pods consume node resources and PVCs consume PV resources.
      • Pods can request specific levels of resources (CPU and Memory).
      • Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany, or ReadWriteMany, see AccessModes).
  • Persistent Volumes can be provisioned
    • Statically – where the cluster administrator creates the PVs which is available for use by cluster users
    • Dynamically using StorageClasses where the cluster may try to dynamically provision a volume especially for the PVC.

Practice Volumes Exercises

Labels & Annotations

  • Labels and Annotations attach metadata to objects in Kubernetes.
  • Labels
    • are key/value pairs that can be attached to Kubernetes objects such as Pods and ReplicaSets.
    • can be arbitrary and are useful for attaching identifying information to Kubernetes objects.
    • provide the foundation for grouping objects and can be used to organize and to select subsets of objects.
    • are used in conjunction with selectors to identify groups of related resources.
  • Annotations
    • provide a storage mechanism that resembles labels
    • are key/value pairs designed to hold non-identifying information that can be leveraged by tools and libraries.

Practice Labels & Annotations Exercises

Nodes

  • A Kubernetes node manages and runs pods; it’s the machine (whether virtualized or physical) that performs the given work.
  • Just as pods collect individual containers that operate together, a node collects entire pods that function together.
  • When you’re operating at scale, you want to be able to hand work over to a node whose pods are free to take it.

Practice Nodes Exercises

Kubernetes Architecture

Kubernetes Architecture

  • A Kubernetes cluster consists of at least one main (control) plane, and one or more worker machines, called nodes.
  • Both the control planes and node instances can be physical devices, virtual machines, or instances in the cloud.
  • In managed Kubernetes environments like AWS EKS, GCP GKE, Azure AKS the control plane is managed by the cloud provider.

Kubernetes Architecture

Control Plane

  • The control plane is also known as a master node or head node.
  • The control plane manages the worker nodes and the Pods in the cluster.
  • In production environments, the control plane usually runs across multiple computers and a cluster usually runs multiple nodes, providing fault-tolerance and high availability.
  • It is not recommended to run user workloads on master mode.
  • The Control plane’s components make global decisions about the cluster, as well as detect and respond to cluster events.
  • The control plane receives input from a CLI or UI via an API.

API Server (kube-apiserver)

  • API server exposes a REST interface to the Kubernetes cluster. It is the front end for the Kubernetes control plane.
  • All operations against pods, services, and so forth, are executed programmatically by communicating with the endpoints provided by it.
  • It tracks the state of all cluster components and manages the interaction between them.
  • It is designed to scale horizontally.
  • It consumes YAML/JSON manifest files.
  • It validates and processes the requests made via API.

etcd (key-value store)

  • Etcd is a consistent, distributed, and highly-available key-value store.
  • is stateful, persistent storage that stores all of Kubernetes cluster data (cluster state and config).
  • is the source of truth for the cluster.
  • can be part of the control plane, or, it can be configured externally.
  • ETCD benefits include
    • Fully replicated: Every node in an etcd cluster has access to the full data store.
    • Highly available: etcd is designed to have no single point of failure and gracefully tolerate hardware failures and network partitions.
    • Reliably consistent: Every data ‘read’ returns the latest data ‘write’ across all clusters.
    • Fast: etcd has been benchmarked at 10,000 writes per second.
    • Secure: etcd supports automatic Transport Layer Security (TLS) and optional secure socket layer (SSL) client certificate authentication.
    • Simple: Any application, from simple web apps to highly complex container orchestration engines such as Kubernetes, can read or write data to etcd using standard HTTP/JSON tools.

Scheduler (kube-scheduler)

  • The scheduler is responsible for assigning work to the various nodes. It keeps watch over the resource capacity and ensures that a worker node’s performance is within an appropriate threshold.
  • It schedules pods to worker nodes.
  • It watches api-server for newly created Pods with no assigned node, and selects a healthy node for them to run on.
  • If there are no suitable nodes, the pods are put in a pending state until such a healthy node appears.
  • It watches API Server for new work tasks.
  • Factors taken into account for scheduling decisions include:
    • Individual and collective resource requirements.
    • Hardware/software/policy constraints.
    • Affinity and anti-affinity specifications.
    • Data locality.
    • Inter-workload interference.
    • Deadlines and taints.

Controller Manager (kube-controller-manager)

  • Controller manager is responsible for making sure that the shared state of the cluster is operating as expected.
  • It watches the desired state of the objects it manages and watches their current state through the API server.
  • It takes corrective steps to make sure that the current state is the same as the desired state.
  • It is a controller of controllers.
  • It runs controller processes. Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process.
  • Some types of controllers are:
    • Node controller: Responsible for noticing and responding when nodes go down.
    • Job controller: Watches for Job objects that represent one-off tasks, then creates Pods to run those tasks to completion.
    • Endpoints controller: Populates the Endpoints object (that is, joins Services & Pods).
    • Service Account & Token controllers: Create default accounts and API access tokens for new namespaces.

Cloud Controller Manager

  • The cloud controller manager integrates with the underlying cloud technologies in your cluster when the cluster is running in a cloud environment.
  • The cloud-controller-manager only runs controllers that are specific to your cloud provider.
  • Cloud controller lets you link your cluster into cloud provider’s API, and separates out the components that interact with that cloud platform from components that only interact with your cluster.
  • The following controllers can have cloud provider dependencies:
    • Node controller: For checking the cloud provider to determine if a node has been deleted in the cloud after it stops responding.
    • Route controller: For setting up routes in the underlying cloud infrastructure.
    • Service controller: For creating, updating, and deleting cloud provider load balancers.

Data Plane Worker Node(s)

  • The data plane is known as the worker node or compute node.
  • A virtual or physical machine that contains the services necessary to run containerized applications.
  • A Kubernetes cluster needs at least one worker node, but normally has many.
  • The worker node(s) host the Pods that are the components of the application workload.
  • Pods are scheduled and orchestrated to run on nodes.
  • Cluster can be scaled up and down by adding and removing nodes.
  • Node components run on every node, maintaining running pods and providing the Kubernetes runtime environment.

kubelet

  • A Kubelet tracks the state of a pod to ensure that all the containers are running and healthy
  • provides a heartbeat message every few seconds to the control plane.
  • runs as an agent on each node in the cluster.
  • acts as a conduit between the API server and the node.
  • instantiates and executes Pods.
  • watches API Server for work tasks.
  • gets instructions from master and reports back to Masters.

kube-proxy

  • Kube proxy is a networking component that routes traffic coming into a node from the service to the correct containers.
  • is a network proxy that runs on each node in a cluster.
  • manages IP translation and routing.
  • maintains network rules on nodes. These network rules allow network communication to Pods from inside or outside of cluster.
  • ensures each Pod gets a unique IP address.
  • makes possible that all containers in a pod share a single IP.
  • facilitates Kubernetes networking services and load-balancing across all pods in a service.
  • It deals with individual host sub-netting and ensures that the services are available to external parties.

Container runtime

  • Container runtime is responsible for running containers (in Pods).
  • Kubernetes supports any implementation of the Kubernetes Container Runtime Interface CRI specifications
  • To run the containers, each worker node has a container runtime engine.
  • It pulls images from a container image registry and starts and stops containers.
  • Kubernetes supports several container runtimes:

 

Kubernetes Overview

Kubernetes Overview

  • Kubernetes is a portable, extensible, open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation.
  • Kubernetes originates from Greek, meaning helmsman or pilot.
  • Kubernetes provides an orchestration framework to run distributed systems resiliently. It takes care of scaling and failover for the application, provides deployment patterns, and more.

Container Deployment Model

Deployment evolution

  • Containers are similar to VMs, but they have relaxed isolation properties to share the Operating System (OS) among the applications.
  • Containers are lightweight and have their own filesystem, share of CPU, memory, process space, and more.
  • Containers are decoupled from the underlying infrastructure, they are portable across clouds and OS distributions.
  • Containers provide the following benefits
    • Agile application creation and deployment
    • Continuous development, integration, and deployment
    • Dev and Ops separation of concerns
    • Observability
    • Environmental consistency across development, testing, and production
    • Cloud and OS distribution portability
    • Application-centric management
    • Loosely coupled, distributed, elastic, liberated micro-services
    • Resource isolation & utilization

Kubernetes Features

  • Service discovery and load balancing
    • Kubernetes can expose a container using the DNS name or using their own IP address.
    • If traffic to a container is high, Kubernetes is able to load balance and distribute the network traffic so that the deployment is stable.
  • Storage orchestration
    • Kubernetes allows you to automatically mount a storage system of your choice, such as local storage, public cloud providers, and more.
  • Automated rollouts and rollbacks
    • Kubernetes can change the actual state of the deployed containers to the desired state at a controlled rate ensuring zero downtime.
  • Automatic bin packing
    • Kubernetes can fit containers onto the available nodes to make the best use of the resources as per the specified container specification.
  • Self-healing & High Availability

    • Kubernetes restarts containers that fail, replaces containers, kills containers that don’t respond to the user-defined health check, and doesn’t advertise them to clients until they are ready to serve.
  • Scalability
    • Kubernetes can help scale the application as per the load.
  • Secret and configuration management
    • Kubernetes helps store and manage sensitive information, such as passwords, OAuth tokens, and SSH keys.
    • Secrets and application configuration can be deployed without rebuilding the container images, and without exposing secrets in the stack configuration.

Kubernetes Architecture

Refer to detailed blog post @ Kubernetes Architecture

Kubernetes ArchitectureMaster components

  • Master components provide the cluster’s control plane.
  • Master components make global decisions about the cluster (for example, scheduling), and that they detect and answer cluster events (for example, beginning a replacement pod when a deployment’s replicas field is unsatisfied).
  • Master components include
    • Kube-API server – Exposes the API.
    • Etcd – key-value stores all cluster data. (Can be run on the same server as a master node or on a dedicated cluster.)
    • Kube-scheduler – Schedules new pods on worker nodes.
    • Kube-controller-manager – Runs the controllers.
    • Cloud-controller-manager – Talks to cloud providers.

Node components

  • Node components run on every node, maintaining running pods and providing the Kubernetes runtime environment.
    • Kubelet – Agent that ensures containers in a pod are running.
    • Kube-proxy – Keeps network rules and performs forwarding.
    • Container runtime – Runs containers.

Kubernetes Components

Refer to blog post @ Kubernetes Components

Kubernetes Security

Refer to blog post @ Kubernetes Security