Google Cloud – Professional Cloud Security Engineer Certification learning path

GCP - Professional Cloud Security Engineer Certificate

Google Cloud – Professional Cloud Security Engineer Certification learning path

Continuing on the Google Cloud Journey, have just cleared the Professional Cloud Security certification. Google Cloud – Professional Cloud Security Engineer certification exam focuses on almost all of the Google Cloud security services with storage, compute, networking services with their security aspects only.

Google Cloud -Professional Cloud Security Engineer Certification Summary

  • Has 50 questions to be answered in 2 hours.
  • Covers a wide range of Google Cloud services mainly focusing on security and network services
  • As mentioned for all the exams, Hands-on is a MUST, if you have not worked on GCP before make sure you do lots of labs else you would be absolutely clueless about some of the questions and commands
  • I did Coursera and ACloud Guru which is really vast, but hands-on or practical knowledge is MUST.

Google Cloud – Professional Cloud Security Engineer Certification Resources

Google Cloud – Professional Cloud Security Engineer Certification Topics

Security Services

  • Google Cloud – Security Services Cheat Sheet
  • Cloud Key Management Service – KMS
    • Cloud KMS provides a centralized, scalable, fast cloud key management service to manage encryption keys
    • KMS Key is a named object containing one or more key versions, along with metadata for the key.
    • KMS KeyRing provides grouping keys with related permissions that allow you to grant, revoke, or modify permissions to those keys at the key ring level without needing to act on each key individually.
  • Cloud Armor
    • Cloud Armor protects the applications from multiple types of threats, including DDoS attacks and application attacks like XSS and SQLi
    • works with the external HTTP(S) load balancer to automatically block network protocol and volumetric DDoS attacks such as protocol floods (SYN, TCP, HTTP, and ICMP) and amplification attacks (NTP, UDP, DNS)
    • with GKE needs to be configured with GKE Ingress
    • can be used to blacklist IPs
    • supports preview mode to understand patterns without blocking the users
  • Cloud Identity-Aware Proxy
    • Identity-Aware Proxy IAP allows managing access to HTTP-based apps both on Google Cloud and outside of Google Cloud.
    • IAP uses Google identities and IAM and can leverage external identity providers as well like OAuth with Facebook, Microsoft, SAML, etc.
    • Signed headers using JWT provide secondary security in case someone bypasses IAP.
  • Cloud Data Loss Prevention – DLP
    • Cloud Data Loss Prevention – DLP is a fully managed service designed to help discover, classify, and protect the most sensitive data.
    • provides two key features
      • Classification is the process to inspect the data and know what data we have, how sensitive it is, and the likelihood.
      • De-identification is the process of removing, masking, redaction, replacing information from data.
    • supports text, image, and storage classification with scans on data stored in Cloud Storage, Datastore, and BigQuery
    • supports scanning of binary, text, image, Microsoft Word, PDF, and Apache Avro files
  • Web Security Scanner
    • Web Security Scanner identifies security vulnerabilities in the App Engine, GKE, and Compute Engine web applications.
    • scans provide information about application vulnerability findings, like OWASP, XSS, Flash injection, outdated libraries, cross-site scripting, clear-text passwords, or use of mixed content
  • Security Command Center – SCC
    • is a Security and risk management platform that helps generate curated insights and provides a unique view of incoming threats and attacks to the assets
    • displays possible security risks, called findings, that are associated with each asset.
  • Forseti Security
    • the open-source security toolkit, and third-party security information and event management (SIEM) applications
    • keeps track of the environment with inventory snapshots of GCP resources on a recurring cadence
  • Access Context Manager
    • Access Context Manager allows organization administrators to define fine-grained, attribute-based access control for projects and resources
    • Access Context Manager helps reduce the size of the privileged network and move to a model where endpoints do not carry ambient authority based on the network.
    • Access Context Manager helps prevent data exfiltration with proper access levels and security perimeter rules

Compliance

  • FIPS 140-2 Validated
    • FIPS 140-2 Validated certification was established to aid in the protection of digitally stored unclassified, yet sensitive, information.
    • Google Cloud uses a FIPS 140-2 validated encryption module called BoringCrypto in the production environment. This means that both data in transit to the customer and between data centers, and data at rest are encrypted using FIPS 140-2 validated encryption.
    • BoringCrypto module that achieved FIPS 140-2 validation is part of the BoringSSL library.
    • BoringSSL library as a whole is not FIPS 140-2 validated
  • PCI/DSS Compliance
    • PCI/DSS compliance is a shared responsibility model
    • Egress rules cannot be controlled for App Engine, Cloud Functions, and Cloud Storage. Google recommends using compute Engine and GKE to ensure that all egress traffic is authorized.
    • Antivirus software and File Integrity monitoring must be used on all systems commonly affected by malware to protect systems from current and evolving malicious software threats including containers
    • For payment processing, the security can be improved and compliance proved by isolating each of these environments into its own VPC network and reduce the scope of systems subject to PCI audit standards

Networking Services

  • Refer Google Cloud Security Services Cheat Sheet
  • Virtual Private Cloud
    • Understand Virtual Private Cloud (VPC), subnets, and host applications within them
    • Firewall rules control the Traffic to and from instances. HINT: rules with lower integers indicate higher priorities. Firewall rules can be applied to specific tags.
    • Know implied firewall rules which deny all ingress and allow all egress
    • Understand the difference between using Service Account vs Network Tags for filtering in Firewall rules. HINT: Use SA over tags as it provides access control while tags can be easily inferred.
    • VPC Peering allows internal or private IP address connectivity across two VPC networks regardless of whether they belong to the same project or the same organization. HINT: VPC Peering uses private IPs and does not support transitive peering
    • Shared VPC allows an organization to connect resources from multiple projects to a common VPC network so that they can communicate with each other securely and efficiently using internal IPs from that network
    • Private Access options for services allow instances with internal IP addresses can communicate with Google APIs and services.
    • Private Google Access allows VMs to connect to the set of external IP addresses used by Google APIs and services by enabling Private Google Access on the subnet used by the VM’s network interface.
    • VPC Flow Logs records a sample of network flows sent from and received by VM instances, including instances used as GKE nodes.
    • Firewall Rules Logging enables auditing, verifying, and analyzing the effects of the firewall rules
  • Hybrid Connectivity
    • Understand Hybrid Connectivity options in terms of security.
    • Cloud VPN provides secure connectivity from the on-premises data center to the GCP network through the public internet. Cloud VPN does not provide internal or private IP connectivity
    • Cloud Interconnect provides direct connectivity from the on-premises data center to the GCP network
  • Cloud NAT
    • Cloud NAT allows VM instances without external IP addresses and private GKE clusters to send outbound packets to the internet and receive any corresponding established inbound response packets.
    • Requests would not be routed through Cloud NAT if they have an external IP address
  • Cloud DNS
    • Understand Cloud DNS and its features 
    • supports DNSSEC, a feature of DNS, that authenticates responses to domain name lookups and protects the domains from spoofing and cache poisoning attacks
  • Cloud Load Balancing
    • Google Cloud Load Balancing provides scaling, high availability, and traffic management for your internet-facing and private applications.
    • Understand Google Load Balancing options and their use cases esp. which is global, internal and does they support SSL offloading
      • Network Load Balancer – regional, external, pass through and supports TCP/UDP
      • Internal TCP/UDP Load Balancer – regional, internal, pass through and supports TCP/UDP
      • HTTP/S Load Balancer – regional/global, external, pass through and supports HTTP/S
      • Internal HTTP/S Load Balancer – regional/global, internal, pass through and supports HTTP/S
      • SSL Proxy Load Balancer – regional/global, external, proxy, supports SSL with SSL offload capability
      • TCP Proxy Load Balancer – regional/global, external, proxy, supports TCP without SSL offload capability

Identity Services

  • Resource Manager
    • Understand Resource Manager the hierarchy Organization -> Folders -> Projects -> Resources
    • IAM Policy inheritance is transitive and resources inherit the policies of all of their parent resources.
    • Effective policy for a resource is the union of the policy set on that resource and the policies inherited from higher up in the hierarchy.
  • Identity and Access Management
    • Identify and Access Management – IAM provides administrators the ability to manage cloud resources centrally by controlling who can take what action on specific resources.
    • A service account is a special kind of account used by an application or a virtual machine (VM) instance, not a person.
    • Service Account, if accidentally deleted, can be recovered if the time gap is less than 30 days and a service account by the same name wasn’t created
    • Understand IAM Best Practices
      • Use groups for users requiring the same responsibilities
      • Use service accounts for server-to-server interactions.
      • Use Organization Policy Service to get centralized and programmatic control over the organization’s cloud resources.
    • Domain-wide delegation of authority to grant third-party and internal applications access to the users’ data for e.g. Google Drive etc.
  • Cloud Identity
    • Cloud Identity provides IDaaS (Identity as a Service) and provides single sign-on functionality and federation with external identity provides like Active Directory.
    • Cloud Identity supports federating with Active Directory using GCDS to implement the synchronization

Compute Services

  • Compute services like Google Compute Engine and Google Kubernetes Engine are lightly covered more from the security aspects
  • Google Compute Engine
    • Google Compute Engine is the best IaaS option for compute and provides fine-grained control
    • Managing access using OS Login or project and instance metadata
    • Compute Engine is recommended to be used with Service Account with the least privilege to provide access to Google services and the information can be queried from instance metadata.
  • Google Kubernetes Engine
    • Google Kubernetes Engine, enables running containers on Google Cloud
    • Understand Best Practices for Building Containers
      • Package a single app per container
      • Properly handle PID 1, signal handling, and zombie processes
      • Optimize for the Docker build cache
      • Remove unnecessary tools
      • Build the smallest image possible
      • Scan images for vulnerabilities
      • Restrict using Public Image
      • Managed Base Images

Storage Services

  • Cloud Storage
    • Cloud Storage is cost-effective object storage for unstructured data and provides an option for long term data retention
    • Understand Cloud Storage Security features
      • Understand various Data Encryption techniques including Envelope Encryption, CMEK, and CSEK. HINT: CSEK works with Cloud Storage and Persistent Disks only. CSEK manages KEK and not DEK.
      • Cloud Storage default encryption uses AES256
      • Understand Signed URL to give temporary access and the users do not need to be GCP users
      • Understand access control and permissions – IAM (Uniform) vs ACLs (fine-grained control)
      • Bucket Lock feature allows configuring a data retention policy for a bucket that governs how long objects in the bucket must be retained. The feature also allows locking the data retention policy, permanently preventing the policy from being reduced or removed

Monitoring

  • Google Cloud Monitoring or Stackdriver
    • provides everything from monitoring, alert, error reporting, metrics, diagnostics, debugging, trace.
  • Google Cloud Logging or Stackdriver logging
    • Audit logs are provided through Cloud logging using Admin Activity and Data Access Audit logs
    • VPC Flow logs and Firewall Rules logs help monitor traffic to and from Compute Engine instances.
    • log sinks can export data to external providers via Cloud Pub/Sub

All the Best !!

HashiCorp Certified Terraform Associate Learning Path

If you are working on an multi-cloud environment and focusing on automation, you would surely have been using Terraform or considered at some point of time. I have been using Terraform for over two years now for provisioning infrastructure on AWS, GCP and AliCloud right through development to production and it has been a wonderful DevOps journey and It was good to validate the Terraform skills through the Terraform Associate certification.

Terraform is for Cloud Engineers specializing in operations, IT, or development who know the basic concepts and skills associated with open source HashiCorp Terraform.

HashiCorp Certified Terraform Associate Exam Summary

  • HashiCorp Certified Terraform Associate exam focuses on Terraform as a Infrastructure as a Code tool
  • HashiCorp Certified Terraform Associate exam has 57 questions with a time limit of 60 minutes
  • Exam has a multi answer, multiple choice, fill in the blanks and True/False type of questions
  • Questions and answer options are pretty short and if you have experience on Terraform they are pretty easy and the time if more than sufficient.

HashiCorp Certified Terraform Associate Exam Topic Summary

Refer Terraform Cheat Sheet for details

Understand Infrastructure as Code (IaC) concepts

  • Explain what IaC is
    • Infrastructure is described using a high-level configuration syntax
    • IaC allows Infrastructure to be versioned and treated as you would any other code.
    • Infrastructure can be shared and re-used.
  • Describe advantages of IaC patterns
    • makes Infrastructure more reliable
    • makes Infrastructure more manageable
    • makes Infrastructure more automated and less error prone

Understand Terraform’s purpose (vs other IaC)

  • Explain multi-cloud and provider-agnostic benefits
    • using multi-cloud setup increases fault tolerance and reduces dependency on a single Cloud
    • Terraform provides a cloud-agnostic framework and allows a single configuration to be used to manage multiple providers, and to even handle cross-cloud dependencies.
    • Terraform simplifies management and orchestration, helping operators build large-scale multi-cloud infrastructures.
  • Explain the benefits of state
    • State is a necessary requirement for Terraform to function.
    • Terraform requires some sort of database to map Terraform config to the real world.
    • Terraform uses its own state structure for mapping configuration to resources in the real world
    • Terraform state helps
      • track metadata such as resource dependencies.
      • provides performance as it stores a cache of the attribute values for all resources in the state
      • aids syncing when using in team with multiple users

Understand Terraform basics

  • Handle Terraform and provider installation and versioning
    • Providers provide abstraction above the upstream API and is responsible for understanding API interactions and exposing resources.
    • Terraform configurations must declare which providers they require, so that Terraform can install and use them
    • Provider requirements are declared in a required_providers block.
  • Describe plugin based architecture
    • Terraform relies on plugins called “providers” to interact with remote systems.
  • Demonstrate using multiple providers
    • supports multiple provider instances using alias for e.g. multiple aws provides with different region
  • Describe how Terraform finds and fetches providers
    • Terraform finds and installs providers when initializing a working directory. It can automatically download providers from a Terraform registry, or load them from a local mirror or cache.
    • Each Terraform module must declare which providers it requires, so that Terraform can install and use them.
  • Explain when to use and not use provisioners and when to use local-exec or remote-exec
    • Terraform provides local-exec and remote-exec to execute tasks not provided by Terraform
      • local exec executes code on the machine running terraform
      • remote exec executes on the resource provisioned and supports ssh and winrm
    • Provisioners should only be used as a last resort.
    • are defined within the resource block.
    • support types – Create and Destroy
      • if creation time fails, resource is tainted if provisioning failed, by default. (next apply it will be re-created)
      • behavior can be overridden by setting the on_failure to continue, which means ignore and continue
      • for destroy, if it fails – resources are not removed

Use the Terraform CLI (outside of core workflow)

  • Given a scenario: choose when to use terraform fmt to format code
    • terraform fmt helps format code to lint into a standard format. It usually aligns the spaces and matches the =
  • Given a scenario: choose when to use terraform taint to taint Terraform resources
    • terraform taint marks a Terraform-managed resource as tainted, forcing it to be destroyed and recreated on the next apply.
    • will not modify infrastructure, but does modify the state file in order to mark a resource as tainted.
    • Infrastructure and state are changed in next apply.
    • can be used to taint a resource within a module
  • Given a scenario: choose when to use terraform import to import existing infrastructure into your Terraform state
    • terraform import helps import already-existing external resources, not managed by Terraform, into Terraform state and allow it to manage those resources
    • Terraform is not able to auto-generate configurations for those imported modules, for now, and requires you to first write the resource definition in Terraform and then import this resource
  • Given a scenario: choose when to use terraform workspace to create workspaces
    • Terraform workspace helps manage multiple distinct sets of infrastructure resources or environments with the same code.
    • state files for each workspace are stored in the directory terraform.tfstate.d
    • terraform workspace new dev creates a new workspace with name dev and switches to it as well
    • does not provide strong separation as it uses the same backend
  • Given a scenario: choose when to use terraform state to view Terraform state
    • state helps keep track of the infrastructure Terraform manages
    • stored locally in the terraform.tfstate
    • recommended not to edit the state manually
    • Use terraform state command
      • mv – to move/rename modules
      • rm – to safely remove resource from the state. (destroy/retain like)
      • pull – to observe current remote state
      • list & show – to write/debug modules
  • Given a scenario: choose when to enable verbose logging and what the outcome/value is
    • debugging can be controlled using TF_LOG , which can be configured for different levels TRACE, DEBUG, INFO, WARN or ERROR, with TRACE being the more verbose.
    • logs path can be controlled TF_LOG_PATHTF_LOG needs to be specified.

Interact with Terraform modules

  • Contrast module source options
    • Terraform Module Registry allows you to browse, filter and search for modules
  • Interact with module inputs and outputs
    • Input variables serve as parameters for a Terraform module, allowing aspects of the module to be customized without altering the module’s own source code, and allowing modules to be shared between different configurations.
    • Resources defined in a module are encapsulated, so the calling module cannot access their attributes directly.
    • Child module can declare output values to selectively export certain values to be accessed by the calling module module.module_name.output_value
  • Describe variable scope within modules/child modules
    • Modules are called from within other modules using module blocks
    • All modules require a source argument, which is a meta-argument defined by Terraform
    • To call a module means to include the contents of that module into the configuration with specific values for its input variables.
  • Discover modules from the public Terraform Module Registry
    • Terraform Module Registry allows you to browse, filter and search for modules
  • Defining module version
    • must be on GitHub and must be a public repo, if using public registry.
    • must be named terraform-<PROVIDER>-<NAME>, where <NAME> reflects the type of infrastructure the module manages and <PROVIDER> is the main provider where it creates that infrastructure. for e.g. terraform-google-vault or terraform-aws-ec2-instance.
    • must maintain x.y.z tags for releases to identify module versions. and can optionally be prefixed with a v for example, v1.0.4 and 0.9.2. Tags that don’t look like version numbers are ignored.
    • must maintain a Standard module structure, which allows the registry to inspect the module and generate documentation, track resource usage, parse submodules and examples, and more.

Navigate Terraform workflow

  • Describe Terraform workflow ( Write -> Plan -> Create )
    • Core Terraform workflow has three steps:
      • Write – Author infrastructure as code.
      • Plan – Preview changes before applying.
      • Apply – Provision reproducible infrastructure.
  • Initialize a Terraform working directory terraform init
    • initializes a working directory containing Terraform configuration files.
    • performs backend initialization, modules and plugins installation.
    • plugins are downloaded in the sub-directory of the present working directory at the path of .terraform/plugins
    • does not delete the existing configuration or state
  • Validate a Terraform configuration terraform validate
    • validates the configuration files in a directory, referring only to the configuration and not accessing any remote services such as remote state, provider APIs, etc.
    • verifies whether a configuration is syntactically valid and internally consistent, regardless of any provided variables or existing state.
    • useful for general verification of reusable modules, including the correctness of attribute names and value types.
  • Generate and review an execution plan for Terraform terraform plan
    • terraform plan create a execution plan as it traverses each vertex and requests each provider using parallelism
    • calculates the difference between the last-known state and the current state and presents this difference as the output of the terraform plan operation to user in their terminal
    • does not modify the infrastructure or state.
    • allows a user to see which actions Terraform will perform prior to making any changes to reach the desired state
    • performs refresh for each resource and might hit rate limiting issues as it calls provider APIs
    • all resources refresh can be disabled or avoided using
      • -refresh=false or
      • target=xxxx or
      • break resources into different directories.
  • Execute changes to infrastructure with Terraform terraform apply
    • will always ask for confirmation before executing unless passed the -auto-approve flag.
    • if a resource successfully creates but fails during provisioning, Terraform will error and mark the resource as “tainted”. Terraform does not roll back the changes
  • Destroy Terraform managed infrastructure terraform destroy
    • will always ask for confirmation before executing unless passed the -auto-approve flag.

Implement and maintain state

  • Describe default local backend
    • A “backend” in Terraform determines how state is loaded and how an operation such as apply is executed. This abstraction enables non-local file state storage, remote execution, etc.
    • determines how state is loaded and how an operation such as apply is executed
    • is responsible for storing state and providing an API for optional state locking
    • needs to be initialized
    • helps
      • collaboration and working as a team, with the state maintained remotely and state locking
      • can provide enhanced security for sensitive data
      • support remote operations
    • local (default) backend stores state in a local JSON file on disk
  • Outline state locking
    • happens for all operations that could write state, if supported by backend for e.g. S3 with DynamoDB, Consul etc.
    • prevents others from acquiring the lock & potentially corrupting the state
    • use force-unlock command to manually unlock the state if unlocking failed
    • backends which support state locking are
      • azurerm
      • Hashicorp consul
      • Tencent Cloud Object Storage (COS)
      • etcdv3
      • Google Cloud Storage GCS
      • HTTP endpoints
      • Kubernetes Secret with locking done using a Lease resource
      • AliCloud Object Storage OSS with locking via TableStore
      • PostgreSQL
      • AWS S3 with locking via DynamoDB
      • Terraform Enterprise
    • Backends which do not support state locking are
      • artifactory
      • etcd
  • Handle backend authentication methods
    • every remote backend support different authentication mechanism and can be configured with the backend configuration
  • Describe remote state storage mechanisms and supported standard backends
    • remote backend stores state remotely like S3, OSS, GCS, Consul and support features like remote operation, state locking, encryption, versioning etc.
    • github is not a supported backend type.
  • Describe effect of Terraform refresh on state
    • terraform refreshis used to reconcile the state Terraform knows about (via its state file) with the real-world infrastructure.
    • can be used to detect any drift from the last-known state, and to update the state file.
    • does not modify infrastructure but does modify the state file.
  • Describe backend block in configuration and best practices for partial configurations
    • Backend configuration doesn’t support interpolations.
    • supports partial configuration with remaining configuration arguments provided as part of the initialization process
    • if switching the backed for the first time setup, Terraform provides a migration option
  • Understand secret management in state files
    • terraform state command is used for advanced state management
    • Terraform has no mechanism to redact or protect secrets that are returned via data sources, so secrets read via this provider will be persisted into the Terraform state, into any plan files, and in some cases in the console output produced while planning and applying.
    • can be protected accordingly either by using Vault and remote backends with encryption and proper access control

Read, generate, and modify configuration

  • Demonstrate use of variables and outputs
    • Variables
      • serve as parameters for a Terraform module and
      • act like function arguments
      • count is a reserved word and cannot be used as variable name
    • Output
      • are like function return values.
      • can be marked sensitive which prevents showing its value in the list of outputs. However, they are stored in the state as plain text.
  • Describe secure secret injection best practice
  • Understand the use of collection and structural types
    • supports primitive data types of
      • string, number and bool
      • automatically convert number and bool values to string values
    • supports complex data types of
      • list – sequence of values identified by consecutive whole numbers starting with zero.
      • map – collection of values where each is identified by a string label
      • set – collection of unique values that do not have any secondary identifiers or ordering.
    • supports structural data types of
      • object – a collection of named attributes with their own type
      • tuple – a sequence of elements identified by consecutive whole numbers starting with zero, where each element has its own type.
  • Create and differentiate resource and data configuration
    • Resources describe one or more infrastructure objects, such as virtual networks, instances, or higher-level components such as DNS records.
    • Data sources allow data to be fetched or computed for use elsewhere in Terraform configuration. Use of data sources allows a Terraform configuration to make use of information defined outside of Terraform, or defined by another separate Terraform configuration.
  • Use resource addressing and resource parameters to connect resources together
  • Use Terraform built-in functions to write configuration
    • lookup retrieves the value of a single element from a map, given its key. If the given key does not exist, a the given default value is returned instead. lookup(map, key, default)
    • zipmap constructs a map from a list of keys and a corresponding list of values. A map is denoted by { } whereas a list is donated by [ ] for e.g. zipmap(["a", "b"], [1, 2]) results into {"a" = 1, "b" = 2}
  • Configure resource using a dynamic block
    • dynamic acts much like a for expression, but produces nested blocks instead of a complex typed value. It iterates over a given complex value, and generates a nested block for each element of that complex value.
    • Overuse of dynamic block is not recommended as it makes the code hard to understand and debug
  • Describe built-in dependency management (order of execution based)
    • Terraform analyses any expressions within a resource block to find references to other objects and treats those references as implicit ordering requirements when creating, updating, or destroying resources.
    • Explicit dependency can be defined using the depends_on attribute where dependencies between resources that are not visible
  • support comments using #, // and /* */

Understand Terraform Cloud and Enterprise capabilities

  • Describe the benefits of Sentinel, registry, and workspaces
    • Terraform Cloud provides private module registry for storing modules private to be used within the organization
  • Differentiate OSS and TFE workspaces
  • Summarize features of Terraform Cloud
    • Terraform Enterprise currently supports running under the following operating systems for a Clustered deployment:
      • Ubuntu 16.04.3 – 16.04.5 / 18.04
      • Red Hat Enterprise Linux 7.4 through 7.7
      • CentOS 7.4 – 7.7
      • Amazon Linux
      • Oracle Linux
      • Clusters currently don’t support other Linux variants.
    • Terraform Enterprise install that is provisioned on a network that does not have Internet access is generally known as an air-gapped install.

HashiCorp Certified Terraform Associate Exam Resources

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Learning Path

AWS Certified Alexa Skill Builder - Specialty Certificate

Finally All Down for AWS (for now) …

Continuing on my AWS journey with the last AWS certification, I took another step by clearing the AWS Certified Alexa Skill Builder – Specialty (AXS-C01) certification. It is amazing to know and learn how Voice first experiences are making an impact and changing how we think about technology and use cases.

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) exam basically validates your ability to build, test, publish and certify Alexa skills.

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Summary

  • AWS Certified Alexa Skill Builder – Specialty exam focuses only on Alexa and how to build skills.
  • AWS Certified Alexa Skill Builder – Specialty exam has 65 questions with a time limit of 170 minutes
  • Compared to the other professional and specialty exams, the question and answers are not long and similar to associate exams. So if you are prepared well, it should not need the 170 minutes.
  • As the exam was online from home, there was no access to paper and pen but the trick remains the same, read the question and draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.

Are you looking for a job? Visit Jooble!

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Topic Summary

Refer AWS Alexa Cheat Sheet

Domain 1: Voice-First Design Practices and Capabilities

1.1 Describe how users interact with skills

1.2 Map features and capabilities to use cases

  • Alexa supports display cards to display text (Simple card) and text with image (Standard card)
  • Alexa Alexa Skill Kits supports APIs
    • Alexa Settings APIs allow developers to retrieve customer preferences for the settings like time zone, distance measuring unit, and temperature measurement unit 
    • Device services – a skill can request the customer’s permission to their address information, which is a static data filled by customer and includes the country/region, postal code and full address
    • Customer Profile services – a skill can request the customer’s permission to their contact information, which includes name, email address and phone number
    • With Location services, a skill can ask a user’s permission to obtain the real-time location of their Alexa-enabled device, specifically at the time of the user’s request to Alexa, so that the skill can provide enhanced services.
  • Alexa Skill Kit APIs need apiAccessToken and deviceId to access the ASK APIs
  • Progressive Response API allows you to keep the user engaged while the skill prepares a full response to the user’s request.
  • Personalization can be provided using userId and state persistence

Domain 2: Skill Design

2.1 Design and develop an interaction model

  • Alexa interaction model includes skill, Invocation name, utterances, slots, Intents
  • A skill is ‘an app for Alexa’, however they are not downloadable but just need to be enabled.
  • Wakeword – Amazon offers a choice of wakewords like ‘Alexa’, ‘Amazon’, ‘Echo’, ‘skill’, ‘app’ or ‘Computer’, with the default being ‘Alexa’.
  • Launch phrases include “run,” “start,” “play,” “resume,” “use,” “launch,” “ask,” “open,” “tell,” “load,” “begin,” and “enable.”
  • Connecting words include “to,” “from,” “in,” “using,” “with,” “about,” “for,” “that,” “by,” “if,” “and,” “whether.”
  • Invocation name
    • is the word or phrase used to trigger the skill for custom skills and the invocation name should adhere to the requirements
    • must not infringe upon the intellectual property rights of an entity or person
    • must be compound of two or more works.
    • One-word invocation names are allowed only for brand/intellectual property.
    • must not include names of people or places
    • if two-word invocation names, one of the words cannot be a definite article (“the”), indefinite article (“a”, “an”) or preposition (“for”, “to”, “of,” “about,” “up,” “by,” “at,” “off,” “with”).
    • must not contain any of the Alexa skill launch phrases, connecting words and wake words
    • must contain only lower-case alphabetic characters, spaces between words, and possessive apostrophes
    • must spell characters like numbers for e.g., twenty one
    • can have periods in the invocation names containing acronyms or abbreviations that are pronounced as a series of individual letters, for e.g. NASA as n. a. s. a.
    • cannot spell out phonemes for e.g., a skill titled “AWS Facts” would need “AWS” represented as “a. w. s. ” and NOT “ay double u ess.”
    • must not create confusion with existing Alexa features.
    • must be written in each supported language
  • An intent is what a user is trying to accomplish.
    • Amazon provides standard built-in intents which can be extended
    • Intents need to have a unique utterance
  • Utterances are the specific phrases that people will use when making a request to Alexa.
  • A slot is a variable that relates to an intent allowing Alexa to understand information about the request
    • Amazon provides standard built-in slots which can be extended
  • Entity resolution improves the way Alexa matches possible slot values in a user’s utterance with the slots defined in your interaction model

2.2 Design a multi-turn conversation

  • Alexa Dialog management model identifies the prompts and utterances to collect, validate, and confirm the slot values and intents.
  • Alexa supports
    • Auto Delegation where Alexa completes all of the dialog steps based on the dialog model.
    • Manual delegation using Dialog.Delegate where Alexa sends the skill an IntentRequest for each turn of the conversation and provides more flexibility.
  • AMAZON.FallbackIntent will not be triggered in the middle of a dialog

2.3 Use built-in intents and slots

  • Standard built-in intents cannot include any slots. If slots are needed, create a custom intent and write your own sample utterances.
  • Alexa recommends using and extending standard built-in intents like Alexa.HelpIntent, Alexa.YesIntent with additional utterances as per the skill requirements
  • Alexa provides Alexa.FallbackIntent for handling any unmatched utterances and can be used to improve the interaction model accuracy.
  • Standard built-in intents cannot include any slots. If slots are needed, create a custom intent and write your own sample utterances.
  • Alexa provides slot which helps capture variables and can be either be a Amazon predefined slot such as dates, numbers, durations, time, etc. or a custom one specific to the skill
  • Predefined slots can be extended to add additional values

2.4 Handle unexpected conversational requests or responses

  • Alexa provides Alexa.FallbackIntent for handling any unmatched utterances and can be used to improve the interaction model accuracy.
  • Alexa also provides Intent History  which provides  a consolidate view with aggregated, anonymized frequent utterances and the resolved intents. These can be used to map the utterances to correct intents

2.5 Design multi-modal skills using one or more service interfaces (for example, audio, video, and gadgets)

  • Alexa enabled devices with a screen handles Page and Scroll intents. Do not handle Next and Previous.
  • Alexa skill with AudioPlayer interface
    • must handle AMAZON.ResumeIntent and AMAZON.PauseIntent
    • PlaybackController events to track AudioPlayer status changes initiated from the device buttons

Domain 3: Skill Architecture

3.1 Identify AWS services for extending Alexa skill functionality (Amazon CloudFront, Amazon S3, Amazon CloudWatch, and Amazon DynamoDB)

  • Focus on standard skill architecture using Lambda for backend, DynamoDB for persistence, S3 for severing static assets, and CloudWatch for monitoring and logs.
  • Lambda provide serverless handling for the Alexa requests, but remember the following limits
    • default concurrency soft limit of 1000 can be increased by raising a support request
    • default timeout of 3 secs, and should be increased to atleast 7 secs to be inline with Alexa timeout of 8 secs
    • default memory of 128mb, increase to improve performance
  • S3 performance can be improved by exposing it through CloudFront esp. for images, audio and video files

3.2 Use AWS Lambda to build Alexa skills

  • Lambda integrates with CloudWatch to provide logs and should be the first thing to check in case of any issues or errors.
  • Alexa allows any http endpoint to act as a backend, but needs to meet following requirements
    • must be accessible over the internet.
    • must accept HTTP requests on port 443.
    • must support HTTP over SSL/TLS, using an Amazon-trusted certificate.

3.3 Follow AWS and Alexa security and privacy best practices

  • Alexa requires the backend to verify that incoming requests come from Alexa using Skill ID verification
  • Child-directed skills cannot use personal and location information
  • Skills cannot be used to capture health information
  • Alexa Skills Kit uses the OAuth 2.0 authentication framework for Account linking, which defines a means by which the service can allow Alexa, with the user’s permission, to access information from the account that the user has set up with you.
  • Alexa smart home skills must have OAuth authorization code grant implementation while custom skills can have authorization code grant or impact grant implementation.

Domain 4: Skill Development

4.1 Implement in-skill purchasing and Amazon Pay for Alexa Skills

  • In-skill purchasing enables selling premium content such as game features and interactive stories in skills with a custom interaction model.
  • In-skill purchasing is handled by Alexa when the skill sends a Upsell directive. As the skill session ends when a Upsell directive is sent, be sure to save any relevant user data in a persistent data store so that the skill can continue where the user left off after the purchase flow is completed and the endpoint is back in control of the user experience.
  • Skill can handle the Connections.Response request that indicates the result of a purchase flow and resume the skill

4.2 Use Speech Synthesis Markup Language (SSML) for expression and MP3 audio

  • SSML is a markup language that provides a standard way to mark up text for the generation of synthetic speech.
  • Alexa supports a subset of SSML tags including
    • say-as to interpret text as telephone, date, time etc.
    • phonemeprovides a phonemic/phonetic pronunciation
    • prosody modifies the volume, pitch, and rate of the tagged speech.
    • audioallows playing MP3 player while rendering a response
      • must be in valid MP3 file (MPEG version 2) format
      • must be hosted at an Internet-accessible HTTPS endpoint.
      • For speech response, the audio file cannot be longer than 240 seconds.
        • combined total time for all audio files in the outputSpeech property of the response cannot be more than 240 seconds.
        • combined total time for all audio files in the reprompt property of the response cannot be more than 90 seconds.
      • bit rate must be 48 kbps.
      • sample rate must be 22050Hz, 24000Hz, or 16000Hz.

4.3 Implement state management

  • Alexa Skill state persistence can be handled using session attributes during the session and externally using services like DynamoDB, RDS across sessions.

4.4 Implement Alexa service interfaces (audio player, video player, and screens)

4.5 Parse Alexa JSON requests and provide responses

  • All requests include the session (optional), context, and request objects at the top level.
    •  session object provides additional context associated with the request.
      • session attributes can be used to store data
      • user containing userId to uniquely define an user and accessToken to access other services.
      • system object provides apiAccessToken and device object provides deviceId to access ASK APIs
      • application provide applicationId
      • device object provides supportedInterfaces to list each interface that the device supports
      • user containing userId to uniquely define an user and accessToken to access other services.
    • request object that provides the details of the user’s request.
  • Response includes
    • outputSpeech contains the speech to render to the user.
    • reprompt contains the outputSpeech to use if a re-prompt is necessary.
    • shouldEndSession provides a boolean value that indicates what should happen after Alexa speaks the response.

Domain 5: Test, Validate, and Troubleshoot

5.1 Debug and troubleshoot using Amazon CloudWatch or other tools

  • Lambda integrates with CloudWatch for metric and logs and can be check for any errors and metrics.

5.2 Use the Alexa developer testing tools

  • Utterance profiles – test utterances to know what intent they resolve to 
  • Alexa Skill simulator
    • provides an ability to Interact with Alexa with either your voice or text, without an actual device.
    • maintains the skill session, so the interaction model and dialog flow can be tested.
    • supports multiple languages testing by selecting locale
    • has limitations in testing audio, video, Alexa settings and Device API
  • Manual Json
    • enter a JSON request directly and see the skill returned JSON response
    • does not maintain the skill session and is similar to testing a JSON request in the Lambda console.
  • Voice & Tone – enter plain text or SSML and hear how Alexa speaks the text in a selected language
  • Alexa device – test with an Alexa-enabled device.
  • Alexa app – test the skill with the Alexa app for Android/iOS
  • Lambda Test console – to test Lambda functions

5.3 Perform beta testing

  • Skill beta testing tool can be used to test the Alexa skill in beta before releasing it to production
  • Beat testing allows testing changes to an existing skill, while still keeping the currently live version of the skill available for the general public.
  • Members can be invited using their Alexa email address. Alexa device used by the beta tester must be associated with the email address in the tester’s invitation.

5.4 Troubleshoot errors in the interaction model

Domain 6: Publishing, Operations, and Lifecycle Management

6.1 Describe the skill publishing process

  • Alexa skill needs to go through certification process before the Skill is live and made available to the users
  • Alexa creates an in development version of the skill, once the skill becomes live
  • Alexa Skill live version cannot be edited, and it is recommended to edit the in development skill, test and then re-certify for publishing.
  • Backend changes like changes in Lambda functions or response output from the function, however, can be made on live version and do not require re-certification. However, it is recommended to use Lambda versioning or alias to do such changes.
  • Alexa for Business allows skill to be made private and available to select users within the company

6.2 Add and remove users in the developer console

  • Alexa Skill Developer console access can be shared across multiple users for collaboration
  • Administrator and Analyst roles will also have access to the Earnings and Payments sections.
  • Administrator and Marketer roles will also have access to edit the content associated with apps (i.e. Descriptions, Images & Multimedia) and IAPs
  • Administrator and Developer roles will have access to create, modify and delete Alexa skills using ASK CLI and SMAPI.
  • Administrator, Analyst and Marketer roles have access to sales report

6.3 Perform analysis of skill analytics in the developer console

  • Intent History – View aggregated, anonymized frequent utterances and the resolved intents. You cannot track the user intent history as they are anonymized.
  • Actions – Unique customers per action, total actions, and total utterances per action.
  • Customers – Total number of unique customers who accessed the skill.
  • Intents – Unique customers per intent, total utterances per intent, total intents, and failed intents.
  • Interaction Path – Paths users take when interacting with the skill.
  • Plays Total number of times that a user played the skill content.
  • Retention (live skills only) Usage of the skill over time by groups of customers or cohorts. View the number or percentage of customers who returned to your skill over a 12-week period.
  • Sessions Total sessions, successful session types (sessions that didn’t end due to an error), average sessions per customer. Includes a breakdown of successful, failed, and no-response sessions as a percentage of total sessions. Custom
  • Utterances Metrics for utterances depend on the skill category.

6.4 Differentiate among the statuses/versions of skills (for example, In Development, In Certification, and Live)

  • In Development – skill available for development, testing
  • In Review – A certification review is in progress and the skill cannot be edited
  • Certified – Skill passed certification review, and is not yet available to users
  • Live – skill has been published and is available to users. You cannot edit the configuration for live skills
  • Hidden – skill was previously published, but has since been hidden. Existing users can access the skill. New users cannot discover the skill.
  • Removed – skill was previously published, but has since been removed. Users cannot enable or use the skill.

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Resources

AWS Certified Solutions Architect – Associate SAA-C02 Exam Learning Path

SAA-C02 Certification

AWS Certified Solutions Architect – Associate SAA-C02 Exam Learning Path

AWS Solutions Architect – Associate SAA-C02 exam is the latest AWS exam that has replaced the previous SAA-C01 certification exam. It basically validates the ability to effectively demonstrate knowledge of how to architect and deploy secure and robust applications on AWS technologies

  • Define a solution using architectural design principles based on customer requirements.
  • Provide implementation guidance based on best practices to the organization throughout the life cycle of the project.

Refer AWS_Solution_Architect_-_Associate_SAA-C02_Exam_Blue_Print

AWS Solutions Architect – Associate SAA-C02 Exam Summary

  • SAA-C02 exam consists of 65 questions in 130 minutes, and the time is more than sufficient if you are well prepared.
  • SAA-C02 Exam covers the architecture aspects in deep, so you must be able to visualize the architecture, even draw them out in the exam just to understand how it would work and how different services relate.
  • AWS has updated the exam concepts from the focus being on individual services to more building of scalable, highly available, cost-effective, performant, resilient.
  • If you had been preparing for the SAA-C01 –
    • SAA-C02 is pretty much similar to SAA-C01 except the operational effective architecture domain has been dropped
    • Although, most of the services and concepts covered by the SAA-C01 are the same. There are few new additions like Aurora Serverless, AWS Global Accelerator, FSx for Windows, FSx for Lustre
  • AWS exams are available online, and I took the online one. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join atleast 30 minutes before the actual time.

AWS Solutions Architect – Associate SAA-C02 Exam Resources

AWS Solutions Architect – Associate SAA-C02 Exam Topics

Make sure you go through all the topics and focus on hints in italics

Networking

  • Be sure to create VPC from scratch. This is mandatory.
    • Create VPC and understand whats an CIDR and addressing patterns
    • Create public and private subnets, configure proper routes, security groups, NACLs. (hint: Subnets are public or private depending on whether they can route traffic directly through Internet gateway)
    • Create Bastion for communication with instances
    • Create NAT Gateway or Instances for instances in private subnets to interact with internet
    • Create two tier architecture with application in public and database in private subnets
    • Create three tier architecture with web servers in public, application and database servers in private. (hint: focus on security group configuration with least privilege)
    • Make sure to understand how the communication happens between Internet, Public subnets, Private subnets, NAT, Bastion etc.
  • Understand difference between Security Groups and NACLs (hint: Security Groups are Stateful vs NACLs are stateless. Also only NACLs provide an ability to deny or block IPs)
  • Understand VPC endpoints and what services it can help interact (hint: VPC Endpoints routes traffic internally without Internet)
    • VPC Gateway Endpoints supports S3 and DynamoDB.
    • VPC Interface Endpoints OR Private Links supports others
  • Understand difference between NAT Gateway and NAT Instance (hint: NAT Gateway is AWS managed and is scalable and highly available)
  • Understand how NAT high availability can be achieved (hint: provision NAT in each AZ and route traffic from subnets within that AZ through that NAT Gateway)
  • Understand VPN and Direct Connect for on-premises to AWS connectivity
    • VPN provides quick connectivity, cost-effective, secure channel, however routes through internet and does not provide consistent throughput
    • Direct Connect provides consistent dedicated throughput without Internet, however requires time to setup and is not cost-effective
  • Understand Data Migration techniques
    • Choose Snowball vs Snowmobile vs Direct Connect vs VPN depending on the bandwidth available, data transfer needed, time available, encryption requirement, one-time or continuous requirement
    • Snowball, SnowMobile are for one-time data, cost-effective, quick and ideal for huge data transfer
    • Direct Connect, VPN are ideal for continuous or frequent data transfers
  • Understand CloudFront as CDN and the static and dynamic caching it provides, what can be its origin (hint: CloudFront can point to on-premises sources and its usecases with S3 to reduce load and cost)
  • Understand Route 53 for routing
    • Understand Route 53 health checks and failover routing
    • Understand  Route 53 Routing Policies it provides and their use cases mainly for high availability (hint: focus on weighted, latency, geolocation, failover routing)
  • Be sure to cover ELB concepts in deep.
    • SAA-C02 focuses on ALB and NLB and does not cover CLB
    • Understand differences between  CLB vs ALB vs NLB
      • ALB is layer 7 while NLB is layer 4
      • ALB provides content based, host based, path based routing
      • ALB provides dynamic port mapping which allows same tasks to be hosted on ECS node
      • NLB provides low latency and ability to scale
      • NLB provides static IP address

Security

  • Understand IAM as a whole
    • Focus on IAM role (hint: can be used for EC2 application access and Cross-account access)
    • Understand IAM identity providers and federation and use cases
    • Understand MFA and how would implement two factor authentication for an application
    • Understand IAM Policies (hint: expect couple of questions with policies defined and you need to select correct statements)
  • Understand encryption services
  • AWS WAF integrates with CloudFront to provide protection against Cross-site scripting (XSS) attacks. It also provide IP blocking and geo-protection.
  • AWS Shield integrates with CloudFront to provide protection against DDoS.
  • Refer Disaster Recovery whitepaper, be sure you know the different recovery types with impact on RTO/RPO.

Storage

  • Understand various storage options S3, EBS, Instance store, EFS, Glacier, FSx and what are the use cases and anti patterns for each
  • Instance Store
    • Understand Instance Store (hint: it is physically attached  to the EC2 instance and provides the lowest latency and highest IOPS)
  • Elastic Block Storage – EBS
    • Understand various EBS volume types and their use cases in terms of IOPS and throughput. SSD for IOPS and HDD for throughput
    • Understand Burst performance and I/O credits to handle occasional peaks
    • Understand EBS Snapshots (hint: backups are automated, snapshots are manual
  • Simple Storage Service – S3
    • Cover S3 in depth
    • Understand S3 storage classes with lifecycle policies
      • Understand the difference between SA Standard vs SA IA vs SA IA One Zone in terms of cost and durability
    • Understand S3 Data Protection (hint: S3 Client side encryption encrypts data before storing it in S3)
    • Understand S3 features including
      • S3 provides a cost effective static website hosting
      • S3 versioning provides protection against accidental overwrites and deletions
      • S3 Pre-Signed URLs for both upload and download provides access without needing AWS credentials
      • S3 CORS allows cross domain calls
      • S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket.
    • Understand Glacier as an archival storage with various retrieval patterns
    • Glacier Expedited retrieval now allows object retrieval within mins
  • Understand Storage gateway and its different types.
    • Cached Volume Gateway provides access to frequently accessed data, while using AWS as the actual storage
    • Stored Volume gateway uses AWS as a backup, while the data is being stored on-premises as well
    • File Gateway supports SMB protocol
  • Understand FSx easy and cost effective to launch and run popular file systems.
  • Understand the difference between EBS vs S3 vs EFS
    • EFS provides shared volume across multiple EC2 instances, while EBS can be attached to a single volume within the same AZ.
  • Understand the difference between EBS vs Instance Store
  • Would recommend referring Storage Options whitepaper, although a bit dated 90% still holds right

Compute

  • Understand Elastic Cloud Compute – EC2
  • Understand Auto Scaling and ELB, how they work together to provide High Available and Scalable solution. (hint: Span both ELB and Auto Scaling across Multi-AZs to provide High Availability)
  • Understand EC2 Instance Purchase Types – Reserved, Scheduled Reserved, On-demand and Spot and their use cases
    • Choose Reserved Instances for continuous persistent load
    • Choose Scheduled Reserved Instances for load with fixed scheduled and time interval
    • Choose Spot instances for fault tolerant and Spiky loads
    • Reserved instances provides cost benefits for long terms requirements over On-demand instances
    • Spot instances provides cost benefits for temporary fault tolerant spiky load
  • Understand EC2 Placement Groups (hint: Cluster placement groups provide low latency and high throughput communication, while Spread placement group provides high availability)
  • Understand Lambda and serverless architecture, its features and use cases. (hint: Lambda integrated with API Gateway to provide a serverless, highly scalable, cost-effective architecture)
  • Understand ECS with its ability to deploy containers and micro services architecture.
    • ECS role for tasks can be provided through taskRoleArn
    • ALB provides dynamic port mapping to allow multiple same tasks on the same node
  • Know Elastic Beanstalk at a high level, what it provides and its ability to get an application running quickly.

Databases

  • Understand relational and NoSQLs data storage options which include RDS, DynamoDB, Aurora and their use cases
  • RDS
    • Understand RDS features – Read Replicas vs Multi-AZ
      • Read Replicas for scalability, Multi-AZ for High Availability
      • Multi-AZ are regional only
      • Read Replicas can span across regions and can be used for disaster recovery
    • Understand Automated Backups, underlying volume types
  • Aurora
    • Understand Aurora
      • provides multiple read replicas and replicates 6 copies of data across AZs
    • Understand Aurora Serverless provides a highly scalable cost-effective database solution
  • DynamoDB
    • Understand DynamoDB with its low latency performance, key-value store (hint: DynamoDB is not a relational database)
    • DynamoDB DAX provides caching for DynamoDB
    • Understand DynamoDB provisioned throughput for Read/Writes (It is more cover in Developer exam though.)
  • Know ElastiCache use cases, mainly for caching performance

Integration Tools

  • Understand SQS as message queuing service and SNS as pub/sub notification service
  • Understand SQS features like visibility, long poll vs short poll
  • Focus on SQS as a decoupling service
  • Understand SQS Standard vs SQS FIFO difference (hint: FIFO provides exactly once delivery both low throughput)

Analytics

  • Know Redshift as a business intelligence tool
  • Know Kinesis for real time data capture and analytics
  • Atleast know what AWS Glue does, so you can eliminate the answer

Management Tools

  • Understand CloudWatch monitoring to provide operational transparency
  • Know which EC2 metrics it can track. Remember, it cannot track memory and disk space/swap utilization
  • Understand CloudWatch is extendable with custom metrics
  • Understand CloudTrail for Audit
  • Have a basic understanding of CloudFormation, OpsWorks

AWS Whitepapers & Cheat sheets

AWS Solutions Architect – Associate Exam Domains

Domain 1: Design Resilient Architectures

  1. Design a multi-tier architecture solution
  2. Design highly available and/or fault-tolerant architectures
  3. Design decoupling mechanisms using AWS services
  4. Choose appropriate resilient storage

Domain 2: Define High-Performing Architectures

  1. Identify elastic and scalable compute solutions for a workload
  2. Select high-performing and scalable storage solutions for a workload
  3. Select high-performing networking solutions for a workload
  4. Choose high-performing database solutions for a workload

Domain 3: Specify Secure Applications and Architectures

  1. Design secure access to AWS resources
  2. Design secure application tiers
  3. Select appropriate data security options

Domain 4: Design Cost-Optimized Architectures

  1. Determine how to design cost-optimized storage.
  2. Determine how to design cost-optimized compute.

AWS Certified Machine Learning -Specialty (MLS-C01) Exam Learning Path

AWS Certified Machine Learning Specialty Certification

AWS Certified Machine Learning -Specialty (MLS-C01) Exam Learning Path

Finally, cleared the AWS Certified Machine Learning – Specialty (MLS-C01). It took me around four months to prepare for the exam. This was my fourth Specialty certification and in terms of the difficulty level of all of them, this is the toughest, partly because I am not a machine learning expert and learned everything from basics for this certification. Machine Learning is a vast specialization in itself and with AWS services, there is lots to cover and know for the exam. This is the only exam, where the majority of the focus is on the concepts outside of AWS i.e. pure machine learning. It also includes AWS Machine Learning and Big Data services.

AWS Certified Machine Learning – Specialty (MLS-C01) exam basically validates

  •  Select and justify the appropriate ML approach for a given business problem.
  • Identify appropriate AWS services to implement ML solutions.
  • Design and implement scalable, cost-optimized, reliable, and secure ML solutions.

Refer AWS Certified Machine Learning – Specialty Exam Guide for details

                              AWS Certified Machine Learning – Specialty Domains

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Summary

  • AWS Certified Machine Learning – Specialty exam, as its name suggests, covers a lot of Machine Learning concepts right. It really digs deep into Machine learning concepts, most of which are not related to AWS.
  • AWS Certified Machine Learning – Speciality exam covers the E2E Machine Learning lifecycle, right from data collection, transformation, making it usable and efficient for Machine Learning, pre-processing data for Machine Learning, training and validation and implementation.
  • As always, one of the key tactic I followed when solving any AWS Certification exam is to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach the right answer or atleast have a 50% chance of getting it right.

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Resources

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Topics

  • Machine Learning
    • Make sure you know and cover all the services in depth, as 60% of the exam is focused on generic Machine learning concepts not related to AWS services.
    • Know about complete generic Machine Learning lifecycle
    • Exploratory Data Analysis
      • Feature selection and Engineering
        • remove features which are not related to training
        • remove features which has same values, very low correlation, very little variance or lot of missing values
        • Apply techniques like Principal Component Analysis (PCA) for dimensionality reduction i.e reduce the number of features.
        • Apply techniques such as One-hot encoding and label encoding to help convert strings to numeric values, which are easier to process.
        • Apply Normalization i.e. values between 0 and 1 to handle data with large variance.
        • Apply feature engineering for feature reduction for e.g. using single height/weight feature instead of both the features
      • Handle Missing data
        • remove the feature or rows with missing data
        • impute using Mean/Median values – valid only for Numeric values and not categorical features also does not factor correlation between features
        • impute using k-NN, Multivariate Imputation by Chained Equation (MICE), Deep Learning – more accurate, factores correlation between features
      • Handle unbalanced data
        • Source more data
        • Oversample minority or Undersample majority
        • Data augmentation using techniques like SMOTE
    • Modeling
      • Know about Algorithms – Supervised, Unsupervised and Reinforcement and which algorithm is best suitable based on the available data either labelled or unlabelled.
        • Supervised learning trains on labelled data for e.g. Linear regression. Logistic regression, Decision trees, Random Forests
        • Unsupervised learning trains on unlabelled data for e.g. PCA, SVD, K-means
        • Reinforcement learning trained based on actions and rewards for e.g. Q-Learning
      • Hyperparameters
        • are parameters exposed by machine learning algorithms that control how the underlying algorithm operates and their values affect the quality of the trained models
        • some of the common hyperparameters are learning rate, batch, epoch (hint:  If the learning rate is too large, the minimum slope might be missed and the graph would oscillate If the learning rate is too small, it requires too many steps which would take the process longer and is less efficient
    • Evaluation
      • Know difference in evaluating model accuracy
        • Use Area Under the (Receiver Operating Characteristic) Curve (AUC) for Binary classification
        • Use root mean square error (RMSE) metric for regression
      • Understand Confusion matrix
        • A true positive is an outcome where the model correctly predicts the positive class. Similarly, a true negative is an outcome where the model correctly predicts the negative class.
        • false positive is an outcome where the model incorrectly predicts the positive class. And a false negative is an outcome where the model incorrectly predicts the negative class.
        • Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN)  (hint: use this for cases like fraud detection,  cost of marking non fraud as frauds is lower than marking fraud as non-frauds)
        • Specificity or TNR (True Negative Rate): Number of items correctly identified as negative out of total negatives- TN/(TN+FP)  (hint: use this for cases like videos for kids, the cost of  dropping few valid videos is lower than showing few bad ones)
      • Handle Overfitting problems
        • Simplify the model, by reducing number of layers
        • Early Stopping – form of regularization while training a model with an iterative method, such as gradient descent
        • Data Augmentation
        • Regularization – technique to reduce the complexity of the model
        • Dropout is a regularization technique that prevents overfitting
        • Never train on test data
  • AWS Machine Learning
    • SageMaker
      • Know SageMaker in depth
      • supports both File mode and Pipe mode
        • File mode loads all of the data from S3 to the training instance volumes VS Pipe mode streams data directly from S3
        • File mode needs disk space to store both the final model artifacts and the full training dataset. VS Pipe mode which helps reduce the required size for EBS volumes
      • Using RecordIO format allows algorithms to take advantage of Pipe mode when training the algorithms that support it. 
      • supports Model tracking capability to manage up to thousands of machine learning model experiments
      • supports Canary deployment using ProductionVariant and deploying multiple variants of a model to the same SageMaker HTTPS endpoint.
      • supports automatic scaling for production variants. Automatic scaling dynamically adjusts the number of instances provisioned for a production variant in response to changes in your workload
      • provides pre-built Docker images for its built-in algorithms and the supported deep learning frameworks used for training & inference
      • SageMaker Automatic Model Tuning
        • is the process of finding a set of hyperparameters for an algorithm that can yield an optimal model.
        • Best practices
          • limit the search to a smaller number as difficulty of a hyperparameter tuning job depends primarily on the number of hyperparameters that Amazon SageMaker has to search
          • DO NOT specify a very large range to cover every possible value for a hyperparameter as it affects the success of hyperparameter optimization.
          • log-scaled hyperparameter can be converted to improve hyperparameter optimization.
          • running one training job at a time achieves the best results with the least amount of compute time.
          • Design distributed training jobs so that you get they report the objective metric that you want.
        • SageMaker Neo enables machine learning models to train once and run anywhere in the cloud and at the edge.
      • know how to take advantage of multiple GPUs (hint: increase learning rate and batch size w.r.t to the increase in GPUs)
      • Algorithms –
        • Blazing text provides Word2vec and text classification algorithms
        • DeepAR provides supervised learning algorithm for forecasting scalar (one-dimensional) time series (hint: train for new products based on existing products sales data)
        • Factorization machines provides supervised classification and regression tasks, helps capture interactions between features within high dimensional sparse datasets economically
        • Image classification algorithm is a supervised learning algorithm that supports multi-label classification
        • IP Insights is an unsupervised learning algorithm that learns the usage patterns for IPv4 addresses
        • K-means is an unsupervised learning algorithm for clustering as it attempts to find discrete groupings within data, where members of a group are as similar as possible to one another and as different as possible from members of other groups.
        • k-nearest neighbors (k-NN) algorithm is an index-based algorithm. It uses a non-parametric method for classification or regression
        • Latent Dirichlet Allocation (LDA) algorithm is an unsupervised learning algorithm that attempts to describe a set of observations as a mixture of distinct categories. Used to identify number of topics shared by documents within a text corpus
        • Linear models are supervised learning algorithms used for solving either classification or regression problems. 
          • For regression (predictor_type=’regressor’), the score is the prediction produced by the model.
          • For classification (predictor_type=’binary_classifier’ or predictor_type=’multiclass_classifier’)
        • Neural Topic Model (NTM) Algorithm is an unsupervised learning algorithm that is used to organize a corpus of documents into topics that contain word groupings based on their statistical distribution
        • Object Detection algorithm detects and classifies objects in images using a single deep neural network
        • Principal Component Analysis (PCA) is an unsupervised machine learning algorithm that attempts to reduce the dimensionality (number of features) (hint: dimensionality reduction)
        • Random Cut Forest (RCF) is an unsupervised algorithm for detecting anomalous data point (hint: anomaly detection)
        • Sequence to Sequence is a supervised learning algorithm where the input is a sequence of tokens (for example, text, audio) and the output generated is another sequence of tokens. (hint: text summarization is the key use case)
    • SageMaker Ground Truth 
      • provides automated data labeling using machine learning
      • helps build highly accurate training datasets for machine learning quickly using Amazon Mechanical Turk
      • provides annotation consolidation to help improve the accuracy of the data object’s labels. It combines the results of multiple worker’s annotation tasks into one high-fidelity label.
      • automated data labeling uses machine learning to label portions of the data automatically without having to send them to human workers
    • Comprehend
      • natural language processing (NLP) service to find insights and relationships in text.
      • identifies the language of the text; extracts key phrases, places, people, brands, or events; understands how positive or negative the text is; analyzes text using tokenization and parts of speech; and automatically organizes a collection of text files by topic.
    • Lex
      • provides conversational interfaces using voice and text helpful in building voice and text chatbots
    • Polly
      • text into speech
      • supports Speech Synthesis Markup Language (SSML) tags like prosody so users can adjust the speech rate, pitch or volume.
      • supports pronunciation lexicons to customize the pronunciation of words
    • Rekognition
      • analyze image and video
      • helps identify objects, people, text, scenes, and activities in images and videos, as well as detect any inappropriate content.
    • Translate – provides natural and fluent language translation
    • Transcribe – provides speech-to-text capability
    • Elastic Interface helps attach low-cost GPU-powered acceleration to EC2 and SageMaker instances or ECS tasks to reduce the cost of running deep learning inference by up to 75%.
  • Analytics
    • Make sure you know and understand data engineering concepts mainly in terms of data capture, data migration, data transformation and data storage
    • Kinesis
      • Understand Kinesis Data Streams and Kinesis Data Firehose in depth
      • Kinesis Data Analytics can process and analyze streaming data using standard SQL and integrates with Data Streams and Firehose
      • Know Kinesis Data Streams vs Kinesis Firehose
        • Know Kinesis Data Streams is open ended on both producer and consumer. It supports KCL and works with Spark.
        • Know Kinesis Firehose is open ended for producer only. Data is stored in S3, Redshift and ElasticSearch.
        • Kinesis Firehose works in batches with minimum 60secs interval.
        • Kinesis Data Firehose supports data transformation and record format conversion using Lambda function (hint: can be used for transforming csv or JSON into parquet)
    • Know ElasticSearch is a search service which supports indexing, full text search, faceting etc.
    • Know Data Pipeline for data transfer
    • Know Glue as fully managed ETL service
      • helps setup, orchestrate, and monitor complex data flows.
      • AWS Glue Data Catalog
        • is a central repository to store structural and operational metadata for all the data assets.
      • AWS Glue crawler
        • connects to a data store, progresses through a prioritized list of classifiers to extract the schema of the data and other statistics, and then populates the Glue Data Catalog with this metadata
  • Security, Identity & Compliance
    • Security is covered very lightly. (hint : SageMaker can read data from KMS encrypted S3. Make sure, the KMS key policies include the role attached with SageMaker)
  • Management & Governance Tools
    • Understand AWS CloudWatch for Logs and Metrics. (hint: SageMaker is integrated with Cloudwatch and logs and metrics are all stored in it)
  • Storage
    • Understand Data Storage Options – Know patterns for S3 vs RDS vs DynamoDB vs Redshift. (hint: S3 is, by default, the data storage option or Big Data storage and look for it in the answer.)

Whitepapers and articles

  •  

AWS Certified Big Data -Speciality (BDS-C00) Exam Learning Path

Clearing the AWS Certified Big Data – Speciality (BDS-C00) was a great feeling. This was my third Speciality certification and in terms of the difficulty level (compared to Network and Security Speciality exams), I would rate it between Network (being the toughest) Security (being the simpler one).

Big Data in itself is a very vast topic and with AWS services, there is lots to cover and know for the exam. If you have worked on Big Data technologies including a bit of Visualization and Machine learning, it would be a great asset to pass this exam.

AWS Certified Big Data – Speciality (BDS-C00) exam basically validates

  • Implement core AWS Big Data services according to basic architectural best practices
  • Design and maintain Big Data
  • Leverage tools to automate Data Analysis

Refer AWS Certified Big Data – Speciality Exam Guide for details

                              AWS Certified Big Data – Speciality Domains

AWS Certified Big Data – Speciality (BDS-C00) Exam Summary

  • AWS Certified Big Data – Speciality exam, as its name suggests, covers a lot of Big Data concepts right from data transfer and collection techniques, storage, pre and post processing, analytics, visualization with the added concepts for data security at each layer.
  • One of the key tactic I followed when solving any AWS Certification exam is to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.
  • Be sure to cover the following topics
    • Whitepapers and articles
    • Analytics
      • Make sure you know and cover all the services in depth, as 80% of the exam is focused on these topics
      • Elastic Map Reduce
        • Understand EMR in depth
        • Understand EMRFS (hint: Use Consistent view to make sure S3 objects referred by different applications are in sync)
        • Know EMR Best Practices (hint: start with many small nodes instead on few large nodes)
        • Know Hive can be externally hosted using RDS, Aurora and AWS Glue Data Catalog
        • Know also different technologies
          • Presto is a fast SQL query engine designed for interactive analytic queries over large datasets from multiple sources
          • D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG, and CSS
          • Spark is a distributed processing framework and programming model that helps do machine learning, stream processing, or graph analytics using Amazon EMR clusters
          • Zeppelin/Jupyter as a notebook for interactive data exploration and provides open-source web application that can be used to create and share documents that contain live code, equations, visualizations, and narrative text
          • Phoenix is used for OLTP and operational analytics, allowing you to use standard SQL queries and JDBC APIs to work with an Apache HBase backing store
      • Kinesis
        • Understand Kinesis Data Streams and Kinesis Data Firehose in depth
        • Know Kinesis Data Streams vs Kinesis Firehose
          • Know Kinesis Data Streams is open ended on both producer and consumer. It supports KCL and works with Spark.
          • Know Kineses Firehose is open ended for producer only. Data is stored in S3, Redshift and ElasticSearch.
          • Kinesis Firehose works in batches with minimum 60secs interval.
        • Understand Kinesis Encryption (hint: use server side encryption or encrypt in producer for data streams)
        • Know difference between KPL vs SDK (hint: PutRecords are synchronously, while KPL supports batching)
        • Kinesis Best Practices (hint: increase performance increasing the shards)
      • Know ElasticSearch is a search service which supports indexing, full text search, faceting etc.
      • Redshift
        • Understand Redshift in depth
        • Understand Redshift Advance topics like Workload Management, Distribution Style, Sort key
        • Know Redshift Best Practices w.r.t selection of Distribution style, Sort key, COPY command which allows parallelism
        • Know Redshift views to control access to data.
      • Amazon Machine Learning
      • Know Data Pipeline for data transfer
      • QuickSight
      • Know Glue as the ETL tool
    • Security, Identity & Compliance
    • Management & Governance Tools
      • Understand AWS CloudWatch for Logs and Metrics. Also, CloudWatch Events more real time alerts as compared to CloudTrail
    • Storage
    • Compute
      • Know EC2 access to services using IAM Role and Lambda using Execution role.
      • Lambda esp. how to improve performance batching, breaking functions etc.

AWS Certified Big Data – Speciality (BDS-C00) Exam Resources

AWS Cloud Migration

AWS Cloud Migration

Some of the key drivers to moving to cloud is

  • Operational Costs – Key components of operational costs are unit price of infrastructure, the ability to match supply and demand, finding a pathway to optionality, employing an elastic cost base, and transparency
  • Workforce Productivity – getting up and ready in seconds and various service availability.
  • Cost Avoidance – eliminating the need for hardware refresh programs and constant maintenance programs
  • Operational Resilience – increases resilience and thereby reduces organization’s risk profile
  • Business Agility – react to market conditions more quickly 

Cloud Stages of Adoption

Cloud Stages of Adoption

PROJECT

  • In the project phase, execute projects to get familiar with and experience benefits from the cloud.

FOUNDATION

  • After experiencing the benefits of cloud, build the foundation to scale the cloud adoption.
  • This includes creating a landing zone (a pre-configured, secure, multi-account AWS environment), Cloud Center of Excellence (CCoE), operations model, as well as assuring security and compliance readiness.

MIGRATION

  • Migrate existing applications including mission-critical applications or entire data centers to the cloud as you scale your adoption across a growing portion of the IT portfolio. 

REINVENTION

  • Now that the operations are in the cloud, focus on reinvention by taking advantage of the flexibility and capabilities of AWS to transform business by speeding time to market and increasing the attention on innovation.

Migration Process

Migration Process

Phase 1: Migration Preparation and Business Planning

  • Determine the right objectives and begin to get an idea of the types of benefits you will see.
  • Starts with some foundational experience and developing a preliminary business case for a migration, which requires taking objectives into account, along with the age and architecture of the existing applications, and their constraints.

Phase 2: Portfolio Discovery and Planning

  • Understand the IT portfolio, the dependencies between applications, and begin to consider what types of migration strategies needed to meet the business case objectives.
  • With the portfolio discovery and migration approach, you are in a good position to build a full business case.

Phase 3 & Phase 4: Designing, Migrating, and Validating Application

  • Move focus from the portfolio level to the individual application level and design, migrate, and validate each application.
  • Each application is designed, migrated, and validated according to one of the six common application strategies (“The 6 R’s”).
  • Once you have some foundational experience from migrating a few apps and a plan in place that the organization can get behind – it’s time to accelerate the migration and achieve scale.
  • AWS provides migration services that help for moving applications and data from on-premises to AWS – AWS Server Migration Service (SMS)AWS Database Migration Service (DMS)

Phase 5: Operate

  • Once applications are migrated, iterate on the new foundation, turn off old systems, and constantly iterate toward a modern operating model.
  • Operating model becomes an evergreen set of people, process, and technology that constantly improves as you migrate more applications.

Application Migration Strategies

Migration strategies depend upon what is in your environment and the what is suitable for the portfolio, taking into account the business and technical requirements.

Below are the Six common migration strategies employed and build upon “The 5 R’s” that Gartner outlined in 2011.

Application Migration Strategies

1. Rehost (“lift and shift”)

  • Moving your application as is to the Cloud.
  • helps to quickly implement the migration and scale to meet a business case
  • provides better opportunity for re-architect the applications once they are already running in cloud, with the organization having already developed cloud skills and the application with its data is migrated and handling traffic.
  • Rehosting can be automated with tools such as AWS Server Migration Service, or can be done manually

2. Replatform (“lift, tinker and shift”)

  • Moving your application to the Cloud with optimizations, without any major changes.
  • Replatform helps achieve some tangible benefit without changing the core architecture of the application. For e.g., using RDS for database or Elastic Beanstalk for applications.

3. Repurchase (“drop and shop”)

  • Dropping the application and Moving to a complete new Solution
  • More of an Buy in a Build vs Buy model, might be expensive in short team but faster time to market.
  • Move to a different product, which likely means the organization is willing to change the existing used licensing model

4. Refactor / Re-architect

  • Moving the application to Cloud, with major changes.
  • More of a Build in a Build vs Buy model, and would take time.
  • driven by a strong business need to add features, scale, or performance with agility and improvement in business continuity that would otherwise be difficult to achieve in the application’s existing environment.

5. Retire

  • Decommission the applications, not needed anymore.
  • Identifying IT assets that are no longer useful and can be turned off will help boost your business case and direct your attention towards maintaining the resources that are widely used.

6. Retain

  • Keep the applications as is in the current environment
  • Retain portions of the IT portfolio, which have tight dependencies, difficult, not in priority or ready for migration

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company is planning the migration of several lab environments used for software testing. An assortment of custom tooling is used to manage the test runs for each lab. The labs use immutable infrastructure for the software test runs, and the results are stored in a highly available SQL database cluster. Although completely rewriting the custom tooling is out of scope for the migration project, the company would like to optimize workloads during the migration. Which application migration strategy meets this requirement?
    1. Re-host
    2. Re-platform
    3. Re-factor/re-architect
    4. Retire

References

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Learning Path

AWS Certified DevOps Engineer - Professional (DOP-C01) Certificate

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Learning Path

NOTE – Refer to DOP-C02 Learning Path

AWS Certified DevOps Engineer – Professional (DOP-C01) exam is the upgraded pattern of the DevOps Engineer – Professional exam which was released last year (2018). I recently attempted the latest pattern and AWS has done quite good in improving it further, as compared to the old one, to include more DevOps related questions and services.

AWS Certified DevOps Engineer – Professional (DOP-C01) exam basically validates

  • Implement and manage continuous delivery systems and methodologies on AWS
  • Implement and automate security controls, governance processes, and compliance validation
  • Define and deploy monitoring, metrics, and logging systems on AWS
  • Implement systems that are highly available, scalable, and self-healing on the AWS platform
  • Design, manage, and maintain tools to automate operational processes

Refer to AWS Certified DevOps Engineer – Professional Exam Guide

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Summary

  • AWS Certified DevOps Engineer – Professional exam was for a total of 170 minutes but it had 75 questions (I was always assuming it to be 65) and I just managed to complete the exam with 20 mins remaining. So be sure you are prepared and manage your time well. As always, mark the questions for review and move on and come back to them after you are done with all.
  • One of the key tactic I followed when solving the DevOps Engineer questions was to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.
  • AWS Certified DevOps Engineer – Professional exam covers a lot of concepts and services related to Automation, Deployments, Disaster Recovery, HA, Monitoring, Logging and Troubleshooting. It also covers security and compliance related topics.
  • Be sure to cover the following topics
    • Whitepapers are the key to understand Deployments and DR
    • Management Tools
      • DevOps professional exam cannot be cleared without the knowledge of this topics
      • Deep dive into CloudFormation, Elastic Beanstalk and OpsWorks
      • Very important to understand CloudFormation vs Elastic Beanstalk vs OpsWorks
      • CloudFormation
        • Have in-depth understand of CloudFormation concepts
        • Know how to indicate completion of events using CloudFormation helper scripts.
        • Understand CloudFormation deployment strategies esp. rolling and replacing update with AutoScaling and update of launch configuration
        • Understand CloudFormation policies esp. Update and Deletion policies (hint : retain resources on stack deletion)
        • Understand CloudFormation Best Practices esp. Nested Stacks and logical grouping
        • Understand CloudFormation template anatomy – parameters, outputs, mappings
        • Understand CloudFormation Custom resource and its use cases (hint : you can use Custom resource to retrieve AMI IDs or interact with external services)
      • Elastic Beanstalk
      • OpsWorks
        • Understand OpsWorks overall – stacks, layers, recipes
        • Understand OpsWorks Lifecycle events esp. the Configure event and how it can be used.
        • Understand OpsWorks Deployment Strategies
        • Know OpsWorks auto-healing and how to be notified for it.
      • Development Tools
        • Unlike the previous DevOps Engineer – Professional exam, the latest pattern has a heavy focus on the Developer tools and be sure to deep dive into them
        • Understand CodePipepline, CodeCommit, CodeDeploy, CodeBuild and their uses cases
        • CodePipeline
          • Understand how to build Pipelines and integration with other Code* services
          • Understand CodePipeline pipeline structure (Hint : run builds parallelly using runorder)
          • Understand how to configure notifications on events and failures
          • Know CodePipeline supports Manual Approval
        • CodeCommit
          • How to handle deployments for code. (Hint : Same repository and branches for projects and environments)
          • Know CodeCommit IAM policies
        • CodeDeploy
    • Monitoring & Governance tools
      • Very important to understand AWS CloudWatch vs AWS CloudTrail vs AWS Config
      • Very important to understand Trust Advisor vs Systems manager vs AWS Inspector
      • Know Personal Health Dashboard & Service Health Dashboard
      • CloudWatch
      • CloudTrail
        • Understand how to maintain CloudTrail logs integrity
      • Understand AWS Config and its use cases (hint : Config maintains history and can be used to revert the config)
      • Know Personal Health Dashboard (hint : it tracks events on your AWS resources)
      • Understand AWS Trusted Advisor and what it provides (hint : low utilization resources)
      • Systems Manager
        • Systems Manager is also covered heavily in the exams so be sure you know
        • Understand AWS Systems Manager and its various services like parameter store, patch manager
    • Networking & Content Delivery
      • Networking is covered very lightly. Usually the questions are targetted towards Troubleshooting of access or permissions.
      • Know VPC
      • Route 53
    • Security, Identity & Compliance
    • Storage
      • Exam does not cover Storage services in deep
      • Focus on Simple Secure Service (S3)
        • Understand S3 Permissions (Hint – acl authenticated users provides access to all authenticated users. How to control access)
        • Know S3 disaster recovery across region. (hint : cross region replication)
        • Know CloudFront for caching to improve performance
      • Elastic Block Store
        • Focus mainly on EBS Backup using snapshots for HA and Disaster recovery
    • Database
    • Compute
      • Know EC2
        • Understand ENI for HA, user data, pre-baked AMIs for faster instance start times
        • Amazon Linux 2 Image (hint : it allows for replication of Amazon Linux behavior in on-premises)
        • Snapshot and sharing
      • Auto Scaling
        • Auto Scaling Lifecycle events
        • Blue/green deployments with Auto Scaling – With new launch configurations, new auto scaling groups or CloudFormation update policies.
      • Understand Lambda
      • ECS
        • Know Monitoring and deployments with image update
    • Integration Tools
      • Know how CloudWatch integration with SNS and Lambda can help in notification (Topics are not required to be in detail)

AWS Certified DevOps Engineer – Professional (DOP-C01) Exam Resources

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Learning Path

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Learning Path

I recently cleared the AWS Certified Advanced Networking – Speciality (ANS-C00), which was my first, en route my path to the AWS Speciality certifications. Frankly, I feel the time I gave for preparation was still not enough, but I just about managed to get through. So a word of caution, this exam is inline or tougher than the professional exam especially for the reason that the Networking concepts it covers are not something you can get your hands dirty with easily.

AWS Certified Advanced Networking – Speciality (ANS-C00) exam is the focusing on the AWS Networking concepts. It basically validates

  • Design, develop, and deploy cloud-based solutions using AWS
    Implement core AWS services according to basic architecture best practices
  • Design and maintain network architecture for all AWS services
  • Leverage tools to automate AWS networking tasks

Refer to AWS Certified Advanced Networking – Speciality Exam Guide

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Resources

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Summary

  • AWS Certified Advanced Networking – Speciality exam covers a lot of Networking concepts like VPC, VPN, Direct Connect, Route 53, ALB, NLB.
  • One of the key tactic I followed when solving the DevOps Engineer questions was to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.
  • Be sure to cover the following topics
    • Networking & Content Delivery
      • You should know everything in Networking.
      • Understand VPC in depth
      • Virtual Private Network to establish connectivity between on-premises data center and AWS VPC
      • Direct Connect to establish connectivity between on-premises data center and AWS VPC and Public Services
        • Make sure you understand Direct Connect in detail, without this you cannot clear the exam
        • Understand Direct Connect connections – Dedicated and Hosted connections
        • Understand how to create a Direct Connect connection (hint: LOA-CFA provides the details for partner to connect to AWS Direct Connect location)
        • Understand virtual interfaces options – Private Virtual Interface for VPC resources and Public Virtual Interface for Public resources
        • Understand setup Private and Public VIF
        • Understand Route Propagation, propagation priority, BGP connectivity
        • Understand High Availability options based on cost and time i.e. Second Direct Connect connection OR VPN connection
        • Understand Direct Connect Gateway – it provides a way to connect to multiple VPCs from on-premises data center using the same Direct Connect connection
      • Route 53
        • Understand Route 53 and Routing Policies and their use cases Focus on Weighted, Latency routing policies
        • Understand Route 53 Split View DNS to have the same DNS to access a site externally and internally
      • Understand CloudFront and use cases
      • Load Balancer
        • Understand ELB, ALB and NLB 
        • Understand the difference ELB, ALB and NLB esp. ALB provides Content, Host and Path based Routing while NLB provides the ability to have static IP address
        • Know how to design VPC CIDR block with NLB (Hint – minimum number of IPs required are 8)
        • Know how to pass original Client IP to the backend instances (Hint – X-Forwarded-for and Proxy Protocol)
      • Know WorkSpaces requirements and setup
    • Security
      • Know AWS GuardDuty as managed threat detection service
      • Know AWS Shield esp. the Shield Advanced option and the features it provides
      • Know WAF as Web Traffic Firewall – (Hint – WAF can be attached to your CloudFront, Application Load Balancer, API Gateway to dynamically detect and prevent attacks)
    •  

Google Cloud – Associate Cloud Engineer Certification learning path

Google Cloud - Associate Cloud Engineer

Google Cloud – Associate Cloud Engineer Certification learning path

Google Cloud – Associate Cloud Engineer certification exam is basically for one who works day-in day-out with the Google Cloud Services. It targets an Cloud Engineer who deploys applications, monitors operations, and manages enterprise solutions. The exam makes sure it covers gamut of services and concepts. Although, the exam is not that tough and time available of 2 hours a quite plenty, if you well prepared.

Google Cloud – Associate Cloud Engineer Certification Summary

  • Has 50 questions to be answered in 2 hours.
  • Covers wide range of Google Cloud services and what they actually do. It focuses heavily on IAM, Compute, Storage with a little bit of Network but hardly any data services.
  • Hands-on is a must. Covers Cloud SDK, CLI commands and Console operations that you would use for day-to-day work. If you have not worked on GCP before make sure you do lot of labs else you would be absolute clueless for some of the questions and commands
  • Once again be sure that NO Online Course or Practice tests is going to cover all. I did ACloud Guru – LA course which covered maybe 60-70%, but hands-on or practical knowledge is MUST

Google Cloud – Associate Cloud Engineer Certification Topics

General Services

  • Cloud Billing
    • understand how Cloud Billing works. Monthly vs Threshold and which has priority
    • Budgets can be set to alert for projects
    • how to change a billing account for a project and what roles you need. Hint – Project Owner and Billing Administrator for the billing account
    • Cloud Billing can be exported to BigQuery and Cloud Storage
  • Resource Manager
    • Understand Resource Manager the hierarchy Organization -> Folders -> Projects -> Resources
    • IAM Policy inheritance is transitive and resources inherit the policies of all of their parent resources.
    • Effective policy for a resource is the union of the policy set on that resource and the policies inherited from higher up in the hierarchy.
  • Cloud SDK
    • understand gcloud commands esp. when dealing with
      • configurations i.e. gcloud config
        • activate profiles – gcloud config configurations activate
        • GKE setting default cluster i.e. gcloud config set container/cluster CLUSTER_NAME
        • set project gcloud config set project mygcp-demo
        • set region gcloud config set compute/region us-west1
        • set zone gcloud config set compute/zone us-west1-a
      • Get project list and ids gcloud projects list
      • Auth i.e gcloud auth
        • Auth login using user gcloud auth login
        • Auth login using service accountgcloud auth activate-service-account --key-file=sa_key.json
      • deployment manager i.e. gcloud deployment-manager
      • VPC firewalls i.e. gcloud compute firewall-rules

Network Services

  • Virtual Private Cloud
    • Understand Virtual Private Cloud (VPC), subnets and host applications within them Hint VPC spans across region
    • Understand how Firewall rules works and how they are configured. Hint – Focus on Network Tags. Also, there are 2 implicit firewall rules – default ingress deny and default egress allow
    • Understand VPC Peering and Shared VPC
    • Understand the concept internal and external IPs and difference between static and ephemeral IPs
    • Primary IP range of an existing subnet can be expanded by modifying its subnet mask, setting the prefix length to a smaller number.
  • Cloud Load Balancing

Identity Services

  • Identity and Access Management – IAM 
    • Identify and Access Management – IAM provides administrators the ability to manage cloud resources centrally by controlling who can take what action on specific resources.
    • Understand how IAM works and how rules apply esp. the hierarchy from Organization -> Folder -> Project -> Resources
    • Understand the difference between Primitive, Pre-defined and Custom roles and their use cases
    • IAM Policy inheritance is transitive and resources inherit the policies of all of their parent resources.
    • Effective policy for a resource is the union of the policy set on that resource and the policies inherited from higher up in the hierarchy.
    • Basically  Permissions -> Roles -> (IAM Policy) -> Members
    • Need to know and understand the roles for the following services atleast
      • Cloud Storage – Admin vs Creator vs Viewer
      • Compute Engine – Admin vs Instance Admin
      • Spanner – Viewer vs Database User
      • BigQuery – User vs JobUser
    • Know how to copy roles to different projects or organization. Hint – gcloud iam roles copy
    • Know how to use service accounts with applications
  • Cloud Identity
    • Cloud Identity provides IDaaS (Identify as a Service) and provides single sign-on functionality and federation with external identity provides like Active Directory.

Compute Services

  • Make sure you know all the compute services Google Compute Engine, Google App Engine and Google Kubernetes Engine, they are heavily covered in the exam.
  • Google Compute Engine
    • Google Compute Engine is the best IaaS option for compute and provides fine grained control
    • Know how to create a Compute Engine instance, connect to it using Cloud shell or ssh keys
    • Difference between backups and images and how to create instances from the same.
    • Instance templates with managed instance groups. Instance template cannot be edited, create a new one and attach.
    • Difference between managed vs unmanaged instance groups and auto-healing feature
    • Preemptible VMs and their use cases. HINT – can be terminated any time and supports max 24 hours.
    • Upgrade an instance without downtime using Live Migration
    • Managing access using OS Login or project and instance metadata
    • Prevent accidental deletion using deletion protection flag
    • In case of any issues or errors, how to debug the same
  • Google App Engine
    • Google App Engine is mainly the best option for PaaS with platforms supported and features provided.
    • Deploy an application with App Engine and understand how versioning and rolling deployments can be done
    • Understand how to keep auto scaling and traffic splitting and migration.
    • Know App Engine is a regional resource and understand the steps to migrate or deploy application to different region and project.
    • Know the difference between App Engine Flexible vs Standard
  • Google Kubernetes Engine
    • Google Container Engine is now officially Google Kubernetes Engine and the questions refer to the same
    • Google Kubernetes Engine, powered by the open source container scheduler Kubernetes, enables you to run containers on Google Cloud Platform.
    • Kubernetes Engine takes care of provisioning and maintaining the underlying virtual machine cluster, scaling your application, and operational logistics such as logging, monitoring, and cluster health management.
    • Be sure to Create a Kubernetes Cluster and configure it to host an application
    • Understand how to make the cluster auto repairable and upgradable. Hint – Node auto-upgrades and auto-repairing feature
    • Very important to understand where to use gcloud commands (to create a cluster) and kubectl commands (manage the cluster components)
    • Very important to understand how to increase cluster size and enable autoscaling for the cluster
    • know how to manage secrets like database passwords

Storage Services

  • Understand each storage service options and their use cases.
  • Cloud Storage
    • Cloud Storage is cost-effective object storage for unstructured data.
    • very important to know the different storage classes and their use cases esp. Regional and Multi-Regional (frequent access), Nearline (monthly access) and Coldline (yearly access)
    • Understand life cycle management. HINT – Changes are in accordance to object creation date
    • Understand Signed URL to give temporary access and the users do not need to be GCP users
    • Understand access control and permissions – IAM vs ACLs (fine grained control)
    • Understand best practices esp. uploading and downloading the data. HINT using parallel composite uploads
  • Relational Databases
    • Cloud SQL
      • Cloud SQL is a fully-managed service that provides MySQL, PostgreSQL and MS SQL Server
      • limited to 10TB 64TB and is a regional service.
      • Difference between Failover and Read replicas. Failover provides High Availability and almost zero downtime while Read replicas provide scalability. Cross region Read Replicas are supported
      • Perform Point-In-Time recovery. Hint – requires binary logging and backups
    • Cloud Spanner
      • is a fully managed, mission-critical relational database service.
      • provides a scalable online transaction processing (OLTP) database with high availability and strong consistency at global scale.
      • globally distributed and can scale and handle more than 10TB.
      • not a direct replacement and would need migration
    • There are no direct options for Microsoft SQL Server or Oracle yet.
  • Data Warehousing
    • BigQuery
      • provides scalable, fully managed enterprise data warehouse (EDW) with SQL and fast ad-hoc queries.
      • Remember it is most suitable for historical analysis.
      • know how to perform a preview or dry run. Hint – price is determined by bytes read not bytes returned.
      • supports federated tables or external tables that can support Cloud Storage, BigTable, Google Drive and Cloud SQL.

Data Services

  • Although there were only a couple of reference of big data services in the exam, it is important to know (DO NOT DEEP DIVE) the Big Data stack (esp. IoT gateway, Pub/Sub, Bigtable vs BigQuery) to understand which service fits the different layers of ingest, store, process, analytics, use
    • Cloud Storage as the medium to store data as data lake
    • Cloud Pub/Sub as the messaging service to capture real time data esp. IoT
    • Cloud Pub/Sub is designed to provide reliable, many-to-many, asynchronous messaging between applications esp. real time IoT data capture
    • Cloud Dataflow to process, transform, transfer data and the key service to integrate store and analytics.
    • Cloud BigQuery for storage and analytics. Remember BigQuery provides the same cost-effective option for storage as Cloud Storage
    • Cloud Dataprep to clean and prepare data. Hint – It can be used anomaly detection.
    • Cloud Dataproc to handle existing Hadoop/Spark jobs. Hint – Use it to replace existing hadoop infra.
    • Cloud Datalab is an interactive tool for exploration, transformation, analysis and visualization of your data on Google Cloud Platform

Monitoring

  • Google Cloud Monitoring or Stackdriver
    • provides everything from monitoring, alert, error reporting, metrics, diagnostics, debugging, trace.
    • remember audits are mainly checking Stackdriver

DevOps services

  • Deployment Manager 
  • Google Marketplace (Cloud Launcher)
    • provides a way to launch common software packages e.g. Jenkins or WordPress and stacks on Google Compute Engine with just a few clicks like a prepackaged solution.
    • It can help minimize deployment time and can be used without any knowledge about the product

Google Cloud – Associate Cloud Engineer Certification Resources