HashiCorp Certified Terraform Associate Learning Path

November 30, 2020 ~ Last updated on : August 8, 2023 ~ jayendrapatil ~ 8 Comments

If you are working on an multi-cloud environment and focusing on automation, you would surely have been using Terraform or considered at some point of time. I have been using Terraform for over two years now for provisioning infrastructure on AWS, GCP and AliCloud right through development to production and it has been a wonderful DevOps journey and It was good to validate the Terraform skills through the Terraform Associate certification.

Terraform is for Cloud Engineers specializing in operations, IT, or development who know the basic concepts and skills associated with open source HashiCorp Terraform.

HashiCorp Certified Terraform Associate Exam Summary

HashiCorp Certified Terraform Associate exam focuses on Terraform as a Infrastructure as a Code tool

HashiCorp Certified Terraform Associate exam has 57 questions with a time limit of 60 minutes
Exam has a multi answer, multiple choice, fill in the blanks and True/False type of questions
Questions and answer options are pretty short and if you have experience on Terraform they are pretty easy and the time if more than sufficient.

HashiCorp Certified Terraform Associate Exam Topic Summary

Refer Terraform Cheat Sheet for details

Understand Infrastructure as Code (IaC) concepts

Explain what IaC is
- Infrastructure is described using a high-level configuration syntax
- IaC allows Infrastructure to be versioned and treated as you would any other code.
- Infrastructure can be shared and re-used.
Describe advantages of IaC patterns
- makes Infrastructure more reliable
- makes Infrastructure more manageable
- makes Infrastructure more automated and less error prone

Understand Terraform’s purpose (vs other IaC)

Explain multi-cloud and provider-agnostic benefits
- using multi-cloud setup increases fault tolerance and reduces dependency on a single Cloud
- Terraform provides a cloud-agnostic framework and allows a single configuration to be used to manage multiple providers, and to even handle cross-cloud dependencies.
- Terraform simplifies management and orchestration, helping operators build large-scale multi-cloud infrastructures.
Explain the benefits of state
- State is a necessary requirement for Terraform to function.
- Terraform requires some sort of database to map Terraform config to the real world.
- Terraform uses its own state structure for mapping configuration to resources in the real world
- Terraform state helps
  - track metadata such as resource dependencies.
  - provides performance as it stores a cache of the attribute values for all resources in the state
  - aids syncing when using in team with multiple users

Understand Terraform basics

Handle Terraform and provider installation and versioning
- Providers provide abstraction above the upstream API and is responsible for understanding API interactions and exposing resources.
- Terraform configurations must declare which providers they require, so that Terraform can install and use them
- Provider requirements are declared in a required_providers block.
Describe plugin based architecture
- Terraform relies on plugins called “providers” to interact with remote systems.

Demonstrate using multiple providers
- supports multiple provider instances using alias for e.g. multiple aws provides with different region
Describe how Terraform finds and fetches providers
- Terraform finds and installs providers when initializing a working directory. It can automatically download providers from a Terraform registry, or load them from a local mirror or cache.
- Each Terraform module must declare which providers it requires, so that Terraform can install and use them.
Explain when to use and not use provisioners and when to use local-exec or remote-exec
- Terraform provides local-exec and remote-exec to execute tasks not provided by Terraform
  - local exec executes code on the machine running terraform
  - remote exec executes on the resource provisioned and supports ssh and winrm
- Provisioners should only be used as a last resort.
- are defined within the resource block.
- support types – Create and Destroy
  - if creation time fails, resource is tainted if provisioning failed, by default. (next apply it will be re-created)
  - behavior can be overridden by setting the on_failure to continue, which means ignore and continue
  - for destroy, if it fails – resources are not removed

Use the Terraform CLI (outside of core workflow)

Given a scenario: choose when to use terraform fmt to format code
- terraform fmt helps format code to lint into a standard format. It usually aligns the spaces and matches the =
Given a scenario: choose when to use terraform taint to taint Terraform resources
- terraform taint marks a Terraform-managed resource as tainted, forcing it to be destroyed and recreated on the next apply.
- will not modify infrastructure, but does modify the state file in order to mark a resource as tainted.
- Infrastructure and state are changed in next apply.
- can be used to taint a resource within a module
Given a scenario: choose when to use terraform import to import existing infrastructure into your Terraform state
- terraform import helps import already-existing external resources, not managed by Terraform, into Terraform state and allow it to manage those resources
- Terraform is not able to auto-generate configurations for those imported modules, for now, and requires you to first write the resource definition in Terraform and then import this resource
Given a scenario: choose when to use terraform workspace to create workspaces
- Terraform workspace helps manage multiple distinct sets of infrastructure resources or environments with the same code.
- state files for each workspace are stored in the directory terraform.tfstate.d
- terraform workspace new dev creates a new workspace with name dev and switches to it as well
- does not provide strong separation as it uses the same backend

Given a scenario: choose when to use terraform state to view Terraform state
- state helps keep track of the infrastructure Terraform manages
- stored locally in the terraform.tfstate
- recommended not to edit the state manually
- Use terraform state command
  - mv – to move/rename modules
  - rm – to safely remove resource from the state. (destroy/retain like)
  - pull – to observe current remote state
  - list & show – to write/debug modules

Given a scenario: choose when to enable verbose logging and what the outcome/value is
- debugging can be controlled using TF_LOG , which can be configured for different levels TRACE, DEBUG, INFO, WARN or ERROR, with TRACE being the more verbose.
- logs path can be controlled TF_LOG_PATH. TF_LOG needs to be specified.

Interact with Terraform modules

Contrast module source options
- Terraform Module Registry allows you to browse, filter and search for modules
Interact with module inputs and outputs
- Input variables serve as parameters for a Terraform module, allowing aspects of the module to be customized without altering the module’s own source code, and allowing modules to be shared between different configurations.
- Resources defined in a module are encapsulated, so the calling module cannot access their attributes directly.
- Child module can declare output values to selectively export certain values to be accessed by the calling module module.module_name.output_value

Describe variable scope within modules/child modules
- Modules are called from within other modules using module blocks
- All modules require a source argument, which is a meta-argument defined by Terraform
- To call a module means to include the contents of that module into the configuration with specific values for its input variables.
Discover modules from the public Terraform Module Registry
- Terraform Module Registry allows you to browse, filter and search for modules

Defining module version
- must be on GitHub and must be a public repo, if using public registry.
- must be named terraform-<PROVIDER>-<NAME>, where <NAME> reflects the type of infrastructure the module manages and <PROVIDER> is the main provider where it creates that infrastructure. for e.g. terraform-google-vault or terraform-aws-ec2-instance.
- must maintain x.y.z tags for releases to identify module versions. and can optionally be prefixed with a v for example, v1.0.4 and 0.9.2. Tags that don’t look like version numbers are ignored.
- must maintain a Standard module structure, which allows the registry to inspect the module and generate documentation, track resource usage, parse submodules and examples, and more.

Navigate Terraform workflow

Describe Terraform workflow ( Write -> Plan -> Create )
- Core Terraform workflow has three steps:
  - Write – Author infrastructure as code.
  - Plan – Preview changes before applying.
  - Apply – Provision reproducible infrastructure.
Initialize a Terraform working directory terraform init
- initializes a working directory containing Terraform configuration files.
- performs backend initialization, modules and plugins installation.
- plugins are downloaded in the sub-directory of the present working directory at the path of .terraform/plugins
- does not delete the existing configuration or state

Validate a Terraform configuration terraform validate
- validates the configuration files in a directory, referring only to the configuration and not accessing any remote services such as remote state, provider APIs, etc.
- verifies whether a configuration is syntactically valid and internally consistent, regardless of any provided variables or existing state.
- useful for general verification of reusable modules, including the correctness of attribute names and value types.
Generate and review an execution plan for Terraform terraform plan
- terraform plan create a execution plan as it traverses each vertex and requests each provider using parallelism
- calculates the difference between the last-known state and the current state and presents this difference as the output of the terraform plan operation to user in their terminal
- does not modify the infrastructure or state.
- allows a user to see which actions Terraform will perform prior to making any changes to reach the desired state
- performs refresh for each resource and might hit rate limiting issues as it calls provider APIs
- all resources refresh can be disabled or avoided using
  - -refresh=false or
  - target=xxxx or
  - break resources into different directories.
Execute changes to infrastructure with Terraform terraform apply
- will always ask for confirmation before executing unless passed the -auto-approve flag.
- if a resource successfully creates but fails during provisioning, Terraform will error and mark the resource as “tainted”. Terraform does not roll back the changes
Destroy Terraform managed infrastructure terraform destroy
- will always ask for confirmation before executing unless passed the -auto-approve flag.

Implement and maintain state

Describe default local backend
- A “backend” in Terraform determines how state is loaded and how an operation such as apply is executed. This abstraction enables non-local file state storage, remote execution, etc.
- determines how state is loaded and how an operation such as apply is executed
- is responsible for storing state and providing an API for optional state locking
- needs to be initialized
- helps
  - collaboration and working as a team, with the state maintained remotely and state locking
  - can provide enhanced security for sensitive data
  - support remote operations
- local (default) backend stores state in a local JSON file on disk
Outline state locking
- happens for all operations that could write state, if supported by backend for e.g. S3 with DynamoDB, Consul etc.
- prevents others from acquiring the lock & potentially corrupting the state
- use force-unlock command to manually unlock the state if unlocking failed
- backends which support state locking are
  - azurerm
  - Hashicorp consul
  - Tencent Cloud Object Storage (COS)
  - etcdv3
  - Google Cloud Storage GCS
  - HTTP endpoints
  - Kubernetes Secret with locking done using a Lease resource
  - AliCloud Object Storage OSS with locking via TableStore
  - PostgreSQL
  - AWS S3 with locking via DynamoDB
  - Terraform Enterprise
- Backends which do not support state locking are
  - artifactory
  - etcd

Handle backend authentication methods
- every remote backend support different authentication mechanism and can be configured with the backend configuration
Describe remote state storage mechanisms and supported standard backends
- remote backend stores state remotely like S3, OSS, GCS, Consul and support features like remote operation, state locking, encryption, versioning etc.
- github is not a supported backend type.
Describe effect of Terraform refresh on state
- terraform refreshis used to reconcile the state Terraform knows about (via its state file) with the real-world infrastructure.
- can be used to detect any drift from the last-known state, and to update the state file.
- does not modify infrastructure but does modify the state file.
Describe backend block in configuration and best practices for partial configurations
- Backend configuration doesn’t support interpolations.
- supports partial configuration with remaining configuration arguments provided as part of the initialization process
- if switching the backed for the first time setup, Terraform provides a migration option
Understand secret management in state files
- terraform state command is used for advanced state management
- Terraform has no mechanism to redact or protect secrets that are returned via data sources, so secrets read via this provider will be persisted into the Terraform state, into any plan files, and in some cases in the console output produced while planning and applying.
- can be protected accordingly either by using Vault and remote backends with encryption and proper access control

Read, generate, and modify configuration

Demonstrate use of variables and outputs
- Variables
  - serve as parameters for a Terraform module and
  - act like function arguments
  - count is a reserved word and cannot be used as variable name
- Output
  - are like function return values.
  - can be marked sensitive which prevents showing its value in the list of outputs. However, they are stored in the state as plain text.
Describe secure secret injection best practice
Understand the use of collection and structural types
- supports primitive data types of
  - string, number and bool
  - automatically convert number and bool values to string values
- supports complex data types of
  - list – sequence of values identified by consecutive whole numbers starting with zero.
  - map – collection of values where each is identified by a string label
  - set – collection of unique values that do not have any secondary identifiers or ordering.
- supports structural data types of
  - object – a collection of named attributes with their own type
  - tuple – a sequence of elements identified by consecutive whole numbers starting with zero, where each element has its own type.
Create and differentiate resource and data configuration
- Resources describe one or more infrastructure objects, such as virtual networks, instances, or higher-level components such as DNS records.
- Data sources allow data to be fetched or computed for use elsewhere in Terraform configuration. Use of data sources allows a Terraform configuration to make use of information defined outside of Terraform, or defined by another separate Terraform configuration.
Use resource addressing and resource parameters to connect resources together
Use Terraform built-in functions to write configuration
- lookup retrieves the value of a single element from a map, given its key. If the given key does not exist, a the given default value is returned instead. lookup(map, key, default)
- zipmap constructs a map from a list of keys and a corresponding list of values. A map is denoted by { } whereas a list is donated by [ ] for e.g. zipmap(["a", "b"], [1, 2]) results into {"a" = 1, "b" = 2}
Configure resource using a dynamic block
- dynamic acts much like a for expression, but produces nested blocks instead of a complex typed value. It iterates over a given complex value, and generates a nested block for each element of that complex value.
- Overuse of dynamic block is not recommended as it makes the code hard to understand and debug
Describe built-in dependency management (order of execution based)
- Terraform analyses any expressions within a resource block to find references to other objects and treats those references as implicit ordering requirements when creating, updating, or destroying resources.
- Explicit dependency can be defined using the depends_on attribute where dependencies between resources that are not visible
support comments using #, // and /* */

Understand Terraform Cloud and Enterprise capabilities

Describe the benefits of Sentinel, registry, and workspaces
- Terraform Cloud provides private module registry for storing modules private to be used within the organization
Differentiate OSS and TFE workspaces
Summarize features of Terraform Cloud
- Terraform Enterprise currently supports running under the following operating systems for a Clustered deployment:
  - Ubuntu 16.04.3 – 16.04.5 / 18.04
  - Red Hat Enterprise Linux 7.4 through 7.7
  - CentOS 7.4 – 7.7
  - Amazon Linux
  - Oracle Linux
  - Clusters currently don’t support other Linux variants.
- Terraform Enterprise install that is provisioned on a network that does not have Internet access is generally known as an air-gapped install.

HashiCorp Certified Terraform Associate Exam Resources

Courses
- KodeKloud – Terraform Associate Certification
Practice tests
- Braincert – HashiCorp Certified Terraform Associate Practice Exams
- Whizlabs – Terraform Certification Exam Questions

Terraform Cheat Sheet

November 30, 2020 ~ Last updated on : November 30, 2020 ~ jayendrapatil ~ 8 Comments

An open source provisioning declarative tool that based on Infrastructure as a Code paradigm

designed on immutable infrastructure principles
Written in Golang and uses own syntax – HCL (Hashicorp Configuration Language), but also supports JSON

Helps to evolve the infrastructure, safely and predictably
Applies Graph Theory to IaaC and provides Automation, Versioning and Reusability
Terraform is a multipurpose composition tool:
○ Composes multiple tiers (SaaS/PaaS/IaaS)
○ A plugin-based architecture model

Terraform is not a cloud agnostic tool. It embraces all major Cloud Providers and provides common language to orchestrate the infrastructure resources
Terraform is not a configuration management tool and other tools like chef, ansible exists in the market.

Terraform Architecture

Terraform Providers (Plugins)

provide abstraction above the upstream API and is responsible for understanding API interactions and exposing resources.
Invoke only upstream APIs for the basic CRUD operations
Providers are unaware of anything related to configuration loading, graph
theory, etc.

supports multiple provider instances using alias for e.g. multiple aws provides with different region
can be integrated with any API using providers framework
Most providers configure a specific infrastructure platform (either cloud or self-hosted).

can also offer local utilities for tasks like generating random numbers for unique resource names.

Terraform Provisioners

run code locally or remotely on resource creation
- local exec executes code on the machine running terraform
- remote exec
  - runs on the provisioned resource
  - supports ssh and winrm
- requires inline list of commands
should be used as a last resort
are defined within the resource block.

support types – Create and Destroy
- if creation time fails, resource is tainted if provisioning failed, by default. (next apply it will be re-created)
- behavior can be overridden by setting the on_failure to continue, which means ignore and continue
- for destroy, if it fails – resources are not removed

Terraform Workspaces

helps manage multiple distinct sets of infrastructure resources or environments with the same code.
just need to create needed workspace and use them, instead of creating a directory for each environment to manage

state files for each workspace are stored in the directory terraform.tfstate.d
terraform workspace new dev creates a new workspace and switches to it as well
terraform workspace select dev helps select workspace

terraform workspace list lists the workspaces and shows the current active one with *
does not provide strong separation as it uses the same backend

Terraform Workflow

init

initializes a working directory containing Terraform configuration files.
performs
- backend initialization , storage for terraform state file.
- modules installation, downloaded from terraform registry to local path
- provider(s) plugins installation, the plugins are downloaded in the sub-directory of the present working directory at the path of .terraform/plugins

supports -upgrade to update all previously installed plugins to the newest version that complies with the configuration’s version constraints
is safe to run multiple times, to bring the working directory up to date with changes in the configuration
does not delete the existing configuration or state

validate

validates syntactically for format and correctness.
is used to validate/check the syntax of the Terraform files.

verifies whether a configuration is syntactically valid and internally consistent, regardless of any provided variables or existing state.
A syntax check is done on all the terraform files in the directory, and will display an error if any of the files doesn’t validate.

plan

create a execution plan
traverses each vertex and requests each provider using parallelism
calculates the difference between the last-known state and
the current state and presents this difference as the output of the terraform plan operation to user in their terminal

does not modify the infrastructure or state.
allows a user to see which actions Terraform will perform prior to making any changes to reach the desired state
will scan all *.tf files in the directory and create the plan

will perform refresh for each resource and might hit rate limiting issues as it calls provider APIs
all resources refresh can be disabled or avoided using
- -refresh=false or
- target=xxxx or
- break resources into different directories.
supports -out to save the plan

apply

apply changes to reach the desired state.
scans the current directory for the configuration and applies the changes appropriately.

can be provided with a explicit plan, saved as out from terraform plan
If no explicit plan file is given on the command line, terraform apply will create a new plan automatically and prompt for approval to apply it
will modify the infrastructure and the state.

if a resource successfully creates but fails during provisioning,
- Terraform will error and mark the resource as “tainted”.
- A resource that is tainted has been physically created, but can’t be considered safe to use since provisioning failed.
- Terraform also does not automatically roll back and destroy the resource during the apply when the failure happens, because that would go against the execution plan: the execution plan would’ve said a resource will be created, but does not say it will ever be deleted.
does not import any resource.
supports -auto-approve to apply the changes without asking for a confirmation

supports -target to apply a specific module

refresh

used to reconcile the state Terraform knows about (via its state file) with the real-world infrastructure

does not modify infrastructure, but does modify the state file

destroy

destroy the infrastructure and all resources

modifies both state and infrastructure
terraform destroy -target can be used to destroy targeted resources
terraform plan -destroy allows creation of destroy plan

import

helps import already-existing external resources, not managed by Terraform, into Terraform state and allow it to manage those resources
Terraform is not able to auto-generate configurations for those imported modules, for now, and requires you to first write the resource definition in Terraform and then import this resource

taint

marks a Terraform-managed resource as tainted, forcing it to be destroyed and recreated on the next apply.
will not modify infrastructure, but does modify the state file in order to mark a resource as tainted. Infrastructure and state are changed in next apply.

can be used to taint a resource within a module

fmt

format to lint the code into a standard format

console

command provides an interactive console for evaluating expressions.

Terraform Modules

enables code reuse

supports versioning to maintain compatibility
stores code remotely
enables easier testing

enables encapsulation with all the separate resources under one configuration block
modules can be nested inside other modules, allowing you to quickly spin up whole separate environments.
can be referred using source attribute

supports Local and Remote modules
- Local modules are stored alongside the Terraform configuration (in a separate directory, outside of each environment but in the same repository) with source path ./ or ../
- Remote modules are stored externally in a separate repository, and supports versioning

supports following backends
- Local paths
- Terraform Registry
- GitHub
- Bitbucket
- Generic Git, Mercurial repositories
- HTTP URLs
- S3 buckets
- GCS buckets

Module requirements
- must be on GitHub and must be a public repo, if using public registry.
- must be named terraform-<PROVIDER>-<NAME>, where <NAME> reflects the type of infrastructure the module manages and <PROVIDER> is the main provider where it creates that infrastructure. for e.g. terraform-google-vault or terraform-aws-ec2-instance.
- must maintain x.y.z tags for releases to identify module versions. Release tag names must be a semantic version, which can optionally be prefixed with a v for example, v1.0.4 and 0.9.2. Tags that don’t look like version numbers are ignored.
- must maintain a Standard module structure, which allows the registry to inspect the module and generate documentation, track resource usage, parse submodules and examples, and more.

Terraform Read and write configuration

terraform_sample

Resources
- resource are the most important element in the Terraform language that describes one or more infrastructure objects, such as compute instances etc
- resource type and local name together serve as an identifier for a given resource and must be unique within a module for e.g. aws_instance.local_name

Data Sources
- data allow data to be fetched or computed for use elsewhere in Terraform configuration
- allows a Terraform configuration to make use of information defined outside of Terraform, or defined by another separate Terraform configuration

Variables
- variable serve as parameters for a Terraform module and act like function arguments
- allows aspects of the module to be customized without altering the module’s own source code, and allowing modules to be shared between different configurations
- can be defined through multiple ways
  - command line for e.g.-var="image_id=ami-abc123"
  - variable definition files .tfvars or .tfvars.json. By default, terraform automatically loads
    - Files named exactly terraform.tfvars or terraform.tfvars.json.
    - Any files with names ending in .auto.tfvars or .auto.tfvars.json
    - file can also be passed with -var-file
  - environment variables can be used to set variables using the format TF_VAR_name
- - Environment variables
  - terraform.tfvars file, if present.
  - terraform.tfvars.json file, if present.
  - Any *.auto.tfvars or *.auto.tfvars.json files, processed in lexical order of their filenames.
  - Any -var and -var-file options on the command line, in the order they are provided.Terraform loads variables in the following order, with later sources taking precedence over earlier ones:
Local Values
- locals assigns a name to an expression, allowing it to be used multiple times within a module without repeating it.
- are like a function’s temporary local variables.
- helps to avoid repeating the same values or expressions multiple times in a configuration.
Output
- are like function return values.
- output can be marked as containing sensitive material using the optional sensitive argument, which prevents Terraform from showing its value in the list of outputs. However, they are still stored in the state as plain text.
- In a parent module, outputs of child modules are available in expressions as module.<MODULE NAME>.<OUTPUT NAME>.

Named Values
- is an expression that references the associated value for e.g. aws_instance.local_name, data.aws_ami.centos, var.instance_type etc.
- support Local named values for e.g count.index

Dependencies
- identifies implicit dependencies as Terraform automatically infers when one resource depends on another by studying the resource attributes used in interpolation expressions for e.g aws_eip on resource aws_instance
- explicit dependencies can be defined using depends_on where dependencies between resources that are not visible to Terraform

Data Types
- supports primitive data types of
  - string, number and bool
  - Terraform language will automatically convert number and bool values to string values when needed
- supports complex data types of
  - list – a sequence of values identified by consecutive whole numbers starting with zero.
  - map – a collection of values where each is identified by a string label.
  - set – a collection of unique values that do not have any secondary identifiers or ordering.
- supports structural data types of
  - object – a collection of named attributes that each have their own type
  - tuple – a sequence of elements identified by consecutive whole numbers starting with zero, where each element has its own type.
Built-in Functions
- includes a number of built-in functions that can be called from within expressions to transform and combine values for e.g. min, max, file, concat, element, index, lookup etc.
- does not support user-defined functions
Dynamic Blocks
- acts much like a for expression, but produces nested blocks instead of a complex typed value. It iterates over a given complex value, and generates a nested block for each element of that complex value.
Terraform Comments
- supports three different syntaxes for comments:
  - #
  - //
  - /* and */

Terraform Backends

determines how state is loaded and how an operation such as apply is executed
are responsible for storing state and providing an API for optional state locking
needs to be initialized
if switching the backed for the first time setup, Terraform provides a migration option
helps
- collaboration and working as a team, with the state maintained remotely and state locking
- can provide enhanced security for sensitive data
- support remote operations
supports local vs remote backends
- local (default) backend stores state in a local JSON file on disk
- remote backend stores state remotely like S3, OSS, GCS, Consul and support features like remote operation, state locking, encryption, versioning etc.
supports partial configuration with remaining configuration arguments provided as part of the initialization process
Backend configuration doesn’t support interpolations.
GitHub is not the supported backend type in Terraform.

Terraform State Management

state helps keep track of the infrastructure Terraform manages
stored locally in the terraform.tfstate
recommended not to edit the state manually
Use terraform state command
- mv – to move/rename modules
- rm – to safely remove resource from the state. (destroy/retain like)
- pull – to observe current remote state
- list & show – to write/debug modules

State Locking

happens for all operations that could write state, if supported by backend
prevents others from acquiring the lock & potentially corrupting the state
backends which support state locking are
- azurerm
- Hashicorp consul
- Tencent Cloud Object Storage (COS)
- etcdv3
- Google Cloud Storage GCS
- HTTP endpoints
- Kubernetes Secret with locking done using a Lease resource
- AliCloud Object Storage OSS with locking via TableStore
- PostgreSQL
- AWS S3 with locking via DynamoDB
- Terraform Enterprise
Backends which do not support state locking are
- artifactory
- etcd
can be disabled for most commands with the -lock flag
use force-unlock command to manually unlock the state if unlocking failed

State Security

can contain sensitive data, depending on the resources in use for e.g passwords and keys
using local state, data is stored in plain-text JSON files
using remote state, state is held in memory when used by Terraform. It may be encrypted at rest, if supported by backend for e.g. S3, OSS

Terraform Logging

debugging can be controlled using TF_LOG , which can be configured for different levels TRACE, DEBUG, INFO, WARN or ERROR, with TRACE being the more verbose.
logs path can be controlled TF_LOG_PATH. TF_LOG needs to be specified.

Terraform Cloud and Terraform Enterprise

Terraform Cloud provides Cloud Infrastructure Automation as a Service. It is offered as a multi-tenant SaaS platform and is designed to suit the needs of smaller teams and organizations. Its smaller plans default to one run at a time, which prevents users from executing multiple runs concurrently.
Terraform Enterprise is a private install for organizations who prefer to self-manage. It is designed to suit the needs of organizations with specific requirements for security, compliance and custom operations.
Terraform Cloud provides features
- Remote Terraform Execution – supports Remote Operations for Remote Terraform execution which helps provide consistency and visibility for critical provisioning operations.
- Workspaces – organizes infrastructure with workspaces instead of directories. Each workspace contains everything necessary to manage a given collection of infrastructure, and Terraform uses that content whenever it executes in the context of that workspace.
- Remote State Management – acts as a remote backend for the Terraform state. State storage is tied to workspaces, which helps keep state associated with the configuration that created it.
- Version Control Integration – is designed to work directly with the version control system (VCS) provider.
- Private Module Registry – provides a private and central library of versioned & validated modules to be used within the organization
- Team based Permission System – can define groups of users that match the organization’s real-world teams and assign them only the permissions they need
- Sentinel Policies – embeds the Sentinel policy-as-code framework, which lets you define and enforce granular policies for how the organization provisions infrastructure. Helps eliminate provisioned resources that don’t follow security, compliance, or operational policies.
- Cost Estimation – can display an estimate of its total cost, as well as any change in cost caused by the proposed updates
- Security – encrypts state at rest and protects it with TLS in transit.
Terraform Enterprise features
- includes all the Terraform Cloud features with
- Audit – supports detailed audit logging and tracks the identity of the user requesting state and maintains a history of state changes.
- SSO/SAML – SAML for SSO provides the ability to govern user access to your applications.
Terraform Enterprise currently supports running under the following operating systems for a Clustered deployment:
- Ubuntu 16.04.3 – 16.04.5 / 18.04
- Red Hat Enterprise Linux 7.4 through 7.7
- CentOS 7.4 – 7.7
- Amazon Linux
- Oracle Linux
- Clusters currently don’t support other Linux variants.
Terraform Cloud currently supports following VCS Provider
- GitHub.com
- GitHub.com (OAuth)
- GitHub Enterprise
- GitLab.com
- GitLab EE and CE
- Bitbucket Cloud
- Bitbucket Server
- Azure DevOps Server
- Azure DevOps Services
A Terraform Enterprise install that is provisioned on a network that does not have Internet access is generally known as an air-gapped install. These types of installs require you to pull updates, providers, etc. from external sources vs. being able to download them directly.

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Learning Path

AWS Certified Alexa Skill Builder - Specialty Certificate

October 23, 2020 ~ Last updated on : January 11, 2022 ~ jayendrapatil ~ 6 Comments

Finally All Down for AWS (for now) …

Continuing on my AWS journey with the last AWS certification, I took another step by clearing the AWS Certified Alexa Skill Builder – Specialty (AXS-C01) certification. It is amazing to know and learn how Voice first experiences are making an impact and changing how we think about technology and use cases.

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) exam basically validates your ability to build, test, publish and certify Alexa skills.

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Summary

AWS Certified Alexa Skill Builder – Specialty exam focuses only on Alexa and how to build skills.
AWS Certified Alexa Skill Builder – Specialty exam has 65 questions with a time limit of 170 minutes
Compared to the other professional and specialty exams, the question and answers are not long and similar to associate exams. So if you are prepared well, it should not need the 170 minutes.

As the exam was online from home, there was no access to paper and pen but the trick remains the same, read the question and draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.

Are you looking for a job? Visit Jooble!

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Topic Summary

Refer AWS Alexa Cheat Sheet

Domain 1: Voice-First Design Practices and Capabilities

1.1 Describe how users interact with skills

1.2 Map features and capabilities to use cases

Alexa supports display cards to display text (Simple card) and text with image (Standard card)

Alexa Alexa Skill Kits supports APIs
- Alexa Settings APIs allow developers to retrieve customer preferences for the settings like time zone, distance measuring unit, and temperature measurement unit
- Device services – a skill can request the customer’s permission to their address information, which is a static data filled by customer and includes the country/region, postal code and full address
- Customer Profile services – a skill can request the customer’s permission to their contact information, which includes name, email address and phone number
- With Location services, a skill can ask a user’s permission to obtain the real-time location of their Alexa-enabled device, specifically at the time of the user’s request to Alexa, so that the skill can provide enhanced services.
Alexa Skill Kit APIs need apiAccessToken and deviceId to access the ASK APIs

Progressive Response API allows you to keep the user engaged while the skill prepares a full response to the user’s request.
Personalization can be provided using userId and state persistence

Domain 2: Skill Design

2.1 Design and develop an interaction model

Alexa interaction model includes skill, Invocation name, utterances, slots, Intents
A skill is ‘an app for Alexa’, however they are not downloadable but just need to be enabled.

Wakeword – Amazon offers a choice of wakewords like ‘Alexa’, ‘Amazon’, ‘Echo’, ‘skill’, ‘app’ or ‘Computer’, with the default being ‘Alexa’.
Launch phrases include “run,” “start,” “play,” “resume,” “use,” “launch,” “ask,” “open,” “tell,” “load,” “begin,” and “enable.”
Connecting words include “to,” “from,” “in,” “using,” “with,” “about,” “for,” “that,” “by,” “if,” “and,” “whether.”

Invocation name
- is the word or phrase used to trigger the skill for custom skills and the invocation name should adhere to the requirements
- must not infringe upon the intellectual property rights of an entity or person
- must be compound of two or more works.
- One-word invocation names are allowed only for brand/intellectual property.
- must not include names of people or places
- if two-word invocation names, one of the words cannot be a definite article (“the”), indefinite article (“a”, “an”) or preposition (“for”, “to”, “of,” “about,” “up,” “by,” “at,” “off,” “with”).
- must not contain any of the Alexa skill launch phrases, connecting words and wake words
- must contain only lower-case alphabetic characters, spaces between words, and possessive apostrophes
- must spell characters like numbers for e.g., twenty one
- can have periods in the invocation names containing acronyms or abbreviations that are pronounced as a series of individual letters, for e.g. NASA as n. a. s. a.
- cannot spell out phonemes for e.g., a skill titled “AWS Facts” would need “AWS” represented as “a. w. s. ” and NOT “ay double u ess.”
- must not create confusion with existing Alexa features.
- must be written in each supported language
An intent is what a user is trying to accomplish.
- Amazon provides standard built-in intents which can be extended
- Intents need to have a unique utterance
Utterances are the specific phrases that people will use when making a request to Alexa.

A slot is a variable that relates to an intent allowing Alexa to understand information about the request
- Amazon provides standard built-in slots which can be extended
Entity resolution improves the way Alexa matches possible slot values in a user’s utterance with the slots defined in your interaction model

2.2 Design a multi-turn conversation

Alexa Dialog management model identifies the prompts and utterances to collect, validate, and confirm the slot values and intents.
Alexa supports
- Auto Delegation where Alexa completes all of the dialog steps based on the dialog model.
- Manual delegation using Dialog.Delegate where Alexa sends the skill an IntentRequest for each turn of the conversation and provides more flexibility.
AMAZON.FallbackIntent will not be triggered in the middle of a dialog

2.3 Use built-in intents and slots

Standard built-in intents cannot include any slots. If slots are needed, create a custom intent and write your own sample utterances.
Alexa recommends using and extending standard built-in intents like Alexa.HelpIntent, Alexa.YesIntent with additional utterances as per the skill requirements

Alexa provides Alexa.FallbackIntent for handling any unmatched utterances and can be used to improve the interaction model accuracy.
Standard built-in intents cannot include any slots. If slots are needed, create a custom intent and write your own sample utterances.
Alexa provides slot which helps capture variables and can be either be a Amazon predefined slot such as dates, numbers, durations, time, etc. or a custom one specific to the skill

Predefined slots can be extended to add additional values

2.4 Handle unexpected conversational requests or responses

Alexa provides Alexa.FallbackIntent for handling any unmatched utterances and can be used to improve the interaction model accuracy.

Alexa also provides Intent History which provides a consolidate view with aggregated, anonymized frequent utterances and the resolved intents. These can be used to map the utterances to correct intents

2.5 Design multi-modal skills using one or more service interfaces (for example, audio, video, and gadgets)

Alexa enabled devices with a screen handles Page and Scroll intents. Do not handle Next and Previous.

Alexa skill with AudioPlayer interface
- must handle AMAZON.ResumeIntent and AMAZON.PauseIntent
- PlaybackController events to track AudioPlayer status changes initiated from the device buttons

Domain 3: Skill Architecture

3.1 Identify AWS services for extending Alexa skill functionality (Amazon CloudFront, Amazon S3, Amazon CloudWatch, and Amazon DynamoDB)

Focus on standard skill architecture using Lambda for backend, DynamoDB for persistence, S3 for severing static assets, and CloudWatch for monitoring and logs.
Lambda provide serverless handling for the Alexa requests, but remember the following limits
- default concurrency soft limit of 1000 can be increased by raising a support request
- default timeout of 3 secs, and should be increased to atleast 7 secs to be inline with Alexa timeout of 8 secs
- default memory of 128mb, increase to improve performance

S3 performance can be improved by exposing it through CloudFront esp. for images, audio and video files

3.2 Use AWS Lambda to build Alexa skills

Lambda integrates with CloudWatch to provide logs and should be the first thing to check in case of any issues or errors.

Alexa allows any http endpoint to act as a backend, but needs to meet following requirements
- must be accessible over the internet.
- must accept HTTP requests on port 443.
- must support HTTP over SSL/TLS, using an Amazon-trusted certificate.

3.3 Follow AWS and Alexa security and privacy best practices

Alexa requires the backend to verify that incoming requests come from Alexa using Skill ID verification

Child-directed skills cannot use personal and location information
Skills cannot be used to capture health information
Alexa Skills Kit uses the OAuth 2.0 authentication framework for Account linking, which defines a means by which the service can allow Alexa, with the user’s permission, to access information from the account that the user has set up with you.

Alexa smart home skills must have OAuth authorization code grant implementation while custom skills can have authorization code grant or impact grant implementation.

Domain 4: Skill Development

4.1 Implement in-skill purchasing and Amazon Pay for Alexa Skills

In-skill purchasing enables selling premium content such as game features and interactive stories in skills with a custom interaction model.

In-skill purchasing is handled by Alexa when the skill sends a Upsell directive. As the skill session ends when a Upsell directive is sent, be sure to save any relevant user data in a persistent data store so that the skill can continue where the user left off after the purchase flow is completed and the endpoint is back in control of the user experience.
Skill can handle the Connections.Response request that indicates the result of a purchase flow and resume the skill

4.2 Use Speech Synthesis Markup Language (SSML) for expression and MP3 audio

SSML is a markup language that provides a standard way to mark up text for the generation of synthetic speech.
Alexa supports a subset of SSML tags including
- say-as to interpret text as telephone, date, time etc.
- phonemeprovides a phonemic/phonetic pronunciation
- prosody modifies the volume, pitch, and rate of the tagged speech.
- audioallows playing MP3 player while rendering a response
  - must be in valid MP3 file (MPEG version 2) format
  - must be hosted at an Internet-accessible HTTPS endpoint.
  - For speech response, the audio file cannot be longer than 240 seconds.
    - combined total time for all audio files in the outputSpeech property of the response cannot be more than 240 seconds.
    - combined total time for all audio files in the reprompt property of the response cannot be more than 90 seconds.
  - bit rate must be 48 kbps.
  - sample rate must be 22050Hz, 24000Hz, or 16000Hz.

4.3 Implement state management

Alexa Skill state persistence can be handled using session attributes during the session and externally using services like DynamoDB, RDS across sessions.

4.4 Implement Alexa service interfaces (audio player, video player, and screens)

4.5 Parse Alexa JSON requests and provide responses

All requests include the session (optional), context, and request objects at the top level.
- session object provides additional context associated with the request.
  - session attributes can be used to store data
  - user containing userId to uniquely define an user and accessToken to access other services.
- context object provides the skill with information about the current state of the Alexa service and device at the time the request is sent to the service.
  - system object provides apiAccessToken and device object provides deviceId to access ASK APIs
  - application provide applicationId
  - device object provides supportedInterfaces to list each interface that the device supports
  - user containing userId to uniquely define an user and accessToken to access other services.
- A request object that provides the details of the user’s request.

Response includes
- outputSpeech contains the speech to render to the user.
- reprompt contains the outputSpeech to use if a re-prompt is necessary.
- shouldEndSession provides a boolean value that indicates what should happen after Alexa speaks the response.

Domain 5: Test, Validate, and Troubleshoot

5.1 Debug and troubleshoot using Amazon CloudWatch or other tools

Lambda integrates with CloudWatch for metric and logs and can be check for any errors and metrics.

5.2 Use the Alexa developer testing tools

Utterance profiles – test utterances to know what intent they resolve to
Alexa Skill simulator
- provides an ability to Interact with Alexa with either your voice or text, without an actual device.
- maintains the skill session, so the interaction model and dialog flow can be tested.
- supports multiple languages testing by selecting locale
- has limitations in testing audio, video, Alexa settings and Device API
Manual Json
- enter a JSON request directly and see the skill returned JSON response
- does not maintain the skill session and is similar to testing a JSON request in the Lambda console.
Voice & Tone – enter plain text or SSML and hear how Alexa speaks the text in a selected language
Alexa device – test with an Alexa-enabled device.

Alexa app – test the skill with the Alexa app for Android/iOS
Lambda Test console – to test Lambda functions

5.3 Perform beta testing

Skill beta testing tool can be used to test the Alexa skill in beta before releasing it to production
Beat testing allows testing changes to an existing skill, while still keeping the currently live version of the skill available for the general public.
Members can be invited using their Alexa email address. Alexa device used by the beta tester must be associated with the email address in the tester’s invitation.

5.4 Troubleshoot errors in the interaction model

Domain 6: Publishing, Operations, and Lifecycle Management

6.1 Describe the skill publishing process

Alexa skill needs to go through certification process before the Skill is live and made available to the users

Alexa creates an in development version of the skill, once the skill becomes live
Alexa Skill live version cannot be edited, and it is recommended to edit the in development skill, test and then re-certify for publishing.
Backend changes like changes in Lambda functions or response output from the function, however, can be made on live version and do not require re-certification. However, it is recommended to use Lambda versioning or alias to do such changes.

Alexa for Business allows skill to be made private and available to select users within the company

6.2 Add and remove users in the developer console

Alexa Skill Developer console access can be shared across multiple users for collaboration

Administrator and Analyst roles will also have access to the Earnings and Payments sections.
Administrator and Marketer roles will also have access to edit the content associated with apps (i.e. Descriptions, Images & Multimedia) and IAPs
Administrator and Developer roles will have access to create, modify and delete Alexa skills using ASK CLI and SMAPI.

Administrator, Analyst and Marketer roles have access to sales report

6.3 Perform analysis of skill analytics in the developer console

Intent History – View aggregated, anonymized frequent utterances and the resolved intents. You cannot track the user intent history as they are anonymized.
Actions – Unique customers per action, total actions, and total utterances per action.
Customers – Total number of unique customers who accessed the skill.
Intents – Unique customers per intent, total utterances per intent, total intents, and failed intents.
Interaction Path – Paths users take when interacting with the skill.
Plays Total number of times that a user played the skill content.
Retention (live skills only) Usage of the skill over time by groups of customers or cohorts. View the number or percentage of customers who returned to your skill over a 12-week period.
Sessions Total sessions, successful session types (sessions that didn’t end due to an error), average sessions per customer. Includes a breakdown of successful, failed, and no-response sessions as a percentage of total sessions. Custom
Utterances Metrics for utterances depend on the skill category.

6.4 Differentiate among the statuses/versions of skills (for example, In Development, In Certification, and Live)

In Development – skill available for development, testing
In Review – A certification review is in progress and the skill cannot be edited
Certified – Skill passed certification review, and is not yet available to users
Live – skill has been published and is available to users. You cannot edit the configuration for live skills
Hidden – skill was previously published, but has since been hidden. Existing users can access the skill. New users cannot discover the skill.
Removed – skill was previously published, but has since been removed. Users cannot enable or use the skill.

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Resources

Online Courses
- ACloud Guru – AWS Certified Alexa Skill Builder – Specialty 2020 quite a comprehensive course.
Practice tests
- Braincert – AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Practice Exams

AWS Alexa – Cheat Sheet

October 23, 2020 ~ Last updated on : October 23, 2020 ~ jayendrapatil

AWS Alexa

Alexa is Amazon’s cloud-based voice service and the brain behind tens of millions of devices including the Echo family of devices, FireTV, Fire Tablet, and third-party devices with Alexa built-in.

Alexa Design Patterns

Adaptability – understands and processes what user says. Let users speak in their own words.
Personalization – remembers user interaction. Individualize entire interaction.

Availability – keeping all options open. Collapse your menus; make all options top-level.
Relatability – Having a conversation with actual person. Talk with them, not at them

Alexa Interaction Model

Wakeword

An Echo device is always listening but in a dormant state. It wakes up when it hears a phrase or specific work called the wakeword.
Amazon offers a choice of wakewords like ‘Alexa’, ‘Amazon’, ‘Echo’, or ‘Computer’, with the default being ‘Alexa’.
These words are reserved and cannot be changed beyond these four options by users or by developers.

Note; a ‘wakeword’ wakes the assistant, but does not trigger your specific skill, that would be an invocation (we’ll get to this later).

Skills

A skill is ‘an app for Alexa’, however they are not downloadable but just need to be enabled.
A skill can be enabled, either within the Alexa App or by asking Alexa to enable it

Alexa supports three types of skill:
- Custom Skills – most common type of skill, and gives the most control over the user experience. This type of skill lets you develop just about anything you can imagine.
- Smart Home Skills – specifically for controlling smart home appliances. Provides less control over the user experience, but is simpler to develop.
- Flash Briefing Skills – specifically for compatibility with Alexa’s native ‘Flash Briefing’ ability. This type of skill also gives you reduced experience control, but again is simpler to develop.

Invocation Name

An ‘invocation name’ is the word or phrase used to trigger the skill.
Invocation name is only required for custom skill.

Invocation name cannot be changed after the skill goes live
Invocation name Requirements
- must not infringe upon the intellectual property rights of an entity or person
- must be compound of two or more works. One-word invocation names are not allowed, unless its unique to the brand/intellectual property.
- must not include names of people or places
- if two-word invocation names, one of the words cannot be a definite article (“the”), indefinite article (“a”, “an”) or preposition (“for”, “to”, “of,” “about,” “up,” “by,” “at,” “off,” “with”).
- must not contain any of the Alexa skill launch phrases and connecting words. Launch phrases include “run,” “start,” “play,” “resume,” “use,” “launch,” “ask,” “open,” “tell,” “load,” “begin,” and “enable.” Connecting words include “to,” “from,” “in,” “using,” “with,” “about,” “for,” “that,” “by,” “if,” “and,” “whether.”
- must not contain the wake words “Alexa,” “Amazon,” “Echo,” or the words “skill” or “app”.
- must contain only lower-case alphabetic characters, spaces between words, and possessive apostrophes
- must spell characters like numbers for e.g., twenty one
- can have periods in the invocation names containing acronyms or abbreviations that are pronounced as a series of individual letters, for e.g. NASA as n. a. s. a.
- cannot spell out phonemes for e.g., a skill titled “AWS Facts” would need “AWS” represented as “a. w. s. ” and NOT “ay double u ess.”
- must not create confusion with existing Alexa features.
- must be written in each supported language
- should be distinctive to ensure users can enable the skill. Invocation names that are too generic may be rejected during the skill certification process, or result in lower discoverability.

Intent

An intent is what a user is trying to accomplish.
defines an action that fulfills the user’s request
Intent name requirements
- can only contain case-insensitive alphabetical characters and underscores
- cannot include numbers
- cannot include special characters
- cannot include spaces

Utterance

Utterances are the specific phrases that people will use when making a request to Alexa.

Slot

A slot is a variable that relates to an intent allowing Alexa to understand information about the request. for e.g., country to travel, date to travel from, city to travel to etc.

Slot can be either be a Amazon predefined slot such as dates, numbers, durations, time, etc. or a custom one specific to the skill.
Custom values can be added to a subset of the built-in list slot types. Extending a built-in slot type only applies to the specific skill and those changes do not apply to any other skills

Alexa Skill Architecture

Alexa Voice Service (AVS)

Cloud-based service that allows device makers to integrate an ever-increasing set of Alexa features and functions into a connected product
AVS maps the user request to a skill and sends the skill the request in a structured format.

Alexa Skill Kit (ASK)

Alexa Skills Kit lets you teach Alexa new skills

provides APIs, tools, documentation and code samples

Progressive Responses

progressive responses allows you to keep the user engaged while the skill prepares a full response to the user’s request.
Progressive responses can also reduce the user’s perception of latency in the skill’s response.

A progressive response is interstitial SSML content (including text-to-speech and short audio) that Alexa plays while waiting for the full skill response
Progressive response can be used to
- Send text-to-speech confirmations that your skill has received the request and is processing an answer.
- Play short soundmarks associated with your skill.
- Provide other engaging content to the users while waiting on the full response.

Speech Synthesis Markup Language (SSML)

Alexa Skills Kit supports Speech Synthesis Markup Language (SSML) to control how Alexa interprets the speech from the text in the response

SSML is a markup language that provides a standard way to mark up text for the generation of synthetic speech.
Alexa Skills Kit supports a subset of the tags defined in the SSML specification
- amazon:effect only supports whispered
- audioallows playing MP3 player while rendering a response
  - must be hosted at an Internet-accessible HTTPS endpoint. Self-signed certificates cannot be used.
  - must not contain any customer-specific or other sensitive information
  - must be a valid MP3 file (MPEG version 2).
  - cannot be longer than 240 seconds.
  - bit rate must be 48 kbps.
  - sample rate must be 22050Hz, 24000Hz, or 16000Hz.
- phonemeprovides a phonemic/phonetic pronunciation for the contained text
- prosody modifies the volume, pitch, and rate of the tagged speech.
- say-as describes how the text should be interpreted using with the interpret-as attribute. for e.g. date, time, telephone, digits etc.

Personalization

Skill personalization enables skill to differentiate an individual user who has a voice profile.
Skill personalization can be provided using userId or personId

With Alexa Settings APIs allow developers to retrieve customer preferences for the settings like time zone, distance measuring unit, and temperature measurement unit
With Device services, a skill can request the customer’s permission to their address information, which is a static data filled by customer and includes the country/region, postal code and full address
With Customer Profile services, a skill can request the customer’s permission to their contact information, which includes name, email address and phone number

With Location services, a skill can ask a user’s permission to obtain the real-time location of their Alexa-enabled device, specifically at the time of the user’s request to Alexa, so that the skill can provide enhanced services.
Requirements
- must include a link to the Privacy Policy that applies to the skill,
- skill is child-directed, cannot use personalization.
- skill uses information protected by HIPAA (Health Insurance Portability and Accountability Act), cannot use personalization.
- Do not use personalization to handle sensitive user information.
- Personalization is not authentication.

Service Endpoint

Requests can be processed using AWS Lambda or any Webservice hosted on cloud or on premises.
Requirements for custom service endpoints
- must be accessible over the internet.
- must accept HTTP requests on port 443.
- must support HTTP over SSL/TLS, using an Amazon-trusted certificate.
- must verify that incoming requests come from Alexa.
- must adhere to the Alexa Skills Kit interface.

AWS Other components

Service endpoints can be implemented using AWS Lambda
- Lambda has a default 3 seconds timeout and a max of 15 mins
- Lambda has a default memory 128 mb
- Lambda has a concurrency soft limit of 1000 and can be increased by raising a AWS support ticket.

CloudWatch can be used for monitoring and logs
Lambda logs are stored in CloudWatch
DynamoDB can be used for state persistence

Alexa Skill Lifecycle

Development

ASK Command Line Interface (CLI)
- provides command line interface to test the skill with ASK CLI commands such as invoke-skill and simulate-skill.
Skill Management API – SMAPI
- provides a restful HTTP endpoints for testing
- helps manage skills programmatically
AWS Developer Console
- provides a user interface for skill development
Access can be shared across multiple users for collaboration
- First user associated with an Alexa developer account is considered the owner and will retain full rights to administer the developer account.
- Additional users can be invited to have access to the developer account and will have the rights associated with the role(s) assigned to the user.
- All roles will grant users the full access to create, modify and delete Alexa skills using the developer console.
- Administrator: This role grants complete access to all sections of the developer account, including reporting and payment information. Most importantly, any account administrator has the ability to manage user permissions, including inviting or removing users from the account.
- Developer: Outside of an Administrator, this is the only role that gives users the ability submit and adjust application files.
- Marketer: Outside of an Administrator, this is the only role that gives users the ability to edit the content associated with apps (i.e. Descriptions, Images & Multimedia) and IAPs. Like the Analyst, this role also gives access to sales reports.
- Analyst: Outside of an Administrator, this is the only role that gives users the ability to view earnings reports. Like the Marketer, this role also gives users access to sales reports.

Build

Use Build to set up the skill, configure the interaction model, and specify the endpoints for your service.

Test

Use Test to test the skill with either text or voice.
Utterance profiles
- Use utterance profiles to test the custom interaction model.
- enter utterances to see how they resolve to the intents and slots before you write the code for your service.
Alexa Skill simulator
- Use the simulator provided on the Test page in the developer console.
- provides an ability to Interact with Alexa with either your voice or text, without an actual device.
- maintains the skill session with the skill just as a device would, so the interaction model and dialog flow can be tested.
- can sends any cards that the skill returns to the Alexa app the same way a device would.
- supports multiple languages testing by selecting the wanted language to test from the drop-down list.
Manual Json
- enter a JSON request directly and see the skill returned JSON response
- does not maintain the skill session and is similar to testing a JSON request in the Lambda console.
Voice & Tone
- enter plain text or SSML and hear how Alexa speaks the text in a selected language
Alexa device
- Test with an Alexa-enabled device.

Alexa app
- Test the skill with the Alexa app for Android/iOS

Distribution

Use Distribution to preview how the skill will appear in the skill store.

Distribution allows user to provide more information about the skill before is it published which includes
- information about the skill, description, icons, keywords, categories etc.
- privacy policy URL if skill requires account linking or collects user information
- skill availability where its public, for business organizations or beta test
- country, region and locale information
Skill beta testing tool
- used to test the Alexa skill in beta before releasing it to production
- test changes to an existing skill, while still keeping the currently live version of the skill available for the general public.
- members can be invited using their Alexa email address. Alexa device used by the beta tester must be associated with the email address in the tester’s invitation.
- can help increase your chances of skill success.

Certification & Publish

Use Certification to validate the skill, run pre-certification tests, and then submit the skill for certification.
Alexa skill must pass the certification process, when submitted to the Alexa skill store, before it’s published live to Amazon customers.

Before submitting the new skill for certification, proper quality assurance testing and if required beta testing must be done to ensure customers have a good experience.
Skill can be validated and functional tests, set of pre-certification tests on the skill, can be executed on the skill which provide immediate feedback for common certification failures
Certification may fail for reasons like
- Child directed skills cannot sell any products and cannot collect any personal information
- Cannot collect any information related to health

Status	Description	Stage
In Development	The skill is available to you and any potential beta testers that you have added to skill beta testing. If you have enabled your skill for testing, a user can invoke your skill on any devices registered to your developer account, or on any devices registered to your beta testers’ accounts.	`development`
In Review	A certification review is in progress. During this time, you cannot edit the skill configuration.	`development`
Certified	The skill has passed certification review, and is not yet available to users. To make the skill available to users, publish the skill. If you have not published the skill and want to start a new certification review, you must first withdraw the certified version of the skill.	`certified`
Live	The skill has been published and is available to users. You cannot edit the configuration for live skills. To start development on an updated version, make your changes on the development version instead.	`live`
Hidden	The skill was previously published, but has since been hidden. Users who enabled the skill before it was hidden can continue to use the skill. The skill is no longer available when users search or browse the Alexa Skills Store.	`live`
Removed	The skill was previously published, but has since been removed. Users cannot enable or use the skill.	`live`

Analytics

Use Analytics to review metrics for the skill such as number utterances, customers, and intents invoked.

Intent History – View aggregated, anonymized frequent utterances and the resolved intents.
Available Skill Metrics
- Actions – Unique customers per action, total actions, and total utterances per action.
- Customers – Total number of unique customers who accessed the skill.
- Intents – Unique customers per intent, total utterances per intent, total intents, and failed intents.
- Interaction Path – Paths users take when interacting with the skill.
- Plays Total number of times that a user played the skill content.
- Retention (live skills only) Usage of the skill over time by groups of customers or cohorts. View the number or percentage of customers who returned to your skill over a 12-week period.
- Sessions Total sessions, successful session types (sessions that didn’t end due to an error), average sessions per customer. Includes a breakdown of successful, failed, and no-response sessions as a percentage of total sessions. Custom
- Utterances Metrics for utterances depend on the skill category.

Edit and Recertify

Once a skill is published to users, it is considered live.
A development version is automatically created as a copy of the live and has the same information as the original live version
Live skill configuration cannot be edited
For updates it is recommended to update the development version, test, apply for re-certification and publish it.
Once the new version is published, it becomes live and replaces the previous live version.

Alexa Account Linking

Account linking enables the skill to connect the skill user’s Amazon identity with an identity in a different system for e.g. uber, twitter etc.
Alexa Skills Kit uses the OAuth 2.0 authentication framework for Account linking, which defines a means by which the service can allow Alexa, with the user’s permission, to access information from the account that the user has set up with you.
“Link accounts” means “to get the user’s permission to obtain an access token” so that the skill can use the access token in API calls to the server that contains the user data
Grants are ways for a client application (in this case, an Alexa skill) to authorize the user and obtain an access token that it can use to authenticate a request to the resource server.
OAuth 2.0 defines a number of grant types.
- Authorization code grant type
  - works on 2 step process
    - by getting an authorization code from the authorization server,
    - and exchanges it for an access token, and then passes the access token in requests to your skill.
  - is recommended grant type for security and usability reasons.
- Implicit grant type
  - authorization server returns the access token once the user logs in
  - is less secure
  - Only custom skills can use the implicit grant type.
- Authorization code grant is applicable for the vast majority of cases and the implicit grant is for limited use.
Account linking is not supported for all skills types for e.g. flash briefing

Alexa In-Skill Purchasing

In-skill purchasing enables selling premium content such as game features and interactive stories in skills with a custom interaction model.
Customers pay for products using the payment options associated with their Amazon account.
Alexa In-Skill purchasing is handled by Alexa and the skill session ends when the purchase flow starts
Amazon handles the voice interaction model and all the mechanics of the purchase, as well as obtaining the product description and list price from the product’s schema. The message and price are automatically adjusted for Prime customers.
When the purchase completes the skill will be re-launched, and a purchase result is supplied to the skill.
Because the skill session ends when a Upsell directive is sent, be sure to save any relevant user data in a persistent data store so that the skill can continue where the user left off after the purchase flow is completed and the endpoint is back in control of the user experience.
Skill can handle the Connections.Response request that indicates the result of a purchase flow and resume the skill

Alexa In-built Intents

Standard built-in intents cannot include any slots. If slots are needed, create a custom intent and write your own sample utterances.
AWS recommends extending built-in intents with additional skill-specific utterances as they provide better coverage than the sample utterances written manually.
Alexa.CancelIntent
- this should just exit the skill.
- can be map it to return to the skill if need be instead of exiting
Alexa.StopIntent
- must be implemented by the skill
- shouldEndSession must be true or null in the response
Alexa.HelpIntent
- provides help about how to use the skill.
- can be extended by adding custom sample utterances
Alexa.FallbackIntent –
- provides a fallback for user utterances that do not match any of the skill’s defined intents
- is considered when the user’s spoken input cannot be matched with confidence to any of the other intents in the skill
- is designed as an out-of-domain model that can pick up user input that does not fit into your skill’s intended design.
- can help the skill handle many utterances that may not confidently map to the skill’s defined sample utterances and intents.
- is not normally triggered if the dialog is delegated to Alexa
Alexa.PauseIntent and Alexa.ResumeIntent
- must be implemented, if the skill streams audio using the AudioPlayer interface.

Alexa Card Types

A Simple card displays plain text and can be provided with a text for the card title and content.
A Standard card also displays plain text, but can include an image and can be provided with a text for the title and content, and the URL for the image to display.
A LinkAccount card is a special card type only used with account linking. This card lets users start the account linking process.
An AskForPermissionsConsent card is sent to the Alexa app when a skill requires the customer to grant specific permissions.

Alexa AudioPlayer Interface

Requires AMAZON.PauseIntent and AMAZON.ResumeIntent to be implemented
PlaybackController events to track AudioPlayer status changes initiated from the device buttons

Alexa Dialog Management

Alexa Dialog management model identifies the prompts and utterances to collect, validate, and confirm the slot values and intents.
When delegated the dialog to Alexa, Alexa determines the next step in the conversation and uses prompts to ask the user for the information.
Two ways to delegate the dialog
- Enable auto delegation, either for the entire skill or for specific intents.
  - Alexa completes all of the dialog steps based on the dialog model.
  - Alexa sends the skill a single IntentRequest when the dialog is complete.
- Delegate manually with the Dialog.Delegate directive.
  - Alexa sends the skill an IntentRequest for each turn of the conversation
  - Skill returns the Dialog.Delegate directive for incomplete dialog, indicating Alexa to check the dialog model for the next step and use a prompt to ask the user for more information as needed.
  - Once all the steps are complete, the skill receives the final IntentRequest with dialogState set to COMPLETED.
  - provides flexibility as the skill can make run-time decisions such as defaulting values.
  - can be used in combination with other Dialog directives to take complete control over the dialog
- Dialog management requires shouldEndSession to be set to false

Alexa Request and Response

Request include the session (optional), context, and request objects at the top level.
- session object provides additional context associated with the request.
  - session attributes can be used to store data
  - user containing userId to uniquely define an user and accessToken to access other services.
- context object provides the skill with information about the current state of the Alexa service and device at the time the request is sent to the service.
  - system object provides apiAccessToken and device object provides deviceId to access ASK APIs
  - application provide applicationId
  - device object provides supportedInterfaces to list each interface that the device supports
  - user containing userId to uniquely define an user and accessToken to access other services.
- A request object that provides the details of the user’s request.
Response includes
- outputSpeech contains the speech to render to the user.
- reprompt contains the outputSpeech to use if a re-prompt is necessary.
- shouldEndSession provides a boolean value that indicates what should happen after Alexa speaks the response.
  - true – the session ends.
  - false – Alexa opens the microphone for a few seconds to listen for the user’s response and reprompt, if included, to give the user a second chance to respond.

Alexa Best Practices

Alexa Skill state persistence can be handled using session attributes during the session and externally using services like DynamoDB, S3 (for hosted skills), and RDS across sessions.
Verify that incoming requests come from Alexa using Skill ID verification to ensure request came from the intended skill.
- prevents a malicious developer from configuring a skill with your endpoint and then using that skill to send requests to your service.
- To do this validation, every request sent by Alexa includes a unique skill ID. Skill ID in the request can be checked against the actual skill ID to ensure that the request was intended for your service.

AWS DynamoDB Best Practices

August 27, 2020 ~ Last updated on : August 23, 2023 ~ jayendrapatil

AWS DynamoDB Best Practices

Primary Key Design

Primary key uniquely identifies each item in a DynamoDB table and can be simple (a partition key only) or composite (a partition key combined with a sort key).

Partition key portion of a table’s primary key determines the logical partitions in which a table’s data is stored, which in turn affects the underlying physical partitions.
Partition key should have many unique values.

Distribute reads / writes uniformly across partitions to avoid hot partitions
Store hot and cold data in separate tables
Consider all possible query patterns to eliminate the use of scans and filters.

Choose a sort key depending on the application’s needs.
Avoid hot keys and hot partitions – a partition key design that doesn’t distribute I/O requests evenly can create “hot” partitions that result in throttling and use the provisioned I/O capacity inefficiently.

Secondary Indexes

Use indexes based on the application’s query patterns.

Local Secondary Indexes – LSIs
- Use primary key or LSIs when strong consistency is desired
- Watch for expanding item collections (10 GB size limit!)

Global Secondary Indexes – GSIs
- Use GSIs for finer control over throughput or when your application needs to query using a different partition key.
- Can be used for eventually consistent read replicas – set up a global secondary index that has the same key schema as the parent table, with some or all of the non-key attributes projected into it.

Project fewer attributes – As secondary indexes consume storage and provisioned throughput, keep the index size as small as possible by projecting only required attributes as it would provide greater performance
Keep the number of indexes to a minimum – don’t create secondary indexes on attributes that aren’t queried often. Indexes that are seldom used contribute to increased storage and I/O costs without improving application performance.
Sparse indexes – DynamoDB indexes are Sparse and it writes a corresponding index entry only if the index sort key value is present in the item. If the sort key doesn’t appear in every table item, the index will do contain the item.

Large Items and Attributes

DynamoDB currently limits the size of each item (400 KB) that is stored in a table, which includes both attribute names and values binary length.
Use shorter (yet intuitive!) attribute names
Keep item size small.

Use compression (GZIP or LZO).
Split large attributes across multiple items.
Store metadata in DynamoDB and large BLOBs or attributes in S3.

Querying and Scanning Data

Avoid scans and filters – Scan operations are less efficient than other operations in DynamoDB. A Scan operation always scans the entire table or secondary index. It then filters out values to provide the result, essentially adding the extra step of removing data from the result set.
Use eventual consistency for reads.

Time Series Data

Use a table per day, week, month, etc for storing time series data – create one table per period, provisioned with the required read and write capacity and the required indexes.

Before the end of each period, prebuild the table for the next period. Just as the current period ends, direct event traffic to the new table. Assign names to the tables that specify the periods they have recorded.
As soon as a table is no longer being written to, reduce its provisioned write capacity to a lower value (for example, 1 WCU), and provision whatever read capacity is appropriate. Reduce the provisioned read capacity of earlier tables as they age.
Archive or drop the tables whose contents are rarely or never needed.

Dropping tables is the fastest, simplest and cost-effective method if all the items are to be deleted from the table, without spending time in scanning and deleting each item.

Other Best Practices

Burst Capacity reserves a portion of unused capacity (5 mins.) for later bursts of throughput to handle usage spikes.
Adaptive capacity helps run imbalanced workloads indefinitely. It minimizes throttling due to throughput exceptions and reduces cost by enabling you to provision only the needed throughput capacity.

Deletion protection can keep the tables from being accidentally deleted.

Reference

DynamoDB_Best_Practices

AWS Content Delivery – Cheat Sheet

July 11, 2020 ~ Last updated on : October 6, 2022 ~ jayendrapatil

CloudFront

provides low latency and high data transfer speeds for distribution of static, dynamic web or streaming content to web users

delivers the content through a worldwide network of data centers called Edge Locations
keeps persistent connections with the origin servers so that the files can be fetched from the origin servers as quickly as possible.

dramatically reduces the number of network hops that users’ requests must pass through
supports multiple origin server options, like AWS hosted service for e.g. S3, EC2, ELB or an on premise server, which stores the original, definitive version of the objects
single distribution can have multiple origins and Path pattern in a cache behavior determines which requests are routed to the origin

supports Web Download distribution and RTMP Streaming distribution
- Web distribution supports static, dynamic web content, on demand using progressive download & HLS and live streaming video content
- RTMP supports streaming of media files using Adobe Media Server and the Adobe Real-Time Messaging Protocol (RTMP) ONLY

supports HTTPS using either
- dedicated IP address, which is expensive as dedicated IP address is assigned to each CloudFront edge location
- Server Name Indication (SNI), which is free but supported by modern browsers only with the domain name available in the request header

For E2E HTTPS connection,
- Viewers -> CloudFront needs either self signed certificate, or certificate issued by CA or ACM
- CloudFront -> Origin needs certificate issued by ACM for ELB and by CA for other origins

Security
- Origin Access Identity (OAI) can be used to restrict the content from S3 origin to be accessible from CloudFront only
- supports Geo restriction (Geo-Blocking) to whitelist or blacklist countries that can access the content
- Signed URLs
  - for RTMP distribution as signed cookies aren’t supported
  - to restrict access to individual files, for e.g., an installation download for your application.
  - users using a client, for e.g. a custom HTTP client, that doesn’t support cookies
- Signed Cookies
  - provide access to multiple restricted files, for e.g., video part files in HLS format or all of the files in the subscribers’ area of a website.
  - don’t want to change the current URLs
- integrates with AWS WAF, a web application firewall that helps protect web applications from attacks by allowing rules configured based on IP addresses, HTTP headers, and custom URI strings
supports GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE to get object & object headers, add, update, and delete objects
- only caches responses to GET and HEAD requests and, optionally, OPTIONS requests
- does not cache responses to PUT, POST, PATCH, DELETE request methods and these requests are proxied back to the origin
object removal from cache
- would be removed upon expiry (TTL) from the cache, by default 24 hrs
- can be invalidated explicitly, but has a cost associated, however might continue to see the old version until it expires from those caches
- objects can be invalidated only for Web distribution
- change object name, versioning, to serve different version
supports adding or modifying custom headers before the request is sent to origin which can be used to
- validate if user is accessing the content from CDN
- identifying CDN from which the request was forwarded from, in case of multiple CloudFront distribution
- for viewers not supporting CORS to return the Access-Control-Allow-Origin header for every request
supports Partial GET requests using range header to download object in smaller units improving the efficiency of partial downloads and recovery from partially failed transfers

supports compression to compress and serve compressed files when viewer requests include Accept-Encoding: gzip in the request header
supports different price class to include all regions, to include only least expensive regions and other regions to exclude most expensive regions
supports access logs which contain detailed information about every user request for both web and RTMP distribution

AWS IoT Core

July 8, 2020 ~ Last updated on : January 15, 2021 ~ jayendrapatil

AWS IoT Core

AWS IoT Core is a managed cloud platform that lets connected devices easily and securely interact with cloud applications and other devices.

AWS IoT Core can support billions of devices and trillions of messages, and can process and route those messages to AWS endpoints and to other devices reliably and securely.
AWS IoT Core allows the applications to keep track of and communicate with all the devices, all the time, even when they aren’t connected.

AWS IoT Core offers
- Connectivity between devices and the AWS cloud.
  - AWS IoT Core allows communication with connected devices securely, with low latency and with low overhead.
  - Communication can scale to as many devices as needed.
  - AWS IoT Core supports standard communication protocols (HTTP, MQTT, and WebSockets are supported currently).
  - Communication is secured using TLS.
- Processing data sent from connected devices.
  - AWS IoT Core can continuously ingest, filter, transform, and route the data streamed from connected devices.
  - Actions can be taken based on the data and route it for further processing and analytics.
- Application interaction with connected devices.
  - AWS IoT Core accelerates IoT application development.
  - It serves as an easy to use interface for applications running in the cloud and on mobile devices to access data sent from connected devices, and send data and commands back to the devices.

AWS IoT Core Works

Connected devices, such as sensors, actuators, embedded devices, smart appliances, and wearable devices, connect to AWS IoT Core over HTTPS, WebSockets, or secure MQTT.
Communication with AWS IoT Core is secure.
- HTTPS and WebSockets requests sent to AWS IoT Core are authenticated using AWS IAM or AWS Cognito, both of which support the AWS SigV4 authentication.
- HTTPS requests can also be authenticated using X.509 certificates.
- MQTT messages to AWS IoT Core are authenticated using X.509 certificates.
- With AWS IoT Core allows using AWS IoT Core generated certificates, as well as those signed by your preferred Certificate Authority (CA).
AWS IoT Core also offers fine-grained authorization to isolate and secure communication among authenticated clients.

Device Gateway

Device Gateway forms the backbone of communication between connected devices and the cloud capabilities such as the Rules Engine, Device Shadow, and other AWS and 3rd-party services.

Device Gateway allows secure, low-latency, low-overhead, bi-directional communication between connected devices, cloud and mobile application
Device Gateway supports the pub/sub messaging pattern, which involves involves clients publishing messages on logical communication channels called ‘topics’ and clients subscribing to topics to receive messages
Device gateway enables communication between publishers and subscribers

Device Gateway scales automatically as per the demand, without any operational overhead

Rules Engine

Rules Engine enables continuous processing of data sent by connected devices.
Rules can be configured to filter and transform the data using an intuitive, SQL-like syntax.

Rules can be configured to route the data to other AWS services such as DynamoDB, Kinesis, Lambda, SNS, SQS, CloudWatch, Elasticsearch Service with built-in Kibana integration, as well as to non-AWS services, via Lambda for further processing, storage, or analytics.

Registry

Registry allows registering devices and keeping track of devices connected to AWS IoT Core, or devices that may connect in the future.

Device Shadow

Device Shadow enables cloud and mobile applications to query data sent from devices and send commands to devices, using a simple REST API, while letting AWS IoT Core handle the underlying communication with the devices.

Device Shadow accelerates application development by providing
- a uniform interface to devices, even when they use one of the several IoT communication and security protocols with which the applications may not be compatible.
- an always available interface to devices even when the connected devices are constrained by intermittent connectivity, limited bandwidth, limited computing ability or limited power.

Device and its Device Shadow Lifecycle

A device (such as a light bulb) is registered in the Registry.
Connected device is programmed to publish a set of its property values or ‘state (“I am ON and my color is RED”) to the AWS IoT Core service.
Device Shadow also stores the last reported state in the in AWS IoT Core.

An application (such as a mobile app controlling the light bulb) uses a RESTful API to query AWS IoT Core for the last reported state of the light bulb, without the complexity of communicating directly with the light bulb
When a user wants to change the state (such as turning the light bulb from ON to OFF), the application uses a RESTful API to request an update, i.e. sets a ‘desired’ state for the device in AWS IoT Core. AWS IoT Core takes care of synchronizing the desired state to the device.
Application gets notified when the connected device updates its state to the desired state.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

You need to filter and transform incoming messages coming from a smart sensor you have connected with AWS. Once messages are received, you need to store them as time series data in DynamoDB. Which AWS service can you use?
1. IoT Device Shadow Service (maintains device state)
2. Redshift
3. Kinesis (While Kinesis could technically be used as an intermediary between different sources, it isn’t a great way to get data into DynamoDB from an IoT device.)
4. IoT Rules Engine

AWS Certified Solutions Architect – Associate SAA-C02 Exam Learning Path

July 1, 2020 ~ Last updated on : October 4, 2023 ~ jayendrapatil ~ 37 Comments

AWS Certified Solutions Architect – Associate SAA-C02 Exam Learning Path

AWS Solutions Architect – Associate SAA-C02 exam is the latest AWS exam that has replaced the previous SAA-C01 certification exam. It basically validates the ability to effectively demonstrate knowledge of how to architect and deploy secure and robust applications on AWS technologies

Define a solution using architectural design principles based on customer requirements.
Provide implementation guidance based on best practices to the organization throughout the life cycle of the project.

Refer AWS_Solution_Architect_-_Associate_SAA-C02_Exam_Blue_Print

AWS Solutions Architect – Associate SAA-C02 Exam Summary

SAA-C02 exam consists of 65 questions in 130 minutes, and the time is more than sufficient if you are well prepared.
SAA-C02 Exam covers the architecture aspects in deep, so you must be able to visualize the architecture, even draw them out in the exam just to understand how it would work and how different services relate.

AWS has updated the exam concepts from the focus being on individual services to more building of scalable, highly available, cost-effective, performant, resilient.
If you had been preparing for the SAA-C01 –
- SAA-C02 is pretty much similar to SAA-C01 except the operational effective architecture domain has been dropped
- Although, most of the services and concepts covered by the SAA-C01 are the same. There are few new additions like Aurora Serverless, AWS Global Accelerator, FSx for Windows, FSx for Lustre
AWS exams are available online, and I took the online one. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
Also, if you are taking the AWS Online exam for the first time try to join atleast 30 minutes before the actual time.

AWS Solutions Architect – Associate SAA-C02 Exam Resources

Online Courses
- Coursera Exam Prep: AWS Certified Solutions Architect – Associate
- DolfinEd – AWS Certified Solutions Architect Associate 2021 – SAA-C02 (E-Study & Lab Guides Included) [Best Seller][Highest Rated]
- DolfinEd – AWS Certified Solutions Architect Associate 2021 (On-line, Instructor-Led – Private Group Bootcamp)
- Stephane Maarek – Ultimate AWS Certified Solutions Architect Associate 2020 [Highest Rated]
- A Cloud Guru – AWS Certified Solutions Architect – Associate 2020
- Linux Academy – AWS Certified Solutions Architect – Associate 2020
- Zeal Vora – AWS Certified Solutions Architect – Associate 2020 course
Practice tests
- Braincert AWS Solutions Architect – Associate SAA-C02 Practice Exams, which are updated for SAA-C02
- Stephane Maarek – AWS Certified Solutions Architect Associate Practice Exams
Signed up with AWS for the Free Tier account which provides a lot of the Services to be tried for free with certain limits which are more than enough to get things going. Be sure to decommission services beyond the free limits, preventing any surprises 🙂

Also, use QwikLabs for introductory courses which are free
Read the FAQs atleast for the important topics, as they cover important points and are good for quick review

AWS Solutions Architect – Associate SAA-C02 Exam Topics

Make sure you go through all the topics and focus on hints in italics

Networking

Be sure to create VPC from scratch. This is mandatory.
- Create VPC and understand whats an CIDR and addressing patterns
- Create public and private subnets, configure proper routes, security groups, NACLs. (hint: Subnets are public or private depending on whether they can route traffic directly through Internet gateway)
- Create Bastion for communication with instances
- Create NAT Gateway or Instances for instances in private subnets to interact with internet
- Create two tier architecture with application in public and database in private subnets
- Create three tier architecture with web servers in public, application and database servers in private. (hint: focus on security group configuration with least privilege)
- Make sure to understand how the communication happens between Internet, Public subnets, Private subnets, NAT, Bastion etc.
Understand difference between Security Groups and NACLs (hint: Security Groups are Stateful vs NACLs are stateless. Also only NACLs provide an ability to deny or block IPs)

Understand VPC endpoints and what services it can help interact (hint: VPC Endpoints routes traffic internally without Internet)
- VPC Gateway Endpoints supports S3 and DynamoDB.
- VPC Interface Endpoints OR Private Links supports others

Understand difference between NAT Gateway and NAT Instance (hint: NAT Gateway is AWS managed and is scalable and highly available)
Understand how NAT high availability can be achieved (hint: provision NAT in each AZ and route traffic from subnets within that AZ through that NAT Gateway)
Understand VPN and Direct Connect for on-premises to AWS connectivity
- VPN provides quick connectivity, cost-effective, secure channel, however routes through internet and does not provide consistent throughput
- Direct Connect provides consistent dedicated throughput without Internet, however requires time to setup and is not cost-effective
Understand Data Migration techniques
- Choose Snowball vs Snowmobile vs Direct Connect vs VPN depending on the bandwidth available, data transfer needed, time available, encryption requirement, one-time or continuous requirement
- Snowball, SnowMobile are for one-time data, cost-effective, quick and ideal for huge data transfer
- Direct Connect, VPN are ideal for continuous or frequent data transfers

Understand CloudFront as CDN and the static and dynamic caching it provides, what can be its origin (hint: CloudFront can point to on-premises sources and its usecases with S3 to reduce load and cost)
Understand Route 53 for routing
- Understand Route 53 health checks and failover routing
- Understand Route 53 Routing Policies it provides and their use cases mainly for high availability (hint: focus on weighted, latency, geolocation, failover routing)
Be sure to cover ELB concepts in deep.
- SAA-C02 focuses on ALB and NLB and does not cover CLB
- Understand differences between CLB vs ALB vs NLB
  - ALB is layer 7 while NLB is layer 4
  - ALB provides content based, host based, path based routing
  - ALB provides dynamic port mapping which allows same tasks to be hosted on ECS node
  - NLB provides low latency and ability to scale
  - NLB provides static IP address

Security

Understand IAM as a whole
- Focus on IAM role (hint: can be used for EC2 application access and Cross-account access)
- Understand IAM identity providers and federation and use cases
- Understand MFA and how would implement two factor authentication for an application
- Understand IAM Policies (hint: expect couple of questions with policies defined and you need to select correct statements)
Understand encryption services
- KMS for key management and envelope encryption
- Focus on S3 with SSE, SSE-C, SSE-KMS
- Know SQS now provides SSE support

AWS WAF integrates with CloudFront to provide protection against Cross-site scripting (XSS) attacks. It also provide IP blocking and geo-protection.
AWS Shield integrates with CloudFront to provide protection against DDoS.
Refer Disaster Recovery whitepaper, be sure you know the different recovery types with impact on RTO/RPO.

Storage

Understand various storage options S3, EBS, Instance store, EFS, Glacier, FSx and what are the use cases and anti patterns for each
Instance Store
- Understand Instance Store (hint: it is physically attached to the EC2 instance and provides the lowest latency and highest IOPS)

Elastic Block Storage – EBS
- Understand various EBS volume types and their use cases in terms of IOPS and throughput. SSD for IOPS and HDD for throughput
- Understand Burst performance and I/O credits to handle occasional peaks
- Understand EBS Snapshots (hint: backups are automated, snapshots are manual)
Simple Storage Service – S3
- Cover S3 in depth
- Understand S3 storage classes with lifecycle policies
  - Understand the difference between SA Standard vs SA IA vs SA IA One Zone in terms of cost and durability
- Understand S3 Data Protection (hint: S3 Client side encryption encrypts data before storing it in S3)
- Understand S3 features including
  - S3 provides a cost effective static website hosting
  - S3 versioning provides protection against accidental overwrites and deletions
  - S3 Pre-Signed URLs for both upload and download provides access without needing AWS credentials
  - S3 CORS allows cross domain calls
  - S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket.
- Understand Glacier as an archival storage with various retrieval patterns
- Glacier Expedited retrieval now allows object retrieval within mins
Understand Storage gateway and its different types.
- Cached Volume Gateway provides access to frequently accessed data, while using AWS as the actual storage
- Stored Volume gateway uses AWS as a backup, while the data is being stored on-premises as well
- File Gateway supports SMB protocol

Understand FSx easy and cost effective to launch and run popular file systems.
- FSx provides two file systems to choose from: Amazon FSx for Windows File Server for business applications and Amazon FSx for Lustre for high-performance workloads.
Understand the difference between EBS vs S3 vs EFS
- EFS provides shared volume across multiple EC2 instances, while EBS can be attached to a single volume within the same AZ.
Understand the difference between EBS vs Instance Store
Would recommend referring Storage Options whitepaper, although a bit dated 90% still holds right

Compute

Understand Elastic Cloud Compute – EC2
Understand Auto Scaling and ELB, how they work together to provide High Available and Scalable solution. (hint: Span both ELB and Auto Scaling across Multi-AZs to provide High Availability)
Understand EC2 Instance Purchase Types – Reserved, Scheduled Reserved, On-demand and Spot and their use cases
- Choose Reserved Instances for continuous persistent load
- Choose Scheduled Reserved Instances for load with fixed scheduled and time interval
- Choose Spot instances for fault tolerant and Spiky loads
- Reserved instances provides cost benefits for long terms requirements over On-demand instances
- Spot instances provides cost benefits for temporary fault tolerant spiky load
Understand EC2 Placement Groups (hint: Cluster placement groups provide low latency and high throughput communication, while Spread placement group provides high availability)

Understand Lambda and serverless architecture, its features and use cases. (hint: Lambda integrated with API Gateway to provide a serverless, highly scalable, cost-effective architecture)
Understand ECS with its ability to deploy containers and micro services architecture.
- ECS role for tasks can be provided through taskRoleArn
- ALB provides dynamic port mapping to allow multiple same tasks on the same node
Know Elastic Beanstalk at a high level, what it provides and its ability to get an application running quickly.

Databases

Understand relational and NoSQLs data storage options which include RDS, DynamoDB, Aurora and their use cases

RDS
- Understand RDS features – Read Replicas vs Multi-AZ
  - Read Replicas for scalability, Multi-AZ for High Availability
  - Multi-AZ are regional only
  - Read Replicas can span across regions and can be used for disaster recovery
- Understand Automated Backups, underlying volume types

Aurora
- Understand Aurora
  - provides multiple read replicas and replicates 6 copies of data across AZs
- Understand Aurora Serverless provides a highly scalable cost-effective database solution
DynamoDB
- Understand DynamoDB with its low latency performance, key-value store (hint: DynamoDB is not a relational database)
- DynamoDB DAX provides caching for DynamoDB
- Understand DynamoDB provisioned throughput for Read/Writes (It is more cover in Developer exam though.)
Know ElastiCache use cases, mainly for caching performance

Integration Tools

Understand SQS as message queuing service and SNS as pub/sub notification service
Understand SQS features like visibility, long poll vs short poll
Focus on SQS as a decoupling service

Understand SQS Standard vs SQS FIFO difference (hint: FIFO provides exactly once delivery both low throughput)

Analytics

Know Redshift as a business intelligence tool
Know Kinesis for real time data capture and analytics

Atleast know what AWS Glue does, so you can eliminate the answer

Management Tools

Understand CloudWatch monitoring to provide operational transparency
Know which EC2 metrics it can track. Remember, it cannot track memory and disk space/swap utilization

Understand CloudWatch is extendable with custom metrics
Understand CloudTrail for Audit
Have a basic understanding of CloudFormation, OpsWorks

AWS Whitepapers & Cheat sheets

AWS Solutions Architect – Associate Exam Domains

Domain 1: Design Resilient Architectures

Design a multi-tier architecture solution
Design highly available and/or fault-tolerant architectures
Design decoupling mechanisms using AWS services

Choose appropriate resilient storage

Domain 2: Define High-Performing Architectures

Identify elastic and scalable compute solutions for a workload
Select high-performing and scalable storage solutions for a workload

Select high-performing networking solutions for a workload
Choose high-performing database solutions for a workload

Domain 3: Specify Secure Applications and Architectures

Design secure access to AWS resources
Design secure application tiers
Select appropriate data security options

Domain 4: Design Cost-Optimized Architectures

Determine how to design cost-optimized storage.
Determine how to design cost-optimized compute.

AWS FSx for Lustre

July 1, 2020 ~ Last updated on : July 13, 2022 ~ jayendrapatil

AWS FSx for Lustre

FSx for Lustre is a fully managed service, that makes it easy and cost-effective to launch and run the world’s most popular HPC high-performance Lustre file system.

FSx for Lustre is an open-source file system designed for applications that require fast storage, where the storage needs to keep up with the compute.
handles the traditional complexity of setting up and managing high-performance Lustre file systems.

is POSIX-compliant and can be used with existing Linux-based applications without having to make any changes.
provides a native file system interface and works as any file system does with the Linux operating system.
provides read-after-write consistency and supports file locking.

is compatible with the most popular Linux-based AMIs, including Amazon Linux, Amazon Linux 2, Red Hat Enterprise Linux (RHEL), CentOS, SUSE Linux and Ubuntu.
is accessible from compute workloads running on EC2 instances and containers running on EKS.
can be accessed from a Linux instance, by installing the open-source Lustre client and mounting the file system using standard Linux commands.

is ideal for use cases where speed matters, such as machine learning, high-performance computing (HPC), video processing, financial modelling, genome sequencing, and electronic design automation (EDA)

FSx for Lustre Deployment Options

Scratch file systems

designed for temporary storage and short-term processing of data.
provide high burst throughput of up to six times the baseline throughput of 200 MBps per TiB of storage capacity.

data is not replicated and does not persist if a file server fails.
ideal for cost-optimized storage for short-term, processing-heavy workloads.

Persistent file systems

designed for long-term storage and workloads.

is highly available, and data is automatically replicated within the AZ that is associated with the file system.
data volumes attached to the file servers are replicated independently from the file servers to which they are attached.
if a file server becomes unavailable, it is replaced automatically within minutes of failure.

continuously monitored for hardware failures, and automatically replaces infrastructure components in the event of a failure.
ideal for workloads that run for extended periods or indefinitely, and that might be sensitive to disruptions in availability.

FSx for Lustre - Scratch vs Persistence

FSx for Lustre with S3

FSx for Lustre also integrates seamlessly with S3, making it easy to process cloud data sets with the Lustre high-performance file system.
FSx for Lustre file system transparently presents S3 objects as files and allows writing changed data back to S3.
FSx for Lustre file system can be linked with a specified S3 bucket, making the data in the S3 accessible to the file system.

S3 objects’ names and prefixes will be visible as files and directories
S3 objects are lazy-loaded by default.
- FSx automatically loads the corresponding objects from S3 only when first accessed by the applications.
- Subsequent reads of these files are served directly out of the file system with low, consistent latencies.
- FSx for Lustre file system can optionally be batch hydrated.
FSx for Lustre uses parallel data transfer techniques to transfer data from S3 at up to hundreds of GBs/s.

Files from the file system can be exported back to the S3 bucket

FSx for Lustre Security

FSx for Lustre provides encryption at rest for the file system and the backups, by default, using KMS.
FSx encrypts data-in-transit when accessed from supported EC2 instances only

FSx for Lustre Scalability

FSx for Lustre file systems scale to hundreds of GB/s of throughput and millions of IOPS.
FSx for Lustre also supports concurrent access to the same file or directory from thousands of compute instances.
FSx for Lustre provides consistent, sub-millisecond latencies for file operations.

FSx for Lustre Availability and Durability

On a scratch file system, file servers are not replaced if they fail and data is not replicated.
On a persistent file system, if a file server becomes unavailable it is replaced automatically and within minutes.
FSx for Lustre provides a parallel file system, where data is stored across multiple network file servers to maximize performance and reduce bottlenecks, and each server has multiple disks.

FSx takes daily automatic incremental backups of the file systems, and allows manual backups at any point.
Backups are highly durable and file-system-consistent

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

A solutions architect is designing storage for a high performance computing (HPC) environment based on Amazon Linux. The workload stores and processes a large amount of engineering drawings that require shared storage and heavy computing. Which storage option would be the optimal solution?
1. Amazon Elastic File System (Amazon EFS)
2. Amazon FSx for Lustre
3. Amazon EC2 instance store
4. Amazon EBS Provisioned IOPS SSD (io1)
A company is planning to deploy a High Performance Computing (HPC) cluster in its VPC that requires a scalable, high performance file system. The storage service must be optimized for efficient workload processing, and the data must be accessible via a fast and scalable file system interface. It should also work natively with Amazon S3 that enables you to easily process your S3 data with a high-performance POSIX interface. Which of the following is the MOST suitable service that you should use for this scenario?
1. Amazon Elastic File System (Amazon EFS)
2. Amazon FSx for Lustre
3. Amazon Elastic Block Store
4. Amazon EBS Provisioned IOPS SSD (io1)

References

Amazon_FSx_for_Lustre

AWS FSx for Windows

June 29, 2020 ~ Last updated on : July 13, 2022 ~ jayendrapatil

AWS FSx for Windows

Amazon FSx for Windows File Server provides fully managed, highly reliable, and scalable file storage that is accessible over the industry-standard Service Message Block (SMB) protocol.

FSx for Windows is built on Windows Server, delivering a wide range of administrative features such as user quotas, end-user file restore, ACLs, and Microsoft Active Directory (AD) integration.
FSx for Windows provides high levels of throughput and IOPS and consistent sub-millisecond latencies.

FSx for Windows offers single-AZ and multi-AZ deployment options, fully managed backups, and encryption of data at rest and in transit.
FSx for Windows File Server backups are file-system-consistent, highly durable, and incremental.
Amazon FSx is accessible from Windows, Linux, and MacOS compute instances and devices.

Amazon FSx provides concurrent access to the file system to thousands of compute instances and devices
Amazon FSx can connect the file system to EC2, VMware Cloud on AWS, Amazon WorkSpaces, and Amazon AppStream 2.0 instances.
Integrated with CloudWatch to monitor storage capacity and file system activity

Integrated with CloudTrail to monitor all Amazon FSx API calls
Amazon FSx was designed for use cases that require Windows shared file storage, like CRM, ERP, custom or .NET applications, home directories, data analytics, media, and entertainment workflows, web serving and content management, software build environments, and Microsoft SQL Server.
FSx file system is accessible from the on-premises environment using an AWS Direct Connect or AWS VPN connection.

FSx is accessible from multiple VPCs, AWS accounts, and AWS Regions using VPC Peering connections or AWS Transit Gateway.
FSx provides consistent sub-millisecond latencies with SSD storage and single-digit millisecond latencies with HDD storage
FSx supports Microsoft’s Distributed File System (DFS) to organize shares into a single folder structure up to hundreds of PB in size

FSx for Windows Security

FSx works with Microsoft Active Directory (AD) to integrate with existing Windows environments, which can either be an AWS Managed Microsoft AD or self-managed Microsoft AD
FSx provides standard Windows permissions (full support for Windows Access Controls ACLS) for files and folders.
FSx for Windows File Server supports encryption at rest for the file system and backups using KMS managed keys

FSx encrypts data-in-transit using SMB Kerberos session keys when accessing the file system from clients that support SMB 3.0.
FSx supports file-level or folder-level restores to previous versions by supporting Windows shadow copies, which are point in time snapshots of the file system.
FSx supports Windows shadow copies to enable the end-users to easily undo file changes and compare file versions by restoring files to previous versions, and backups to support the backup retention and compliance needs.

FSx complies with ISO, PCI-DSS, and SOC certifications, and is HIPAA eligible.

FSx for Windows Availability and durability

FSx for Windows automatically replicates the data within an Availability Zone (AZ) to protect it from component failure.
FSx continuously monitors for hardware failures and automatically replaces infrastructure components in the event of a failure.

FSx supports Multi-AZ deployment
- automatically provisions and maintains a standby file server in a different Availability Zone.
- any changes written to disk in the file system are synchronously replicated across AZs to standby.
- helps enhance availability during planned system maintenance.
- helps protect the data against instance failure and AZ disruption.
- In the event of planned file system maintenance or unplanned service disruption, FSx automatically fails over to the secondary file server, allowing data accessibility without manual intervention.

Multi-AZ file systems automatically failover from the preferred file server to the standby file server if
- An Availability Zone outage occurs.
- Preferred file server becomes unavailable.
- Preferred file server undergoes planned maintenance.
FSx supports automatic backups of the file systems, which incrementally store only the changes after the most recent backup.
FSx stores backups in S3.

AWS Certification Exam Practice Questions

Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).

AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.

AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated

Open to further feedback, discussion and correction.

A data processing facility wants to move a group of Microsoft Windows servers to the AWS Cloud. Theses servers require access to a shared file system that can integrate with the facility’s existing Active Directory (AD) infrastructure for file and folder permissions. The solution needs to provide seamless support for shared files with AWS and on-premises servers and allow the environment to be highly available. The chosen solution should provide added security by supporting encryption at rest and in transit. The solution should also be cost-effective to implement and manage. Which storage solution would meet these requirements?
1. An AWS Storage Gateway file gateway joined to the existing AD domain
2. An Amazon FSx for Windows File Server file system joined to the existing AD domain
3. An Amazon Elastic File System (Amazon EFS) file system joined to an AWS managed AD domain
4. An Amazon S3 bucket mounted on Amazon EC2 instances in multiple Availability Zones running Windows Server and joined to an AWS managed AD domain.

References

Amazon_FSx_For_Windows