AWS Certified Machine Learning -Specialty (MLS-C01) Exam Learning Path

AWS Machine Learning - Specialty Certification

AWS Certified Machine Learning -Specialty (MLS-C01) Exam Learning Path

  • Finally Re-certified the updated AWS Certified Machine Learning – Specialty (MLS-C01) certification exam after 3 months of preparation.
  • In terms of the difficulty level of all professional and specialty certifications, I find this to be the toughest, partly because I am still diving deep into machine learning and relearned everything from basics for this certification.
  • Machine Learning is a vast specialization in itself and with AWS services, there is a lot to cover and know for the exam. This is the only exam, where the majority of the focus is on concepts outside of AWS i.e. pure machine learning. It also includes AWS Machine Learning and Data Engineering services.

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Content

  • AWS Certified Machine Learning – Specialty (MLS-C01) exam validates
    • Select and justify the appropriate ML approach for a given business problem.
    • Identify appropriate AWS services to implement ML solutions.
    • Design and implement scalable, cost-optimized, reliable, and secure ML solutions.

Refer AWS Certified Machine Learning – Specialty Exam Guide for details

                              AWS Certified Machine Learning – Specialty Domains

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Summary

  • Specialty exams are tough, lengthy, and tiresome. Most of the questions and answers options have a lot of prose and a lot of reading that needs to be done, so be sure you are prepared and manage your time well.
  • MLS-C01 exam has 65 questions to be solved in 170 minutes which gives you roughly 2 1/2 minutes to attempt each question.
  • MLS-C01 exam includes two types of questions, multiple-choice and multiple-response.
  • MLS-C01 has a scaled score between 100 and 1,000. The scaled score needed to pass the exam is 750.
  • Specialty exams currently cost $ 300 + tax.
  • You can get an additional 30 minutes if English is your second language by requesting Exam Accommodations. It might not be needed for Associate exams but is helpful for Professional and Specialty ones.
  • As always, mark the questions for review, move on, and come back to them after you are done with all.
  • As always, having a rough architecture or mental picture of the setup helps focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach the right answer or at least have a 50% chance of getting it right.
  • AWS exams can be taken either remotely or online, I prefer to take them online as it provides a lot of flexibility. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Resources

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Topics

  • AWS Certified Machine Learning – Specialty exam covers a lot of Machine Learning concepts. It digs deep into Machine learning concepts, most of which are not related to AWS.
  • AWS Certified Machine Learning – Speciality exam covers the E2E Machine Learning lifecycle, right from data collection, transformation, making it usable and efficient for Machine Learning, pre-processing data for Machine Learning, training and validation, and implementation.

Machine Learning Concepts

  • Exploratory Data Analysis
    • Feature selection and Engineering
      • remove features that are not related to training
      • remove features that have the same values, very low correlation, very little variance, or a lot of missing values
      • Apply techniques like Principal Component Analysis (PCA) for dimensionality reduction i.e. reduce the number of features.
      • Apply techniques such as One-hot encoding and label encoding to help convert strings to numeric values, which are easier to process.
      • Apply Normalization i.e. values between 0 and 1 to handle data with large variance.
      • Apply feature engineering for feature reduction e.g. using a single height/weight feature instead of both features.
    • Handle Missing data
      • remove the feature or rows with missing data
      • impute using Mean/Median values – valid only for Numeric values and not categorical features also does not factor correlation between features
      • impute using k-NN, Multivariate Imputation by Chained Equation (MICE), Deep Learning – more accurate and helps factors correlation between features
    • Handle unbalanced data
      • Source more data
      • Oversample minority or Undersample majority
      • Data augmentation using techniques like Synthetic Minority Oversampling Technique (SMOTE).
  • Modeling
    • Know about Algorithms – Supervised, Unsupervised and Reinforcement and which algorithm is best suitable based on the available data either labelled or unlabelled.
      • Supervised learning trains on labeled data e.g. Linear regression. Logistic regression, Decision trees, Random Forests
      • Unsupervised learning trains on unlabelled data e.g. PCA, SVD, K-means
      • Reinforcement learning trained based on actions and rewards e.g. Q-Learning
    • Hyperparameters
      • are parameters exposed by machine learning algorithms that control how the underlying algorithm operates and their values affect the quality of the trained models
      • some of the common hyperparameters are learning rate, batch, epoch (hint:  If the learning rate is too large, the minimum slope might be missed and the graph would oscillate If the learning rate is too small, it requires too many steps which would take the process longer and is less efficient
  • Evaluation
    • Know difference in evaluating model accuracy
      • Use Area Under the (Receiver Operating Characteristic) Curve (AUC) for Binary classification
      • Use root mean square error (RMSE) metric for regression
    • Understand Confusion matrix
      • A true positive is an outcome where the model correctly predicts the positive class. Similarly, a true negative is an outcome where the model correctly predicts the negative class.
      • false positive is an outcome where the model incorrectly predicts the positive class. A false negative is an outcome where the model incorrectly predicts the negative class.
      • Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN)  (hint: use this for cases like fraud detection,  cost of marking non fraud as frauds is lower than marking fraud as non-frauds)
      • Specificity or TNR (True Negative Rate): Number of items correctly identified as negative out of total negatives- TN/(TN+FP)  (hint: use this for cases like videos for kids, the cost of  dropping few valid videos is lower than showing few bad ones)
    • Handle Overfitting problems
      • Simplify the model, by reducing the number of layers
      • Early Stopping – form of regularization while training a model with an iterative method, such as gradient descent
      • Data Augmentation
      • Regularization – technique to reduce the complexity of the model
      • Dropout is a regularization technique that prevents overfitting
      • Never train on test data

Machine Learning Services

  • SageMaker
    • supports both File mode, Pipe mode, and Fast File mode
      • File mode loads all of the data from S3 to the training instance volumes VS Pipe mode streams data directly from S3
      • File mode needs disk space to store both the final model artifacts and the full training dataset. VS Pipe mode which helps reduce the required size for EBS volumes.
      • Fast File mode combines the ease of use of the existing File Mode with the performance of Pipe Mode.
    • Using RecordIO format allows algorithms to take advantage of Pipe mode when training the algorithms that support it. 
    • supports Model tracking capability to manage up to thousands of machine learning model experiments
    • supports automatic scaling for production variants. Automatic scaling dynamically adjusts the number of instances provisioned for a production variant in response to changes in your workload
    • provides pre-built Docker images for its built-in algorithms and the supported deep learning frameworks used for training & inference
    • SageMaker Automatic Model Tuning
      • is the process of finding a set of hyperparameters for an algorithm that can yield an optimal model.
      • Best practices
        • limit the search to a smaller number as the difficulty of a hyperparameter tuning job depends primarily on the number of hyperparameters that Amazon SageMaker has to search
        • DO NOT specify a very large range to cover every possible value for a hyperparameter as it affects the success of hyperparameter optimization.
        • log-scaled hyperparameter can be converted to improve hyperparameter optimization.
        • running one training job at a time achieves the best results with the least amount of compute time.
        • Design distributed training jobs so that you get they report the objective metric that you want.
    • know how to take advantage of multiple GPUs (hint: increase learning rate and batch size w.r.t to the increase in GPUs)
    • Elastic Interface (now replaced by Inferentia) helps attach low-cost GPU-powered acceleration to EC2 and SageMaker instances or ECS tasks to reduce the cost of running deep learning inference.
    • SageMaker Inference options.
      • Real-time inference is ideal for online inferences that have low latency or high throughput requirements.
      • Serverless Inference is ideal for intermittent or unpredictable traffic patterns as it manages all of the underlying infrastructure with no need to manage instances or scaling policies.
      • Batch Transform is suitable for offline processing when large amounts of data are available upfront and you don’t need a persistent endpoint.
      • Asynchronous Inference is ideal when you want to queue requests and have large payloads with long processing times.
    • SageMaker Model deployment allows deploying multiple variants of a model to the same SageMaker endpoint to test new models without impacting the user experience
      • Production Variants
        • supports A/B or Canary testing where you can allocate a portion of the inference requests to each variant.
        • helps compare production variants’ performance relative to each other.
      • Shadow Variants
        • replicates a portion of the inference requests that go to the production variant to the shadow variant.
        • logs the responses of the shadow variant for comparison and not returned to the caller.
        • helps test the performance of the shadow variant without exposing the caller to the response produced by the shadow variant.
    • SageMaker Managed Spot training can help use spot instances to save cost and with Checkpointing feature can save the state of ML models during training
    • SageMaker Feature Store
      • helps to create, share, and manage features for ML development.
      • is a centralized store for features and associated metadata so features can be easily discovered and reused.
    • SageMaker Debugger provides tools to debug training jobs and resolve problems such as overfitting, saturated activation functions, and vanishing gradients to improve the model’s performance.
    • SageMaker Model Monitor monitors the quality of SageMaker machine learning models in production and can help set alerts that notify when there are deviations in the model quality.
    • SageMaker Automatic Model Tuning helps find a set of hyperparameters for an algorithm that can yield an optimal model.
    • SageMaker Data Wrangler
      • reduces the time it takes to aggregate and prepare tabular and image data for ML from weeks to minutes.
      • simplifies the process of data preparation (including data selection, cleansing, exploration, visualization, and processing at scale) and feature engineering.
    • SageMaker Experiments is a capability of SageMaker that lets you create, manage, analyze, and compare machine learning experiments.
    • SageMaker Clarify helps improve the ML models by detecting potential bias and helping to explain the predictions that the models make.
    • SageMaker Model Governance is a framework that gives systematic visibility into ML model development, validation, and usage.
    • SageMaker Autopilot is an automated machine learning (AutoML) feature set that automates the end-to-end process of building, training, tuning, and deploying machine learning models.
    • SageMaker Neo enables machine learning models to train once and run anywhere in the cloud and at the edge.
    • SageMaker API and SageMaker Runtime support VPC interface endpoints powered by AWS PrivateLink that helps connect VPC directly to the SageMaker API or SageMaker Runtime using AWS PrivateLink without using an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
    • Algorithms –
      • Blazing text provides Word2vec and text classification algorithms
      • DeepAR provides supervised learning algorithm for forecasting scalar (one-dimensional) time series (hint: train for new products based on existing products sales data).
      • Factorization machines provide supervised classification and regression tasks, helps capture interactions between features within high dimensional sparse datasets economically.
      • Image classification algorithm is a supervised learning algorithm that supports multi-label classification.
      • IP Insights is an unsupervised learning algorithm that learns the usage patterns for IPv4 addresses.
      • K-means is an unsupervised learning algorithm for clustering as it attempts to find discrete groupings within data, where members of a group are as similar as possible to one another and as different as possible from members of other groups.
      • k-nearest neighbors (k-NN) algorithm is an index-based algorithm. It uses a non-parametric method for classification or regression.
      • Latent Dirichlet Allocation (LDA) algorithm is an unsupervised learning algorithm that attempts to describe a set of observations as a mixture of distinct categories. Used to identify number of topics shared by documents within a text corpus
      • Neural Topic Model (NTM) Algorithm is an unsupervised learning algorithm that is used to organize a corpus of documents into topics that contain word groupings based on their statistical distribution
      • Linear models are supervised learning algorithms used for solving either classification or regression problems. 
        • For regression (predictor_type=’regressor’), the score is the prediction produced by the model.
        • For classification (predictor_type=’binary_classifier’ or predictor_type=’multiclass_classifier’)
      • Object Detection algorithm detects and classifies objects in images using a single deep neural network
      • Principal Component Analysis (PCA) is an unsupervised machine learning algorithm that attempts to reduce the dimensionality (number of features) (hint: dimensionality reduction)
      • Random Cut Forest (RCF) is an unsupervised algorithm for detecting anomalous data points (hint: anomaly detection)
      • Sequence to Sequence is a supervised learning algorithm where the input is a sequence of tokens (for example, text, audio) and the output generated is another sequence of tokens. (hint: text summarization is the key use case)
  • SageMaker Ground Truth
    • provides automated data labeling using machine learning
    • helps build highly accurate training datasets for machine learning quickly using Amazon Mechanical Turk
    • provides annotation consolidation to help improve the accuracy of the data object’s labels. It combines the results of multiple worker’s annotation tasks into one high-fidelity label.
    • automated data labeling uses machine learning to label portions of the data automatically without having to send them to human workers

Machine Learning & AI Managed Services

  • Comprehend
    • natural language processing (NLP) service to find insights and relationships in text.
    • identifies the language of the text; extracts key phrases, places, people, brands, or events; understands how positive or negative the text is; analyzes text using tokenization and parts of speech; and automatically organizes a collection of text files by topic.
  • Lex
    • provides conversational interfaces using voice and text helpful in building voice and text chatbots
  • Polly
    • text into speech
    • supports Speech Synthesis Markup Language (SSML) tags like prosody so users can adjust the speech rate, pitch or volume.
    • supports pronunciation lexicons to customize the pronunciation of words
  • Rekognition – analyze images and video
    • helps identify objects, people, text, scenes, and activities in images and videos, as well as detect any inappropriate content.
  • Translate – natural and fluent language translation
  • Transcribe – automatic speech recognition (ASR) speech-to-text
  • Kendra – an intelligent search service that uses NLP and advanced ML algorithms to return specific answers to search questions from your data.
  • Panorama brings computer vision to the on-premises camera network.
  • Augmented AI (Amazon A2I) is an ML service that makes it easy to build the workflows required for human review.
  • Forecast – highly accurate forecasts.

Analytics

  • Make sure you know and understand data engineering concepts mainly in terms of data capture, migration, transformation, and storage.
  • Kinesis
    • Understand Kinesis Data Streams and Kinesis Data Firehose in depth
    • Kinesis Data Analytics can process and analyze streaming data using standard SQL and integrates with Data Streams and Firehose
    • Know Kinesis Data Streams vs Kinesis Firehose
      • Know Kinesis Data Streams is open ended on both producer and consumer. It supports KCL and works with Spark.
      • Know Kinesis Firehose is open ended for producer only. Data is stored in S3, Redshift and ElasticSearch.
      • Kinesis Firehose works in batches with minimum 60secs interval.
      • Kinesis Data Firehose supports data transformation and record format conversion using Lambda function (hint: can be used for transforming csv or JSON into parquet)
    • Kinesis Video Streams provides a fully managed service to ingest, index store, and stream live video. HLS can be used to view a Kinesis video stream, either for live playback or to view archived video.
  • OpenSearch (ElasticSearch) is a search service that supports indexing, full-text search, faceting, etc.
  • Data Pipeline helps define data-driven flows to automate and schedule regular data movement and data processing activities in AWS
  • Glue is a fully managed, ETL (extract, transform, and load) service that automates the time-consuming steps of data preparation for analytics
    • helps setup, orchestrate, and monitor complex data flows.
    • Glue Data Catalog is a central repository to store structural and operational metadata for all the data assets.
    • Glue crawler connects to a data store, extracts the schema of the data, and then populates the Glue Data Catalog with this metadata
    • Glue DataBrew is a visual data preparation tool that enables users to clean and normalize data without writing any code.
  • DataSync is an online data transfer service that simplifies, automates, and accelerates moving data between storage systems and services.

Security, Identity & Compliance

  • Security is covered very lightly. (hint : SageMaker can read data from KMS-encrypted S3. Make sure, the KMS key policies include the role attached with SageMaker)

Management & Governance Tools

  • Understand AWS CloudWatch for Logs and Metrics. (hint: SageMaker is integrated with Cloudwatch and logs and metrics are all stored in it)

Storage

  • Understand Data Storage Options – Know patterns for S3 vs RDS vs DynamoDB vs Redshift. (hint: S3 is, by default, the data storage option or Big Data storage, and look for it in the answer.)

Whitepapers and articles

AWS Certified Database – Specialty (DBS-C01) Exam Learning Path

AWS Database - Specialty Certificate

AWS Certified Database – Specialty (DBS-C01) Exam Learning Path

I recently revalidated my AWS Certified Database – Specialty (DBS-C01) certification just before it expired. The format and domains are pretty much the same as the previous exam, however, it has been enhanced to cover a lot of new services.

AWS Certified Database – Specialty (DBS-C01) Exam Content

AWS Certified Database – Specialty (DBS-C01) exam validates your understanding of databases, including the concepts of design, migration, deployment, access, maintenance, automation, monitoring, security, and troubleshooting, and covers the following tasks:

  • Understand and differentiate the key features of AWS database services.
  • Analyze needs and requirements to design and recommend appropriate database solutions using AWS services

Refer to AWS Database – Specialty Exam Guide

DBS-C01 Domains

AWS Certified Database – Specialty (DBS-C01) Exam Summary

  • Specialty exams are tough, lengthy, and tiresome. Most of the questions and answers options have a lot of prose and a lot of reading that needs to be done, so be sure you are prepared and manage your time well.
  • DBS-C01 exam has 65 questions to be solved in 170 minutes which gives you roughly 2 1/2 minutes to attempt each question.
  • DBS-C01 exam includes two types of questions, multiple-choice and multiple-response.
  • DBS-C01 has a scaled score between 100 and 1,000. The scaled score needed to pass the exam is 750.
  • Specialty exams currently cost $ 300 + tax.
  • You can get an additional 30 minutes if English is your second language by requesting Exam Accommodations. It might not be needed for Associate exams but is helpful for Professional and Specialty ones.
  • As always, mark the questions for review, move on, and come back to them after you are done with all.
  • As always, having a rough architecture or mental picture of the setup helps focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach the right answer or at least have a 50% chance of getting it right.
  • AWS exams can be taken either remotely or online, I prefer to take them online as it provides a lot of flexibility. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.

AWS Certified Database – Specialty (DBS-C01) Exam Resources

AWS Certified Database – Specialty (DBS-C01) Exam Summary

  • AWS Certified Database – Specialty exam focuses completely on AWS Data services from relational, non-relational, graph, caching, and data warehousing. It also covers deployments, automation, migration, security, monitoring, and troubleshooting aspects of them.

Database Services

  • Make sure you know and cover all the services in-depth, as 80% of the exam is focused on topics like Aurora, RDS, DynamoDB
  • DynamoDB
    • is a fully managed NoSQL database service providing single-digit millisecond latency.
    • DynamoDB provisioned throughput supports On-demand and provisioned throughput capacity modes.
      • On-demand mode
        • provides a flexible billing option capable of serving thousands of requests per second without capacity planning
        • does not support reserved capacity
      • Provisioned mode
        • requires you to specify the number of reads and writes per second as required by the application
        • Understand the provisioned capacity calculations
    • DynamoDB Auto Scaling uses the AWS Application Auto Scaling service to dynamically adjust provisioned throughput capacity on your behalf, in response to actual traffic patterns.
    • Know DynamoDB Burst capacity, Adaptive capacity
    • DynamoDB Consistency mode determines the manner and timing in which the successful write or update of a data item is reflected in a subsequent read operation of that same item.
      • supports eventual and strongly consistent reads.
      • Eventual requires less throughput but might return stale data, whereas, Strongly consistent reads require higher throughput but would always return correct data.
    • DynamoDB secondary indexes provide efficient access to data with attributes other than the primary key.
      • LSI uses the same partition key but a different sort key, whereas, GSI is a separate table with a different partition key and/or sort key.
      • GSI can cause primary table throttling if under-provisioned.
      • Make sure you understand the difference between the Local Secondary Index and the Global Secondary Index
    • DynamoDB Global Tables is a new multi-master, cross-region replication capability of DynamoDB to support data access locality and regional fault tolerance for database workloads.
    • DynamoDB Time to Live – TTL enables a per-item timestamp to determine when an item is no longer needed. (hint: know TTL can expire the data and this can be captured by using DynamoDB Streams)
    • DynamoDB cross-region replication allows identical copies (called replicas) of a DynamoDB table (called master table) to be maintained in one or more AWS regions.
    • DynamoDB Streams provides a time-ordered sequence of item-level changes made to data in a table.
    • DynamoDB Triggers (just like database triggers) is a feature that allows the execution of custom actions based on item-level updates on a table.
    • DynamoDB Accelerator – DAX is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement even at millions of requests per second.
      • DAX does not support fine-grained access control like DynamoDB.
    • DynamoDB Backups support PITR
      • AWS Backup can be used to backup and restore, and it supports cross-region snapshot copy as well.
    • VPC Gateway Endpoints provide private access to DynamoDB from within a VPC without the need for an internet gateway or NAT gateway
    • Understand DynamoDB Best practices (hint: selection of keys to avoid hot partitions and creation of LSI and GSI)
  • Aurora
    • is a relational database engine that combines the speed and reliability with the simplicity and cost-effectiveness of open-source databases. 
    • provides MySQL and PostgreSQL compatibility
    • Aurora Disaster Recovery & High Availability can be achieved using Read Replicas with very minimal downtime.
      • Aurora promotes read replicas as per the priority tier (tier 0 is the highest), the largest size if the tier matches
    • Aurora Global Database provides cross-region read replicas for low-latency reads. Remember it is not multi-master and would not provide low latency writes across regions as DynamoDB Global tables.
    • Aurora Connection endpoints support
      • Cluster for primary read/write
      • Reader for read replicas
      • Custom for a specific group of instances
      • Instance for specific single instance – Not recommended
    • Aurora Fast Failover techniques
      • set TCP keepalives low
      • set Java DNS caching timeouts low
      • Set the timeout variables used in the JDBC connection string as low
      • Use the provided read and write Aurora endpoints
      • Use cluster cache management for Aurora PostgreSQL. Cluster cache management ensures that application performance is maintained if there’s a failover.
    • Aurora Serverless is an on-demand, autoscaling configuration for the MySQL-compatible and PostgreSQL-compatible editions of Aurora.
    • Aurora Backtrack feature helps rewind the DB cluster to the specified time. It is not a replacement for backups.
    • Aurora Server Auditing Events for different activities cover log-in, DML, permission changes DCL, schema changes DDL, etc.
    • Aurora Cluster Cache management feature which helps fast failover
    • Aurora Clone feature which allows you to create quick and cost-effective clones
    • Aurora supports fault injection queries to simulate various failovers like node down, primary failover, etc.
    • RDS PostgreSQL and MySQL can be migrated to Aurora, by creating an Aurora Read Replica from the instance. Once the replica lag is zero, switch the DNS with no data loss
    • Aurora Database Activity Streams help stream audit logs to external services like Kinesis
    • Supports stored procedures calling lambda functions
  • Relational Database Service (RDS)
    • provides a relational database in the cloud with multiple database options.
    • RDS Snapshots, Backups, and Restore
      • restoring a DB from a snapshot does not retain the parameter group and security group
      • automated snapshots cannot be shared. Make a manual backup from the snapshot before sharing the same.
    • RDS Read Replicas
      • allow elastic scaling beyond the capacity constraints of a single DB instance for read-heavy database workloads.
      • increased scalability and database availability in the case of an AZ failure.
      • supports cross-region replicas.
    • RDS Multi-AZ provides high availability and automatic failover support for DB instances.
    • Understand the differences between RDS Multi-AZ vs Read Replicas
      • Multi-AZ failover can be simulated using Reboot with Failure option
      • Read Replicas require automated backups enabled
    • Understand DB components esp. DB parameter group, DB options groups
      • Dynamic parameters are applied immediately
      • Static parameters need manual reboot.
      • Default parameter group cannot be modified. Need to create custom parameter group and associate to RDS
      • Know max connections also depends on DB instance size
    • RDS Custom automates database administration tasks and operations. while making it possible for you as a database administrator to access and customize the database environment and operating system.
    • RDS Performance Insights is a database performance tuning and monitoring feature that helps you quickly assess the load on the database, and determine when and where to take action. 
    • RDS Security
      • RDS supports security groups to control who can access RDS instances
      • RDS supports data at rest encryption and SSL for data in transit encryption
      • RDS supports IAM database authentication with temporary credentials.
      • Existing RDS instance cannot be encrypted, create a snapshot -> encrypt it –> restore as encrypted DB
      • RDS PostgreSQL requires rds.force_ssl=1 and sslmode=ca/verify-full to enable SSL encryption
      • Know RDS Encrypted Database limitations
    • Understand RDS Monitoring and Notification
      • Know RDS supports notification events through SNS for events like database creation, deletion, snapshot creation, etc.
      • CloudWatch gathers metrics about CPU utilization from the hypervisor for a DB instance, and Enhanced Monitoring gathers its metrics from an agent on the instance.
      • Enhanced Monitoring metrics are useful to understand how different processes or threads on a DB instance use the CPU.
      • RDS Performance Insights is a database performance tuning and monitoring feature that helps illustrate the database’s performance and help analyze any issues that affect it
    • RDS instance cannot be stopped if with read replicas
  • ElastiCache
    • is a managed web service that helps deploy and run Memcached or Redis protocol-compliant cache clusters in the cloud easily.
    • Understand the differences between Redis vs. Memcached
  • Neptune
    • is a fully managed database service built for the cloud that makes it easier to build and run graph applications. Neptune provides built-in security, continuous backups, serverless compute, and integrations with other AWS services.  
    • provides Neptune loader to quickly import data from S3
    • supports VPC endpoints
  • Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra–compatible database service.
  • Amazon Quantum Ledger Database (Amazon QLDB) is a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log.
  • Redshift
    • is a fully managed, fast, and powerful, petabyte-scale data warehouse service. It is not covered in depth.
    • Know Redshift Best Practices w.r.t selection of Distribution style, Sort key, importing/exporting data
      • COPY command which allows parallelism, and performs better than multiple COPY commands
      • COPY command can use manifest files to load data
      • COPY command handles encrypted data
    • Know Redshift cross region encrypted snapshot copy
      • Create a new key in destination region
      • Use CreateSnapshotCopyGrant to allow Amazon Redshift to use the KMS key from the destination region.
      • In the source region, enable cross-region replication and specify the name of the copy grant created.
    • Know Redshift supports Audit logging which covers authentication attempts, connections and disconnections usually for compliance reasons.
  • Data Migration Service (DMS)
    • DMS helps in migration of homogeneous and heterogeneous database
    • DMS with Full load plus Change Data Capture (CDC) migration capability can be used to migrate databases with zero downtime and no data loss.
    • DMS with SCT (Schema Conversion Tool) can be used to migrate heterogeneous databases.
    • Premigration Assessment evaluates specified components of a database migration task to help identify any problems that might prevent a migration task from running as expected.
    • Multiserver assessment report evaluates multiple servers based on input that you provide for each schema definition that you want to assess.
    • DMS provides support for data validation to ensure that your data was migrated accurately from the source to the target.
    • DMS supports LOB migration as a 2-step process. It can do a full or limited LOB migration
      • In full LOB mode, AWS DMS migrates all LOBs from source to target regardless of size. Full LOB mode can be quite slow.
      • In limited LOB mode, a maximum LOB size can be set that AWS DMS should accept. Doing so allows AWS DMS to pre-allocate memory and load the LOB data in bulk. LOBs that exceed the maximum LOB size are truncated and a warning is issued to the log file. In limited LOB mode, you get significant performance gains over full LOB mode.
      • Recommended to use limited LOB mode whenever possible.

Security, Identity & Compliance

  • Identity and Access Management (IAM)
  • Key Management Services
    • is a managed encryption service that allows the creation and control of encryption keys to enable data encryption.
    • provides data at rest encryption for the databases.
  • AWS Secrets Manager
    • protects secrets needed to access applications, services, etc.
    • enables you to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle
    • supports automatic rotation of credentials for RDS, DocumentDB, etc.
  • Secrets Manager vs. Systems Manager Parameter Store
    • Secrets Manager supports automatic rotation while SSM Parameter Store does not
    • Parameter Store is cost-effective as compared to Secrets Manager.
  • Trusted Advisor provides RDS Idle instances

Management & Governance Tools

  • Understand AWS CloudWatch for Logs and Metrics.
    • EventBridge (CloudWatch Events) provides real-time alerts
    • CloudWatch can be used to store RDS logs with a custom retention period, which is indefinite by default.
    • CloudWatch Application Insights support .Net and SQL Server monitoring
  • Know CloudFormation for provisioning, in terms of
    • Stack drifts – to understand the difference between current state and on actual environment with any manual changes
    • Change Set – allows you to verify the changes before being propagated
    • parameters – allows you to configure variables or environment-specific values
    • Stack policy defines the update actions that can be performed on designated resources.
    • Deletion policy for RDS allows you to configure if the resources are retained, snapshot, or deleted once destroy is initiated
    • Supports secrets manager for DB credentials generation, storage, and easy rotation
    • System parameter store for environment-specific parameters

Whitepapers and articles

On the Exam Day

  • Make sure you are relaxed and get some good night’s sleep. The exam is not tough if you are well-prepared.
  • If you are taking the AWS Online exam
    • Try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.
    • The online verification process does take some time and usually, there are glitches.
    • Remember, you would not be allowed to take the take if you are late by more than 30 minutes.
    • Make sure you have your desk clear, no hand-watches, or external monitors, keep your phones away, and nobody can enter the room.

Finally, All the Best 🙂

AWS Certified Data Analytics – Specialty (DAS-C01) Exam Learning Path

AWS Data Analytics - Specialty DAS-C01 Certificate

AWS Certified Data Analytics – Specialty (DAS-C01) Exam Learning Path

  • Recertified with the AWS Certified Data Analytics – Specialty (DAS-C01) which tends to cover a lot of big data topics focused on AWS services.
  • Data Analytics – Specialty (DAS-C01) has replaced the previous Big Data – Specialty (BDS-C01).

AWS Certified Data Analytics – Specialty (DAS-C01) exam basically validates

  • Define AWS data analytics services and understand how they integrate with each other.
  • Explain how AWS data analytics services fit in the data lifecycle of collection, storage, processing, and visualization.

Refer AWS Certified Data Analytics – Specialty Exam Guide for details

AWS Certified Data Analytics - Specialty DAS-C01 Domains

AWS Certified Data Analytics – Specialty (DAS-C01) Exam Resources

AWS Certified Data Analytics – Specialty (DAS-C01) Exam Summary

  • Specialty exams are tough, lengthy, and tiresome. Most of the questions and answers options have a lot of prose and a lot of reading that needs to be done, so be sure you are prepared and manage your time well.
  • DAS-C01 exam has 65 questions to be solved in 170 minutes which gives you roughly 2 1/2 minutes to attempt each question.
  • DAS-C01 exam includes two types of questions, multiple-choice and multiple-response.
  • DAS-C01 has a scaled score between 100 and 1,000. The scaled score needed to pass the exam is 750.
  • Specialty exams currently cost $ 300 + tax.
  • You can get an additional 30 minutes if English is your second language by requesting Exam Accommodations. It might not be needed for Associate exams but is helpful for Professional and Specialty ones.
  • As always, mark the questions for review and move on and come back to them after you are done with all.
  • As always, having a rough architecture or mental picture of the setup helps focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach the right answer or at least have a 50% chance of getting it right.
  • AWS exams can be taken either remotely or online, I prefer to take them online as it provides a lot of flexibility. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.

AWS Certified Data Analytics – Specialty (DAS-C01) Exam Topics

  • AWS Certified Data Analytics – Specialty exam, as its name suggests, covers a lot of Big Data concepts right from data collection, ingestion, transfer, storage, pre and post-processing, analytics, and visualization with the added concepts for data security at each layer.

Analytics

  • Make sure you know and cover all the services in-depth, as 80% of the exam is focused on topics like Glue, Kinesis, and Redshift.
  • AWS Analytics Services Cheat Sheet
  • Glue
    • DAS-C01 covers Glue in great detail.
    • AWS Glue is a fully managed, ETL service that automates the time-consuming steps of data preparation for analytics.
    • supports server-side encryption for data at rest and SSL for data in motion.
    • Glue ETL engine to Extract, Transform, and Load data that can automatically generate Scala or Python code.
    • Glue Data Catalog is a central repository and persistent metadata store to store structural and operational metadata for all the data assets. It works with Apache Hive as its metastore.
    • Glue Crawlers scan various data stores to automatically infer schemas and partition structures to populate the Data Catalog with corresponding table definitions and statistics.
    • Glue Job Bookmark tracks data that has already been processed during a previous run of an ETL job by persisting state information from the job run.
    • Glue Streaming ETL enables performing ETL operations on streaming data using continuously-running jobs.
    • Glue provides flexible scheduler that handles dependency resolution, job monitoring, and retries.
    • Glue Studio offers a graphical interface for authoring AWS Glue jobs to process data allowing you to define the flow of the data sources, transformations, and targets in the visual interface and generating Apache Spark code on your behalf.
    • Glue Data Quality helps reduces manual data quality efforts by automatically measuring and monitoring the quality of data in data lakes and pipelines.
    • Glue DataBrew helps prepare, visualize, clean, and normalize data directly from the data lake, data warehouses, and databases, including S3, Redshift, Aurora, and RDS.
  • Redshift
    • Redshift is also covered in depth.
    • Cover Redshift Advanced topics
      • Redshift Distribution Style determines how data is distributed across compute nodes and helps minimize the impact of the redistribution step by locating the data where it needs to be before the query is executed.
      • Redshift Enhanced VPC routing forces all COPY and UNLOAD traffic between the cluster and the data repositories through the VPC.
      • Workload management (WLM) enables users to flexibly manage priorities within workloads so that short, fast-running queries won’t get stuck in queues behind long-running queries.
      • Redshift Spectrum helps query and retrieve structured and semistructured data from files in S3 without having to load the data into Redshift tables.
      • Federated Query feature allows querying and analyzing data across operational databases, data warehouses, and data lakes.
      • Short query acceleration (SQA) prioritizes selected short-running queries ahead of longer-running queries.
      • Redshift Serverless is a serverless option of Redshift that makes it more efficient to run and scale analytics in seconds without the need to set up and manage data warehouse infrastructure.
    • Redshift Best Practices w.r.t selection of Distribution style, Sort key, importing/exporting data
      • COPY command which allows parallelism, and performs better than multiple COPY commands
      • COPY command can use manifest files to load data
      • COPY command handles encrypted data
    • Redshift Resizing cluster options (elastic resize did not support node type changes before, but does now)
    • Redshift supports encryption at rest and in transit
    • Redshift supports encrypting an unencrypted cluster using KMS. However, you can’t enable hardware security module (HSM) encryption by modifying the cluster. Instead, create a new, HSM-encrypted cluster and migrate your data to the new cluster.
    • Know Redshift views to control access to data.
  • Elastic Map Reduce
    • Understand EMRFS
      • Use Consistent view to make sure S3 objects referred by different applications are in sync. Although, it is not needed now.
    • Know EMR Best Practices (hint: start with many small nodes instead of few large nodes)
    • Know EMR Encryption options
      • supports SSE-S3, SS3-KMS, CSE-KMS, and CSE-Custom encryption for EMRFS
      • supports LUKS encryption for local disks
      • supports TLS for data in transit encryption
      • supports EBS encryption
    • Hive metastore can be externally hosted using RDS, Aurora, and AWS Glue Data Catalog
    • Know also different technologies
      • Presto is a fast SQL query engine designed for interactive analytic queries over large datasets from multiple sources
      • Spark is a distributed processing framework and programming model that helps do machine learning, stream processing, or graph analytics using Amazon EMR clusters
      • Zeppelin/Jupyter as a notebook for interactive data exploration and provides open-source web application that can be used to create and share documents that contain live code, equations, visualizations, and narrative text
      • Phoenix is used for OLTP and operational analytics, allowing you to use standard SQL queries and JDBC APIs to work with an Apache HBase backing store
  • Kinesis
    • Understand Kinesis Data Streams and Kinesis Data Firehose in depth
    • Know Kinesis Data Streams vs Kinesis Firehose
      • Know Kinesis Data Streams is open-ended for both producer and consumer. It supports KCL and works with Spark.
      • Know Kinesis Firehose is open-ended for producers only. Data is stored in S3, Redshift, and OpenSearch.
      • Kinesis Firehose works in batches with minimum 60secs intervals and in near-real time.
      • Kinesis Firehose supports out-of-the-box transformation and  custom transformation using Lambda
    • Kinesis supports encryption at rest using server-side encryption
    • Kinesis Producer Library supports batching
    • Kinesis Data Analytics
      • helps transform and analyze streaming data in real time using Apache Flink.
      • supports anomaly detection using Random Cut Forest ML
      • supports reference data stored in S3.
  • OpenSearch
    • OpenSearch is a search service that supports indexing, full-text search, faceting, etc.
    • OpenSearch can be used for analysis and supports visualization using OpenSearch Dashboards which can be real-time.
    • OpenSearch Service Storage tiers support Hot, UltraWarm, and Cold and the data can be transitioned using Index State management.
  • QuickSight
    • Know Visual Types (hint: esp. word clouds, plotting line, bar, and story based visualizations)
    • Know Supported Data Sources
    • QuickSight provides IP addresses that need to be whitelisted for QuickSight to access the data store.
    • QuickSight provides direct integration with Microsoft AD
    • QuickSight supports Row level security using dataset rules to control access to data at row granularity based on permissions associated with the user interacting with the data.
    • QuickSight supports ML insights as well
    • QuickSight supports users defined via IAM or email signup.
  • Athena
    • is a serverless, interactive analytics service built on open-source frameworks, supporting open-table and file formats.
    • provides a simplified, flexible way to analyze data in an S3 data lake and 30 data sources, including on-premises data sources or other cloud systems using SQL or Python without loading the data.
    • integrates with QuickSight for visualizing the data or creating dashboards.
    • uses a managed Glue Data Catalog to store information and schemas about the databases and tables that you create for the data stored in S3
    • Workgroups can be used to separate users, teams, applications, or workloads, to set limits on the amount of data each query or the entire workgroup can process, and to track costs.
    • Athena best practices recommended partitioning the data, partition projection, and using the Columnar file format like ORC or Parquet as they support compression and are splittable.
  • Know Data Pipeline for data transfer

Security, Identity & Compliance

Management & Governance Tools

  • Understand AWS CloudWatch for Logs and Metrics.
  • CloudWatch Subscription Filters can be used to route data to Kinesis Data Streams, Kinesis Data Firehose, and Lambda.

Whitepapers and articles

On the Exam Day

  • Make sure you are relaxed and get some good night’s sleep. The exam is not tough if you are well-prepared.
  • If you are taking the AWS Online exam
    • Try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.
    • The online verification process does take some time and usually, there are glitches.
    • Remember, you would not be allowed to take the take if you are late by more than 30 minutes.
    • Make sure you have your desk clear, no hand-watches, or external monitors, keep your phones away, and nobody can enter the room.

Finally, All the Best 🙂

AWS Certified Security – Specialty (SCS-C01) Exam Learning Path

AWS Certified Security - Specialty SCS-C01 Certificate

AWS Certified Security – Specialty (SCS-C01) Exam Learning Path

I recently re-certified AWS Certified Security – Specialty (SCS-C01) after first clearing the same in 2019 and the format, and domains are pretty much the same however has been enhanced to cover all the latest services.

AWS Certified Security – Specialty (SCS-C01) Exam Content

  • The AWS Certified Security – Specialty (SCS-C01) exam focuses on the AWS Security and Compliance concepts. It basically validates
    • An understanding of specialized data classifications and AWS data protection mechanisms.
    • An understanding of data-encryption methods and AWS mechanisms to implement them.
    • An understanding of secure Internet protocols and AWS mechanisms to implement them.
  • A working knowledge of AWS security services and features of services to provide a secure production environment.
  • Competency gained from two or more years of production deployment experience using AWS security services and features.
  • The ability to make tradeoff decisions with regard to cost, security, and deployment complexity given a set of application requirements. An understanding of security operations and risks

Refer to AWS Certified Security – Speciality Exam Guide

AWS Certified Security – Specialty (SCS-C01) Exam Summary

  • Specialty exams are tough, lengthy, and tiresome. Most of the questions and answers options have a lot of prose and a lot of reading that needs to be done, so be sure you are prepared and manage your time well.
  • SCS-C01 exam has 65 questions to be solved in 170 minutes which gives you roughly 2 1/2 minutes to attempt each question.
  • SCS-C01 exam includes two types of questions, multiple-choice and multiple-response.
  • SCS-C01 has a scaled score between 100 and 1,000. The scaled score needed to pass the exam is 750.
  • Associate exams currently cost $ 300 + tax.
  • You can get an additional 30 minutes if English is your second language by requesting Exam Accommodations. It might not be needed for Associate exams but is helpful for Professional and Specialty ones.
  • As always, mark the questions for review and move on and come back to them after you are done with all.
  • As always, having a rough architecture or mental picture of the setup helps focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach the right answer or at least have a 50% chance of getting it right.
  • AWS exams can be taken either remotely or online, I prefer to take them online as it provides a lot of flexibility. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.

AWS Certified Security – Specialty (SCS-C01) Exam Resources

AWS Certified Security – Specialty (SCS-C01) Exam Topics

  • AWS Certified Security – Specialty (SCS-C01) exam focuses a lot on Security & Compliance concepts involving Data Encryption at rest or in transit, Data protection, Auditing, Compliance and regulatory requirements, and automated remediation.

Security, Identity & Compliance

  • Identity and Access Management (IAM)
    • IAM Roles to grant the service, users temporary access to AWS services.
      • IAM Role can be used to give cross-account access and usually  involves creating a role within the trusting account with a trust and permission policy and granting the user in the trusted account permissions to assume the trusting account role.
    • Identity Providers & Federation to grant external user identity (SAML or Open ID compatible IdPs) permissions to AWS resources without having to be created within the AWS account.
    • IAM Policies help define who has access & what actions can they perform.
  • Deep dive into Key Management Service (KMS). There would be quite a few questions on this.
    • is a managed encryption service that allows the creation and control of encryption keys to enable data encryption. 
    • uses Envelope Encryption which uses a master key to encrypt the data key, which is then used to encrypt the data.
    • Understand how KMS works
    • Understand IAM Policies, Key Policies, Grants to grant access.
      • Key policies are the primary way to control access to KMS keys. Unless the key policy explicitly allows it, you cannot use IAM policies to allow access to a KMS key.
    • are regional, however, supports multi-region keys, which are KMS keys in different AWS Regions that can be used interchangeably – as though you had the same key in multiple Regions.
    • KMS Multi-region keys
      • are AWS KMS keys in different AWS Regions that can be used interchangeably – as though having the same key in multiple Regions.
      • are not global and each multi-region key needs to be replicated and managed independently.
    • Understand the difference between CMK with generated and imported key material esp. in rotating keys
    • KMS usage with VPC Endpoint which ensures the communication between the VPC and KMS is conducted entirely within the AWS network.
    • KMS ViaService condition
  • Cloud HSM
    • is a cloud-based hardware security module (HSM) that enables you to easily generate and use your own encryption keys on the AWS Cloud
  • AWS Certificate Manager (ACM)
    • helps provision, manage, and deploy public and private SSL/TLS certificates for use with AWS services
    • to use an ACM Certificate with CloudFront, the certificate must be imported into the US East (N. Virginia) region.
    • is regional and you need to request certificates in all regions and associate individually in all regions.
    • does not support EC2 instances and private keys cannot be exported.
  • AWS Secrets Manager
    • protects secrets needed to access applications, services, etc.
    • enables you to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle
    • supports automatic rotation of credentials for RDS, DocumentDB, etc.
  • Secrets Manager vs Systems Manager Parameter Store
    • Secrets Manager supports automatic rotation while SSM Parameter Store does not
    • Parameter Store is cost-effective as compared to Secrets Manager.
  • AWS GuardDuty
    • is a threat detection service that continuously monitors the AWS accounts and workloads for malicious activity and delivers detailed security findings for visibility and remediation.
    • supports CloudTrail S3 data events and management event logs, DNS logs, EKS audit logs, and VPC flow logs.
  • AWS Inspector
    • is an automated security assessment service that helps improve the security and compliance of applications deployed on AWS.
  • Amazon Macie
    • is a security service that uses machine learning to automatically discover, classify, and protect sensitive data in S3.
  • AWS Artifact is a central resource for compliance-related information that provides on-demand access to AWS’ security and compliance reports and select online agreements
  • AWS Shield & Shield Advanced
    • for DDoS protection and integrates with Route 53, CloudFront, ALB, and Global Accelerator.
  • AWS WAF
    • protects from common attack techniques like SQL injection and XSS, Conditions based include IP addresses, HTTP headers, HTTP body, and URI strings.
    • integrates with CloudFront, ALB, and API Gateway.
    • supports Web ACLs and can block traffic based on IPs, Rate limits, and specific countries as well
    • allows IP match set rule to allow/deny specific IP addresses and rate-based rule to limit the number of requests.
    • logs can be sent to the CloudWatch Logs log group, an S3 bucket, or Kinesis Data Firehose.
  • AWS Security Hub is a cloud security posture management service that performs security best practice checks, aggregates alerts, and enables automated remediation.
  • AWS Network Firewall is a stateful, fully managed, network firewall and intrusion detection and prevention service (IDS/IPS) for VPCs.
  • AWS Resource Access Manager helps you securely share your resources across AWS accounts, within your organization or organizational units (OUs), and with IAM roles and users for supported resource types.
  • AWS Signer is a fully managed code-signing service to ensure the trust and integrity of your code.
  • AWS Audit Manager to map your compliance requirements to AWS usage data with prebuilt and custom frameworks and automated evidence collection.
  • AWS Cognito esp. User Pools
  • Firewall Manager helps centrally configure and manage firewall rules across the accounts and applications in AWS Organizations which includes a variety of protections, including WAF, Shield Advanced, VPC security groups, Network Firewall, and Route 53 Resolver DNS Firewall.

Networking & Content Delivery

  • Virtual Private Connect – VPC
    • Security Groups, NACLs
      • NACLs are stateless, Security groups are stateful
      • NACLs at subnet level, Security groups at the instance level
      • NACLs need to open ephemeral ports for response traffic.
    • VPC Gateway Endpoints to provide access to S3 and DynamoDB
    • VPC Interface Endpoints or PrivateLink provide access to a variety of services like SQS, Kinesis, or Private APIs exposed through NLB.
    • VPC Peering
      • to enable communication between VPCs within the same or different regions.
      • Route tables need to be configured on either VPC for them to be able to communicate.
      • does not allow cross-region security group reference.
    • VPC Flow Logs help capture information about the IP traffic going to and from network interfaces in the VPC
    • NAT Gateway provides managed NAT service that provides better availability, higher bandwidth, and requires less administrative effort.
  • Virtual Private Network – VPN & Direct Connect to establish connectivity a secured, low latency access between an on-premises data center and VPC.
    • IPSec VPN over Direct Connect to provide secure connectivity.
  • CloudFront 
    • integrates with S3 to improve latency, and performance.
    • provides multiple security features
    • supports encryption at rest and end-to-end encryption
      • Viewer Protocol Policy and Origin Protocol Policy to enforce HTTPS – can be configured to require that viewers use HTTPS to request the files so that connections are encrypted when CloudFront communicates with viewers.
      • Integrates with ACM and requires certs to be in the us-east-1 region
      • Underlying origin can be applied certs from ACM or issued by the third party.
    • CloudFront Origin Shield
      • helps improve the cache hit ratio and reduce the load on the origin.
      • requests from other regional caches would hit the Origin shield rather than the Origin.
      • should be placed at the regional cache and not in the edge cache
      • should be deployed to the region closer to the origin server
    • CloudFront provides Encryption at Rest
      • uses SSDs which are encrypted for edge location points of presence (POPs), and encrypted EBS volumes for Regional Edge Caches (RECs).
      • Function code and configuration are always stored in an encrypted format on the encrypted SSDs on the edge location POPs, and in other storage locations used by CloudFront.
    • Restricting access to content
  • Route 53
    • is a highly available and scalable DNS web service.
    • Resolver Query logging
      • logs the queries that originate in specified VPCs, on-premises resources that use inbound resolver or ones using outbound resolver as well as the responses to those DNS queries.
      • can be logged to CloudWatch logs, S3, and Kinesis Data Firehose
    • Route 53 DNSSEC secures DNS traffic, and helps protect a domain from DNS spoofing man-in-the-middle attacks. 
  • Elastic Load Balancer
    • End to End encryption
      • can be done NLB with TCP listener as pass through and terminating SSL on the EC2 instances
      • can be done with ALB with SSL termination and using HTTPS between ALB and EC2 instances
  • Gateway Load Balancer – GWLB
    • helps deploy, scale, and manage virtual appliances, such as firewalls, IDS/IPS systems, and deep packet inspection systems.

Management & Governance Tools

  • CloudWatch
  • CloudTrail for audit and governance
    • CloudTrail can be enabled for all regions at one go and supports log file integrity validation
    • With Organizations, the trail can be configured to log CloudTrail from all accounts to a central account.
  • AWS Config
    • AWS Config rules can be used to alert for any changes and Config can be used to check the history of changes. AWS Config can also help check approved AMIs compliance
    • allows you to remediate noncompliant resources using AWS Systems Manager Automation documents.
    • AWS Config -> EventBridge -> Lambda/SNS
  • CloudTrail vs Config
    • CloudTrail provides the WHO and Config provides the WHAT.
  • Systems Manager
    • Parameter Store provides secure, scalable, centralized, hierarchical storage for configuration data and secret management. Does not support secrets rotation. Use Secrets Manager instead
    • Systems Manager Patch Manager helps select and deploy the operating system and software patches automatically across large groups of EC2 or on-premises instances
    • Systems Manager Run Command provides safe, secure remote management of your instances at scale without logging into the servers, replacing the need for bastion hosts, SSH, or remote PowerShell
    • Session Manager provides secure and auditable instance management without the need to open inbound ports, maintain bastion hosts, or manage SSH keys.
  • AWS Organizations
    • is an account management service that enables consolidating multiple AWS accounts into an organization that can be managed centrally.
    • can configure Organization Trail to centrally log all CloudTrail logs.
    • Service Control Policies 
      • acts as guardrails and specify the services and actions that users and roles can use in the accounts that the SCP affects.
      • are similar to IAM permission policies except that they don’t grant any permissions.
  • AWS Trusted Advisor
    • inspects the AWS environment to make recommendations for system performance, saving money, availability, and closing security gaps
  • CloudFormation
    • Deletion Policy to prevent, retain, or backup RDS, EBS Volumes
    • Stack policy can prevent stack resources from being unintentionally updated or deleted during a stack update. Stack Policy only applies for Stack updates and not stack deletion.
  • Control Tower
    • to setup, govern, and secure a multi-account environment
    • strongly recommended guardrails cover EBS encryption

Storage & Databases

  • Simple Storage Service – S3
    • Undertstand S3 Security in detail
    • S3 Encryption supports both data at rest and data in transit encryption.
      • Data in transit encryption can be provided by enabling communication via SSL or using client-side encryption
      • Data at rest encryption can be provided using Server Side or Client Side encryption
      • Enforce S3 Encryption at Rest using default encryption of bucket policies
      • Enforce S3 encryption in transit using secureTransport in the S3 bucket policy
    • S3 permissions can be handled using
    • S3 Object Lock helps to store objects using a WORM model and can help prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely.
    • S3 Block Public Access provides controls across an entire AWS Account or at the individual S3 bucket level to ensure that objects never have public access, now and in the future.
    • S3 Access Points simplify data access for any AWS service or customer application that stores data in S3.
    • S3 Versioning with MFA Delete can be enabled on a bucket to ensure that data in the bucket cannot be accidentally overwritten or deleted.
    • S3 Access Analyzer monitors the access policies, ensuring that the policies provide only the intended access to your S3 resources.
  • Glacier Vault Lock
  • EBS Encryption
  • Relational Database Services – RDS
    • is a web service that makes it easier to set up, operate, and scale a relational database in the cloud.
    • supports the same encryption at rest methods as EBS
    • does not support enabling encryption after creation. Need to create a snapshot, copy the snapshot to an encrypted snapshot and restore it as an encrypted DB.

Compute

Integration Tools

  • Know how CloudWatch integration with SNS and Lambda can help in notification (Topics are not required to be in detail)

Whitepapers and articles

On the Exam Day

  • Make sure you are relaxed and get some good night’s sleep. The exam is not tough if you are well-prepared.
  • If you are taking the AWS Online exam
    • Try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.
    • The online verification process does take some time and usually, there are glitches.
    • Remember, you would not be allowed to take the take if you are late by more than 30 minutes.
    • Make sure you have your desk clear, no hand-watches, or external monitors, keep your phones away, and nobody can enter the room.

Finally, All the Best 🙂

AWS Certified Advanced Networking – Specialty ANS-C01 Exam Learning Path

AWS Certified Advanced Networking - Specialty Certificate

AWS Certified Advanced Networking – Specialty ANS-C01 Exam Learning Path

I recently certified/recertified for the AWS Certified Advanced Networking – Specialty (ANS-C01). Frankly, Networking is something that I am still diving deep into and I just about managed to get through. So a word of caution, this exam is inline or tougher than the professional exams, especially for the reason that some of the Networking concepts covered are not something you can get your hands dirty with easily.

AWS Certified Advanced Networking – Specialty ANS-C01 Exam Content

  • AWS Certified Advanced Networking – Specialty (ANS-C01) exam focuses on the AWS Networking concepts. It basically validates
    • Design and develop hybrid and cloud-based networking solutions by using AWS
    • Implement core AWS networking services according to AWS best practices
    • Operate and maintain hybrid and cloud-based network architecture for all AWS services
    • Use tools to deploy and automate hybrid and cloud-based AWS networking tasks
    • Implement secure AWS networks using AWS native networking constructs and services

Refer to AWS Certified Advanced Networking – Specialty Exam Guide AWS Certified Advanced Networking - Specialty ANS-C01 Exam Domains

AWS Certified Advanced Networking – Specialty (ANS-C01) Exam Resources

AWS Certified Advanced Networking – Specialty (ANS-C01) Exam Summary

  • Specialty exams are tough, lengthy, and tiresome. Most of the questions and answers options have a lot of prose and a lot of reading that needs to be done, so be sure you are prepared and manage your time well.
  • ANS-C01 exam has 65 questions to be solved in 170 minutes which gives you roughly 2 1/2 minutes to attempt each question. 65 questions consists of 50 scored and 15 unscored questions.
  • ANS-C01 exam includes two types of questions, multiple-choice and multiple-response.
  • ANS-C01 has a scaled score between 100 and 1,000. The scaled score needed to pass the exam is 750.
  • Each question mainly touches multiple AWS services.
  • Specialty exams currently cost $ 300 + tax.
  • You can get an additional 30 minutes if English is your second language by requesting Exam Accommodations. It might not be needed for Associate exams but is helpful for Professional and Specialty ones.
  • As always, mark the questions for review and move on and come back to them after you are done with all.
  • As always, having a rough architecture or mental picture of the setup helps focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach the right answer or at least have a 50% chance of getting it right.
  • AWS exams can be taken either remotely or online, I prefer to take them online as it provides a lot of flexibility. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.

AWS Certified Advanced Networking – Specialty (ANS-C01) Exam Topics

  • AWS Certified Networking – Specialty (ANS-C01) exam focuses a lot on Networking concepts involving Hybrid Connectivity with Direct Connect, VPN, Transit Gateway, Direct Connect Gateway, and a bit of VPC, Route 53, ALB, NLB & CloudFront.

Networking & Content Delivery

  • Virtual Private Cloud – VPC
    • Understand VPC, Subnets
    • AWS allows extending the VPC by adding a secondary VPC
    • Understand Security Groups, NACLs
    • VPC Flow Logs
      • help capture information about the IP traffic going to and from network interfaces in the VPC and can help in monitoring the traffic or troubleshooting any connectivity issues
      • NACLs are stateless and how it is reflected in VPC Flow Logs
        • If ACCEPT followed by REJECT, inbound was accepted by Security Groups and ACLs. However, rejected by NACLs outbound
        • If REJECT, inbound was either rejected by Security Groups OR NACLs.
      • Use pkt-dstaddr instead of dstaddr to track the destination address as dstaddr refers to the primary ENI address always and not the secondary addresses.
      • Pattern: VPC Flow Logs -> CloudWatch Logs -> (Subscription) -> Kinesis Data Firehose -> S3/Open Search.
    • DHCP Option Sets esp. how to resolve DNS from both on-premises data center and AWS.
    • VPC Peering
      • helps point-to-point connectivity between 2 VPCs which can be in the same or different regions and accounts.
      • know VPC Peering Limitations esp. it does not allow overlapping CIDRs and transitive routing.
    • Placement Groups determine how the instances are placed on the underlying hardware
    • VRF – Virtual Routing & Forwarding can be used to route traffic to the same customer gateway from multiple VPCs, that can be overlapping.
  • VPC Endpoints
    • VPC Gateway Endpoints for connectivity with S3 & DynamoDB i.e. VPC -> VPC Gateway Endpoints -> S3/DynamoDB.
    • VPC Interface Endpoints or Private Links for other AWS services and custom hosted services i.e. VPC -> VPC Interface Endpoint OR Private Link -> S3/Kinesis/SQS/CloudWatch/Any custom endpoint.
    • S3 gateway endpoints cannot be accessed through VPC Peering, VPN, or Direct Connect. Need HTTP proxy to route traffic.
    • S3 Private Link can be accessed through VPC Peering, VPN, or Direct Connect. Need to use an endpoint-specific DNS name.
    • VPC endpoint policy can be configured to control which S3 buckets can be accessed and the S3 Bucket policy can be used to control which VPC (includes all VPC Endpoints) or VPC Endpoint can access it.
    • Private Link Patterns
  • VPC Network Access Analyzer
    • helps identify unintended network access to the resources on AWS.
  • Transit Gateway
    • helps consolidate the AWS VPC routing configuration for a region with a hub-and-spoke architecture.
    • Appliance Mode ensures that network flows are symmetrically routed to the same AZ and network appliance
    • Transit Gateway Connect attachment can be used to connect SD-WAN to AWS Cloud. This supports GRE.
    • Transit Gateways are regional and Peering can connect Transit Gateways across regions.
    • Transit Gateway Network Manager includes events and metrics to monitor the quality of the global network, both in AWS and on-premises.
  • VPC Routing Priority
  • NAT Gateways
    • for HA, Scalable, Outgoing traffic. Does not support Security Groups or ICMP pings.
    • times out the connection if it is idle for 350 seconds or more. To prevent the connection from being dropped, initiate more traffic over the connection or enable TCP keepalive on the instance with a value of less than 350 seconds.
    • supports Private NAT Gateways for internal communication.
  • Virtual Private Network
    • to establish connectivity between the on-premises data center and AWS VPC
  • Direct Connect
    • to establish connectivity between the on-premises data center and AWS VPC and Public Services
    • Direct Connect connections – Dedicated and Hosted connections
    • Understand how to create a Direct Connect connection
      • LOA-CFA provides the details for partners to connect to the AWS Direct Connect location
    • Virtual interfaces options – Private Virtual Interface for VPC resources and Public Virtual Interface for Public Resources
      • Private VIF is for resources within a VPC
      • Public VIF is for AWS public resources
      • Private VIF has a limit of 100 routes and Public VIF of 1000 routes. Summarize the routes if you need to configure more.
    • Understand setup Private and Public VIF
    • Understand High Availability options based on cost and time i.e. Second Direct Connect connection OR VPN connection
    • Direct Connect Gateway
      • it provides a way to connect to multiple VPCs from an on-premises data center using the same Direct Connect connection.
      • can connect to VGW or TGW.
    • Understand Active/Passive Direct Connect 
    • supports MACsec which delivers native, near line-rate, point-to-point encryption ensuring that data communications between AWS and the data center, office, or colocation facility remain protected.
    • Understand Route Propagation, propagation priority, BGP connectivity
      • BGP prefers the shortest AS PATH to get to the destination. Traffic from the VPC to on-premises uses the primary router. This is because the secondary router advertises a longer AS-PATH.
      • AS PATH prepending doesn’t work when the Direct Connect connections are in different AWS Regions than the VPC.
      • AS PATH works from AWS to on-premises and Local Pref from on-premises to AWS
      • Use Local Preference BGP community tags to configure Active/Passive when the connections are from different regions. The higher tag has a higher preference for 7224:7300 > 7224:7100
      • NO_EXPORT works only for Public VIFs
      • 7224:9100, 7224:9200, and 7224:9300 apply only to public prefixes. Usually used to restrict traffic to regions. Can help control if routes should propagate to the local Region only, all Regions within a continent, or all public Regions.
        • 7224:9100 — Local AWS Region
        • 7224:9200 — All AWS Regions for a continent, North America–wide, Asia Pacific, Europe, the Middle East and Africa
        • 7224:9300 — Global (all public AWS Regions)
      • 7224:8100 — Routes that originate from the same AWS Region in which the AWS Direct Connect point of presence is associated.
      • 7224:8200 — Routes that originate from the same continent with which the AWS Direct Connect point of presence is associated.
      • No-tag — Global (all public AWS Regions).
  • Route 53
    • provides a highly available and scalable DNS web service.
    • Routing Policies and their use cases Focus on Weighted,  Latency, and Failover routing policies.
    • supports Alias resource record sets, which enables routing of queries to a CloudFront distribution, Elastic Beanstalk, ELB, an S3 bucket configured as a static website, or another Route 53 resource record set.
    • CNAME does not support zone apex or root records. 
    • Route 53 DNSSEC
      • secures DNS traffic, and helps protect a domain from DNS spoofing man-in-the-middle attacks. 
      • Requirements
        • Asymmetric Customer Managed Keys
        • us-east-1 with ECC_NIST_P256 spec
    • Route 53 Resolver DNS Firewall
      • protection for outbound DNS requests from the VPCs and can monitor and control the domains that the applications can query.
      • allows you to define allow and deny list.
      • can be used for DNS exfiltration.
      • supports FirewallFailOpen configuration which determines how Route 53 Resolver handles queries during failures.
        • disabled, favors security over availability and blocks queries that it is unable to evaluate properly.
        • enabled, favors availability over security and allows queries to proceed if it is unable to properly evaluate them.
    • Route 53 Resolver (Hybrid DNS)
      • Inbound Endpoint for On-premises -> AWS
      • Outbound Endpoint for AWS -> On-premises
    • Route 53 DNS Query Logging
      • Can be logged to CloudWatch logs, S3, and Kinesis Data Firehose
    • Route 53 Resolver rules take precedence over privately hosted zones.
    • Route 53 Split View DNS helps to have the same DNS to access a site externally and internally
    • Know the Domain Migration process
  • CloudFront
    • provides a fully managed, fast CDN service that speeds up the distribution of static, dynamic web, or streaming content to end-users.
    • supports geo-restriction, WAF & AWS Shield for protection.
    • provides Cloud Functions (Edge location) & Lambda@Edge (Regional location) to execute scripts closer to the user.
    • supports encryption at rest and end-to-end encryption
    • CloudFront Origin Shield
      • helps improve the cache hit ratio and reduce the load on the origin.
      • requests from other regional caches would hit the Origin shield rather than the Origin.
      • should be placed at the regional cache and not in the edge cache
      • should be deployed to the region closer to the origin server
  • Global Accelerator
    • provides 2 static IPs
    • does not support client IP address preservation for NLB and Elastic IP address endpoints.
    • does not support IPv6 address
    • know CloudFront vs Global Accelerator
  • Understand ELB, ALB and NLB
    • Differences between ALB and NLB
    • ALB provides Content, Host, and Path-based Routing while NLB provides the ability to have a static IP address
    • Maintain original Client IP to the backend instances using X-Forwarded-for and Proxy Protocol
    • ALB/NLB do not support TLS renegotiation or mutual TLS authentication (mTLS). For implementing mTLS, use NLB with TCP listener on port 443 and terminate on the instances.
    • NLB
      • also provides local zonal endpoints to keep the traffic within AZ
      • can front Private Link endpoints and provide static IPs.
    • ALB supports Forward Secrecy, through Security Policies, that provide additional safeguards against the eavesdropping of encrypted data, through the use of a unique random session key.
    • Supports sticky session feature (session affinity) to enable the LB to bind a user’s session to a specific target. This ensures that all requests from the user during the session are sent to the same target. Sticky Sessions is configured on the target groups.
  • Gateway Load Balancer – GWLB
    • helps deploy, scale, and manage virtual appliances, such as firewalls, IDS/IPS systems, and deep packet inspection systems.
  • Athena integrates with S3 only and not with CloudWatch logs.
  • Transit VPC
    • helps connect multiple, geographically disperse VPCs and remote networks in order to create a global network transit center.
    • Use Transit Gateway instead now.
  • Know CloudHub and its use case

Security

  • AWS GuardDuty
    • managed threat detection service
    • provides Malware protection
  • AWS Shield
    • managed DDoS protection service
    • AWS Shield Advanced provides 24×7 access to the AWS Shield Response Team (SRT), protection against DDoS-related spike, and DDoS cost protection to safeguard against scaling charges.
  • WAF as Web Traffic Firewall
    • helps protect web applications from attacks by allowing rules configuration that allow, block, or monitor (count) web requests based on defined conditions.
    • integrates with CloudFront, ALB, API Gateway to dynamically detect and prevent attacks
  • Network Firewall
  • AWS Inspector
    • is a vulnerability management service that continuously scans the AWS workloads for vulnerabilities

Monitoring & Management Tools

  • Understand AWS CloudFormation esp. in terms of Network creation.
    • Custom resources can be used to handle activities not supported by AWS
    • While configuring VPN connections use depends_on on route tables to define a dependency on other resources as the VPN gateway route propagation depends on a VPC-gateway attachment when you have a VPN gateway.
  • AWS Config
    • fully managed service that provides AWS resource inventory, configuration history, and configuration change notifications to enable security, compliance, and governance.
    • can be used to monitor resource changes e.g. Security Groups and invoke Systems Manager Automation scripts for remediation.
  • CloudTrail for audit and governance

Integration Tools

Networking Architecture Patterns

AWS Certified Advanced Networking – Specialty (ANS-C01) Exam Day

  • Make sure you are relaxed and get some good night’s sleep. The exam is not tough if you are well-prepared.
  • If you are taking the AWS Online exam
    • Try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.
    • The online verification process does take some time and usually, there are glitches.
    • Remember, you would not be allowed to take the take if you are late by more than 30 minutes.
    • Make sure you have your desk clear, no hand-watches, or external monitors, keep your phones away, and nobody can enter the room.

Finally, All the Best 🙂

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Learning Path

AWS Certified Alexa Skill Builder - Specialty Certificate

Finally All Down for AWS (for now) …

Continuing on my AWS journey with the last AWS certification, I took another step by clearing the AWS Certified Alexa Skill Builder – Specialty (AXS-C01) certification. It is amazing to know and learn how Voice first experiences are making an impact and changing how we think about technology and use cases.

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) exam basically validates your ability to build, test, publish and certify Alexa skills.

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Summary

  • AWS Certified Alexa Skill Builder – Specialty exam focuses only on Alexa and how to build skills.
  • AWS Certified Alexa Skill Builder – Specialty exam has 65 questions with a time limit of 170 minutes
  • Compared to the other professional and specialty exams, the question and answers are not long and similar to associate exams. So if you are prepared well, it should not need the 170 minutes.
  • As the exam was online from home, there was no access to paper and pen but the trick remains the same, read the question and draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.

Are you looking for a job? Visit Jooble!

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Topic Summary

Refer AWS Alexa Cheat Sheet

Domain 1: Voice-First Design Practices and Capabilities

1.1 Describe how users interact with skills

1.2 Map features and capabilities to use cases

  • Alexa supports display cards to display text (Simple card) and text with image (Standard card)
  • Alexa Alexa Skill Kits supports APIs
    • Alexa Settings APIs allow developers to retrieve customer preferences for the settings like time zone, distance measuring unit, and temperature measurement unit 
    • Device services – a skill can request the customer’s permission to their address information, which is a static data filled by customer and includes the country/region, postal code and full address
    • Customer Profile services – a skill can request the customer’s permission to their contact information, which includes name, email address and phone number
    • With Location services, a skill can ask a user’s permission to obtain the real-time location of their Alexa-enabled device, specifically at the time of the user’s request to Alexa, so that the skill can provide enhanced services.
  • Alexa Skill Kit APIs need apiAccessToken and deviceId to access the ASK APIs
  • Progressive Response API allows you to keep the user engaged while the skill prepares a full response to the user’s request.
  • Personalization can be provided using userId and state persistence

Domain 2: Skill Design

2.1 Design and develop an interaction model

  • Alexa interaction model includes skill, Invocation name, utterances, slots, Intents
  • A skill is ‘an app for Alexa’, however they are not downloadable but just need to be enabled.
  • Wakeword – Amazon offers a choice of wakewords like ‘Alexa’, ‘Amazon’, ‘Echo’, ‘skill’, ‘app’ or ‘Computer’, with the default being ‘Alexa’.
  • Launch phrases include “run,” “start,” “play,” “resume,” “use,” “launch,” “ask,” “open,” “tell,” “load,” “begin,” and “enable.”
  • Connecting words include “to,” “from,” “in,” “using,” “with,” “about,” “for,” “that,” “by,” “if,” “and,” “whether.”
  • Invocation name
    • is the word or phrase used to trigger the skill for custom skills and the invocation name should adhere to the requirements
    • must not infringe upon the intellectual property rights of an entity or person
    • must be compound of two or more works.
    • One-word invocation names are allowed only for brand/intellectual property.
    • must not include names of people or places
    • if two-word invocation names, one of the words cannot be a definite article (“the”), indefinite article (“a”, “an”) or preposition (“for”, “to”, “of,” “about,” “up,” “by,” “at,” “off,” “with”).
    • must not contain any of the Alexa skill launch phrases, connecting words and wake words
    • must contain only lower-case alphabetic characters, spaces between words, and possessive apostrophes
    • must spell characters like numbers for e.g., twenty one
    • can have periods in the invocation names containing acronyms or abbreviations that are pronounced as a series of individual letters, for e.g. NASA as n. a. s. a.
    • cannot spell out phonemes for e.g., a skill titled “AWS Facts” would need “AWS” represented as “a. w. s. ” and NOT “ay double u ess.”
    • must not create confusion with existing Alexa features.
    • must be written in each supported language
  • An intent is what a user is trying to accomplish.
    • Amazon provides standard built-in intents which can be extended
    • Intents need to have a unique utterance
  • Utterances are the specific phrases that people will use when making a request to Alexa.
  • A slot is a variable that relates to an intent allowing Alexa to understand information about the request
    • Amazon provides standard built-in slots which can be extended
  • Entity resolution improves the way Alexa matches possible slot values in a user’s utterance with the slots defined in your interaction model

2.2 Design a multi-turn conversation

  • Alexa Dialog management model identifies the prompts and utterances to collect, validate, and confirm the slot values and intents.
  • Alexa supports
    • Auto Delegation where Alexa completes all of the dialog steps based on the dialog model.
    • Manual delegation using Dialog.Delegate where Alexa sends the skill an IntentRequest for each turn of the conversation and provides more flexibility.
  • AMAZON.FallbackIntent will not be triggered in the middle of a dialog

2.3 Use built-in intents and slots

  • Standard built-in intents cannot include any slots. If slots are needed, create a custom intent and write your own sample utterances.
  • Alexa recommends using and extending standard built-in intents like Alexa.HelpIntent, Alexa.YesIntent with additional utterances as per the skill requirements
  • Alexa provides Alexa.FallbackIntent for handling any unmatched utterances and can be used to improve the interaction model accuracy.
  • Standard built-in intents cannot include any slots. If slots are needed, create a custom intent and write your own sample utterances.
  • Alexa provides slot which helps capture variables and can be either be a Amazon predefined slot such as dates, numbers, durations, time, etc. or a custom one specific to the skill
  • Predefined slots can be extended to add additional values

2.4 Handle unexpected conversational requests or responses

  • Alexa provides Alexa.FallbackIntent for handling any unmatched utterances and can be used to improve the interaction model accuracy.
  • Alexa also provides Intent History  which provides  a consolidate view with aggregated, anonymized frequent utterances and the resolved intents. These can be used to map the utterances to correct intents

2.5 Design multi-modal skills using one or more service interfaces (for example, audio, video, and gadgets)

  • Alexa enabled devices with a screen handles Page and Scroll intents. Do not handle Next and Previous.
  • Alexa skill with AudioPlayer interface
    • must handle AMAZON.ResumeIntent and AMAZON.PauseIntent
    • PlaybackController events to track AudioPlayer status changes initiated from the device buttons

Domain 3: Skill Architecture

3.1 Identify AWS services for extending Alexa skill functionality (Amazon CloudFront, Amazon S3, Amazon CloudWatch, and Amazon DynamoDB)

  • Focus on standard skill architecture using Lambda for backend, DynamoDB for persistence, S3 for severing static assets, and CloudWatch for monitoring and logs.
  • Lambda provide serverless handling for the Alexa requests, but remember the following limits
    • default concurrency soft limit of 1000 can be increased by raising a support request
    • default timeout of 3 secs, and should be increased to atleast 7 secs to be inline with Alexa timeout of 8 secs
    • default memory of 128mb, increase to improve performance
  • S3 performance can be improved by exposing it through CloudFront esp. for images, audio and video files

3.2 Use AWS Lambda to build Alexa skills

  • Lambda integrates with CloudWatch to provide logs and should be the first thing to check in case of any issues or errors.
  • Alexa allows any http endpoint to act as a backend, but needs to meet following requirements
    • must be accessible over the internet.
    • must accept HTTP requests on port 443.
    • must support HTTP over SSL/TLS, using an Amazon-trusted certificate.

3.3 Follow AWS and Alexa security and privacy best practices

  • Alexa requires the backend to verify that incoming requests come from Alexa using Skill ID verification
  • Child-directed skills cannot use personal and location information
  • Skills cannot be used to capture health information
  • Alexa Skills Kit uses the OAuth 2.0 authentication framework for Account linking, which defines a means by which the service can allow Alexa, with the user’s permission, to access information from the account that the user has set up with you.
  • Alexa smart home skills must have OAuth authorization code grant implementation while custom skills can have authorization code grant or impact grant implementation.

Domain 4: Skill Development

4.1 Implement in-skill purchasing and Amazon Pay for Alexa Skills

  • In-skill purchasing enables selling premium content such as game features and interactive stories in skills with a custom interaction model.
  • In-skill purchasing is handled by Alexa when the skill sends a Upsell directive. As the skill session ends when a Upsell directive is sent, be sure to save any relevant user data in a persistent data store so that the skill can continue where the user left off after the purchase flow is completed and the endpoint is back in control of the user experience.
  • Skill can handle the Connections.Response request that indicates the result of a purchase flow and resume the skill

4.2 Use Speech Synthesis Markup Language (SSML) for expression and MP3 audio

  • SSML is a markup language that provides a standard way to mark up text for the generation of synthetic speech.
  • Alexa supports a subset of SSML tags including
    • say-as to interpret text as telephone, date, time etc.
    • phonemeprovides a phonemic/phonetic pronunciation
    • prosody modifies the volume, pitch, and rate of the tagged speech.
    • audioallows playing MP3 player while rendering a response
      • must be in valid MP3 file (MPEG version 2) format
      • must be hosted at an Internet-accessible HTTPS endpoint.
      • For speech response, the audio file cannot be longer than 240 seconds.
        • combined total time for all audio files in the outputSpeech property of the response cannot be more than 240 seconds.
        • combined total time for all audio files in the reprompt property of the response cannot be more than 90 seconds.
      • bit rate must be 48 kbps.
      • sample rate must be 22050Hz, 24000Hz, or 16000Hz.

4.3 Implement state management

  • Alexa Skill state persistence can be handled using session attributes during the session and externally using services like DynamoDB, RDS across sessions.

4.4 Implement Alexa service interfaces (audio player, video player, and screens)

4.5 Parse Alexa JSON requests and provide responses

  • All requests include the session (optional), context, and request objects at the top level.
    •  session object provides additional context associated with the request.
      • session attributes can be used to store data
      • user containing userId to uniquely define an user and accessToken to access other services.
      • system object provides apiAccessToken and device object provides deviceId to access ASK APIs
      • application provide applicationId
      • device object provides supportedInterfaces to list each interface that the device supports
      • user containing userId to uniquely define an user and accessToken to access other services.
    • request object that provides the details of the user’s request.
  • Response includes
    • outputSpeech contains the speech to render to the user.
    • reprompt contains the outputSpeech to use if a re-prompt is necessary.
    • shouldEndSession provides a boolean value that indicates what should happen after Alexa speaks the response.

Domain 5: Test, Validate, and Troubleshoot

5.1 Debug and troubleshoot using Amazon CloudWatch or other tools

  • Lambda integrates with CloudWatch for metric and logs and can be check for any errors and metrics.

5.2 Use the Alexa developer testing tools

  • Utterance profiles – test utterances to know what intent they resolve to 
  • Alexa Skill simulator
    • provides an ability to Interact with Alexa with either your voice or text, without an actual device.
    • maintains the skill session, so the interaction model and dialog flow can be tested.
    • supports multiple languages testing by selecting locale
    • has limitations in testing audio, video, Alexa settings and Device API
  • Manual Json
    • enter a JSON request directly and see the skill returned JSON response
    • does not maintain the skill session and is similar to testing a JSON request in the Lambda console.
  • Voice & Tone – enter plain text or SSML and hear how Alexa speaks the text in a selected language
  • Alexa device – test with an Alexa-enabled device.
  • Alexa app – test the skill with the Alexa app for Android/iOS
  • Lambda Test console – to test Lambda functions

5.3 Perform beta testing

  • Skill beta testing tool can be used to test the Alexa skill in beta before releasing it to production
  • Beat testing allows testing changes to an existing skill, while still keeping the currently live version of the skill available for the general public.
  • Members can be invited using their Alexa email address. Alexa device used by the beta tester must be associated with the email address in the tester’s invitation.

5.4 Troubleshoot errors in the interaction model

Domain 6: Publishing, Operations, and Lifecycle Management

6.1 Describe the skill publishing process

  • Alexa skill needs to go through certification process before the Skill is live and made available to the users
  • Alexa creates an in development version of the skill, once the skill becomes live
  • Alexa Skill live version cannot be edited, and it is recommended to edit the in development skill, test and then re-certify for publishing.
  • Backend changes like changes in Lambda functions or response output from the function, however, can be made on live version and do not require re-certification. However, it is recommended to use Lambda versioning or alias to do such changes.
  • Alexa for Business allows skill to be made private and available to select users within the company

6.2 Add and remove users in the developer console

  • Alexa Skill Developer console access can be shared across multiple users for collaboration
  • Administrator and Analyst roles will also have access to the Earnings and Payments sections.
  • Administrator and Marketer roles will also have access to edit the content associated with apps (i.e. Descriptions, Images & Multimedia) and IAPs
  • Administrator and Developer roles will have access to create, modify and delete Alexa skills using ASK CLI and SMAPI.
  • Administrator, Analyst and Marketer roles have access to sales report

6.3 Perform analysis of skill analytics in the developer console

  • Intent History – View aggregated, anonymized frequent utterances and the resolved intents. You cannot track the user intent history as they are anonymized.
  • Actions – Unique customers per action, total actions, and total utterances per action.
  • Customers – Total number of unique customers who accessed the skill.
  • Intents – Unique customers per intent, total utterances per intent, total intents, and failed intents.
  • Interaction Path – Paths users take when interacting with the skill.
  • Plays Total number of times that a user played the skill content.
  • Retention (live skills only) Usage of the skill over time by groups of customers or cohorts. View the number or percentage of customers who returned to your skill over a 12-week period.
  • Sessions Total sessions, successful session types (sessions that didn’t end due to an error), average sessions per customer. Includes a breakdown of successful, failed, and no-response sessions as a percentage of total sessions. Custom
  • Utterances Metrics for utterances depend on the skill category.

6.4 Differentiate among the statuses/versions of skills (for example, In Development, In Certification, and Live)

  • In Development – skill available for development, testing
  • In Review – A certification review is in progress and the skill cannot be edited
  • Certified – Skill passed certification review, and is not yet available to users
  • Live – skill has been published and is available to users. You cannot edit the configuration for live skills
  • Hidden – skill was previously published, but has since been hidden. Existing users can access the skill. New users cannot discover the skill.
  • Removed – skill was previously published, but has since been removed. Users cannot enable or use the skill.

AWS Certified Alexa Skill Builder – Specialty (AXS-C01) Exam Resources

AWS Certified Big Data -Speciality (BDS-C00) Exam Learning Path

Clearing the AWS Certified Big Data – Speciality (BDS-C00) was a great feeling. This was my third Speciality certification and in terms of the difficulty level (compared to Network and Security Speciality exams), I would rate it between Network (being the toughest) Security (being the simpler one).

Big Data in itself is a very vast topic and with AWS services, there is lots to cover and know for the exam. If you have worked on Big Data technologies including a bit of Visualization and Machine learning, it would be a great asset to pass this exam.

AWS Certified Big Data – Speciality (BDS-C00) exam basically validates

  • Implement core AWS Big Data services according to basic architectural best practices
  • Design and maintain Big Data
  • Leverage tools to automate Data Analysis

Refer AWS Certified Big Data – Speciality Exam Guide for details

                              AWS Certified Big Data – Speciality Domains

AWS Certified Big Data – Speciality (BDS-C00) Exam Summary

  • AWS Certified Big Data – Speciality exam, as its name suggests, covers a lot of Big Data concepts right from data transfer and collection techniques, storage, pre and post processing, analytics, visualization with the added concepts for data security at each layer.
  • One of the key tactic I followed when solving any AWS Certification exam is to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.
  • Be sure to cover the following topics
    • Whitepapers and articles
    • Analytics
      • Make sure you know and cover all the services in depth, as 80% of the exam is focused on these topics
      • Elastic Map Reduce
        • Understand EMR in depth
        • Understand EMRFS (hint: Use Consistent view to make sure S3 objects referred by different applications are in sync)
        • Know EMR Best Practices (hint: start with many small nodes instead on few large nodes)
        • Know Hive can be externally hosted using RDS, Aurora and AWS Glue Data Catalog
        • Know also different technologies
          • Presto is a fast SQL query engine designed for interactive analytic queries over large datasets from multiple sources
          • D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG, and CSS
          • Spark is a distributed processing framework and programming model that helps do machine learning, stream processing, or graph analytics using Amazon EMR clusters
          • Zeppelin/Jupyter as a notebook for interactive data exploration and provides open-source web application that can be used to create and share documents that contain live code, equations, visualizations, and narrative text
          • Phoenix is used for OLTP and operational analytics, allowing you to use standard SQL queries and JDBC APIs to work with an Apache HBase backing store
      • Kinesis
        • Understand Kinesis Data Streams and Kinesis Data Firehose in depth
        • Know Kinesis Data Streams vs Kinesis Firehose
          • Know Kinesis Data Streams is open ended on both producer and consumer. It supports KCL and works with Spark.
          • Know Kineses Firehose is open ended for producer only. Data is stored in S3, Redshift and ElasticSearch.
          • Kinesis Firehose works in batches with minimum 60secs interval.
        • Understand Kinesis Encryption (hint: use server side encryption or encrypt in producer for data streams)
        • Know difference between KPL vs SDK (hint: PutRecords are synchronously, while KPL supports batching)
        • Kinesis Best Practices (hint: increase performance increasing the shards)
      • Know ElasticSearch is a search service which supports indexing, full text search, faceting etc.
      • Redshift
        • Understand Redshift in depth
        • Understand Redshift Advance topics like Workload Management, Distribution Style, Sort key
        • Know Redshift Best Practices w.r.t selection of Distribution style, Sort key, COPY command which allows parallelism
        • Know Redshift views to control access to data.
      • Amazon Machine Learning
      • Know Data Pipeline for data transfer
      • QuickSight
      • Know Glue as the ETL tool
    • Security, Identity & Compliance
    • Management & Governance Tools
      • Understand AWS CloudWatch for Logs and Metrics. Also, CloudWatch Events more real time alerts as compared to CloudTrail
    • Storage
    • Compute
      • Know EC2 access to services using IAM Role and Lambda using Execution role.
      • Lambda esp. how to improve performance batching, breaking functions etc.

AWS Certified Big Data – Speciality (BDS-C00) Exam Resources

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Learning Path

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Learning Path

I recently cleared the AWS Certified Advanced Networking – Speciality (ANS-C00), which was my first, en route my path to the AWS Speciality certifications. Frankly, I feel the time I gave for preparation was still not enough, but I just about managed to get through. So a word of caution, this exam is inline or tougher than the professional exam especially for the reason that the Networking concepts it covers are not something you can get your hands dirty with easily.

AWS Certified Advanced Networking – Speciality (ANS-C00) exam is the focusing on the AWS Networking concepts. It basically validates

  • Design, develop, and deploy cloud-based solutions using AWS
    Implement core AWS services according to basic architecture best practices
  • Design and maintain network architecture for all AWS services
  • Leverage tools to automate AWS networking tasks

Refer to AWS Certified Advanced Networking – Speciality Exam Guide

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Resources

AWS Certified Advanced Networking – Speciality (ANS-C00) Exam Summary

  • AWS Certified Advanced Networking – Speciality exam covers a lot of Networking concepts like VPC, VPN, Direct Connect, Route 53, ALB, NLB.
  • One of the key tactic I followed when solving the DevOps Engineer questions was to read the question and use paper and pencil to draw a rough architecture and focus on the areas that you need to improve. Trust me, you will be able eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach to the right answer or atleast have a 50% chance of getting it right.
  • Be sure to cover the following topics
    • Networking & Content Delivery
      • You should know everything in Networking.
      • Understand VPC in depth
      • Virtual Private Network to establish connectivity between on-premises data center and AWS VPC
      • Direct Connect to establish connectivity between on-premises data center and AWS VPC and Public Services
        • Make sure you understand Direct Connect in detail, without this you cannot clear the exam
        • Understand Direct Connect connections – Dedicated and Hosted connections
        • Understand how to create a Direct Connect connection (hint: LOA-CFA provides the details for partner to connect to AWS Direct Connect location)
        • Understand virtual interfaces options – Private Virtual Interface for VPC resources and Public Virtual Interface for Public resources
        • Understand setup Private and Public VIF
        • Understand Route Propagation, propagation priority, BGP connectivity
        • Understand High Availability options based on cost and time i.e. Second Direct Connect connection OR VPN connection
        • Understand Direct Connect Gateway – it provides a way to connect to multiple VPCs from on-premises data center using the same Direct Connect connection
      • Route 53
        • Understand Route 53 and Routing Policies and their use cases Focus on Weighted, Latency routing policies
        • Understand Route 53 Split View DNS to have the same DNS to access a site externally and internally
      • Understand CloudFront and use cases
      • Load Balancer
        • Understand ELB, ALB and NLB 
        • Understand the difference ELB, ALB and NLB esp. ALB provides Content, Host and Path based Routing while NLB provides the ability to have static IP address
        • Know how to design VPC CIDR block with NLB (Hint – minimum number of IPs required are 8)
        • Know how to pass original Client IP to the backend instances (Hint – X-Forwarded-for and Proxy Protocol)
      • Know WorkSpaces requirements and setup
    • Security
      • Know AWS GuardDuty as managed threat detection service
      • Know AWS Shield esp. the Shield Advanced option and the features it provides
      • Know WAF as Web Traffic Firewall – (Hint – WAF can be attached to your CloudFront, Application Load Balancer, API Gateway to dynamically detect and prevent attacks)
    •