AWS Certified Machine Learning Engineer – Associate (MLA-C01) Exam Learning Path

AWS Certified Machine Learning Engineer - Associate (MLA-C01) Certificate

AWS Certified Machine Learning Engineer – Associate (MLA-C01) Exam Learning Path

🔄 2026 Update: The MLA-C01 exam guide and question pool were refreshed in Q1 2026 to reflect changes in AWS services, including the addition of Amazon Bedrock AgentCore. The AWS Certified Machine Learning – Specialty (MLS-C01) was retired on March 31, 2026, making MLA-C01 the primary AWS certification for ML practitioners.

  • Certified for the last pending AWS Certified Machine Learning Engineer – Associate (MLA-C01) certification, which was newly introduced on October 8, 2024, following its beta period.
  • Machine Learning Engineer – Associate exam validates knowledge to build, operationalize, deploy, and maintain machine learning (ML) solutions and pipelines by using the AWS Cloud.
  • Exam also validates a candidate’s ability to complete the following tasks:
    • Ingest, transform, validate, and prepare data for ML modeling.
    • Select general modeling approaches, train models, tune hyperparameters, analyze model performance, and manage model versions.
    • Choose deployment infrastructure and endpoints, provision compute resources, and configure auto scaling based on requirements.
    • Set up continuous integration and continuous delivery (CI/CD) pipelines to automate orchestration of ML workflows.
    • Monitor models, data, and infrastructure to detect issues.
    • Secure ML systems and resources through access controls, compliance features, and best practices.

Refer AWS Certified Machine Learning Engineer – Associate (MLA-C01) Exam Guide

AWS Certified Machine Learning Engineer – Associate (MLA-C01) Exam Summary

  • MLA-C01 exam consists of 65 questions (50 scored and 15 unscored) in 130 minutes, and the time is more than sufficient if you are well-prepared.
  • MLA-C01 exam covers four domains:
    • Domain 1: Data Preparation for Machine Learning (28%)
    • Domain 2: ML Model Development (26%)
    • Domain 3: Deployment and Orchestration of ML Workflows (22%)
    • Domain 4: ML Solution Monitoring, Maintenance, and Security (24%)
  • In addition to the usual types of multiple-choice and multiple-response questions, the MLA-C01 exams have introduced the following new types
    • Ordering: Has a list of 3-5 responses which you need to select and place in the correct order to complete a specified task.
    • Matching: Has a list of responses to match with a list of 3-7 prompts. You must match all the pairs correctly to receive credit for the question.
    • Case study: A case study presents a single scenario with multiple questions. Each question is evaluated independently, and credit is given for each correct answer.
  • MLA-C01 has a scaled score between 100 and 1,000. The scaled score needed to pass the exam is 720.
  • Associate exams currently cost $150 + tax.
  • You can get an additional 30 minutes if English is your second language by requesting Exam Accommodations. It might not be needed for Associate exams but is helpful for Professional and Specialty ones.
  • AWS exams can be taken either remotely or online, I prefer to take them online as it provides a lot of flexibility. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.

AWS Certified Machine Learning Engineer – Associate (MLA-C01) Exam Resources

AWS Certified Machine Learning Engineer – Associate (MLA-C01) Exam Topics

  • AWS Certified Machine Learning Engineer – Associate exam covers a lot of Machine Learning concepts in addition to the AWS ML Services.
  • AWS Certified Machine Learning exam covers the Machine Learning lifecycle, data collection, transformation, making it usable and efficient for Machine Learning, pre-processing data for Machine Learning, training and validation, and implementation.
  • With the Q1 2026 refresh, the exam now includes increased coverage of Amazon Bedrock (Knowledge Bases, Agents, Guardrails, AgentCore) and Generative AI workflows.

Machine Learning Concepts

  • Exploratory Data Analysis
    • Feature selection and Engineering
      • remove features that are not related to training
      • remove features that have the same values, very low correlation, very little variance, or a lot of missing values
      • Apply techniques like Principal Component Analysis (PCA) for dimensionality reduction i.e. reduce the number of features.
      • Apply techniques such as One-hot encoding and label encoding to help convert strings to numeric values, which are easier to process.
      • Apply Normalization i.e. values between 0 and 1 to handle data with large variance.
      • Apply feature engineering for feature reduction e.g. using a single height/weight feature instead of both features.
    • Handle Missing data
      • remove the feature or rows with missing data
      • impute using Mean/Median values – valid only for Numeric values and not categorical features also does not factor correlation between features
      • impute using k-NN, Multivariate Imputation by Chained Equation (MICE), Deep Learning – more accurate and helps factors correlation between features
    • Handle unbalanced data
      • Source more data
      • Oversample minority or Undersample majority
      • Data augmentation using techniques like Synthetic Minority Oversampling Technique (SMOTE).
  • Modeling
    • Know about Algorithms – Supervised, Unsupervised and Reinforcement and which algorithm is best suitable based on the available data either labelled or unlabelled.
      • Supervised learning trains on labeled data e.g. Linear regression. Logistic regression, Decision trees, Random Forests
      • Unsupervised learning trains on unlabelled data e.g. PCA, SVD, K-means
      • Reinforcement learning trained based on actions and rewards e.g. Q-Learning
    • Hyperparameters
      • are parameters exposed by machine learning algorithms that control how the underlying algorithm operates and their values affect the quality of the trained models
      • some of the common hyperparameters are learning rate, batch, epoch (hint: If the learning rate is too large, the minimum slope might be missed and the graph would oscillate If the learning rate is too small, it requires too many steps which would take the process longer and is less efficient)
  • Evaluation
    • Know difference in evaluating model accuracy
      • Use Area Under the (Receiver Operating Characteristic) Curve (AUC) for Binary classification
      • Use root mean square error (RMSE) metric for regression
    • Understand Confusion matrix
      • A true positive is an outcome where the model correctly predicts the positive class. Similarly, a true negative is an outcome where the model correctly predicts the negative class.
      • A false positive is an outcome where the model incorrectly predicts the positive class. A false negative is an outcome where the model incorrectly predicts the negative class.
      • Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN) (hint: use this for cases like fraud detection, cost of marking non fraud as frauds is lower than marking fraud as non-frauds)
      • Specificity or TNR (True Negative Rate): Number of items correctly identified as negative out of total negatives- TN/(TN+FP) (hint: use this for cases like videos for kids, the cost of dropping few valid videos is lower than showing few bad ones)
    • Handle Overfitting problems
      • Simplify the model, by reducing the number of layers
      • Early Stopping – form of regularization while training a model with an iterative method, such as gradient descent
      • Data Augmentation
      • Regularization – technique to reduce the complexity of the model
      • Dropout is a regularization technique that prevents overfitting
      • Never train on test data

Machine Learning Services

SageMaker AI (formerly SageMaker)

Note: At re:Invent 2024, AWS rebranded Amazon SageMaker to Amazon SageMaker AI as part of the next-generation SageMaker platform that unifies data, analytics, and AI.

  • supports both File mode, Pipe mode, and Fast File mode
    • File mode loads all of the data from S3 to the training instance volumes VS Pipe mode streams data directly from S3
    • File mode needs disk space to store both the final model artifacts and the full training dataset. VS Pipe mode which helps reduce the required size for EBS volumes.
    • Fast File mode combines the ease of use of the existing File Mode with the performance of Pipe Mode.
  • Using RecordIO format allows algorithms to take advantage of Pipe mode when training the algorithms that support it.
  • supports Model tracking capability to manage up to thousands of machine learning model experiments
  • supports automatic scaling for production variants. Automatic scaling dynamically adjusts the number of instances provisioned for a production variant in response to changes in your workload
  • provides pre-built Docker images for its built-in algorithms and the supported deep learning frameworks used for training & inference
  • SageMaker Automatic Model Tuning
    • is the process of finding a set of hyperparameters for an algorithm that can yield an optimal model.
    • Best practices
      • limit the search to a smaller number as the difficulty of a hyperparameter tuning job depends primarily on the number of hyperparameters that Amazon SageMaker has to search
      • DO NOT specify a very large range to cover every possible value for a hyperparameter as it affects the success of hyperparameter optimization.
      • log-scaled hyperparameter can be converted to improve hyperparameter optimization.
      • running one training job at a time achieves the best results with the least amount of compute time.
      • Design distributed training jobs so that you get they report the objective metric that you want.
  • know how to take advantage of multiple GPUs (hint: increase learning rate and batch size w.r.t to the increase in GPUs)
  • Elastic Inference (deprecated April 2023, replaced by AWS Inferentia) — previously helped attach low-cost GPU-powered acceleration to EC2 and SageMaker instances for deep learning inference. Use AWS Inferentia (Inf1/Inf2 instances) or AWS Trainium for cost-effective ML acceleration.
  • SageMaker AI Inference options.
    • Real-time inference is ideal for online inferences that have low latency or high throughput requirements.
    • Serverless Inference is ideal for intermittent or unpredictable traffic patterns as it manages all of the underlying infrastructure with no need to manage instances or scaling policies.
    • Batch Transform is suitable for offline processing when large amounts of data are available upfront and you don’t need a persistent endpoint.
    • Asynchronous Inference is ideal when you want to queue requests and have large payloads with long processing times.
  • SageMaker AI Model deployment allows deploying multiple variants of a model to the same SageMaker endpoint to test new models without impacting the user experience
    • Production Variants
      • supports A/B or Canary testing where you can allocate a portion of the inference requests to each variant.
      • helps compare production variants’ performance relative to each other.
    • Shadow Variants
      • replicates a portion of the inference requests that go to the production variant to the shadow variant.
      • logs the responses of the shadow variant for comparison and not returned to the caller.
      • helps test the performance of the shadow variant without exposing the caller to the response produced by the shadow variant.
  • SageMaker Managed Spot training can help use spot instances to save cost and with Checkpointing feature can save the state of ML models during training
  • SageMaker Feature Store
    • helps to create, share, and manage features for ML development.
    • is a centralized store for features and associated metadata so features can be easily discovered and reused.
    • now supports Apache Iceberg table format, streaming ingestion, scalable batch ingestion, and fine-grained access control through AWS Lake Formation (2025 update).
  • SageMaker Debugger provides tools to debug training jobs and resolve problems such as overfitting, saturated activation functions, and vanishing gradients to improve the model’s performance.
  • SageMaker Model Monitor monitors the quality of SageMaker machine learning models in production and can help set alerts that notify when there are deviations in the model quality.
  • SageMaker Automatic Model Tuning helps find a set of hyperparameters for an algorithm that can yield an optimal model.
  • SageMaker Data Wrangler
    • reduces the time it takes to aggregate and prepare tabular and image data for ML from weeks to minutes.
    • Note: Data Wrangler has been integrated into Amazon SageMaker Canvas. The new Data Wrangler experience in SageMaker Canvas includes a natural language interface in addition to the visual interface for data exploration and transformation.
  • SageMaker Experiments is a capability of SageMaker that lets you create, manage, analyze, and compare machine learning experiments.
  • SageMaker Clarify helps improve the ML models by detecting potential bias and helping to explain the predictions that the models make.
  • SageMaker Model Governance is a framework that gives systematic visibility into ML model development, validation, and usage.
  • SageMaker Model Cards
    • helps document critical details about the ML models in a single place for streamlined governance and reporting.
    • helps capture key information about the models throughout their lifecycle and implement responsible AI practices.
  • SageMaker Autopilot
    • is an automated machine learning (AutoML) feature set that automates the end-to-end process of building, training, tuning, and deploying machine learning models.
    • Note: Autopilot UI has been migrated to Amazon SageMaker Canvas. Use SageMaker Canvas for no-code/low-code AutoML capabilities.
  • SageMaker Neo enables machine learning models to train once and run anywhere in the cloud and at the edge.
  • SageMaker API and SageMaker Runtime support VPC interface endpoints powered by AWS PrivateLink that helps connect VPC directly to the SageMaker API or SageMaker Runtime using AWS PrivateLink without using an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
  • SageMaker managed warm pools retain and reuse provisioned infrastructure after the training job completion to reduce latency for repetitive workloads.
  • SageMaker supports Elastic File System (EFS) and FSx for Lustre file systems as data sources for training machine learning models.
  • SageMaker MLOps
    • ML Lineage Tracking creates and stores tracking information about the steps of a ML workflow from data preparation to model deployment that can help reproduce the workflow steps, track model and dataset lineage, and establish model governance and audit standards.
    • Model Registry provides a model catalog, helps manage model versions, associate metadata, manage model approval status, deploy models to production and share models with other users.
  • SageMaker AI Managed MLflow (New – 2024)
    • provides fully managed MLflow tracking servers for experiment tracking, model packaging, and model registry.
    • helps track multiple training runs as experiments, compare runs with visualizations, evaluate models, and register the best models.
    • models registered in MLflow are automatically registered to SageMaker Model Registry with an associated SageMaker Model Card.
  • SageMaker HyperPod (New – 2023/2024)
    • purpose-built infrastructure for training foundation models at scale, reducing training time by up to 40%.
    • efficiently distributes and parallelizes training workloads across hundreds or thousands of AI accelerators.
    • continuously checks for hardware problems, resolves them automatically, and ensures workloads recover without manual intervention.
    • supports Flexible Training Plans (re:Invent 2024) to meet training timelines and budgets.
    • supports checkpointless training with 80-93% reduction in recovery time.
  • SageMaker Unified Studio (New – re:Invent 2024)
    • a unified interface combining data preparation, ML model development, generative AI, and governance.
    • integrates SageMaker AI, Amazon Bedrock, analytics, and data governance into a single platform.
    • supports one-click onboarding with existing IAM roles and permissions.
  • SageMaker Canvas
    • a visual, no-code/low-code ML service that enables building, evaluating, and deploying production-ready models without writing code.
    • now integrates Data Wrangler and Autopilot capabilities.
    • supports petabyte-scale data preparation and time series forecasting (replacing Amazon Forecast).
    • supports fine-tuning foundation models via Amazon Bedrock integration.

SageMaker Ground Truth

  • provides automated data labeling using machine learning
  • helps build highly accurate training datasets for machine learning quickly using Amazon Mechanical Turk
  • provides annotation consolidation to help improve the accuracy of the data object’s labels. It combines the results of multiple worker’s annotation tasks into one high-fidelity label.
  • automated data labeling uses machine learning to label portions of the data automatically without having to send them to human workers

Amazon Bedrock (New – Critical for 2026 Exam)

  • Amazon Bedrock is a fully managed service providing access to foundation models (FMs) from Amazon and third-party providers through a unified API.
    • supports models from Amazon (Nova, Titan), Anthropic (Claude), Meta (Llama), Mistral, Cohere, and others.
    • provides serverless experience — no infrastructure to manage.
  • Bedrock Knowledge Bases
    • enables RAG (Retrieval-Augmented Generation) by grounding FM responses in enterprise data.
    • automatically chunks documents, creates embeddings, and stores them in a vector database.
    • supports vector stores including Amazon OpenSearch Serverless, Amazon Aurora, Pinecone, and Redis Enterprise.
    • Managed Knowledge Base (GA June 2026) — fully managed RAG service without managing vector databases or data pipelines.
    • integrates with Amazon Kendra GenAI Index for enhanced semantic retrieval.
  • Bedrock Agents
    • orchestrate multi-step generative AI workflows by connecting FMs to APIs and data sources.
    • automatically break down tasks, create orchestration plans, and execute actions.
    • support action groups (Lambda functions) and knowledge base integration.
  • Bedrock AgentCore (New – 2025/2026)
    • a dedicated runtime for production-grade AI agents with enterprise-grade security.
    • enables rapid deployment and scaling of AI agents.
    • now included in MLA-C01 exam scope per Q1 2026 refresh.
  • Bedrock Guardrails
    • configurable safeguards to filter harmful content, block sensitive data (PII), and ensure compliance.
    • enforces deterministic controls independent of model’s reasoning quality.
    • supports content filters, denied topics, word filters, sensitive information filters, and contextual grounding checks.
    • can be associated with agents, knowledge bases, and direct model invocations.
  • Bedrock Model Evaluation
    • evaluate, compare, and select the best FM for a specific use case.
    • supports automatic evaluation (built-in metrics) and human evaluation.
  • Bedrock Model Customization
    • supports continued pre-training and fine-tuning of FMs with proprietary data.
    • supports Reinforcement Fine-Tuning (RFT) for Amazon Nova and open-source models.
    • Custom Model Import allows bringing SageMaker-trained models into Bedrock for serverless inference.
  • Bedrock Flows
    • visual builder for creating generative AI workflows connecting prompts, models, knowledge bases, and agents.

Machine Learning & AI Managed Services

  • Comprehend
    • natural language processing (NLP) service to find insights and relationships in text.
    • identifies the language of the text; extracts key phrases, places, people, brands, or events; understands how positive or negative the text is; analyzes text using tokenization and parts of speech; and automatically organizes a collection of text files by topic.
  • Rekognition – analyze images and video to identify objects, people, text, scenes, and activities in images and videos, as well as detect any inappropriate content.
  • Transcribe – automatic speech recognition (ASR) speech-to-text
  • Kendra – an intelligent search service that uses NLP and advanced ML algorithms to return specific answers to search questions from your data.
    • Kendra GenAI Index (New – re:Invent 2024) — delivers highest accuracy for RAG and intelligent search using latest information retrieval technologies and semantic models.
    • integrates with Amazon Q Business and Amazon Bedrock Knowledge Bases.
  • Augmented AI (Amazon A2I) is an ML service that makes it easy to build the workflows required for human review.
  • Amazon Q Developer (formerly CodeWhisperer) — AI-powered coding assistant for building, deploying, and operating applications on AWS.

Generative AI

  • MLA-C01 covers Generative AI concepts with increased emphasis post-Q1 2026 refresh, including practical AWS Bedrock integration.
  • Foundation Models:
    • Large, pre-trained models built on diverse data that can be fine-tuned for specific tasks like text, image, and speech generation. for e.g. GPT, BERT, DALL·E, Amazon Nova, Claude, Llama.
  • Large Language Models (LLMs):
    • A subset of foundation models designed to understand and generate human-like text. Capable of answering questions, summarizing, translating, and more.
    • LLM Components
      • Tokens: Basic units of text (words, subwords, or characters) that LLMs process.
      • Vectors: Numerical representations of tokens in high-dimensional space, enabling the model to perform mathematical operations on text. Each token is converted into a vector for processing in the neural network.
      • Embeddings: Pre-trained numerical vector representations of tokens that capture their semantic meaning.
      • Attention Mechanism: Allows models to weigh the importance of different tokens in a sequence relative to each other (e.g., self-attention in Transformers).
  • Prompt Engineering:
    • Crafting effective input instructions to guide generative AI toward desired outputs. Key for improving performance without fine-tuning the model.
    • Techniques include zero-shot, few-shot, chain-of-thought (CoT), and ReAct prompting.
  • Retrieval-Augmented Generation (RAG):
    • Combines LLMs with external knowledge bases to retrieve accurate and up-to-date information during text generation.
    • Reduces hallucinations by grounding responses in verified enterprise data.
    • AWS Implementation: Amazon Bedrock Knowledge Bases + vector stores.
  • Fine-Tuning:
    • Adjusting pre-trained models using domain-specific data to optimize performance for specific applications.
    • AWS options: Amazon Bedrock fine-tuning, SageMaker AI fine-tuning (SFT, DPO, RLVR, RLAIF).
  • Responsible AI Features:
    • Incorporates fairness, transparency, and bias mitigation techniques to ensure ethical AI outputs.
    • Amazon Bedrock Guardrails provides managed responsible AI controls.
  • Multi-Modal Capabilities:
    • Models that process and generate outputs across multiple data types, such as text, images, and audio.
    • Amazon Nova models support text, image, and video generation.
  • Controls
    • Temperature: Adjusts randomness in the output; lower values (e.g., 0.2) produce focused and deterministic results, while higher values (e.g., 1.0+) generate creative and diverse outputs.
    • Top P (Nucleus Sampling): Determines the probability threshold for token selection — with Top P = 0.9, the model considers only the smallest set of tokens whose cumulative probability is 90%.
    • Top K: Limits the token selection to the top K most probable tokens — with Top K = 10, the model randomly chooses tokens only from the 10 most likely options.
    • Token Length (Max Tokens): Sets the maximum number of tokens the model can generate in a response.

Analytics

  • Kinesis
  • Glue is a fully managed, ETL (extract, transform, and load) service that automates the time-consuming steps of data preparation for analytics
    • helps setup, orchestrate, and monitor complex data flows.
    • Glue Data Catalog is a central repository to store structural and operational metadata for all the data assets.
    • Glue crawler connects to a data store, extracts the schema of the data, and then populates the Glue Data Catalog with this metadata
    • Glue DataBrew is a visual data preparation tool that enables users to clean and normalize data without writing any code.

Security, Identity & Compliance

  • SageMaker can read data from KMS-encrypted S3. Make sure, the KMS key policies include the role attached with SageMaker
  • Amazon Bedrock supports AWS PrivateLink for private connectivity, encryption at rest and in transit, and IAM-based access control.

Management & Governance Tools

  • Understand AWS CloudWatch for Logs and Metrics. (hint: SageMaker is integrated with CloudWatch and logs and metrics are all stored in it)

Deprecated Services to Be Aware Of

  • Amazon Elastic Inference — deprecated April 2023. Use AWS Inferentia (Inf2 instances) or AWS Trainium (Trn1 instances) for cost-effective ML inference/training acceleration.
  • SageMaker Edge Manager — discontinued April 26, 2024. Use AWS IoT Greengrass V2 with ONNX format for edge ML deployments.
  • Amazon Forecast — closed to new customers July 29, 2024. Use SageMaker Canvas for time series forecasting.
  • SageMaker Studio Classic — replaced by the updated SageMaker Studio experience and SageMaker Unified Studio.
  • AWS Certified Machine Learning – Specialty (MLS-C01) — retired March 31, 2026. MLA-C01 is now the primary ML certification.

Whitepapers and articles

Related AWS AI/ML Certifications

  • AWS Certified AI Practitioner (AIF-C01) — foundational-level certification for understanding AI/ML concepts and AWS AI services.
  • AWS Certified Generative AI Developer – Professional (AIP-C01) — professional-level certification for building production-ready generative AI solutions using Amazon Bedrock. Launched in late 2025 (GA April 2026).
  • AWS Certified Machine Learning – Specialty (MLS-C01) — retired March 31, 2026. Holders retain active certification for 3 years from date earned.

On the Exam Day

  • Make sure you are relaxed and get some good night’s sleep. The exam is not tough if you are well-prepared.
  • If you are taking the AWS Online exam
    • Try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.
    • The online verification process does take some time and usually, there are glitches.
    • Remember, you would not be allowed to take the take if you are late by more than 30 minutes.
    • Make sure you have your desk clear, no hand-watches, or external monitors, keep your phones away, and nobody can enter the room.

Finally, All the Best 🙂

AWS Certified AI Practitioner AIF-C01 Exam Learning Path

AWS Certified AI Practitioner

AWS Certified AI Practitioner AIF-C01 Exam Learning Path

  • Started the AI journey by clearing the AWS Certified AI Practitioner AIF-C01 exam with a perfect score.
  • AWS Certified AI Practitioner AIF-C01 exam is the latest AWS exam released on October 8, 2024, following its beta period.
  • AI Practitioner exam validates knowledge of AI/ML, generative AI technologies, and associated AWS services and tools, independent of a specific job role.
  • The AIF-C01 exam has been refreshed in 2026 to reflect changes in AWS services, including the addition of Amazon Bedrock AgentCore.
  • Exam also validates a candidate’s ability to complete the following tasks:
    • Understand AI, ML, and generative AI concepts, methods, and strategies in general and on AWS.
    • Understand the appropriate use of AI/ML and generative AI technologies to ask relevant questions within the candidate’s organization.
    • Determine the correct types of AI/ML technologies to apply to specific use cases.
    • Use AI, ML, and generative AI technologies responsibly

🎯 AWS AI/ML Certification Path Update (2026)

  • AWS Certified Machine Learning – Specialty retired on March 31, 2026. Certification holders retain active status through their original expiration date.
  • AWS Certified Generative AI Developer – Professional (AIP-C01) is now generally available (beta ended March 31, 2026). It validates advanced skills in building production-ready AI solutions using Bedrock, RAG architectures, and agentic AI.
  • AWS Agentic AI Demonstrated microcredential — a free, hands-on credential for implementing AI solutions in a provisioned AWS environment.
  • Current AI/ML certification path:
    • Foundational: AWS Certified AI Practitioner (AIF-C01)
    • Associate: AWS Certified Machine Learning Engineer – Associate (MLA-C01)
    • Professional: AWS Certified Generative AI Developer – Professional (AIP-C01)

Refer AWS Certified AI Practitioner AIF-C01 Exam Guide

AWS Certified AI Practitioner AIF-C01 Exam Summary

  • AIF-C01 exam consists of 65 questions (50 scored and 15 unscored) in 90 minutes, and the time is more than sufficient if you are well-prepared.
  • In addition to the usual types of multiple-choice and multiple-response questions, the AIF exams have introduced the following new types
    • Ordering: Has a list of 3-5 responses which you need to select and place in the correct order to complete a specified task.
    • Matching: Has a list of responses to match with a list of 3-7 prompts. You must match all the pairs correctly to receive credit for the question.
    • Case study: A case study presents a single scenario with multiple questions. Each question is evaluated independently, and credit is given for each correct answer.
  • AIF-C01 has a scaled score between 100 and 1,000. The scaled score needed to pass the exam is 700.
  • Associate exams currently cost $ 100 + tax.
  • You can get an additional 30 minutes if English is your second language by requesting Exam Accommodations. It might not be needed for Associate exams but is helpful for Professional and Specialty ones.
  • AWS exams can be taken either remotely or online, I prefer to take them online as it provides a lot of flexibility. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.

AWS Certified AI Practitioner AIF-C01 Exam Resources

AWS Certified AI Practitioner AIF-C01 Exam Topics

AIF-C01 Exam covers the AI and ML aspects in terms of AI & ML fundamentals, ML lifecycle, Generative AI, AI use cases and applications and building secure, responsible AI.

Machine Learning Concepts

  • Exploratory Data Analysis
    • Feature selection and Engineering
      • remove features that are not related to training
      • remove features that have the same values, very low correlation, very little variance, or a lot of missing values
      • Apply techniques like Principal Component Analysis (PCA) for dimensionality reduction i.e. reduce the number of features.
      • Apply techniques such as One-hot encoding and label encoding to help convert strings to numeric values, which are easier to process.
      • Apply Normalization i.e. values between 0 and 1 to handle data with large variance.
      • Apply feature engineering for feature reduction e.g. using a single height/weight feature instead of both features.
    • Handle Missing data
      • remove the feature or rows with missing data
      • impute using Mean/Median values – valid only for Numeric values and not categorical features also does not factor correlation between features
      • impute using k-NN, Multivariate Imputation by Chained Equation (MICE), Deep Learning – more accurate and helps factors correlation between features
    • Handle unbalanced data
      • Source more data
      • Oversample minority or Undersample majority
      • Data augmentation using techniques like Synthetic Minority Oversampling Technique (SMOTE).
  • Modeling
    • Transfer learning (TL) is a machine learning (ML) technique where a model pre-trained on one task is fine-tuned for a new, related task.
    • Know about Algorithms – Supervised, Unsupervised and Reinforcement and which algorithm is best suitable based on the available data either labelled or unlabelled.
      • Supervised learning trains on labeled data e.g. Linear regression. Logistic regression, Decision trees, Random Forests
      • Unsupervised learning trains on unlabelled data e.g. PCA, SVD, K-means
      • Reinforcement learning trained based on actions and rewards e.g. Q-Learning
    • Hyperparameters
      • are parameters exposed by machine learning algorithms that control how the underlying algorithm operates and their values affect the quality of the trained models
      • some of the common hyperparameters are learning rate, batch, epoch (hint: If the learning rate is too large, the minimum slope might be missed and the graph would oscillate If the learning rate is too small, it requires too many steps which would take the process longer and is less efficient)
  • Evaluation
    • Know difference in evaluating model accuracy
      • Use Area Under the (Receiver Operating Characteristic) Curve (AUC) for Binary classification
      • Use root mean square error (RMSE) metric for regression
    • Understand Confusion matrix
      • A true positive is an outcome where the model correctly predicts the positive class. Similarly, a true negative is an outcome where the model correctly predicts the negative class.
      • A false positive is an outcome where the model incorrectly predicts the positive class. A false negative is an outcome where the model incorrectly predicts the negative class.
      • Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN) (hint: use this for cases like fraud detection, cost of marking non fraud as frauds is lower than marking fraud as non-frauds)
      • Specificity or TNR (True Negative Rate): Number of items correctly identified as negative out of total negatives- TN/(TN+FP) (hint: use this for cases like videos for kids, the cost of dropping few valid videos is lower than showing few bad ones)
    • Training Problems
      • Overfitting occurs when the machine learning model gives accurate predictions for training data but not for new data.
      • Underfitting occurs when the model cannot determine a meaningful relationship between the input and output data. You get underfit models if they have not trained for the appropriate length of time on a large number of data points.
      • Underfit models experience high bias—they give inaccurate results for both the training data and test set. On the other hand, overfit models experience high variance—they give accurate results for the training set but not for the test set. More model training results in less bias but variance can increase. Data scientists aim to find the sweet spot between underfitting and overfitting when fitting a model. A well-fitted model can quickly establish the dominant trend for seen and unseen data sets.
    • Handle Overfitting problems
      • Simplify the model, by reducing the number of layers
      • Early Stopping – form of regularization while training a model with an iterative method, such as gradient descent
      • Data Augmentation
      • Regularization – technique to reduce the complexity of the model
      • Dropout is a regularization technique that prevents overfitting
      • Never train on test data

Generative AI

  • Foundation Models:
    • Large, pre-trained models built on diverse data that can be fine-tuned for specific tasks like text, image, and speech generation. for e.g. GPT, BERT, and DALL·E.
  • Large Language Models (LLMs):
    • A subset of foundation models designed to understand and generate human-like text. Capable of answering questions, summarizing, translating, and more.
    • LLM Components
      • Tokens:
        • Basic units of text (words, subwords, or characters) that LLMs process.
      • Vectors
        • Numerical representations of tokens in high-dimensional space, enabling the model to perform mathematical operations on text.
        • Each token is converted into a vector for processing in the neural network.
      • Embeddings:
        • Pre-trained numerical vector representations of tokens that capture their semantic meaning.
  • Prompt Engineering:
    • Crafting effective input instructions to guide generative AI toward desired outputs. Key for improving performance without fine-tuning the model.
    • Techniques
      • Zero-Shot Prompting: Instructs the model to perform a task without providing examples.
      • Few-Shot Prompting: Provides a few examples of the task in the prompt to guide the model’s output.
      • Chain-of-Thought Prompting: Encourages the model to explain its reasoning step-by-step before giving the final answer.
      • Instruction Prompting: Provides explicit instructions to guide the model’s behavior.
      • Contextual Prompting: Includes additional context or background information in the prompt for better responses.
      • Iterative Refinement: Refines the prompt in multiple iterations based on model responses to improve accuracy.
      • Role-based Prompting: Assigns a role to the model to influence its tone or expertise.
  • Agentic AI:
    • AI systems that can autonomously plan, reason, and execute multi-step tasks with minimal human intervention.
    • Agents use tools, make decisions, and adapt based on intermediate results.
    • Key patterns include ReAct (Reasoning + Acting), tool use, and multi-agent orchestration.
    • AWS provides Amazon Bedrock AgentCore for building and deploying production-grade agents.
  • Retrieval-Augmented Generation (RAG):
    • Combines LLMs with external knowledge bases to retrieve accurate and up-to-date information during text generation. Useful for chatbots and domain-specific tasks.
  • Fine-Tuning:
    • Adjusting pre-trained models using domain-specific data to optimize performance for specific applications.
    • Reinforcement Fine-Tuning (RFT): Uses reward signals to align model outputs with desired behaviors, now supported on Amazon Bedrock for open-weight models.
  • Responsible AI Features:
    • Incorporates fairness, transparency, and bias mitigation techniques to ensure ethical AI outputs.
  • Multi-Modal Capabilities:
    • Models that process and generate outputs across multiple data types, such as text, images, and audio.
  • Vector database
    • provides the ability to store and retrieve vectors as high-dimensional points.
    • add additional capabilities for efficient and fast lookup of nearest-neighbors in the N-dimensional space.
    • Amazon natively supports vector search through OpenSearch, Aurora PostgreSQL with pgvector and Partner solutions like Pinecone, Weaviate, and Milvus.
  • Controls
    • Temperature: Adjusts randomness in the output; lower values (e.g., 0.2) produce focused and deterministic results, while higher values (e.g., 1.0 or above) generate creative and diverse outputs.
    • Top P (Nucleus Sampling): Determines the probability threshold for token selection. With Top P = 0.9, the model considers only the smallest set of tokens whose cumulative probability is 90%.
    • Top K: Limits the token selection to the top K most probable tokens. With Top K = 10, the model randomly chooses from the 10 most likely options.
    • Token Length (Max Tokens): Sets the maximum number of tokens the model can generate in a response.
  • Model Evaluation Metrics:
    • Techniques like BLEU, ROUGE, perplexity, and embeddings measure generative AI performance across different use cases.
    • ROUGE (Recall-Oriented Understudy for Gisting Evaluation): Commonly used for text summarization; compares overlap between generated and reference text.
    • BERTScore: Evaluates text generation by comparing contextual embeddings, capturing semantic similarity beyond n-gram overlap.
    • Perplexity: Used for language models to evaluate prediction quality. Lower perplexity indicates a better model.
    • BLEU (Bilingual Evaluation Understudy): Evaluates machine translation by comparing generated text against reference translations.
  • Limitations
    • Security: can be exploited to create malicious content, phishing attacks, or deepfakes.
    • Cost: Training and deploying large models require substantial computational resources.
    • Explainability: Decision-making process is often a “black box,” making models hard to interpret.
    • Hallucination: Models may confidently generate false or nonsensical outputs that appear accurate.
    • Toxicity: Without proper safeguards, AI can produce harmful, biased, or offensive content.
    • Creativity: AI-generated content often lacks true originality and may rely on existing patterns.
    • Data Dependency: Quality of outputs depends heavily on training data quality and diversity.
    • Regulation: Legal and ethical concerns surrounding misuse and intellectual property.
    • Latency: Real-time applications may experience delays due to high computational demands.

AI Services

Amazon Bedrock

  • is a fully managed service that offers a choice of industry leading foundation models (FMs) along with a broad set of capabilities needed to build generative AI applications, simplifying development with security, privacy, and responsible AI without the need to manage underlying infrastructure.
  • supports foundation models from Amazon (Nova), Anthropic (Claude), OpenAI (GPT-5.5, GPT-5.4), Meta (Llama), Mistral AI, Cohere, Stability AI, and others.
  • supports custom fine-tuning of FMs using tagged data or by using continued pre-train feature to customize the model using non-tagged data.
  • supports Reinforcement Fine-Tuning (RFT) for open-weight models using OpenAI-compatible APIs (February 2026).
  • supports Retrieval Augmented Generation (RAG) to enhance model responses with real-time, context-specific data retrieval from external knowledge bases.
  • Knowledge Bases
    • Integrate custom datasets to tailor models for specific use cases and improve accuracy.
    • provides access to additional data that helps the model generate more relevant, context-specific, and accurate responses without continually retraining the FM.
    • Managed Knowledge Base (GA June 2026) — a fully managed RAG service that abstracts storage, retrieval, embeddings, re-ranking, and FM selection into a single managed primitive. Includes six native data source connectors (S3, SharePoint, Confluence, Google Drive, OneDrive, Web Crawler), Smart Parsing for automatic multi-format data preparation, and an Agentic Retriever for complex multi-step queries.
    • Supports multimodal retrieval across text, images, audio, and video content.
  • Agents
    • are fully managed capabilities that can help build and deploy intelligent agents to automate workflows and enhance user interactions.
    • can complete complex tasks for a wide range of use cases and deliver up-to-date answers based on proprietary knowledge sources.
  • Amazon Bedrock AgentCore (GA June 2026)
    • Enterprise-grade infrastructure and operations layer for deploying and managing AI agents at scale.
    • Provides a managed harness — orchestration loop, tool execution, context window management, state persistence, failure recovery, and session isolation — all with just two API calls (CreateHarness and InvokeHarness).
    • Works with any framework: LangGraph, LlamaIndex, CrewAI, Strands Agents, and more.
    • Includes AgentCore Gateway for connecting agents to tools, other agents, and models.
    • Supports optimization capabilities that turn production traces into continuous improvement.
    • Integrates with Bedrock Guardrails for real-time evaluation of agent actions and tool calls.
    • Note: AgentCore is now included in the refreshed AIF-C01 exam content.
  • Guardrails
    • help implement safeguards for generative AI applications based on use cases and responsible AI policies.
    • Provides six safeguard types:
      • Content Filters — filter undesirable and harmful content
      • Denied Topics — block conversations on specified topics
      • Word Filters — block specific words and phrases
      • Sensitive Information Filters (PII Redaction) — redacts PII using predefined types or custom regex patterns, masking with placeholders (e.g., {NAME}, {EMAIL})
      • Contextual Grounding Checks — validates responses are grounded in provided context
      • Automated Reasoning Checks (GA August 2025) — uses formal verification methods (mathematical logic) to validate AI outputs against rules and constraints. Provides provably correct, auditable assessment for every request. Delivers up to 99% verification accuracy for hallucination prevention.
    • Prompt Attack Detection — detects prompt injection and jailbreak attempts.
    • InvokeGuardrailChecks API (June 2026) — a new resourceless API that lets you apply individual safeguards at any point in agentic AI applications without creating guardrail resources. Operates in detect-only mode and returns numeric scores.
    • Supports cross-account safeguards with centralized control via AWS Organizations.
  • Model Evaluation
    • Test and evaluate foundation models for performance and accuracy (GA April 2024).
    • Automatic evaluation with predefined metrics (accuracy, robustness, toxicity).
    • Human evaluation workflows for subjective quality assessment.
    • LLM-as-a-judge capability for scalable, human-like evaluation.
    • RAG evaluation capabilities.
    • Compare multiple foundation models side by side.
  • Pricing modes
    • On-Demand Throughput Mode: Automatically scales based on request traffic. Ideal for variable workloads.
    • Provisioned Throughput Mode: Pre-allocate capacity for consistent high-volume workloads. Required for customized fine-tuned models.
  • Responsible AI Support: Tools and guidance to monitor, mitigate, and reduce biases while ensuring fairness and ethical AI use.
  • Security
    • S3 allows storing and managing data securely with fine-grained access controls and encryption.
    • VPC PrivateLink allows operating Bedrock entirely within the VPC, ensuring secure communication without an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
  • Scalability and Cost Efficiency: Automatically scales to meet workload demands with a pay-as-you-go pricing model.
  • Model Invocation Logging
    • helps collect invocation logs, model input data, and model output data for all invocations.
    • includes full request data, response data, and metadata associated with all calls.
    • supported destinations include CloudWatch Logs and S3.
  • Redesigned Console (June 2026) — browse the full model catalog (Claude, GPT, open-weight models), compare side by side on capabilities, modality, context window, and quotas in a single view.

Amazon SageMaker

  • SageMaker Unified Studio (GA March 2025)
    • Next-generation unified development environment for data engineers, data scientists, ML developers, and analysts.
    • Breaks down silos by providing a single experience to discover data and AI assets, build analytics and AI artifacts, and collaborate in projects.
    • Includes serverless notebooks with a built-in AI agent and one-click onboarding.
    • Supports notebook scheduling, parameterization, and orchestration directly from the interface.
  • supports Model tracking capability to manage up to thousands of machine learning model experiments
  • supports automatic scaling for production variants dynamically adjusting instances in response to workload changes
  • provides pre-built Docker images for its built-in algorithms and the supported deep learning frameworks used for training & inference
  • SageMaker Inference options:
    • Real-time inference is ideal for online inferences with low latency or high throughput requirements.
    • Serverless Inference is ideal for intermittent or unpredictable traffic patterns.
    • Batch Transform is suitable for offline processing when large amounts of data are available upfront.
    • Asynchronous Inference is ideal for large payloads with long processing times.
  • SageMaker Model deployment allows deploying multiple variants of a model to the same endpoint to test new models.
  • SageMaker Managed Spot training can use spot instances to save cost; Checkpointing saves the state of ML models during training.
  • SageMaker Feature Store — centralized store for features and associated metadata for easy discovery and reuse.
  • SageMaker Debugger provides tools to debug training jobs and resolve problems such as overfitting, saturated activation functions, and vanishing gradients.
  • SageMaker Model Monitor monitors model quality in production and alerts when there are deviations.
  • SageMaker Automatic Model Tuning helps find optimal hyperparameters for an algorithm.
  • SageMaker Data Wrangler reduces time to aggregate and prepare tabular and image data for ML from weeks to minutes.
  • SageMaker Clarify helps detect potential bias and explain model predictions using SHAP analysis.
  • SageMaker Model Governance provides systematic visibility into ML model development, validation, and usage.
  • SageMaker Model Cards — document critical details about ML models for streamlined governance and reporting.
  • SageMaker Autopilot automates the end-to-end process of building, training, tuning, and deploying ML models.
  • SageMaker Neo enables ML models to train once and run anywhere in the cloud and at the edge.
  • SageMaker JumpStart — pre-trained foundation models hub with one-click deployment, fine-tuning, and support for both proprietary and open-source models.
  • SageMaker supports VPC interface endpoints powered by AWS PrivateLink for secure private connectivity.

SageMaker Ground Truth

  • provides automated data labeling using machine learning
  • helps build highly accurate training datasets quickly using Amazon Mechanical Turk
  • provides annotation consolidation to improve the accuracy of data labels by combining multiple workers’ results.
  • automated data labeling uses machine learning to label portions of the data automatically without sending them to human workers

AI Managed Services

  • Amazon Q Business
    • is a fully managed, generative-AI powered assistant that can answer questions, provide summaries, generate content, and complete tasks based on enterprise data.
    • connects to enterprise data sources with existing security and access controls.
  • AWS PartyRock
    • Amazon Bedrock Playground for learning generative AI.
    • No-code app building interface for hands-on experimentation with foundation models.
    • Build apps in minutes with natural language prompts.
    • Free to use for learning and prompt engineering practice.
  • Comprehend — natural language processing (NLP) service to find insights and relationships in text. Identifies language, extracts key phrases, people, brands, events; understands sentiment; organizes text by topic.
  • Lex — provides conversational interfaces using voice and text for building chatbots.
  • Polly — text-to-speech; supports SSML tags and pronunciation lexicons.
  • Rekognition — analyze images and video; identifies objects, people, text, scenes, activities, and inappropriate content.
  • Translate — natural and fluent language translation.
  • Transcribe — automatic speech recognition (ASR) speech-to-text.
  • Kendra — intelligent search service using NLP and ML to return specific answers from your data.
  • Panorama — brings computer vision to on-premises camera networks.
  • Augmented AI (Amazon A2I) — builds workflows for human review of ML predictions.
  • Forecast — highly accurate time-series forecasts.

Security, Identity & Compliance

  • AWS Artifact is a self-service portal for on-demand access to AWS compliance documentation and agreements.
  • SageMaker can read data from KMS-encrypted S3. Make sure KMS key policies include the role attached to SageMaker.
  • AWS Identity and Access Management (IAM) helps securely control access to AWS resources.
  • Amazon Inspector — vulnerability management service that continuously scans workloads for software vulnerabilities and unintended network exposure.

Management & Governance Tools

  • Understand AWS CloudWatch for Logs and Metrics. (hint: SageMaker & Bedrock are integrated with CloudWatch for logs and metrics)
  • CloudTrail records API events, the user who made the call, and the time of the call for monitoring and logging.

Whitepapers and articles

On the Exam Day

  • Make sure you are relaxed and get some good night’s sleep. The exam is not tough if you are well-prepared.
  • If you are taking the AWS Online exam
    • Try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.
    • The online verification process does take some time and usually, there are glitches.
    • Remember, you would not be allowed to take the exam if you are late by more than 30 minutes.
    • Make sure you have your desk clear, no hand-watches, or external monitors, keep your phones away, and nobody can enter the room.

Practice Questions

  1. A company needs to deploy AI agents that can autonomously plan multi-step workflows, execute tools, and recover from failures at production scale. Which AWS service provides the infrastructure layer for this?
    1. Amazon Bedrock Agents
    2. Amazon Bedrock AgentCore
    3. Amazon SageMaker Unified Studio
    4. AWS Step Functions

    Answer: B — Amazon Bedrock AgentCore provides enterprise-grade infrastructure for deploying and managing AI agents at scale, including orchestration, state persistence, and failure recovery.

  2. A financial services company needs mathematically verifiable accuracy for AI-generated compliance reports to prevent hallucinations. Which Amazon Bedrock Guardrails capability should they use?
    1. Content Filters
    2. Contextual Grounding Checks
    3. Automated Reasoning Checks
    4. Sensitive Information Filters

    Answer: C — Automated Reasoning Checks use formal verification methods (mathematical logic) to validate AI outputs, delivering provably correct and auditable assessments with up to 99% verification accuracy.

  3. An organization wants to apply Bedrock Guardrails safety checks at individual steps in their agentic AI workflow without creating dedicated guardrail resources. Which API should they use?
    1. ApplyGuardrail
    2. InvokeGuardrailChecks
    3. CreateGuardrail
    4. InvokeModel

    Answer: B — InvokeGuardrailChecks is a resourceless API (June 2026) that lets you apply individual safeguards at any point in agentic AI applications without creating guardrail resources.

  4. A development team wants to build a RAG application with enterprise data from SharePoint, Confluence, and S3 without managing vector storage or retrieval infrastructure. Which service should they use?
    1. Amazon OpenSearch with custom embeddings
    2. Amazon Bedrock Knowledge Bases with custom vector store
    3. Amazon Bedrock Managed Knowledge Base
    4. Amazon Kendra

    Answer: C — Amazon Bedrock Managed Knowledge Base (GA June 2026) is a fully managed RAG service with native data connectors, managed vector storage, Smart Parsing, and an Agentic Retriever.

  5. Which of the following certifications has AWS retired as of March 31, 2026?
    1. AWS Certified AI Practitioner
    2. AWS Certified Machine Learning Engineer – Associate
    3. AWS Certified Machine Learning – Specialty
    4. AWS Certified Generative AI Developer – Professional

    Answer: C — AWS Certified Machine Learning – Specialty was retired on March 31, 2026. It has been replaced by the expanded AI/ML certification portfolio including the Generative AI Developer – Professional.

  6. A beginner wants to learn generative AI through hands-on experimentation without writing code or managing infrastructure. Which AWS service should they use?
    1. SageMaker Unified Studio
    2. SageMaker Canvas
    3. AWS PartyRock
    4. Bedrock Console Playground

    Answer: C — AWS PartyRock is a free, no-code platform for learning generative AI through hands-on experimentation with foundation models and prompt engineering.

Finally, All the Best 🙂

AWS Certified Data Engineer – Associate DEA-C01 Exam Learning Path

AWS Certified Data Engineer - Associate DEA-C01

AWS Certified Data Engineer – Associate DEA-C01 Exam Learning Path

  • Just cleared the AWS Certified Data Engineer – Associate DEA-C01 exam with a score of 930/1000.
  • AWS Certified Data Engineer – Associate DEA-C01 exam is the latest AWS exam released on 12th March 2024.

AWS Certified Data Engineer - Associate DEA-C01

📋 Exam Guide Updated – Version 1.1 (December 2025)

AWS released Version 1.1 of the DEA-C01 exam guide in December 2025 with significant updates:

  • New skills added: LLM integration for data processing, open table formats (Apache Iceberg), vector index types (HNSW, IVF), vectorization concepts (Amazon Bedrock knowledge base), SageMaker Unified Studio governance
  • New in-scope services: Amazon Aurora, Amazon Q, Amazon Bedrock, Amazon Kendra, AWS Data Exchange, Amazon S3 Tables
  • Services removed from scope: AWS Cloud9, AWS CodeCommit, AWS Schema Conversion Tool (AWS SCT)
  • Service deprecations removed from out-of-scope: Amazon Honeycode, Amazon WorkDocs, Amazon Timestream, Amazon CodeWhisperer

Refer DEA-C01 Exam Guide Revisions for full details.

AWS Certified Data Engineer – Associate DEA-C01 Exam Content

  • Data Engineer exam validates skills and knowledge in core data-related AWS services, ability to ingest and transform data, orchestrate data pipelines while applying programming concepts, design data models, manage data life cycles, and ensure data quality.
  • Exam also validates a candidate’s ability to complete the following tasks:
    • Ingest and transform data, and orchestrate data pipelines while applying programming concepts.
    • Choose an optimal data store, design data models, catalog data schemas, and manage data lifecycles.
    • Operationalize, maintain, and monitor data pipelines. Analyze data and ensure data quality.
    • Implement appropriate authentication, authorization, data encryption, privacy, and governance. Enable logging
  • (New in v1.1) Integrate Large Language Models (LLM) for data processing.
  • (New in v1.1) Manage open table formats (e.g., Apache Iceberg).
  • (New in v1.1) Apply storage services including vector index types (HNSW, IVF) and services like Amazon Aurora PostgreSQL and Amazon MemoryDB.
  • (New in v1.1) Use infrastructure as code (IaC) for repeatable resource deployment (AWS CloudFormation, AWS CDK).

Refer AWS Certified Data Engineer – Associate DEA-C01 Exam Guide

AWS Certified Data Engineer – Associate DEA-C01 Exam Summary

  • DEA-C01 exam consists of 65 questions in 130 minutes, and the time is more than sufficient if you are well-prepared.
  • DEA-C01 exam includes two types of questions, multiple-choice and multiple-response.
  • DEA-C01 has a scaled score between 100 and 1,000. The scaled score needed to pass the exam is 720.
  • Associate exams currently cost $ 150 + tax.
  • You can get an additional 30 minutes if English is your second language by requesting Exam Accommodations. It might not be needed for Associate exams but is helpful for Professional and Specialty ones.
  • AWS exams can be taken either remotely or online, I prefer to take them online as it provides a lot of flexibility. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.

AWS Certified Data Engineer – Associate DEA-C01 Exam Resources

AWS Certified Data Engineer – Associate DEA-C01 Exam Topics

  • DEA-C01 Exam covers the data engineering aspects in terms of data ingestion, transformation, orchestration, designing data models, managing data life cycles, and ensuring data quality.
  • (Updated v1.1) Exam now also covers LLM integration, open table formats (Iceberg), vector databases, and generative AI services like Amazon Bedrock and Amazon Q.

Analytics

  • Ensure you know and cover all the services in-depth, as 80% of the exam focuses on topics like Glue, Athena, Kinesis, and Redshift.
  • AWS Analytics Services Cheat Sheet
  • Glue
    • DEA-C01 covers Glue in great detail.
    • AWS Glue is a fully managed, ETL service that automates the time-consuming steps of data preparation for analytics.
    • supports server-side encryption for data at rest and SSL for data in motion.
    • Glue ETL engine to Extract, Transform, and Load data that can automatically generate Scala or Python code.
    • Glue Data Catalog is a central repository and persistent metadata store to store structural and operational metadata for all the data assets. It works with Apache Hive as its metastore.
    • (New – June 2026) Glue Data Catalog now supports business context and semantic search, enabling data discovery by semantic meaning with glossary terms and custom metadata fields.
    • Glue Crawlers scan various data stores to automatically infer schemas and partition structures to populate the Data Catalog with corresponding table definitions and statistics.
    • Glue Job Bookmark tracks data that has already been processed during a previous run of an ETL job by persisting state information from the job run.
    • Glue Streaming ETL enables performing ETL operations on streaming data using continuously running jobs.
    • Glue provides a flexible scheduler that handles dependency resolution, job monitoring, and retries.
    • Glue Studio offers a graphical interface for authoring AWS Glue jobs to process data allowing you to define the flow of the data sources, transformations, and targets in the visual interface and generating Apache Spark code on your behalf.
    • Glue Data Quality helps reduce manual data quality efforts by automatically measuring and monitoring the quality of data in data lakes and pipelines. (Updated) Now supports rule labeling for organizing and analyzing data quality results by category, team, or domain.
    • Glue DataBrew helps prepare, visualize, clean, and normalize data directly from the data lake, data warehouses, and databases, including S3, Redshift, Aurora, and RDS.
    • Glue Flex execution option helps to reduce the costs of pre-production, test, and non-urgent data integration workloads by up to 34% and is ideal for customer workloads that don’t require fast jobs start times.
    • Glue FindMatches transform helps identify duplicate or matching records in the dataset, even when the records do not have a common unique identifier and no fields match exactly.
    • (New) AWS Glue 5.1 (GA Nov 2025) introduces support for Apache Iceberg format version 3.0 (deletion vectors, row lineage), Iceberg Materialized Views, and Spark-native fine-grained access control with AWS Lake Formation for data writes.
    • (New) Glue Interactive Sessions now support Spark Connect for interactive workloads, enabling step-by-step debugging and incremental PySpark development.
    • (Deprecation Note) AWS Glue for Ray will no longer be open to new customers starting April 30, 2026. For similar capabilities, explore Amazon EKS.
  • Kinesis
    • Understand Kinesis Data Streams and Amazon Data Firehose (formerly Kinesis Data Firehose) in-depth.
    • Know Kinesis Data Streams vs Amazon Data Firehose
      • Know Kinesis Data Streams is open-ended for both producer and consumer. It supports KCL and works with Spark.
      • Know Amazon Data Firehose is open-ended for producers only. Data is stored in S3, Redshift, OpenSearch, Splunk, Snowflake, and other 3rd-party analytics services.
      • Amazon Data Firehose works in batches with minimum 60secs intervals and in near-real time.
      • Amazon Data Firehose supports out-of-the-box transformation and custom transformation using Lambda
    • (Rename – Feb 2024) Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. The service functionality remains the same.
    • Kinesis supports encryption at rest using server-side encryption
    • Kinesis supports Interface VPC endpoint to keep traffic between the VPC and Kinesis Data Streams from leaving the Amazon network and doesn’t require an internet gateway, NAT device, VPN connection, or Direct Connect connection.
    • Kinesis Producer Library supports batching
    • Amazon Managed Service for Apache Flink (formerly Kinesis Data Analytics)
      • (Rename – Aug 2023) Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink.
      • helps transform and analyze streaming data in real time using Apache Flink.
      • supports anomaly detection using Random Cut Forest ML
      • supports reference data stored in S3.
      • (EOL – Jan 2026) Kinesis Data Analytics for SQL applications reached end of support on January 27, 2026. Migrate to Amazon Managed Service for Apache Flink for real-time stream processing workloads.
  • Redshift
    • Redshift is also covered in depth.
    • Redshift Advanced include
      • Redshift Distribution Style determines how data is distributed across compute nodes and helps minimize the impact of the redistribution step by locating the data where it needs to be before the query is executed.
      • Redshift Enhanced VPC routing forces all COPY and UNLOAD traffic between the cluster and the data repositories through the VPC.
      • Workload management (WLM) enables users to flexibly manage priorities within workloads so that short, fast-running queries won’t get stuck in queues behind long-running queries.
      • Redshift Spectrum
        • helps query structured and semistructured data from files in S3 without having to load the data into Redshift tables.
        • cannot access data from Glacier.
      • Federated Query feature allows querying and analyzing data across operational databases, data warehouses, and data lakes.
      • Short query acceleration (SQA) prioritizes selected short-running queries ahead of longer-running queries.
      • Concurrency Scaling helps support thousands of concurrent users and concurrent queries, with consistently fast query performance.
      • Redshift Serverless is a serverless option of Redshift that makes it more efficient to run and scale analytics in seconds without the need to set up and manage data warehouse infrastructure. (Updated) AI-driven scaling and optimization is now the default for all new Serverless workgroups, using ML to predict compute needs and automatically adjust resources.
      • Streaming ingestion provides low-latency, high-speed ingestion of stream data from Kinesis Data Streams and Managed Streaming for Apache Kafka into a Redshift provisioned or Redshift Serverless materialized view.
      • Redshift data sharing can securely share access to live data across Redshift clusters, workgroups, AWS accounts, and AWS Regions without manually moving or copying the data.
      • Redshift Data API provides a secure HTTP endpoint and integration with AWS SDKs to help access Redshift data with web services–based applications, including AWS Lambda, SageMaker notebooks, and AWS Cloud9.
      • (New) Amazon Redshift RG is a new Graviton-powered instance family that delivers up to 2.4x faster performance than RA3 at 30% lower price per vCPU.
      • (New) Redshift now supports autonomics for multi-cluster environments, extending ATO, ATS, Auto Vacuum, and Auto Analyze across consumer clusters.
    • Redshift Best Practices w.r.t selection of Distribution style, Sort key, importing/exporting data
      • COPY command which allows parallelism, and performs better than multiple COPY commands
      • COPY command can use manifest files to load data
      • COPY command handles encrypted data
    • Redshift Resizing cluster options (elastic resize did not support node type changes before, but does now)
    • Redshift supports encryption at rest and in transit
    • Redshift supports encrypting an unencrypted cluster using KMS. However, you can’t enable hardware security module (HSM) encryption by modifying the cluster. Instead, create a new, HSM-encrypted cluster and migrate your data to the new cluster.
    • Know Redshift views to control access to data.
  • Athena
    • is a serverless, interactive analytics service built on open-source frameworks, supporting open-table and file formats.
    • provides a simplified, flexible way to analyze data in an S3 data lake and 30 data sources, including on-premises data sources or other cloud systems using SQL or Python without loading the data.
    • integrates with Amazon Quick (formerly QuickSight) for visualizing the data or creating dashboards.
    • uses a managed Glue Data Catalog to store information and schemas about the databases and tables for the data stored in S3.
    • Workgroups can be used to separate users, teams, applications, or workloads, to set limits on the amount of data each query or the entire workgroup can process, and to track costs.
    • Athena best practices
      • Data partitioning,
      • Partition projection, and
      • Columnar file formats like ORC or Parquet as they support compression and are splittable.
    • (New) Athena for Apache Spark is available in SageMaker notebooks with live Spark UI debugging, Lake Formation table-level access controls, and Spark Connect support.
    • (New) Athena now offers managed connectors for 12 data sources (DynamoDB, PostgreSQL, MySQL, Snowflake) without deploying or maintaining connector resources.
    • (Updated v1.1) Exam now tests SQL skills in both Redshift and Athena for querying data and creating views.
  • Elastic Map Reduce
    • Understand EMRFS
      • Use Consistent view to make sure S3 objects referred by different applications are in sync. Although, it is not needed now.
    • Know EMR Best Practices (hint: start with many small nodes instead of few large nodes)
    • Know EMR Encryption options
      • supports SSE-S3, SS3-KMS, CSE-KMS, and CSE-Custom encryption for EMRFS
      • supports LUKS encryption for local disks
      • supports TLS for data in transit encryption
      • supports EBS encryption
    • Hive metastore can be externally hosted using RDS, Aurora, and AWS Glue Data Catalog
    • (New) Amazon EMR Serverless eliminates local storage provisioning for Apache Spark workloads, reducing data processing costs by up to 20% and preventing job failures from disk capacity constraints.
    • (New) EMR Serverless supports interactive sessions with Spark Connect for development from SageMaker Unified Studio notebooks and IDEs.
    • (New) Apache Spark upgrade agent for EMR uses AI to analyze code, identify required changes, and perform automated transformations for version upgrades.
  • OpenSearch
    • OpenSearch is a search service that supports indexing, full-text search, faceting, etc.
    • OpenSearch can be used for analysis and supports visualization using OpenSearch Dashboards which can be real-time.
    • OpenSearch Service Storage tiers support Hot, UltraWarm, and Cold and the data can be transitioned using Index State management.
  • Amazon Quick (formerly QuickSight)
    • (Rename – Oct 2025) Amazon QuickSight has been rebranded to Amazon Quick (also referred to as Amazon Quick Suite), expanding from a standalone BI service to a comprehensive analytics and AI platform.
    • Know Supported Data Sources
    • Amazon Quick provides IP addresses that need to be whitelisted to access the data store.
    • Amazon Quick provides direct integration with Microsoft AD
    • Amazon Quick supports row-level security using dataset rules to control access to data at row granularity based on permissions associated with the user interacting with the data.
    • Amazon Quick supports ML insights as well
    • Amazon Quick supports users defined via IAM or email signup.
    • (New) Amazon Q generative SQL integration helps generate SQL for Redshift or Athena queries using natural language.
  • AWS Lake Formation
    • is an integrated data lake service that helps to discover, ingest, clean, catalog, transform, and secure data and make it available for analysis.
    • automatically manages access to the registered data in S3 through services including AWS Glue, Athena, Redshift, Amazon Quick, and EMR
    • provides central access control for the data, including table-and-column-level access controls, and encryption for data at rest.
    • (New) Lake Formation now supports Spark-native fine-grained access control for data writes in AWS Glue 5.1.
    • (New) Lake Formation integrates with Amazon S3 Tables for cross-account data mesh architectures without copying data or managing cross-account S3 bucket policies.
  • Simple Storage Service – S3 as a storage service
    • S3 storage classes with lifecycle policies based on usage to provide cost-effective storage solutions.
    • S3 Event Notifications integrates with SNS and Lambda for real-time data processing
    • (New – Dec 2024) Amazon S3 Tables provide fully managed Apache Iceberg tables optimized for analytics workloads with up to 3x faster query throughput and 10x higher transactions per second compared to self-managed tables. Includes automatic compaction, snapshot management, and unreferenced file cleanup.
  • Data Pipeline
    • ⚠️ Closed to New Customers (July 2024): AWS closed new customer access to AWS Data Pipeline effective July 25, 2024. Existing customers can continue to use the service. No new features or region expansions are planned. Consider migrating to AWS Step Functions, Amazon MWAA, or AWS Glue workflows.
  • Step Functions help build distributed applications, automate processes, orchestrate microservices, and create data and ML pipelines.
    • Provides native integrations with over 200 AWS services and external third-party APIs.
    • (New) Step Functions added 28 new service integrations including Amazon Bedrock AgentCore and Amazon S3 Vectors.
  • (New) Amazon Managed Workflows for Apache Airflow (MWAA)
    • A managed service to run Apache Airflow for workflow orchestration at scale without managing infrastructure.
    • Now supports Apache Airflow 3.x with redesigned UI, API-based task execution, and scheduler-based backfills.
    • (New – Nov 2025) Amazon MWAA Serverless eliminates operational overhead with true serverless scaling and cost optimization.
    • Know when to use MWAA vs Step Functions: MWAA is ideal for complex data pipelines with many dependencies; Step Functions is better for event-driven serverless workflows.
  • AppFlow is a fully managed integration service to securely exchange data between software-as-a-service (SaaS) applications, such as Salesforce, and AWS services, such as Simple Storage Service (S3) and Redshift.

New In-Scope Services (Added in Exam Guide v1.1)

  • Amazon Bedrock
    • A fully managed service for building generative AI applications with foundation models.
    • (New in v1.1) Exam now tests ability to integrate LLMs for data processing and vectorization concepts using Bedrock knowledge bases.
    • Understand how to use Bedrock for data enrichment, text extraction, and unstructured data processing in pipelines.
  • Amazon Q
    • AI-powered assistant that generates SQL, provides data insights, and helps with data integration tasks.
    • Amazon Q generative SQL helps speed up deriving insights from Redshift and Glue Data Catalog data.
  • Amazon S3 Tables
    • Fully managed Apache Iceberg tables in S3, optimized for analytics workloads.
    • Automatic table maintenance: compaction, snapshot management, unreferenced file cleanup.
    • Integrates with Glue Data Catalog, Redshift, EMR, Athena, and SageMaker.
    • Delivers up to 3x faster query performance and 10x higher TPS vs self-managed Iceberg tables.
  • Amazon Aurora
    • Now in scope for the exam, particularly for vector indexing (HNSW with Aurora PostgreSQL) and as a data source for analytics pipelines.
  • Amazon Kendra
    • Intelligent search service powered by ML, useful for searching across data catalogs and documentation.
  • AWS Data Exchange
    • Service to find, subscribe to, and use third-party data in the cloud for analytics.
  • Amazon SageMaker Unified Studio
    • (New – GA March 2025) A single development environment bringing together data engineering, analytics, and ML workflows.
    • Combines functionality from Athena, EMR, Glue, Redshift, MWAA, and SageMaker Studio.
    • (v1.1) Exam tests use of domain, domain units, and projects for SageMaker Unified Studio governance.
    • Amazon SageMaker Catalog enables business data catalog creation and management with data lineage tracking.
  • Open Table Formats (Apache Iceberg)
    • (New in v1.1) Exam now requires understanding of managing open table formats like Apache Iceberg.
    • Key concepts: table versioning, time travel, schema evolution, partition evolution, compaction.
    • AWS services supporting Iceberg: S3 Tables, Glue, EMR, Athena, Redshift.
  • Vector Databases & Indexes
    • (New in v1.1) Understand vector index types like HNSW (Hierarchical Navigable Small World) and IVF (Inverted File Index).
    • Know services supporting vector storage: Aurora PostgreSQL (pgvector), Amazon MemoryDB, Amazon Bedrock Knowledge Bases.
    • Understand vectorization concepts and embedding generation for AI/ML pipelines.

Security, Identity & Compliance

  • Identity and Access Management (IAM)
    • Understand IAM Roles
    • (Updated v1.1) Understand authorization methods: role-based, tag-based, and attribute-based access control.
    • (Updated v1.1) Construct custom policies that meet the principle of least privilege.
  • Key Management Service (KMS) provides key management for encryption at rest.
  • AWS Secrets Manager
    • helps protect secrets needed to access applications, services, and IT resources.
  • Amazon Macie is a security service that uses machine learning to automatically discover, classify, and protect sensitive data in S3.
  • (New v1.1) Understand data sovereignty requirements and how to maintain them.
  • (New v1.1) Enable encryption in transit or before transit for data.

Management & Governance Tools

  • Understand AWS CloudWatch for Logs and Metrics.
  • CloudWatch Logs Subscription Filters can be used to route data to Kinesis Data Streams, Amazon Data Firehose, and Lambda.
  • (Updated v1.1) AWS Config for viewing configuration changes that have occurred in an account.

On the Exam Day

  • Make sure you are relaxed and get some good night’s sleep. The exam is not tough if you are well-prepared.
  • If you are taking the AWS Online exam
    • Try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.
    • The online verification process does take some time and usually, there are glitches.
    • Remember, you would not be allowed to take the exam if you are late by more than 30 minutes.
    • Make sure you have your desk clear, no hand-watches, or external monitors, keep your phones away, and nobody can enter the room.

Finally, All the Best 🙂

AWS Redshift Advanced

AWS Redshift Advanced

  • Redshift Distribution Style determines how data is distributed across compute nodes and helps minimize the impact of the redistribution step by locating the data where it needs to be before the query is executed.
  • Redshift enhanced VPC routing forces all COPY and UNLOAD traffic between the cluster and the data repositories through the VPC.
  • Redshift workload management (WLM) enables users to flexibly manage priorities within workloads so that short, fast-running queries won’t get stuck in queues behind long-running queries.
  • Redshift Spectrum helps query and retrieve structured and semistructured data from files in S3 without having to load the data into Redshift tables.
  • Redshift Federated Query feature allows querying and analyzing data across operational databases, data warehouses, and data lakes.
  • Zero-ETL Integrations facilitate point-to-point data movement from operational databases to Redshift without the need to build and manage data pipelines.
  • Redshift Data Sharing enables live, transactionally consistent data sharing across Redshift clusters without copying data.
  • Redshift Serverless automatically provisions and scales data warehouse capacity without managing infrastructure.

Distribution Styles

  • Table distribution style determines how data is distributed across compute nodes and helps minimize the impact of the redistribution step by locating the data where it needs to be before the query is executed.
  • Redshift supports four distribution styles; AUTO, EVEN, KEY, or ALL.

KEY distribution

  • A single column acts as a distribution key (DISTKEY) and helps place matching values on the same node slice.
  • As a rule of thumb, choose a column that:
    • Is uniformly distributed – Otherwise skew data will cause unbalances in the volume of data that will be stored in each compute node leading to undesired situations where some slices will process bigger amounts of data than others and causing bottlenecks.
    • acts as a JOIN column – for tables related to dimensions tables (star-schema), it is better to choose as DISTKEY the field that acts as the JOIN field with the larger dimension table, so that matching values from the common columns are physically stored together, reducing the amount of data that needs to be broadcasted through the network.

EVEN distribution

  • distributes the rows across the slices in a round-robin fashion, regardless of the values in any particular column
  • Choose EVEN distribution
    • when the table does not participate in joins
    • when there is not a clear choice between KEY and ALL distribution.

ALL distribution

  • Whole table is replicated in every compute node.
  • ensures that every row is collocated for every join that the table participates in.
  • ideal for relatively slow-moving tables, tables that are not updated frequently or extensively.
  • Small dimension tables DO NOT benefit significantly from ALL distribution, because the cost of redistribution is low.

AUTO distribution

  • Redshift assigns an optimal distribution style based on the size of the table data for e.g. apply ALL distribution for a small table and as it grows changes it to Even distribution
  • Amazon Redshift applies AUTO distribution, by default.
  • Redshift’s automatic table optimization (ATO) continuously monitors query patterns and can automatically adjust distribution keys and sort keys for optimal performance.

Sort Key

  • Sort keys define the order in which the data will be stored.
  • Sorting enables efficient handling of range-restricted predicates.
  • Only one sort key per table can be defined, but it can be composed of one or more columns.
  • Redshift stores columnar data in 1 MB disk blocks. The min and max values for each block are stored as part of the metadata. If the query uses a range-restricted predicate, the query processor can use the min and max values to rapidly skip over large numbers of blocks during table scans
  • The are two kinds of sort keys in Redshift: Compound and Interleaved.

Compound Keys

  • A compound key is made up of all of the columns listed in the sort key definition, in the order, they are listed.
  • A compound sort key is more efficient when query predicates use a prefix, or query’s filter applies conditions, such as filters and joins, which is a subset of the sort key columns in order.
  • Compound sort keys might speed up joins, GROUP BY and ORDER BY operations, and window functions that use PARTITION BY and ORDER BY.

Interleaved Sort Keys

  • An interleaved sort key gives equal weight to each column in the sort key, so query predicates can use any subset of the columns that make up the sort key, in any order.
  • An interleaved sort key is more efficient when multiple queries use different columns for filters.
  • Don’t use an interleaved sort key on columns with monotonically increasing attributes, such as identity columns, dates, or timestamps.
  • Use cases involve performing ad-hoc multi-dimensional analytics, which often requires pivoting, filtering, and grouping data using different columns as query dimensions.
  • Note: AWS recommends using compound sort keys for most workloads. Interleaved sort keys require more maintenance (VACUUM REINDEX) and have higher overhead.

Constraints

  • Redshift does not support Indexes.
  • Redshift supports UNIQUE, PRIMARY KEY, and FOREIGN KEY constraints, however, they are only for informational purposes.
  • Redshift does not perform integrity checks for these constraints and is used by the query planner, as hints, in order to optimize executions.
  • Redshift does enforce NOT NULL column constraints.

Redshift Enhanced VPC Routing

  • Redshift enhanced VPC routing forces all COPY and UNLOAD traffic between the cluster and the data repositories through the VPC.
  • Without enhanced VPC routing, Redshift would route traffic through the internet, including traffic to other services within the AWS network.
  • Enhanced VPC routing is now supported for zero-ETL integration warehouses (as of September 2024), enabling secure data replication within the VPC.

Redshift Workload Management

  • Redshift workload management (WLM) enables users to flexibly manage priorities within workloads so that short, fast-running queries won’t get stuck in queues behind long-running queries.
  • Redshift provides query queues, in order to manage concurrency and resource planning. Each queue can be configured with the following parameters:
    • Slots: number of concurrent queries that can be executed in this queue.
    • Working memory: percentage of memory assigned to this queue.
    • Max. Execution Time: the amount of time a query is allowed to run before it is terminated.
  • Queries can be routed to different queues using Query Groups and User Groups.
  • As a rule of thumb, it is considered a best practice to have separate queues for long running resource-intensive queries and fast queries that don’t require big amounts of memory and CPU.
  • By default, Redshift configures one queue with a concurrency level of five, which enables up to five queries to run concurrently, plus one predefined Superuser queue, with a concurrency level of one.
  • A maximum of eight queues can be defined, with each queue configured with a maximum concurrency level of 50. The maximum total concurrency level for all user-defined queues (not including the Superuser queue) is 50.
  • Redshift WLM supports two modes – Manual and Automatic
    • Automatic WLM supports queue priorities.
    • Automatic WLM is the recommended mode and uses ML to dynamically allocate resources.
  • Query Monitoring Rules (QMR) define metrics-based performance boundaries for WLM queues and specify actions when a query exceeds those boundaries.
    • Up to 25 rules per queue, with a limit of 25 rules across all queues.
    • Each rule includes up to three conditions (predicates) and one action (log, cancel, hop, or change priority).
    • Queue-based QMR is now supported in Redshift Serverless (2026), enabling granular workload control.

Redshift Concurrency Scaling

  • Concurrency Scaling helps support thousands of concurrent users and concurrent queries, with consistently fast query performance.
  • With Concurrency scaling, Redshift automatically adds additional cluster capacity to process an increase in both read and write queries.
  • Queries run on the main cluster or a concurrency-scaling cluster returns the most current data.
  • Queries sent to the concurrency-scaling cluster can be managed by configuring WLM queues.
  • Concurrency scaling now supports more types of write queries (INSERT, CREATE TABLE AS, UPDATE, DELETE), expanding beyond read-only scaling.

Redshift Short Query Acceleration – SQA

  • Short query acceleration (SQA) prioritizes selected short-running queries ahead of longer-running queries.
  • SQA runs short-running queries in a dedicated space, so that SQA queries aren’t forced to wait in queues behind longer queries.
  • SQA only prioritizes queries that are short-running and are in a user-defined queue.

Redshift Loading Data

  • A COPY command is the most efficient way to load a table.
    • COPY command is able to read from multiple data files or multiple data streams simultaneously.
    • Redshift allocates the workload to the cluster nodes and performs the load operations in parallel, including sorting the rows and distributing data across node slices.
    • COPY command supports loading data from S3, EMR, DynamoDB, and remote hosts such as EC2 instances using SSH.
    • COPY supports decryption and can decrypt the data as it performs the load if the data is encrypted
    • COPY can then speed up the load process by uncompressing the files as they are read if the data is compressed.
    • COPY command can be used with COMPUPDATE set to ON to analyze and apply compression automatically based on sample data.
    • Optimizing storage for narrow tables (multiple rows few columns) by using Single COPY command instead of multiple COPY commands, as it would not work well due to hidden fields and compression issues.
  • Auto Copy
    • Auto-copy (GA October 2024) provides the ability to automate copy statements by tracking S3 folders and ingesting new files without customer intervention.
    • Without Auto-copy, a copy statement immediately starts the file ingestion process for existing files.
    • Auto-copy extends the existing copy command and provides the ability to
      • Automate file ingestion process by monitoring specified S3 paths for new files
      • Re-use copy configurations, reducing the need to create and run new copy statements for repetitive ingestion tasks and
      • Keep track of loaded files to avoid data duplication.
  • Streaming Ingestion
    • Redshift supports streaming ingestion from Amazon Kinesis Data Streams, Amazon MSK, Confluent Managed Cloud, and self-managed Apache Kafka clusters.
    • Streaming ingestion uses materialized views to ingest data from streams directly into Redshift tables for near real-time analytics.
    • Supports cascading refresh of nested materialized views on streaming sources (2025).
  • INSERT command
    • Clients can connect to Amazon Redshift using ODBC or JDBC and issue ‘insert’ SQL commands to insert the data.
    • INSERT command is much less efficient than using COPY as they are routed through the single leader node.

Redshift Resizing Cluster

  • Elastic resize
    • Use elastic resize to change the node type, number of nodes, or both.
    • If only the number of nodes is changed, then queries are temporarily paused and connections are held open if possible.
    • During the resize operation, the cluster is read-only.
    • Elastic resize takes 10–15 minutes.
  • Classic resize
    • Use classic resize to change the node type, number of nodes, or both.
    • During the resize operation, data is copied to a new cluster and the source cluster is read-only
    • Classic resize takes 2 hours – 2 days or longer, depending on the data’s size
  • Snapshot and restore with classic resize
    • To keep the cluster available during a classic resize, create a snapshot, make a copy of an existing cluster, then resize the new cluster.

Redshift Spectrum

  • Redshift Spectrum helps query and retrieve structured and semistructured data from files in S3 without having to load the data into Redshift tables.
  • Redshift Spectrum queries employ massive parallelism to execute very fast against large datasets. Much of the processing occurs in the Redshift Spectrum layer, and most of the data remains in S3.
  • Multiple clusters can concurrently query the same dataset in S3 without the need to make copies of the data for each cluster.
  • Redshift Spectrum resides on dedicated Redshift servers that are independent of the existing cluster.
  • Redshift Spectrum pushes many compute-intensive tasks, such as predicate filtering and aggregation, down to the Redshift Spectrum layer.
  • Redshift Spectrum also scales automatically, based on the demands of the queries, and can potentially use thousands of instances to take advantage of massively parallel processing.
  • Supports external data catalog using Glue, Athena, or Hive metastore
  • Supports querying Apache Iceberg tables and S3 Tables (purpose-built tabular storage with Iceberg support, launched at re:Invent 2024).
  • Iceberg query performance has improved up to 3x year-over-year through optimizations including Glue Data Catalog statistics, dynamic partition elimination, and parallel manifest file processing.
  • Supports incremental refresh for materialized views on data lake tables, eliminating the need for full recomputation when new data arrives.
  • Redshift cluster and the S3 bucket must be in the same AWS Region.
  • Redshift Spectrum external tables are read-only. You can’t COPY or INSERT to an external table.

Redshift Federated Query

  • Redshift Federated Query feature allows querying and analyzing data across operational databases, warehouses, and lakes.
  • Redshift Federated Query allows integrating queries on live data in RDS for PostgreSQL, Aurora PostgreSQL, RDS for MySQL, and Aurora MySQL with queries across Redshift and S3.
  • Supports both PostgreSQL and MySQL engines for federated access.

Zero-ETL Integrations

  • Zero-ETL integrations facilitate point-to-point data movement from operational databases to Redshift without the need to build and manage custom data pipelines.
  • Provides near real-time analytics on transactional data within seconds of it being written to the source.
  • Supported Sources:
    • Amazon Aurora MySQL-Compatible Edition (first zero-ETL source)
    • Amazon Aurora PostgreSQL-Compatible Edition (GA October 2024)
    • Amazon RDS for MySQL (GA September 2024)
    • Amazon DynamoDB (GA October 2024)
    • Self-managed databases (MySQL, PostgreSQL) via CDC replication
    • Enterprise Applications (re:Invent 2024): Salesforce, Zendesk, ServiceNow, SAP, Facebook Ads, Instagram Ads, Pardot, and Zoho CRM
  • Key Features:
    • Data filtering to selectively extract tables and schemas using regular expressions
    • Support for incremental and auto-refresh materialized views on replicated data
    • Configurable change data capture (CDC) refresh rates
    • Cross-account integrations within the same region
    • Supports both Redshift Serverless workgroups and provisioned clusters using RA3 instance types
    • Compatible with enhanced VPC routing and Multi-AZ deployments

Redshift Data Sharing

  • Redshift Data Sharing allows securely sharing live, transactionally consistent data between Redshift clusters without physically copying or moving data.
  • Supports cross-account and cross-Region data sharing.
  • For cross-account data sharing, both the producer and consumer cluster must be encrypted.
  • Producer clusters create datashares; consumer clusters associate with them via a two-way handshake for cross-account sharing.
  • Multi-data warehouse writes through data sharing (GA November 2024) allows writing to shared Redshift databases from multiple data warehouses, enabling distributed ETL workloads.
  • Supports data sharing with data lake tables, enabling unified access across warehouses and data lakes.
  • Works with both RA3 provisioned clusters and Redshift Serverless.
  • Integrated with AWS Lake Formation for fine-grained access control on shared data.

Redshift Serverless

  • Redshift Serverless automatically provisions and scales data warehouse capacity to deliver fast performance without managing infrastructure.
  • Pay only for compute capacity when the data warehouse is active, measured in Redshift Processing Units (RPUs).
  • Each RPU provides 16 GB of memory; base capacity ranges from 4 RPUs to 1024 RPUs.
  • Starting capacity as low as 4 RPUs ($1.50/hour), making it cost-effective for smaller workloads.
  • AI-driven scaling and optimization (GA October 2024, default for new workgroups April 2026):
    • Automatically learns workload patterns and adjusts compute resources based on query complexity, data volume, and scan size.
    • Offers a price-performance slider with five profiles from “Optimized for Cost” to “Optimized for Performance.”
    • Deploys automatic optimizations including materialized views and table design optimization.
    • Up to 10x price-performance improvement for variable workloads.
  • Serverless Reservations (April 2025): Commit to specific RPUs for a one-year term with 20% (no-upfront) or 24% (all-upfront) discount off on-demand rates.
  • Supports all Redshift features including data sharing, streaming ingestion, federated queries, and zero-ETL integrations.

Redshift Multi-AZ Deployments

  • Multi-AZ deployments (GA November 2023 for RA3 clusters) support running the data warehouse across multiple Availability Zones simultaneously.
  • Provides high availability by continuing operations during unforeseen failure scenarios in a single AZ.
  • Available for RA3 provisioned clusters in most commercial regions and GovCloud (US).
  • Compatible with zero-ETL integrations for highly available near real-time analytics.

Redshift Node Types

  • RG Instances (GA May 2026) – Latest generation powered by AWS Graviton processors
    • Up to 2.2x faster for data warehouse workloads and 2.4x faster for data lake workloads compared to RA3.
    • 30% lower price per vCPU compared to RA3 instances.
    • Includes a custom-built vectorized data lake query engine that processes Apache Iceberg and Parquet data on cluster nodes.
    • Available in rg.xlarge and rg.4xlarge node types.
    • Recommended upgrade path from RA3 instances.
  • RA3 Instances – Managed storage with separate compute and storage scaling
    • Available in ra3.xlplus, ra3.4xlarge, ra3.16xlarge, and ra3.large sizes.
    • Managed storage automatically tiers data between high-performance SSD and S3.
    • RA3.large (GA October 2024) offers a cost-effective migration path from DC2.large.
  • DC2 Instances – Dense compute with local SSD storage (legacy, migration to RA3/RG recommended)

Redshift Generative AI Integration

  • Amazon Q generative SQL (GA September 2024) in Redshift Query Editor allows users to express queries in natural language and receive SQL code recommendations.
  • Amazon Bedrock Integration (October 2024) enables invoking large language models (LLMs) from SQL commands for tasks like text generation, summarization, sentiment analysis, and language translation.
  • Amazon Bedrock Knowledge Bases supports natural language querying to retrieve structured data from Redshift warehouses, automatically translating questions into SQL.

Redshift SageMaker Lakehouse

  • Amazon SageMaker Lakehouse (re:Invent 2024) unifies data across S3 data lakes and Redshift warehouses.
  • Provides access via Apache Iceberg open standards for use with any Iceberg-compatible engine.
  • Existing Redshift data warehouses can be published to SageMaker Lakehouse, opening warehouse data with Iceberg REST API.
  • Supports creating new data lake tables using Redshift Managed Storage (RMS) as native storage.
  • Offers integrated access controls and fine-grained permissions through Lake Formation across all engines.

Redshift Behavior Changes and Deprecations

  • Python UDFs End of Support (June 30, 2026)
    • Creation of new Python UDFs blocked since October 30, 2025.
    • Existing Python UDFs will stop functioning after June 30, 2026.
    • Migration: Use Lambda UDFs which provide better integration, flexibility, scalability, and security.
  • ODBC 1.x Driver End of Support: September 30, 2026. Migrate to ODBC 2.x driver.
  • Minimum TLS Version: TLS 1.2 minimum required starting January 31, 2026.
  • Materialized View Auto-REFRESH Behavior Change: After February 27, 2026, auto-refresh respects workload priorities.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A Redshift data warehouse has different user teams that need to query the same table with very different query types. These user teams are experiencing poor performance. Which action improves performance for the user teams in this situation?
    1. Create custom table views.
    2. Add interleaved sort keys per team.
    3. Maintain team-specific copies of the table.
    4. Add support for workload management queue hopping.
  2. A company needs to replicate data from their Aurora PostgreSQL database to Redshift for near real-time analytics without building custom ETL pipelines. Which approach requires the LEAST operational overhead?
    1. Set up AWS Glue jobs to periodically extract and load data
    2. Use Amazon Kinesis Data Firehose to stream changes
    3. Configure Aurora PostgreSQL zero-ETL integration with Redshift
    4. Create Lambda functions triggered by DynamoDB Streams
  3. An organization uses Redshift Serverless and wants to optimize for cost during off-peak hours while maintaining performance during peak business hours. Which feature best addresses this requirement?
    1. Manual RPU scaling with CloudWatch alarms
    2. Concurrency Scaling with WLM queue configuration
    3. AI-driven scaling and optimization with cost-optimized profile
    4. Scheduled pause and resume of the Serverless workgroup
  4. A company wants to share live data from their Redshift cluster with a partner organization’s Redshift cluster in a different AWS account without copying data. What is the recommended approach?
    1. Use Redshift Spectrum with cross-account S3 access
    2. Set up AWS Data Exchange for data delivery
    3. Configure cross-account Redshift Data Sharing
    4. Use AWS Glue ETL to replicate data to the partner account
  5. A team wants to perform sentiment analysis on customer feedback stored in Redshift without moving data to a separate ML service. Which Redshift feature enables this?
    1. Redshift ML with SageMaker Autopilot
    2. Export to S3 and use Comprehend
    3. Amazon Redshift integration with Amazon Bedrock using SQL commands
    4. Redshift federated query to an NLP endpoint
  6. A company is migrating from RA3 instances and wants better price-performance for both data warehouse and data lake workloads. Which instance type should they consider?
    1. DC2.8xlarge for compute-intensive workloads
    2. RA3.16xlarge with AQUA enabled
    3. RG instances powered by AWS Graviton
    4. Redshift Serverless with 1024 RPU base capacity

Amazon Athena

Athena

Amazon Athena

  • Amazon Athena is a serverless, interactive analytics service built on open-source frameworks, supporting open-table and file formats.
  • provides a simplified, flexible way to analyze petabytes of data in an S3 data lake and 30+ data sources, including on-premises data sources or other cloud systems using SQL or Python without loading the data.
  • is built on open-source Trino and Presto engines and Apache Spark frameworks, with no provisioning or configuration effort required.
  • supports Athena for Apache Spark, enabling serverless Spark applications for advanced analytics and machine learning workloads.
  • features Athena SQL v3 engine with improved performance, enhanced SQL capabilities, and better cost optimization.
  • offers Provisioned Capacity mode for predictable query performance and consistent workloads alongside the traditional on-demand pricing.
  • is highly available and runs queries using compute resources across multiple facilities, automatically routing queries appropriately if a particular facility is unreachable
  • can process unstructured, semi-structured, and structured datasets.
  • integrates with QuickSight for visualizing the data or creating dashboards.
  • supports various standard data formats, including CSV, TSV, JSON, ORC, Avro, and Parquet.
  • supports modern table formats including Apache Iceberg, Delta Lake, and Apache Hudi for ACID transactions and time travel queries.
  • supports compressed data in Snappy, Zlib, LZO, and GZIP formats. You can improve performance and reduce costs by compressing, partitioning, and using columnar formats.
  • enables cross-region querying to analyze data stored across multiple AWS regions from a single query interface.
  • can handle complex analysis, including large joins, window functions, and arrays
  • uses a managed Glue Data Catalog to store information and schemas about the databases and tables that you create for the data stored in S3
  • uses schema-on-read technology, which means that the table definitions are applied to the data in S3 when queries are being applied. There’s no data loading or transformation required. Table definitions and schema can be deleted without impacting the underlying data stored in S3.
  • supports fine-grained access control with AWS Lake Formation which allows for centrally managing permissions and access control for data catalog resources in the S3 data lake.
  • integrates with Amazon DataZone for comprehensive data governance, cataloging, and discovery across the organization.
  • supports AWS Clean Rooms integration for privacy-preserving collaborative analytics without sharing raw data.
Athena
Source: Amazon

Athena Workgroups

  • Athena workgroups can be used to separate users, teams, applications, or workloads, to set limits on amount of data each query or the entire workgroup can process, and to track costs.
  • Resource-level identity-based policies can be used to control access to a specific workgroup.
  • Workgroups help view query-related metrics in CloudWatch, control costs by configuring limits on the amount of data scanned, create thresholds, and trigger actions, such as SNS, when these thresholds are breached.
  • Workgroups now support query result reuse and caching to reduce costs and improve performance for repeated queries.
  • Enhanced cost controls with per-query data scanning limits and automatic query termination for runaway queries.
  • Workgroup-level encryption settings and fine-grained access controls for improved security governance.
  • Workgroups can now be configured with Provisioned Capacity for consistent performance and predictable costs.
  • Workgroups integrate with IAM, CloudWatch, Simple Notification Service, and AWS Cost and Usage Reports as follows:
    • IAM identity-based policies with resource-level permissions control who can run queries in a workgroup.
    • Athena publishes the workgroup query metrics to CloudWatch if you enable query metrics.
    • SNS topics can be created that issue alarms to specified workgroup users when data usage controls for queries in a workgroup exceed the established thresholds.
    • Workgroup tag can be configured as a cost allocation tag in the Billing and Cost Management console and the costs associated with running queries in that workgroup appear in the Cost and Usage Reports with that cost allocation tag.

Athena Best Practices

  • Partition the data
    • which helps keep the related data together based on column values such as date, country, and region.
    • Athena supports Hive partitioning and advanced partition projection with custom expressions.
    • Use dynamic partition pruning for improved query performance with complex partition schemes.
    • Consider partition evolution strategies when using modern table formats like Iceberg.
    • Pick partition keys that will support the queries
    • Partition projection is an Athena feature that stores partition information not in the Glue Data Catalog but as rules in the properties of the table in AWS Glue.
  • Compression
    • Compressing the data can speed up queries significantly, as long as the files are either of an optimal size or the files are splittable.
    • Smaller data sizes reduce the data scanned from S3, resulting in lower costs of running queries and reduced network traffic.
  • Optimize file sizes
    • Queries run more efficiently when data scanning can be parallelized and when blocks of data can be read sequentially.
  • Modern file formats and optimization
    • Columnar storage formats like ORC and Parquet remain optimal for analytical workloads.
    • Apache Iceberg tables provide ACID transactions, schema evolution, and time travel capabilities.
    • Delta Lake integration enables reliable data lakes with ACID guarantees.
    • Use Z-ordering and data clustering techniques for improved query performance.
    • A splittable file can be read in parallel by the execution engine in Athena, whereas an unsplittable file can’t be read in parallel.
  • Query optimization and performance
    • Leverage query result caching and reuse for frequently executed queries.
    • Use EXPLAIN and ANALYZE statements to understand query execution plans.
    • Implement query performance monitoring with CloudWatch Insights.
    • Consider Provisioned Capacity for consistent performance requirements.
    • Optimize queries by using appropriate WHERE clauses and avoiding SELECT * statements.

Security and Governance

  • Enhanced Lake Formation Integration: Row-level and cell-level security controls for fine-grained data access.
  • Data Masking and Anonymization: Built-in functions for protecting sensitive data during queries.
  • Cross-Account Access: Secure data sharing across AWS accounts with resource-based policies.
  • Audit and Compliance: Comprehensive query logging and data lineage tracking through AWS CloudTrail and DataZone.
  • Encryption Enhancements: Support for customer-managed KMS keys and field-level encryption.
  • Identity-Based Access Control: Integration with AWS IAM for fine-grained permissions and role-based access.
  • VPC Endpoints: Private connectivity to Athena without internet gateway requirements.

Advanced Use Cases and Patterns

  • Machine Learning Integration: Query results can be directly used with Amazon SageMaker for ML model training and inference.
  • Real-time Analytics: Near real-time querying of streaming data from Kinesis Data Firehose with minimal latency.
  • Federated Queries: Query data across multiple sources including RDS, Redshift, and on-premises databases using Athena Federated Query.
  • Data Mesh Architecture: Athena serves as a query engine for decentralized data architectures with domain-specific data products.
  • Serverless ETL Pipelines: Combine Athena with AWS Step Functions and Lambda for fully serverless data processing workflows.
  • Cost Optimization Patterns: Implement intelligent tiering and lifecycle policies based on query patterns and data access frequency.
  • Multi-Account Analytics: Centralized analytics across multiple AWS accounts using cross-account access patterns.
  • Hybrid Cloud Analytics: Query on-premises data alongside cloud data using federated query capabilities.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A SysOps administrator is storing access logs in Amazon S3 and wants to use standard SQL to query data and generate a report without having to manage infrastructure. Which AWS service will allow the SysOps administrator to accomplish this task?
    1. Amazon Inspector
    2. Amazon CloudWatch
    3. Amazon Athena
    4. Amazon RDS
  2. A Solutions Architect must design a storage solution for incoming billing reports in CSV format. The data does not need to be scanned frequently and is discarded after 30 days. Which service will be MOST cost-effective in meeting these requirements?
    1. Import the logs into an RDS MySQL instance
    2. Use AWS Data pipeline to import the logs into a DynamoDB table
    3. Write the files to an S3 bucket and use Amazon Athena to query the data
    4. Import the logs to an Amazon Redshift cluster
  3. A data engineering team needs to implement ACID transactions and time travel queries on their data lake. They want to maintain compatibility with existing Athena queries while adding these capabilities. Which solution should they choose?
    1. Migrate to Amazon Redshift Spectrum
    2. Use Amazon EMR with Apache Hive
    3. Implement Apache Iceberg tables with Athena
    4. Use AWS Glue with Delta Lake format
  4. An organization wants to share analytical insights with external partners without exposing raw data. They need to perform collaborative analytics while maintaining data privacy. Which AWS service integration with Athena would be most appropriate?
    1. AWS Lake Formation with external account access
    2. AWS Clean Rooms with Athena integration
    3. Amazon QuickSight with embedded dashboards
    4. AWS DataSync with cross-account replication
  5. A company wants to optimize costs for their Athena workloads that have predictable query patterns and consistent performance requirements. Which Athena feature should they implement?
    1. Athena Federated Query
    2. Athena for Apache Spark
    3. Athena Provisioned Capacity
    4. Athena Query Result Reuse

References

AWS Certified Machine Learning -Specialty (MLS-C01) Exam Learning Path

AWS Machine Learning - Specialty Certification

⚠️ AWS Certified Machine Learning – Specialty (MLS-C01) RETIRED

The MLS-C01 exam was retired on March 31, 2026. The last day to take this exam has passed. Certification holders will still have an active certification for 3 years from the date it was earned.

This content is maintained for historical reference and foundational ML knowledge.

Replacement Certifications:

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Learning Path

  • Finally Re-certified the updated AWS Certified Machine Learning – Specialty (MLS-C01) certification exam after 3 months of preparation.
  • In terms of the difficulty level of all professional and specialty certifications, I find this to be the toughest, partly because I am still diving deep into machine learning and relearned everything from basics for this certification.
  • Machine Learning is a vast specialization in itself and with AWS services, there is a lot to cover and know for the exam. This is the only exam, where the majority of the focus is on concepts outside of AWS i.e. pure machine learning. It also includes AWS Machine Learning and Data Engineering services.

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Content

  • AWS Certified Machine Learning – Specialty (MLS-C01) exam validates
    • Select and justify the appropriate ML approach for a given business problem.
    • Identify appropriate AWS services to implement ML solutions.
    • Design and implement scalable, cost-optimized, reliable, and secure ML solutions.

Refer AWS Certified Machine Learning – Specialty Exam Guide for details

AWS Certified Machine Learning – Specialty Domains

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Summary

  • Specialty exams are tough, lengthy, and tiresome. Most of the questions and answers options have a lot of prose and a lot of reading that needs to be done, so be sure you are prepared and manage your time well.
  • MLS-C01 exam has 65 questions to be solved in 170 minutes which gives you roughly 2 1/2 minutes to attempt each question.
  • MLS-C01 exam includes two types of questions, multiple-choice and multiple-response.
  • MLS-C01 has a scaled score between 100 and 1,000. The scaled score needed to pass the exam is 750.
  • Specialty exams currently cost $ 300 + tax.
  • You can get an additional 30 minutes if English is your second language by requesting Exam Accommodations. It might not be needed for Associate exams but is helpful for Professional and Specialty ones.
  • As always, mark the questions for review, move on, and come back to them after you are done with all.
  • As always, having a rough architecture or mental picture of the setup helps focus on the areas that you need to improve. Trust me, you will be able to eliminate 2 answers for sure and then need to focus on only the other two. Read the other 2 answers to check the difference area and that would help you reach the right answer or at least have a 50% chance of getting it right.
  • AWS exams can be taken either remotely or online, I prefer to take them online as it provides a lot of flexibility. Just make sure you have a proper place to take the exam with no disturbance and nothing around you.
  • Also, if you are taking the AWS Online exam for the first time try to join at least 30 minutes before the actual time as I have had issues with both PSI and Pearson with long wait times.

AWS ML/AI Certification Replacements (2024-2026)

AWS restructured its AI/ML certification portfolio in 2024-2026. The MLS-C01 was retired and replaced by three new certifications:

CertificationLevelCodeFocus
ML Engineer – AssociateAssociateMLA-C01Build, deploy, operationalize ML pipelines on AWS
AI PractitionerFoundationalAIF-C01AI/ML/GenAI concepts, responsible AI, AWS AI services
Generative AI Developer – ProfessionalProfessionalAIP-C01Production GenAI apps with Amazon Bedrock, RAG, agents

Note: The ML knowledge from MLS-C01 remains highly relevant for MLA-C01 and AIP-C01 exams. The core ML concepts (algorithms, evaluation, feature engineering) covered below are foundational for all three new certifications.

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Resources

AWS Certified Machine Learning – Specialty (MLS-C01) Exam Topics

  • AWS Certified Machine Learning – Specialty exam covers a lot of Machine Learning concepts. It digs deep into Machine learning concepts, most of which are not related to AWS.
  • AWS Certified Machine Learning – Speciality exam covers the E2E Machine Learning lifecycle, right from data collection, transformation, making it usable and efficient for Machine Learning, pre-processing data for Machine Learning, training and validation, and implementation.

Machine Learning Concepts

  • Exploratory Data Analysis
    • Feature selection and Engineering
      • remove features that are not related to training
      • remove features that have the same values, very low correlation, very little variance, or a lot of missing values
      • Apply techniques like Principal Component Analysis (PCA) for dimensionality reduction i.e. reduce the number of features.
      • Apply techniques such as One-hot encoding and label encoding to help convert strings to numeric values, which are easier to process.
      • Apply Normalization i.e. values between 0 and 1 to handle data with large variance.
      • Apply feature engineering for feature reduction e.g. using a single height/weight feature instead of both features.
    • Handle Missing data
      • remove the feature or rows with missing data
      • impute using Mean/Median values – valid only for Numeric values and not categorical features also does not factor correlation between features
      • impute using k-NN, Multivariate Imputation by Chained Equation (MICE), Deep Learning – more accurate and helps factors correlation between features
    • Handle unbalanced data
      • Source more data
      • Oversample minority or Undersample majority
      • Data augmentation using techniques like Synthetic Minority Oversampling Technique (SMOTE).
  • Modeling
    • Know about Algorithms – Supervised, Unsupervised and Reinforcement and which algorithm is best suitable based on the available data either labelled or unlabelled.
      • Supervised learning trains on labeled data e.g. Linear regression. Logistic regression, Decision trees, Random Forests
      • Unsupervised learning trains on unlabelled data e.g. PCA, SVD, K-means
      • Reinforcement learning trained based on actions and rewards e.g. Q-Learning
    • Hyperparameters
      • are parameters exposed by machine learning algorithms that control how the underlying algorithm operates and their values affect the quality of the trained models
      • some of the common hyperparameters are learning rate, batch, epoch (hint: If the learning rate is too large, the minimum slope might be missed and the graph would oscillate. If the learning rate is too small, it requires too many steps which would take the process longer and is less efficient)
  • Evaluation
    • Know difference in evaluating model accuracy
      • Use Area Under the (Receiver Operating Characteristic) Curve (AUC) for Binary classification
      • Use root mean square error (RMSE) metric for regression
    • Understand Confusion matrix
      • A true positive is an outcome where the model correctly predicts the positive class. Similarly, a true negative is an outcome where the model correctly predicts the negative class.
      • A false positive is an outcome where the model incorrectly predicts the positive class. A false negative is an outcome where the model incorrectly predicts the negative class.
      • Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN) (hint: use this for cases like fraud detection, cost of marking non fraud as frauds is lower than marking fraud as non-frauds)
      • Specificity or TNR (True Negative Rate): Number of items correctly identified as negative out of total negatives- TN/(TN+FP) (hint: use this for cases like videos for kids, the cost of dropping few valid videos is lower than showing few bad ones)
    • Handle Overfitting problems
      • Simplify the model, by reducing the number of layers
      • Early Stopping – form of regularization while training a model with an iterative method, such as gradient descent
      • Data Augmentation
      • Regularization – technique to reduce the complexity of the model
      • Dropout is a regularization technique that prevents overfitting
      • Never train on test data

Machine Learning Services

  • Amazon SageMaker AI (renamed from Amazon SageMaker in Dec 2024)
    • Note: At re:Invent 2024, AWS introduced the next generation of Amazon SageMaker — a unified platform for data, analytics, and AI. The original SageMaker was renamed to Amazon SageMaker AI, which retains focus on building, training, and deploying ML models. The new Amazon SageMaker Unified Studio (GA March 2025) provides a single workspace for data engineering, analytics, ML, and GenAI development.
    • supports both File mode, Pipe mode, and Fast File mode
      • File mode loads all of the data from S3 to the training instance volumes VS Pipe mode streams data directly from S3
      • File mode needs disk space to store both the final model artifacts and the full training dataset. VS Pipe mode which helps reduce the required size for EBS volumes.
      • Fast File mode combines the ease of use of the existing File Mode with the performance of Pipe Mode.
    • Using RecordIO format allows algorithms to take advantage of Pipe mode when training the algorithms that support it.
    • supports Model tracking capability to manage up to thousands of machine learning model experiments
    • supports automatic scaling for production variants. Automatic scaling dynamically adjusts the number of instances provisioned for a production variant in response to changes in your workload
    • provides pre-built Docker images for its built-in algorithms and the supported deep learning frameworks used for training & inference
    • SageMaker Automatic Model Tuning
      • is the process of finding a set of hyperparameters for an algorithm that can yield an optimal model.
      • Best practices
        • limit the search to a smaller number as the difficulty of a hyperparameter tuning job depends primarily on the number of hyperparameters that Amazon SageMaker has to search
        • DO NOT specify a very large range to cover every possible value for a hyperparameter as it affects the success of hyperparameter optimization.
        • log-scaled hyperparameter can be converted to improve hyperparameter optimization.
        • running one training job at a time achieves the best results with the least amount of compute time.
        • Design distributed training jobs so that you get they report the objective metric that you want.
    • know how to take advantage of multiple GPUs (hint: increase learning rate and batch size w.r.t to the increase in GPUs)
    • AWS Inferentia and Inferentia2 chips provide high-performance ML inference acceleration. (Note: Elastic Inference was deprecated in April 2023; use Inferentia-based instances or SageMaker inference components instead.)
    • SageMaker AI Inference options:
      • Real-time inference is ideal for online inferences that have low latency or high throughput requirements.
      • Serverless Inference is ideal for intermittent or unpredictable traffic patterns as it manages all of the underlying infrastructure with no need to manage instances or scaling policies.
      • Batch Transform is suitable for offline processing when large amounts of data are available upfront and you don’t need a persistent endpoint.
      • Asynchronous Inference is ideal when you want to queue requests and have large payloads with long processing times.
    • SageMaker Model deployment allows deploying multiple variants of a model to the same SageMaker endpoint to test new models without impacting the user experience
      • Production Variants
        • supports A/B or Canary testing where you can allocate a portion of the inference requests to each variant.
        • helps compare production variants’ performance relative to each other.
      • Shadow Variants
        • replicates a portion of the inference requests that go to the production variant to the shadow variant.
        • logs the responses of the shadow variant for comparison and not returned to the caller.
        • helps test the performance of the shadow variant without exposing the caller to the response produced by the shadow variant.
    • SageMaker Managed Spot training can help use spot instances to save cost and with Checkpointing feature can save the state of ML models during training
    • SageMaker Feature Store – helps to create, share, and manage features for ML development. A centralized store for features and associated metadata so features can be easily discovered and reused.
    • SageMaker Debugger provides tools to debug training jobs and resolve problems such as overfitting, saturated activation functions, and vanishing gradients.
    • SageMaker Model Monitor monitors the quality of machine learning models in production and can help set alerts for deviations in model quality.
    • SageMaker Data Wrangler reduces the time to aggregate and prepare tabular and image data for ML from weeks to minutes.
    • SageMaker Experiments lets you create, manage, analyze, and compare machine learning experiments.
    • SageMaker Clarify helps detect potential bias and explain model predictions.
    • SageMaker Model Governance provides systematic visibility into ML model development, validation, and usage.
    • SageMaker Autopilot is an automated machine learning (AutoML) feature that automates the end-to-end process of building, training, tuning, and deploying ML models.
    • SageMaker Neo enables machine learning models to train once and run anywhere in the cloud and at the edge.
    • SageMaker HyperPod (launched 2023) provides resilient infrastructure for large-scale foundation model training with automatic fault recovery, reducing training costs by up to 40%.
    • SageMaker JumpStart is an ML hub for evaluating, comparing, and deploying foundation models (FMs) from leading providers including pre-trained models, solution templates, and example notebooks.
    • SageMaker Canvas provides a no-code visual interface for generating accurate ML predictions. It now also supports time-series forecasting (replacing Amazon Forecast for new customers).
    • SageMaker API and SageMaker Runtime support VPC interface endpoints powered by AWS PrivateLink.
    • Algorithms:
      • Blazing text provides Word2vec and text classification algorithms
      • DeepAR provides supervised learning algorithm for forecasting scalar (one-dimensional) time series (hint: train for new products based on existing products sales data).
      • Factorization machines provide supervised classification and regression tasks, helps capture interactions between features within high dimensional sparse datasets economically.
      • Image classification algorithm is a supervised learning algorithm that supports multi-label classification.
      • IP Insights is an unsupervised learning algorithm that learns the usage patterns for IPv4 addresses.
      • K-means is an unsupervised learning algorithm for clustering.
      • k-nearest neighbors (k-NN) algorithm is an index-based algorithm for classification or regression.
      • Latent Dirichlet Allocation (LDA) is an unsupervised algorithm to identify topics shared by documents within a text corpus.
      • Neural Topic Model (NTM) is an unsupervised algorithm to organize a corpus of documents into topics based on statistical distribution.
      • Linear models are supervised learning algorithms for classification or regression problems.
      • Object Detection algorithm detects and classifies objects in images using a single deep neural network.
      • Principal Component Analysis (PCA) is an unsupervised algorithm for dimensionality reduction.
      • Random Cut Forest (RCF) is an unsupervised algorithm for anomaly detection.
      • Sequence to Sequence is a supervised algorithm where input is a sequence of tokens and output is another sequence (hint: text summarization).
      • XGBoost is an optimized distributed gradient boosting algorithm for classification and regression.
  • SageMaker Ground Truth
    • provides automated data labeling using machine learning
    • helps build highly accurate training datasets quickly using Amazon Mechanical Turk
    • provides annotation consolidation to improve accuracy of data object labels
    • automated data labeling uses ML to label portions of data automatically without human workers

Machine Learning & AI Managed Services

  • Amazon Bedrock (launched 2023, major updates 2024-2026)
    • Fully managed service offering access to hundreds of foundation models (FMs) from leading AI companies (Anthropic, Meta, Mistral, Amazon, etc.) via a unified API.
    • Key capabilities: model evaluation, guardrails for responsible AI, knowledge bases (RAG), agents for task automation, fine-tuning, and model customization.
    • Bedrock AgentCore (2026) provides production-grade runtime for AI agents with orchestration and scaling.
    • Critical for the new AIP-C01 (Generative AI Developer) certification.
  • Amazon Q Business (GA April 2024)
    • Generative AI-powered assistant for finding information, gaining insight, and taking action using enterprise data.
    • Connects to 40+ enterprise data sources while respecting existing access controls.
    • Evolved from and supersedes many Amazon Kendra use cases for intelligent search.
  • Amazon Kendra – intelligent search service using NLP and ML algorithms. Still available but increasingly complemented/replaced by Amazon Q Business for enterprise search use cases.
  • Comprehend – natural language processing (NLP) service to find insights and relationships in text. Identifies language, extracts key phrases, entities, sentiment analysis, and topic modeling.
  • Lex – provides conversational interfaces using voice and text for building chatbots.
  • Polly – text-to-speech. Supports SSML tags (prosody) for adjusting speech rate, pitch, or volume. Supports pronunciation lexicons.
  • Rekognition – analyze images and video. Identifies objects, people, text, scenes, and activities.
  • Translate – natural and fluent language translation.
  • Transcribe – automatic speech recognition (ASR), speech-to-text.
  • Textract – extracts text, handwriting, and data from scanned documents using ML.
  • Augmented AI (Amazon A2I) – ML service for building human review workflows.
  • Amazon Forecast⚠️ No longer available to new customers (closed July 29, 2024). Existing customers can continue using the service. AWS recommends migrating to Amazon SageMaker Canvas for time-series forecasting.
  • AWS Panorama⚠️ End of Support: May 31, 2026. AWS Panorama brought computer vision to on-premises camera networks. After EOL, the service and appliances will no longer function. Consider alternatives like Amazon Rekognition, SageMaker Edge, or partner solutions.

Analytics

  • Make sure you know and understand data engineering concepts mainly in terms of data capture, migration, transformation, and storage.
  • Kinesis
    • Understand Kinesis Data Streams and Amazon Data Firehose (renamed from Kinesis Data Firehose in Feb 2024) in depth
    • Amazon Managed Service for Apache Flink (renamed from Kinesis Data Analytics in Aug 2023) can process and analyze streaming data using Apache Flink and integrates with Data Streams and Data Firehose. Note: Kinesis Data Analytics for SQL was discontinued January 27, 2026.
    • Know Kinesis Data Streams vs Data Firehose
      • Know Kinesis Data Streams is open ended on both producer and consumer. It supports KCL and works with Spark.
      • Know Data Firehose is open ended for producer only. Data is stored in S3, Redshift, and OpenSearch.
      • Data Firehose works in batches with minimum 60secs interval.
      • Amazon Data Firehose supports data transformation and record format conversion using Lambda function (hint: can be used for transforming csv or JSON into parquet)
    • Kinesis Video Streams provides a fully managed service to ingest, index, store, and stream live video.
  • OpenSearch (formerly ElasticSearch) is a search service that supports indexing, full-text search, faceting, etc.
  • AWS Data Pipeline⚠️ Closed to new customers (July 25, 2024). Existing customers can continue to use the service. AWS recommends migrating to AWS Glue, Step Functions, or Amazon MWAA (Managed Apache Airflow).
  • AWS Glue is a fully managed ETL (extract, transform, and load) service
    • helps setup, orchestrate, and monitor complex data flows.
    • Glue Data Catalog is a central repository to store structural and operational metadata for all data assets.
    • Glue Crawler connects to a data store, extracts the schema, and populates the Glue Data Catalog with metadata.
    • Glue DataBrew is a visual data preparation tool that enables users to clean and normalize data without writing any code.
  • DataSync is an online data transfer service that simplifies, automates, and accelerates moving data between storage systems and services.

Security, Identity & Compliance

  • Security is covered very lightly. (hint: SageMaker AI can read data from KMS-encrypted S3. Make sure the KMS key policies include the role attached with SageMaker)

Management & Governance Tools

  • Understand AWS CloudWatch for Logs and Metrics. (hint: SageMaker AI is integrated with CloudWatch and logs and metrics are all stored in it)

Storage

  • Understand Data Storage Options – Know patterns for S3 vs RDS vs DynamoDB vs Redshift. (hint: S3 is, by default, the data storage option or Big Data storage, and look for it in the answer.)

Whitepapers and articles

AWS SageMaker Built-in Algorithms Summary

SageMaker Built-in Algorothms

SageMaker AI Built-in Algorithms

📌 Naming Update (December 2024): On December 3, 2024, Amazon SageMaker was renamed to Amazon SageMaker AI. The “SageMaker” brand now refers to the next-generation unified platform for data, analytics, and AI. All built-in algorithms remain available under SageMaker AI.

  • SageMaker AI provides a suite of built-in algorithms, pre-trained models, and pre-built solution templates to help data scientists and ML practitioners get started on training and deploying ML models quickly.
  • SageMaker AI also provides SageMaker JumpStart with pre-trained foundation models (including LLMs like LLaMA, BLOOM, Falcon) for generative AI tasks such as text generation, summarization, and question answering.

SageMaker AI Built-in Algorithms

Tabular Data – Classification & Regression

AutoGluon-Tabular

  • is an open-source AutoML framework that succeeds by ensembling models and stacking them in multiple layers.
  • automatically performs data processing, model selection, and hyperparameter tuning.
  • used for both classification and regression tasks on tabular data.
  • supports CPU and GPU (single instance only) training.

CatBoost

  • is an implementation of the gradient-boosted trees algorithm that introduces ordered boosting and an innovative algorithm for processing categorical features.
  • used for both classification and regression tasks.
  • handles categorical features natively without requiring manual encoding.
  • supports CPU (single instance only) training.

LightGBM

  • is an implementation of the gradient-boosted trees algorithm that adds two novel techniques for improved efficiency and scalability.
  • uses Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB).
  • used for both classification and regression tasks.
  • supports CPU (single instance only) training.

TabTransformer

  • is a novel deep tabular data modeling architecture built on self-attention-based Transformers.
  • converts categorical features into contextual embeddings using Transformer layers.
  • used for both classification and regression tasks.
  • supports CPU and GPU (single instance only) training.

XGBoost (eXtreme Gradient Boosting)

  • is a popular and efficient open-source implementation of the gradient boosted trees algorithm.
  • Gradient boosting is a supervised learning algorithm that attempts to accurately predict a target variable by combining an ensemble of estimates from a set of simpler, weaker models.
  • supports both classification and regression tasks.
  • supports distributed training across multiple instances.

Linear Learner

  • are supervised learning algorithms used for solving either classification or regression problems.
  • learns a linear function for regression or a linear threshold function for classification.
  • supports distributed training.

K-nearest neighbors (k-NN) algorithm

  • is an index-based algorithm.
  • uses a non-parametric method for classification or regression.
  • For classification problems, the algorithm queries the k points that are closest to the sample point and returns the most frequently used label of their class as the predicted label.
  • For regression problems, the algorithm queries the k closest points to the sample point and returns the average of their feature values as the predicted value.

Factorization Machine

  • is a general-purpose supervised learning algorithm used for both classification and regression tasks.
  • extension of a linear model designed to capture interactions between features within high dimensional sparse datasets economically, such as click prediction and item recommendation.

Text-based

BlazingText algorithm

  • provides highly optimized implementations of the Word2vec and text classification algorithms.
  • Word2vec algorithm
    • useful for many downstream natural language processing (NLP) tasks, such as sentiment analysis, named entity recognition, machine translation, etc.
    • maps words to high-quality distributed vectors, whose representation is called word embeddings
    • word embeddings capture the semantic relationships between words.
  • Text classification
    • is an important task for applications performing web searches, information retrieval, ranking, and document classification
  • provides the Skip-gram and continuous bag-of-words (CBOW) training architectures

Text Classification – TensorFlow

  • is a supervised learning algorithm that supports transfer learning with many pretrained models from the TensorFlow Hub.
  • uses deep learning networks such as BERT which are highly accurate for text classification.
  • takes text as input and outputs probability for each of the class labels.
  • useful for sentiment analysis, spam detection, and document categorization.

Sequence to Sequence – seq2seq

  • is a supervised learning algorithm where the input is a sequence of tokens (for example, text, audio), and the output generated is another sequence of tokens.
  • key uses cases are machine translation (input a sentence from one language and predict what that sentence would be in another language), text summarization (input a longer string of words and predict a shorter string of words that is a summary), speech-to-text (audio clips converted into output sentences in tokens)

Forecasting

DeepAR

  • is a supervised learning algorithm for forecasting scalar (one-dimensional) time series using recurrent neural networks (RNN).
  • use the trained model to generate forecasts for new time series that are similar to the ones it has been trained on.
  • supports learning complex patterns from multiple related time series simultaneously.

Clustering

K-means algorithm

  • is an unsupervised learning algorithm for clustering
  • attempts to find discrete groupings within data, where members of a group are as similar as possible to one another and as different as possible from members of other groups

Topic Modelling

Latent Dirichlet Allocation (LDA)

  • is an unsupervised learning algorithm that attempts to describe a set of observations as a mixture of distinct categories.
  • used to discover a user-specified number of topics shared by documents within a text corpus.

Neural Topic Model (NTM)

  • is an unsupervised learning algorithm that is used to organize a corpus of documents into topics that contain word groupings based on their statistical distribution
  • Topic modeling can be used to classify or summarize documents based on the topics detected or to retrieve information or recommend content based on topic similarities.

Feature Reduction

Object2Vec

  • is a general-purpose neural embedding algorithm that is highly customizable
  • can learn low-dimensional dense embeddings of high-dimensional objects.
  • useful for duplicate detection, finding similar items, and relationship prediction.

Principal Component Analysis – PCA

  • is an unsupervised ML algorithm that attempts to reduce the dimensionality (number of features) within a dataset while still retaining as much information as possible.
  • projects data points onto the first few principal components (eigenvectors of the data’s covariance matrix).

Anomaly Detection

Random Cut Forest (RCF)

  • is an unsupervised algorithm for detecting anomalous data points within a data set.
  • detects data points that diverge from otherwise well-structured or patterned data.

IP Insights

  • is an unsupervised learning algorithm that learns the usage patterns for IPv4 addresses.
  • designed to capture associations between IPv4 addresses and various entities, such as user IDs or account numbers
  • useful for detecting suspicious login attempts from anomalous IP addresses.

Computer Vision – CV

Image Classification – MXNet

  • a supervised learning algorithm that supports multi-label classification
  • takes an image as input and outputs one or more labels
  • uses a convolutional neural network (ResNet) that can be trained from scratch or trained using transfer learning when a large number of training images are not available.
  • recommended input format is Apache MXNet RecordIO. Also supports raw images in .jpg or .png format.

Image Classification – TensorFlow

  • is a supervised learning algorithm that supports transfer learning with many pretrained models from the TensorFlow Hub.
  • uses deep learning networks such as MobileNet, ResNet, Inception, and EfficientNet for image classification.
  • takes an image as input and outputs probability for each of the class labels.
  • supports fine-tuning pretrained models for specific image classification tasks.

Object Detection – MXNet

  • detects and classifies objects in images using a single deep neural network.
  • is a supervised learning algorithm that takes images as input and identifies all instances of objects within the image scene.

Object Detection – TensorFlow

  • is a supervised learning algorithm that supports transfer learning with many pretrained models from the TensorFlow Model Garden.
  • takes an image as input and predicts bounding boxes and object labels.
  • uses deep learning networks such as MobileNet, ResNet, Inception, and EfficientNet for object detection.

Semantic Segmentation

  • provides a fine-grained, pixel-level approach to developing computer vision applications.
  • tags every pixel in an image with a class label from a predefined set of classes and is critical to an increasing number of CV applications, such as self-driving vehicles, medical imaging diagnostics, and robot sensing.
  • also provides information about the shapes of the objects contained in the image. The segmentation output is represented as a grayscale image, called a segmentation mask.

SageMaker JumpStart – Pre-trained Models

  • SageMaker JumpStart provides pre-trained foundation models, pre-built solution templates, and example notebooks for popular ML problem types.
  • Foundation models include large language models (LLMs) such as LLaMA, Falcon, BLOOM, FLAN-T5, Mistral, and GPT-J for generative AI tasks.
  • Supports 15+ problem types including:
    • Text Generation, Text Summarization, Question Answering
    • Text Embedding, Named Entity Recognition
    • Image Classification, Object Detection, Instance Segmentation
    • Tabular Classification, Tabular Regression
    • Machine Translation, Sentence Pair Classification
  • Models can be fine-tuned on custom datasets and deployed directly from SageMaker Studio.

SageMaker Autopilot (AutoML)

  • SageMaker Autopilot automatically explores different solutions to find the best model for your data.
  • Analyzes data, selects algorithms, preprocesses data, trains models, and performs hyperparameter optimization.
  • Supports classification, regression, and time-series forecasting problem types.
  • Available as a no-code/low-code option through SageMaker Canvas for business analysts.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. An Analytics team is leading an organization and wants to use anomaly detection to identify potential risks. What Amazon SageMaker AI machine learning algorithms are best suited for identifying anomalies?
    1. Semantic segmentation
    2. K-nearest neighbors
    3. Latent Dirichlet Allocation (LDA)
    4. Random Cut Forest (RCF)
  2. A ML specialist team works for a marketing consulting firm wants to
    apply different marketing strategies per segment of their customer base. Online retailer purchase history from the last 5 years is available, it has been decided to segment the customers based on their purchase history. Which type of machine learning algorithm would give you segmentation based on purchase history in the most expeditious manner?

    1. K-Nearest Neighbors (KNN)
    2. K-Means
    3. Semantic Segmentation
    4. Neural Topic Model (NTM)
  3. A ML specialist team is looking to improve the quality of searches for their library of documents that are uploaded in PDF, Rich Text Format, or ASCII text. It is looking to use machine learning to automate the identification of key topics for each of the documents. What machine learning resources are best suited for this problem? (Select TWO)
    1. BlazingText algorithm
    2. Latent Dirichlet Allocation (LDA) algorithm
    3. Topic Finder (TF) algorithm
    4. Neural Topic Model (NTM) algorithm
  4. A manufacturing company has a large set of labeled historical sales data. The company would like to predict how many units of a particular part should be produced each quarter. Which machine learning approach should be used to solve this problem?
    1. BlazingText algorithm
    2. Random Cut Forest (RCF)
    3. Principal component analysis (PCA)
    4. Linear regression
  5. An agency collects census information with responses for approximately 500 questions from each citizen. Which algorithm would help reduce the number of features?
    1. Factorization machines (FM) algorithm
    2. Latent Dirichlet Allocation (LDA) algorithm
    3. Principal component analysis (PCA) algorithm
    4. Random Cut Forest (RCF) algorithm
  6. A store wants to understand some characteristics of visitors to the store. The store has security video recordings from the past several years. The store wants to group visitors by hair style and hair color. Which solution will meet these requirements with the LEAST amount of effort?
    1. Object detection algorithm
    2. Latent Dirichlet Allocation (LDA) algorithm
    3. Random Cut Forest (RCF) algorithm
    4. Semantic segmentation algorithm
  7. A data scientist needs to build a model that can automatically classify product reviews as positive or negative. The dataset contains millions of labeled reviews. Which SageMaker AI built-in algorithm is MOST suitable for this text classification task with transfer learning?
    1. Sequence-to-Sequence (seq2seq)
    2. BlazingText in Word2Vec mode
    3. Text Classification – TensorFlow
    4. Neural Topic Model (NTM)
  8. A company wants to predict customer churn using a tabular dataset with both numerical and categorical features. The team wants an AutoML approach that automatically ensembles multiple models. Which SageMaker AI built-in algorithm should they use?
    1. XGBoost
    2. Linear Learner
    3. AutoGluon-Tabular
    4. Factorization Machines
  9. A team needs to detect objects in images and draw bounding boxes around them. They want to leverage pretrained models and use transfer learning. Which SageMaker AI algorithm should they choose?
    1. Image Classification – MXNet
    2. Semantic Segmentation
    3. Image Classification – TensorFlow
    4. Object Detection – TensorFlow
  10. A company has tabular data with many categorical features and wants a gradient-boosted trees algorithm that handles categorical features natively without manual encoding. Which algorithm is BEST suited?
    1. XGBoost
    2. LightGBM
    3. CatBoost
    4. Linear Learner

References

AWS AI & Machine Learning Services Cheat Sheet

AWS Machine Learning Services

AWS Machine Learning Services

AWS Machine Learning Services

Amazon Bedrock

  • is a fully managed service providing access to high-performing foundation models (FMs) from leading AI companies (GA September 2023).
  • offers foundation models from AI21 Labs, Amazon (Nova), Anthropic, Cohere, Meta, Mistral AI, OpenAI, and Stability AI through a unified API.
  • enables building and scaling generative AI applications without managing infrastructure.
  • supports model customization including fine-tuning and reinforcement fine-tuning (RFT) with your own data while maintaining data privacy and security.
  • provides serverless experience with pay-per-use pricing.
  • includes capabilities for text generation, chat, image generation, video generation, and embeddings.
  • supports Retrieval Augmented Generation (RAG) with Knowledge Bases and the new Managed Knowledge Base (2026) that abstracts storage, retrieval, embeddings, and re-ranking into a single managed primitive.
  • provides Bedrock Agents for multi-step task automation.
  • includes Amazon Bedrock Guardrails for configurable safety controls including content filtering, topic classification, sensitive information protection, and hallucination detection across both text and images with up to 88% harmful content blocking accuracy.
  • supports OpenAI-compatible API endpoints (2026) including Responses API and Chat Completions API for simplified migration and integration.
  • ensures data is not used to train base models and remains within your AWS environment.
  • includes Amazon Bedrock AgentCore (2026) — a platform to build, connect, deploy, and optimize AI agents with managed harness, observability, guardrails integration, and continuous optimization capabilities.

Amazon Nova Foundation Models

  • is Amazon’s family of proprietary foundation models available exclusively through Amazon Bedrock (launched December 2024 at re:Invent).
  • includes Amazon Nova Micro — a text-only model optimized for speed and lowest cost, ideal for summarization, translation, and classification (128K context).
  • includes Amazon Nova Lite — a low-cost multimodal model processing text, images, and video for tasks like document analysis and visual Q&A.
  • includes Amazon Nova Pro — a balanced multimodal model offering strong accuracy, speed, and cost for a wide range of tasks.
  • includes Amazon Nova Premier — the most capable model for complex reasoning, agentic workflows, and model distillation.
  • includes Amazon Nova Canvas — an image generation model.
  • includes Amazon Nova Reel — a video generation model.
  • includes Amazon Nova Sonic — a speech-to-speech model.
  • Amazon Nova 2 models (Nova 2 Lite and Nova 2 Pro) announced in December 2025 with improved capabilities.
  • all Nova models are among the fastest and most cost-effective in their respective intelligence classes, optimized for RAG and agentic applications.

Amazon Q Developer (formerly CodeWhisperer) → Transitioning to Kiro

  • is a generative AI-powered coding assistant for software developers (rebranded from CodeWhisperer in April 2024).
  • provides real-time code suggestions, completions, and generation based on comments and existing code.
  • supports multiple programming languages including Python, Java, JavaScript, TypeScript, C#, Go, Rust, PHP, Ruby, Kotlin, C, C++, Shell, SQL, and more.
  • integrates with popular IDEs including VS Code, IntelliJ IDEA, PyCharm, WebStorm, and AWS Cloud9.
  • performs security scanning to identify and suggest fixes for vulnerabilities.
  • provides code explanations and documentation generation.
  • assists with debugging, upgrading applications, and troubleshooting.
  • tracks open-source code references and license information.
  • offers free tier for individual developers and paid tier for professional use.
⚠️ Transition Notice (May 2026): Amazon Q Developer IDE plugins and paid subscriptions will reach end-of-support on April 30, 2027. New signups blocked as of May 15, 2026. The successor is Kiro — AWS’s next-generation agentic development environment (IDE and CLI) built on Code OSS and powered by Amazon Bedrock. Kiro includes agentic coding, inline chat, terminal integration, and MCP support. Users have a 12-month transition window.

Amazon Quick (formerly Amazon Q Business)

  • is a generative AI-powered assistant for enterprise use, rebranded from Amazon Q Business to Amazon Quick in April 2026.
  • is described as “the next evolution of Amazon Q Business” — an AI assistant for work that connects to apps, learns workflows, and takes action.
  • answers questions, provides summaries, generates content, and completes tasks based on enterprise data.
  • connects to 40+ enterprise data sources including S3, SharePoint, Salesforce, ServiceNow, Jira, and more.
  • respects existing access controls and permissions from connected data sources.
  • provides conversational interface for employees to access company information.
  • available as a desktop app (Windows and Mac) with Microsoft 365 extensions (Outlook, Word, Teams).
  • offers Free and Plus pricing plans.
  • supports autonomous agents for handling recurring tasks continuously.
  • supports Amazon Q Apps for creating AI-powered applications from conversations.
  • ensures enterprise data privacy and security with data isolation.

Amazon SageMaker AI (formerly Amazon SageMaker)

  • Naming Update (December 2024): On December 3, 2024, Amazon SageMaker was renamed to Amazon SageMaker AI. The “SageMaker” brand now refers to the next-generation unified platform for data, analytics, and AI.
  • Build, train, and deploy machine learning models at scale.
  • fully-managed service that enables data scientists and developers to quickly and easily build, train & deploy machine learning models.
  • enables developers and scientists to build machine learning models for use in intelligent, predictive apps.
  • is designed for high availability with no maintenance windows or scheduled downtimes.
  • allows users to select the number and type of instance used for the hosted notebook, training & model hosting.
  • can be deployed as endpoint interfaces and batch.
  • supports Canary deployment using ProductionVariant and deploying multiple variants of a model to the same SageMaker HTTPS endpoint.
  • supports Jupyter notebooks.
  • Users can persist their notebook files on the attached ML storage volume.
  • Users can modify the notebook instance and select a larger profile through the SageMaker console, after saving their files and data on the attached ML storage volume.
  • includes built-in algorithms for linear regression, logistic regression, k-means clustering, principal component analysis, factorization machines, neural topic modeling, latent dirichlet allocation, gradient boosted trees, seq2seq, time series forecasting, word2vec & image classification
  • algorithms work best when using the optimized protobuf recordIO format for the training data, which allows Pipe mode that streams data directly from S3 and helps faster start times and reduce space requirements
  • provides built-in algorithms, pre-built container images, or extend a pre-built container image and even build your custom container image.
  • supports users custom training algorithms provided through a Docker image adhering to the documented specification.
  • also provides optimized MXNet, Tensorflow, Chainer & PyTorch containers
  • ensures that ML model artifacts and other system artifacts are encrypted in transit and at rest.
  • requests to the API and console are made over a secure (SSL) connection.
  • stores code in ML storage volumes, secured by security groups and optionally encrypted at rest.
  • SageMaker Neo is a capability that enables machine learning models to train once and run anywhere in the cloud and at the edge.

Amazon SageMaker Unified Studio

  • is a unified web-based development environment announced at re:Invent 2024 and GA in March 2025.
  • is part of the next generation of Amazon SageMaker — the center for all data, analytics, and AI.
  • breaks down silos in data and tools, giving data engineers, data scientists, data analysts, and ML developers a single development experience.
  • brings together functionality from Amazon EMR, AWS Glue, Amazon Redshift, Amazon Bedrock, and SageMaker AI Studio.
  • enables discovering data and AI assets from across the organization, then collaborating in projects to securely build and share analytics and AI artifacts.
  • includes SageMaker Lakehouse — unifies data across data lakes, data warehouses, operational databases, and enterprise applications with Apache Iceberg compatibility.
  • includes SageMaker Data and AI Governance for integrated access controls and data governance.
  • offers choice of IDEs including JupyterLab, Code Editor (based on VS Code OSS), and RStudio.
  • Note: The previous “SageMaker Studio” experience was renamed to “SageMaker Studio Classic” (November 2023) and is now part of SageMaker AI.

Amazon SageMaker Canvas

  • is a no-code machine learning service for business analysts (launched November 2021).
  • enables building accurate ML models without writing code or requiring ML expertise.
  • provides visual, point-and-click interface for data preparation and model building.
  • supports tabular, image, and text data for predictions.
  • connects to 50+ data sources including S3, Redshift, Snowflake, and SaaS applications.
  • offers ready-to-use ML models and custom model building capabilities.
  • includes generative AI capabilities (October 2023) for text generation, summarization, and content creation.
  • provides automated feature engineering, algorithm selection, and hyperparameter tuning.
  • enables one-click model deployment and batch predictions.
  • supports collaboration between business analysts and data scientists.
  • is the recommended migration path for Amazon Forecast customers for time-series forecasting.

Amazon SageMaker Clarify

  • provides bias detection and model explainability capabilities (launched December 2020).
  • identifies biases in training data and ML models across different groups (age, gender, income, etc.).
  • detects potential bias during data preparation, after model training, and in deployed models.
  • generates detailed reports quantifying different types of possible bias.
  • provides feature importance graphs to explain model predictions.
  • integrates with SageMaker Data Wrangler for bias detection during data preparation.
  • supports continuous monitoring of deployed models for bias drift.
  • helps meet regulatory requirements and ethical AI standards.
  • produces reports for internal presentations and compliance documentation.

Amazon SageMaker HyperPod

  • is purpose-built infrastructure for distributed training at scale (GA November 2023).
  • reduces time to train foundation models by up to 40% with optimized infrastructure.
  • supports GPU-based and AWS Trainium-based instances for cost-effective training.
  • provides automated cluster health monitoring and node replacement.
  • enables training for weeks or months with automated resiliency.
  • automatically saves checkpoints and resumes training from last checkpoint on failure.
  • efficiently distributes models and data across thousands of compute resources.
  • includes preconfigured distributed training libraries for popular frameworks.
  • provides recipes for accelerating foundation model training and fine-tuning.
  • offers flexible training plans to meet timelines and budgets.

Amazon Textract

  • Textract provides OCR and helps add document text detection and analysis to the applications.
  • includes simple, easy-to-use API operations that can analyze image files and PDF files.
  • extracts text, handwriting, tables, and forms from scanned documents.
  • supports Queries for extracting specific information from documents using natural language questions.
  • provides Lending API for automated mortgage document processing.

Amazon Comprehend

  • Comprehend is a managed natural language processing (NLP) service to find insights and relationships in text.
  • identifies the language of the text; extracts key phrases, places, people, brands, or events; understands how positive or negative the text is; analyzes text using tokenization and parts of speech; and automatically organizes a collection of text files by topic.
  • can analyze a collection of documents and other text files (such as social media posts) and automatically organize them by relevant terms or topics.
  • supports custom entity recognition and custom classification for domain-specific NLP.
  • provides Comprehend Medical for extracting medical information such as conditions, medications, dosages, and their relationships.
⚠️ Note (April 2026): Amazon Comprehend topic modeling, event detection, and prompt safety classification features are no longer available to new customers as of April 30, 2026. Existing customers can continue to use these features.

Amazon Lex

  • is a service for building conversational interfaces using voice and text.
  • provides the advanced deep learning functionalities of automatic speech recognition (ASR) for converting speech to text, and natural language understanding (NLU) to recognize the intent of the text, to enable building applications with highly engaging user experiences and lifelike conversational interactions.
  • common use cases of Lex include: Application/Transactional bot, Informational bot, Enterprise Productivity bot, and Device Control bot.
  • leverages Lambda for Intent fulfillment, Cognito for user authentication & Polly for text-to-speech.
  • scales to customers’ needs and does not impose bandwidth constraints.
  • is a completely managed service so users don’t have to manage the scaling of resources or maintenance of code.
  • uses deep learning to improve over time.
  • supports Generative AI features powered by Amazon Bedrock LLMs including:
    • AMAZON.QnAIntent — handles FAQ-style questions using knowledge bases without configuring individual intents.
    • Assisted NLU (2025) — uses LLMs to improve intent classification and slot resolution accuracy while staying within configured intents.
    • Descriptive Bot Builder — generates bot configurations from natural language descriptions.

Amazon Polly

  • text into speech
  • uses advanced deep-learning technologies to synthesize speech that sounds like a human voice.
  • provides dozens of lifelike voices across 60+ languages.
  • supports multiple voice engines:
    • Standard — concatenative synthesis voices.
    • Neural — higher-quality neural TTS voices.
    • Long-Form — optimized for long content like articles and books.
    • Generative (2024-2025) — the most natural-sounding voices using generative AI, with new voices continually added.
  • supports Lexicons to customize pronunciation of specific words & phrases.
  • supports Speech Synthesis Markup Language (SSML) tags like prosody so users can adjust the speech rate, pitch, pauses, or volume.
  • supports bidirectional streaming API for real-time applications.

Amazon Rekognition

  • analyzes image and video
  • identify objects, people, text, scenes, and activities in images and videos, as well as detect any inappropriate content.
  • provides highly accurate facial analysis and facial search capabilities that can be used to detect, analyze, and compare faces for a wide variety of user verification, people counting, and public safety use cases.
  • helps identify potentially unsafe or inappropriate content across both image and video assets and provides detailed labels that help accurately control what you want to allow based on your needs.
  • provides Rekognition Custom Labels (launched December 2019) – an AutoML feature to build custom ML models for detecting specific objects and scenes unique to business needs.
  • Custom Labels requires as few as 10 sample images per label to train custom models.
  • Custom Labels automatically selects optimal ML algorithms and trains models without requiring ML expertise.
  • enables identifying business-specific items like machine parts, product defects, or brand logos.

Amazon Forecast

⚠️ SERVICE CLOSED TO NEW CUSTOMERS (July 29, 2024)
Amazon Forecast is no longer available to new customers. Existing customers can continue using the service. Migration: Use Amazon SageMaker Canvas for time-series forecasting with a no-code interface.
  • Amazon Forecast is a fully managed time-series forecasting service that uses statistical and machine learning algorithms to deliver highly accurate time-series forecasts and is built for business metrics analysis.
  • automatically tracks the accuracy of the model over time as new data is imported.
  • provides six built-in algorithms which include ARIMA, Prophet, NPTS, ETS, CNN-QR, and DeepAR+.
  • integrates with AutoML to choose the optimal model for the datasets.

Amazon SageMaker Ground Truth

  • helps build highly accurate training datasets for machine learning quickly.
  • offers easy access to labelers through Amazon Mechanical Turk and provides them with built-in workflows and interfaces for common labeling tasks.
  • allows using your own labelers or use vendors recommended by Amazon through AWS Marketplace.
  • helps lower labeling costs by up to 70% using automatic labeling, which works by training Ground Truth from data labeled by humans so that the service learns to label data independently.
  • provides annotation consolidation to help improve the accuracy of the data object’s labels.

Amazon Translate

  • provides natural and fluent language translation
  • a neural machine translation service that delivers fast, high-quality, and affordable language translation.
  • Neural machine translation is a form of language translation automation that uses deep learning models to deliver more accurate and natural-sounding translation than traditional statistical and rule-based translation algorithms.
  • allows content localization – such as websites and applications – for international users, and to easily translate large volumes of text efficiently.

Amazon Transcribe

  • provides speech-to-text capability
  • uses a deep learning process called automatic speech recognition (ASR) to convert speech to text quickly and accurately.
  • can be used to transcribe customer service calls, automate closed captioning and subtitling, and generate metadata for media assets to create a fully searchable archive.
  • adds punctuation and formatting so that the output closely matches the quality of manual transcription at a fraction of the time and expense.
  • process audio in batch or near real-time.
  • supports automatic language identification.
  • supports custom vocabulary to generate more accurate transcriptions for domain-specific words and phrases like product names, technical terminology, or names of individuals.
  • supports specifying a list of words to remove from transcripts.
  • provides Transcribe Call Analytics (launched August 2021) for extracting insights from customer conversations.
  • Call Analytics generates turn-by-turn transcripts with speaker identification and sentiment analysis.
  • supports real-time Call Analytics (November 2022) for live conversation insights and agent assistance.
  • provides Transcribe Medical for healthcare and medical transcription with HIPAA eligibility.

Amazon Kendra

  • is an intelligent search service that uses NLP and advanced ML algorithms to return specific answers to search questions from your data.
  • uses its semantic and contextual understanding capabilities to decide whether a document is relevant to a search query.
  • returns specific answers to questions, giving users an experience that’s close to interacting with a human expert.
  • provides a unified search experience by connecting multiple data repositories to an index and ingesting and crawling documents.
  • can use the document metadata to create a feature-rich and customized search experience for the users, helping them efficiently find the right answers to their queries.
  • can be used as a retriever for Amazon Quick (formerly Amazon Q Business) to power enterprise search with generative AI.

Augmented AI (Amazon A2I)

  • Augmented AI (Amazon A2I) is an ML service that makes it easy to build the workflows required for human review.
  • brings human review to all developers, removing the undifferentiated heavy lifting associated with building human review systems or managing large numbers of human reviewers, whether it runs on AWS or not.
  • integrates with Amazon Textract for document processing and Amazon Rekognition for content moderation.
  • supports private review teams, Amazon Mechanical Turk, and AWS Marketplace vendors.

Amazon Personalize

  • Personalize is a fully managed machine learning service that uses data to generate item recommendations.
  • can also generate user segments based on the users’ affinity for certain items or item metadata.
  • generates recommendations primarily based on item interaction data that comes from the users interacting with items in the catalog.
  • includes API operations for real-time personalization, and batch operations for bulk recommendations and user segments.

Amazon Panorama

⚠️ SERVICE END OF SUPPORT — May 31, 2026
AWS will end support for AWS Panorama on May 31, 2026. After this date, you will no longer be able to access the AWS Panorama console or resources, and Panorama devices will become non-functional. Consider migrating to Amazon SageMaker AI with edge deployment or third-party edge CV solutions.
  • brings computer vision to the on-premises camera network.
  • AWS Panorama Appliance or another compatible device can be installed in the data center and registered with AWS Panorama to deploy computer vision applications from the cloud.
  • AWS Panorama Appliance
    • is a compact edge appliance that uses a powerful system-on-module (SOM) that is optimized for ML workloads.
    • can run multiple computer vision models against multiple video streams in parallel and output the results in real-time.
    • is designed for use in commercial and industrial settings and is rated for dust and liquid protection.
  • works with the existing real-time streaming protocol (RTSP) network cameras.

Amazon Fraud Detector

  • Fraud Detector is a fully managed service to identify potentially fraudulent online activities such as online payment fraud and fake account creation.
  • takes care of all the heavy lifting such as data validation and enrichment, feature engineering, algorithm selection, hyperparameter tuning, and model deployment.

AWS IoT Greengrass ML Inference

  • IoT Greengrass helps perform machine learning inference locally on devices, using models that are created, trained, and optimized in the cloud.
  • provides flexibility to use machine learning models trained in SageMaker or to bring your pre-trained model stored in S3.
  • helps get inference results with very low latency to ensure the IoT applications can respond quickly to local events.

Amazon Elastic Inference

⚠️ SERVICE DEPRECATED (April 2023)
Amazon Elastic Inference is no longer available to new customers. Alternatives: Use AWS Inferentia instances (Inf1/Inf2) for better price-performance on inference workloads, or use SageMaker AI real-time inference endpoints with appropriate instance types.
  • helped attach low-cost GPU-powered acceleration to EC2 and SageMaker instances or ECS tasks to reduce the cost of running deep learning inference by up to 75%.
  • supported TensorFlow, Apache MXNet, and ONNX models.

AWS Certification Exam Practice Questions

  • Questions are collected from Internet and the answers are marked as per my knowledge and understanding (which might differ with yours).
  • AWS services are updated everyday and both the answers and questions might be outdated soon, so research accordingly.
  • AWS exam questions are not updated to keep up the pace with AWS updates, so even if the underlying feature has changed the question might not be updated
  • Open to further feedback, discussion and correction.
  1. A company has built a deep learning model and now wants to deploy it using the SageMaker Hosting Services. For inference, they want a cost-effective option that guarantees low latency but still comes at a fraction of the cost of using a GPU instance for your endpoint. As a machine learning Specialist, what feature should be used?
    1. Inference Pipeline
    2. Elastic Inference [Note: Elastic Inference is deprecated. Current recommendation is AWS Inferentia (Inf2) instances for cost-effective inference.]
    3. SageMaker Ground Truth
    4. SageMaker Neo
  2. A machine learning specialist works for an online retail company that sells health products. The company allows users to enter reviews of the products they buy from the website. The company wants to make sure the reviews do not contain any offensive or unsafe content, such as obscenities or threatening language. Which Amazon SageMaker algorithm or service will allow scanning user’s review text in the simplest way?
    1. BlazingText
    2. Transcribe
    3. Semantic Segmentation
    4. Comprehend
  3. A company develops a tool whose coverage includes blogs, news sites, forums, videos, reviews, images, and social networks such as Twitter and Facebook. Users can search data by using Text and Image Search, and use charting, categorization, sentiment analysis, and other features to provide further information and analysis. They want to provide Image and text analysis capabilities to the applications which include identifying objects, people, text, scenes, and activities, and also provide highly accurate facial analysis and facial recognition. What service can provide this capability?
    1. Amazon Comprehend
    2. Amazon Rekognition
    3. Amazon Polly
    4. Amazon SageMaker
  4. A company wants to build generative AI applications using foundation models without managing infrastructure. Which service should they use?
    1. Amazon SageMaker
    2. Amazon Comprehend
    3. Amazon Bedrock
    4. Amazon Lex
  5. A development team wants an AI assistant that provides real-time code suggestions and security scanning in their IDE. Which service should they use?
    1. Amazon CodeGuru
    2. Amazon Q Developer (transitioning to Kiro)
    3. AWS Cloud9
    4. Amazon SageMaker
  6. A business analyst with no ML experience wants to build accurate ML models using a visual interface. Which service should they use?
    1. Amazon SageMaker Studio
    2. Amazon SageMaker Canvas
    3. Amazon Forecast
    4. Amazon Personalize
  7. A company needs to detect bias in their ML models and explain predictions for regulatory compliance. Which service should they use?
    1. Amazon SageMaker Ground Truth
    2. Amazon Inspector
    3. Amazon SageMaker Clarify
    4. AWS Audit Manager
  8. A company wants to train large foundation models for weeks with automated resiliency and checkpoint management. Which service should they use?
    1. Amazon SageMaker Training Jobs
    2. Amazon SageMaker HyperPod
    3. AWS Batch
    4. Amazon EC2 with GPU instances
  9. A contact center wants real-time insights from customer calls including sentiment analysis and agent assistance. Which service should they use?
    1. Amazon Transcribe
    2. Amazon Transcribe Call Analytics
    3. Amazon Comprehend
    4. Amazon Connect
  10. A company wants to build custom image recognition models to identify specific machine parts with minimal training data. Which service should they use?
    1. Amazon Rekognition (standard)
    2. Amazon Rekognition Custom Labels
    3. Amazon SageMaker
    4. Amazon Textract
  11. A company wants to deploy AI agents that can perform multi-step workflows, access enterprise tools, and maintain state across conversations in production. Which service should they use?
    1. Amazon Lex
    2. Amazon SageMaker AI
    3. Amazon Bedrock AgentCore
    4. AWS Step Functions
  12. A company needs Amazon’s own foundation models that offer industry-leading price-performance for text, image, and video generation tasks. Which model family should they use?
    1. Amazon Titan
    2. Amazon Nova
    3. Amazon Comprehend
    4. Amazon SageMaker JumpStart
  13. An enterprise wants a unified platform for data engineering, analytics, ML development, and generative AI that breaks down tool silos. Which service should they use?
    1. Amazon SageMaker AI
    2. Amazon EMR
    3. Amazon SageMaker Unified Studio
    4. AWS Glue
  14. A company wants to implement safety guardrails for their generative AI application to filter harmful content, block prompt injections, and protect sensitive information. Which service should they use?
    1. AWS WAF
    2. Amazon Macie
    3. Amazon Bedrock Guardrails
    4. AWS Shield

References

AWS SageMaker – ML Model Training & Deployment

SageMaker Overview

AWS SageMaker AI

📝 Naming Update (December 2024): On December 3, 2024, Amazon SageMaker was renamed to Amazon SageMaker AI. The “SageMaker” brand now refers to the next-generation unified platform for data, analytics, and AI. SageMaker AI remains available as a standalone service for building, training, and deploying ML models at scale, and is also integrated within the broader next-generation SageMaker platform.

Amazon SageMaker AI is a fully managed machine learning service that enables data scientists and developers to build, train, and deploy machine learning models quickly and efficiently. This comprehensive platform simplifies the entire machine learning workflow while providing the flexibility to use your preferred tools and frameworks.

  • SageMaker AI removes the heavy lifting from each step of the machine learning process to make it easier to develop high-quality models.
  • It is designed for high availability with no maintenance windows or scheduled downtimes.
  • APIs run in Amazon’s proven, high-availability data centers, with service stack replication configured across three facilities in each AWS region to provide fault tolerance in the event of a server failure or AZ outage.
  • SageMaker AI provides a full end-to-end workflow, but users can continue to use their existing tools with SageMaker AI.
  • It supports Jupyter notebooks through SageMaker Studio and SageMaker Notebook Instances.
  • Users can select the number and type of instance used for hosted notebooks, training jobs, and model hosting to optimize for performance and cost.

SageMaker Overview - A diagram showing the complete machine learning workflow from data preparation to model deployment

🎓 Build AI Skills with Google
Learn practical AI skills and earn a Google Certificate. No experience required – learn at your own pace.
Start the Google AI Essentials Learning Path →

Next-Generation Amazon SageMaker Platform (Dec 2024)

At AWS re:Invent 2024, AWS unveiled the next generation of Amazon SageMaker — a unified platform for data, analytics, and AI. This represents a major evolution from the original ML-focused service into a comprehensive data and AI platform.

Platform Components

  • Amazon SageMaker Unified Studio: A single integrated development environment for data engineering, SQL analytics, ML model development, and generative AI application development. It brings together tools previously spread across multiple services (EMR, Glue, Redshift, Athena, SageMaker AI, Bedrock) into one workspace. Generally available as of March 2025.
  • Amazon SageMaker Lakehouse: An open data architecture built on Apache Iceberg that unifies data across Amazon S3 data lakes (including S3 Tables), Amazon Redshift data warehouses, and third-party/federated data sources, enabling analytics and AI on a single copy of data.
  • Amazon SageMaker Catalog: Enables secure discovery, governance, and collaboration for data and AI assets across structured/unstructured data, AI models, dashboards, and applications. Supports metadata forms, business glossaries, lineage tracking, and fine-grained access control.
  • Amazon SageMaker AI: The original SageMaker capabilities for building, training, and deploying ML models (formerly just “Amazon SageMaker”). Includes HyperPod, JumpStart, MLOps, and all inference/training features.

Key Differentiators

  • Unified access to data and tools with governance built in.
  • Serverless notebooks combining SQL, Python, Spark, and natural language prompts.
  • Integrated with Amazon Q Developer for AI-assisted development.
  • Apache Iceberg-based open lakehouse architecture.
  • Cross-account and cross-service collaboration capabilities.

What’s New in SageMaker AI (2024-2026)

AWS has significantly enhanced SageMaker AI with major new features and capabilities:

  • Serverless Model Customization (2025): Automatically provisions compute resources for fine-tuning, supporting SFT, DPO, RLVR, and RLAIF techniques with integrated MLflow experiment tracking. Supports Amazon Nova, DeepSeek, Llama, Qwen models.
  • Bidirectional Streaming (2025): Enables real-time, multi-modal inference with persistent connections where data flows simultaneously in both directions — powering voice agents, live transcription, and continuous conversations.
  • Enhanced Observability (2025): Granular instance-level and container-level metrics for CPU, memory, GPU utilization, and invocation performance with configurable publishing frequencies.
  • Inference Component Rolling Updates (2025): Deploy model updates in configurable batches with CloudWatch alarm-based automatic rollbacks, eliminating need for duplicate infrastructure.
  • Flexible Training Plans: Reserve accelerated compute capacity (P4d, P5, P5e, P5en, Trn1, Trn2) up to 8 weeks in advance with instant start times. Now also supports inference endpoints.
  • Container Caching (2026): Stores container images and model artifacts on running instances to reduce cold start latency for inference component scaling operations.
  • SageMaker HyperPod Enhancements: EKS integration, flexible instance groups, continuous scaling, custom AMIs, CMK integration, and Karpenter auto-scaling support.
  • Agent-Guided Model Customization (2025): AI-guided workflows that accelerate model fine-tuning by automating technique selection and experiment management.
  • SageMaker Canvas: Now includes Data Wrangler capabilities with a natural language interface for data preparation, in addition to the visual no-code ML model building.
  • SageMaker Model Cards: Documentation and tracking of model information throughout the ML lifecycle.
  • SageMaker Role Manager: Simplified granting of least-privilege permissions for ML workloads.

Deprecated and Discontinued SageMaker Features

⚠️ Deprecated Services and Features

Feature Status EOL Date Replacement
SageMaker Edge Manager End of Life April 26, 2024 SageMaker Neo + IoT Greengrass
SageMaker Studio Classic End of Maintenance December 31, 2024 SageMaker Studio (new experience)
Amazon Elastic Inference No longer available April 15, 2023 AWS Inferentia / Inferentia2
SageMaker Training Compiler No new releases N/A (existing DLCs still usable) Neuron SDK / framework-native optimizations
Data Wrangler (standalone) Merged into Canvas N/A SageMaker Canvas Data Wrangler
JupyterLab 1 & 3 (Notebook Instances) End of Support June 30, 2025 JupyterLab 4

SageMaker AI Machine Learning Workflow

SageMaker AI supports the complete machine learning workflow, from data preparation to model deployment and monitoring. Each stage is designed to be flexible and interoperable with the others.

SageMaker Machine Learning Workflow showing the progression from data preparation through model deployment

Data Preparation and Feature Engineering

Before training a model, data must be explored, cleaned, and transformed. SageMaker AI provides several tools to streamline this process:

  • SageMaker Data Wrangler (via Canvas): Now integrated into SageMaker Canvas, Data Wrangler provides both a visual interface and a natural language interface to aggregate and prepare data, with automated data quality assessment and intelligent transformation recommendations.
  • SageMaker Feature Store: Provides a centralized repository for storing, sharing, and managing features with real-time feature computation capabilities.
  • SageMaker Processing: Enables running data processing workloads at scale with support for custom frameworks.

Data Preparation Best Practices

  • Fetch the data: Import data from various sources including Amazon S3, Amazon Redshift, Amazon Athena, and more.
  • Clean the data: Handle missing values, outliers, and inconsistencies with automated data quality assessment.
  • Transform the data: Convert data into formats suitable for machine learning algorithms with intelligent transformation recommendations.

Model Training

SageMaker AI provides flexible options for training machine learning models, from using built-in algorithms to bringing your own code:

  • Training the model: Select an algorithm and compute resources appropriate for your data and problem.
  • Evaluating the model: Determine whether the accuracy and other metrics meet your requirements using enhanced evaluation tools.

Training Data Format Options

SageMaker AI supports multiple data storage locations and input modes for training:

  • Storage options include Amazon S3, Amazon EFS, and Amazon FSx for Lustre.
  • Input modes include:
    • File mode: Downloads all data to the training instance before starting. Best for smaller datasets that fit in memory.
    • Pipe mode: Streams data directly from S3, enabling faster start times and reduced storage requirements.
    • Fast File mode: Combines the ease of File mode with the performance benefits of Pipe mode.
    • Streaming mode: Supports continuous data streaming for online learning scenarios with real-time data.
  • SageMaker AI natively supports all common data formats including CSV, JSON, Parquet, Arrow, and specialized formats for multi-modal data.

Model Building

SageMaker AI offers multiple approaches to building machine learning models:

  • Built-in Algorithms: Optimized algorithms for various ML tasks, including specialized algorithms for multi-modal learning and time-series forecasting.
  • Custom Training: Support for all major ML frameworks including TensorFlow, PyTorch, and emerging frameworks.
  • Foundation Models: Access to pre-trained foundation models via SageMaker JumpStart with simplified fine-tuning workflows and parameter-efficient training methods (LoRA, QLoRA).
  • AutoML: SageMaker AutoPilot with support for multi-modal data, time-series forecasting, and automated model selection.
  • Serverless Model Customization: Automatically provisions compute for fine-tuning with SFT, DPO, RLVR, and RLAIF techniques.

Model Deployment and Inference

SageMaker AI provides multiple options for deploying models to production environments, each optimized for different use cases:

  • Model deployment helps deploy ML code to make predictions, also known as Inference.
  • SageMaker AI supports auto-scaling for hosted models to dynamically adjust the number of instances based on workload.
  • Inference Components provide an abstraction layer for models with dedicated resource allocation and per-model scaling policies, enabling intelligent model packing for cost optimization.
  • Multi-model endpoints provide a cost-effective solution for deploying large numbers of models using shared resources.
  • High availability and reliability are achieved by deploying multiple instances across multiple Availability Zones.

Inference Options Comparison

Inference Type Best For Payload Size Processing Time Key Features
Real-time Low-latency, high-throughput requirements Up to 6 MB Up to 60 seconds Persistent REST API endpoint, instance type of your choice
Serverless Intermittent or unpredictable traffic Up to 4 MB Up to 60 seconds No instance management, pay-per-use pricing
Batch Transform Offline processing of large datasets GB-scale Days No persistent endpoint, good for preprocessing
Asynchronous Large payloads, long processing times Up to 1 GB Up to one hour Request queuing, scale to zero when idle
Bidirectional Streaming Real-time multi-modal (voice, live transcription) Streaming Continuous Persistent WebSocket connection, simultaneous send/receive
Edge IoT and edge device deployment Device-dependent Device-dependent On-device inference, intermittent connectivity support

SageMaker AI also supports Inference Pipelines, which allow you to chain multiple models and preprocessing/postprocessing steps in a sequence of containers.

Testing Model Variants

SageMaker AI supports testing multiple models or model versions behind the same endpoint using variants:

  • Production Variants: Enable A/B or canary testing by allocating portions of traffic to different model versions.
  • Shadow Variants: Test new models by sending them copies of production traffic without exposing their responses to users.
  • Rolling Updates: Deploy model updates in configurable batches with integrated CloudWatch alarm monitoring and automatic rollbacks if issues are detected.

SageMaker AI Training Optimization

SageMaker AI provides several features to optimize the training process for cost, performance, and reliability:

  • SageMaker Managed Spot Training: Uses EC2 Spot instances to reduce training costs by up to 90% compared to On-Demand instances.
  • SageMaker Checkpoints: Saves model state during training to resume from the last checkpoint if interrupted.
  • SageMaker Distributed Training: Optimizes training across multiple GPUs and instances for faster model convergence.
  • SageMaker Inference Recommender: Helps select the optimal instance type and configuration for deploying models based on performance and cost requirements.
  • Flexible Training Plans: Reserve accelerated compute capacity (P4d, P5, P5e, P5en, Trn1, Trn2 instances) up to 8 weeks in advance with instant start times (as soon as 30 minutes). Now also supports inference endpoints for GPU capacity reservation.
  • SageMaker HyperPod Recipes: Pre-built fine-tuning workflows for popular models (DeepSeek-R1, Llama, etc.) that simplify model customization complexity.
Note: SageMaker Training Compiler is no longer receiving new releases or versions. Existing AWS Deep Learning Containers (DLCs) with Training Compiler remain usable. For newer optimizations, consider using the Neuron SDK for Trainium/Inferentia or framework-native compilation tools.

SageMaker AI Security and Governance

SageMaker AI provides comprehensive security features and governance tools to help you meet compliance requirements and maintain control over your ML workflows:

  • ML model artifacts and other system artifacts are encrypted in transit and at rest.
  • Support for encrypted S3 buckets and KMS keys for notebooks, training jobs, and endpoints.
  • Secure API and console access over SSL connections.
  • VPC interface endpoints powered by AWS PrivateLink for secure access without internet exposure, now with comprehensive regional support and IPv6 compatibility.
  • SageMaker Role Manager: Simplifies creating least-privilege IAM roles for ML workflows.
  • SageMaker Model Cards: Documents model information, intended uses, risk ratings, and evaluation results.
  • SageMaker Catalog (Next-Gen): Provides unified governance across data and AI assets with fine-grained access control, metadata management, lineage tracking, and data quality monitoring.

Network Isolation

SageMaker Network Isolation provides additional security by:

  • Preventing containers from making outbound network calls, even to other AWS services.
  • Not exposing AWS credentials to the container runtime environment.
  • Limiting network traffic to peers of each training container in multi-instance training jobs.
  • Isolating S3 operations from the training or inference container.

SageMaker AI Development Environment

SageMaker Studio

SageMaker Studio is a comprehensive integrated development environment (IDE) for machine learning:

  • Provides a unified interface for all ML development tasks.
  • Supports collaborative development allowing team members to share notebooks and projects.
  • Integrates MLOps capabilities for CI/CD pipelines and automated workflows.
  • Offers intelligent code assistance and optimization suggestions via Amazon Q Developer.
  • Enables cross-account collaboration while maintaining governance.
Note: SageMaker Studio Classic reached end of maintenance on December 31, 2024. No new Studio Classic applications can be created. Existing workloads should be migrated to the new Studio experience. For domains created after June 1, 2026, Amazon EFS is not created by default through quick setup.

SageMaker Unified Studio (Next-Gen Platform)

SageMaker Unified Studio is the development environment for the next-generation SageMaker platform (GA March 2025):

  • Single environment for data engineering, SQL analytics, ML development, and generative AI.
  • Serverless notebooks combining SQL queries, Python code, Apache Spark processing, and natural language prompts.
  • Backed by Amazon Athena for Apache Spark, scaling from interactive exploration to petabyte-scale jobs.
  • Integrated with Amazon Bedrock for generative AI application development with guardrails.
  • Project-based collaboration with fine-grained access control via SageMaker Catalog.
  • Accelerated by Amazon Q Developer for AI-assisted development.

SageMaker Canvas

SageMaker Canvas provides a visual interface that enables business analysts and developers to build ML models without writing code:

  • Import data from various sources including Amazon S3, Redshift, and local files.
  • Automatically clean and prepare data for model building.
  • Data Wrangler Integration: Now includes full Data Wrangler capabilities with both visual and natural language interfaces for data preparation and transformation.
  • Build models for common use cases like prediction, categorization, and time series forecasting.
  • Evaluate models with easy-to-understand metrics and visualizations.
  • Generate predictions on new data and share insights with stakeholders.
  • Collaborate with data scientists using SageMaker Studio.

SageMaker Studio Lab

SageMaker Studio Lab provides free resources for learning machine learning:

  • Access to CPU and GPU compute instances for educational purposes.
  • Guided tutorials and courses for learning ML concepts and SageMaker AI capabilities.
  • Community features for sharing notebooks and collaborating on projects.
  • Upgraded to JupyterLab 4 as of August 8, 2025.
  • Easy migration path to full SageMaker AI when ready for production.

SageMaker AI ML Components

SageMaker Feature Store

SageMaker Feature Store is a purpose-built repository for storing, sharing, and managing machine learning features:

  • Centralized store for features and associated metadata for easy discovery and reuse.
  • Reduces repetitive data processing by allowing features to be created once and used for both training and inference.
  • Organizes features into FeatureGroups that describe Records.
  • Supports both online and offline stores:
    • Online store: For low-latency, real-time inference use cases, retaining only the latest feature values.
    • Offline store: For training and batch inference, storing historical feature data in Parquet format for optimized storage and queries.

Feature Store diagram showing how it fits into the machine learning pipeline, from raw data processing to feature serving for both training and inference

SageMaker JumpStart

SageMaker JumpStart provides pre-built solutions and foundation models to accelerate ML development:

  • Access to hundreds of pre-trained, open-source models for various problem types.
  • Support for foundation models including large language models (Llama, DeepSeek, Mistral, Falcon), text-to-image models, and embedding models.
  • Ability to fine-tune models on your own data before deployment using parameter-efficient methods (LoRA, QLoRA).
  • Integration with Amazon Bedrock — models hosted via JumpStart can be accessed through Bedrock’s managed API with advanced security controls and monitoring.
  • Solution templates that set up infrastructure for common use cases.
  • Executable example notebooks for learning SageMaker AI capabilities.

SageMaker Built-in Algorithms

SageMaker AI provides numerous built-in algorithms optimized for performance and scale, covering common machine learning tasks:

  • Supervised learning algorithms for regression and classification.
  • Unsupervised learning algorithms for clustering and anomaly detection.
  • Computer vision algorithms for image and video analysis.
  • Natural language processing algorithms for text analysis.
  • Time series forecasting algorithms for predicting future values.

For a detailed list of available algorithms, please refer to SageMaker Built-in Algorithms.

SageMaker HyperPod

SageMaker HyperPod provides purpose-built infrastructure for training and inference of foundation models at scale, reducing training time by up to 40%:

  • Designed for large-scale distributed training of foundation models with automatic best-configuration selection.
  • Provides fault-tolerant infrastructure with automatic recovery from instance failures.
  • Supports checkpointing to resume training from the last saved state.
  • EKS Integration: Run HyperPod on Amazon EKS with continuous scaling, custom AMIs, and customer managed key (CMK) support.
  • Flexible Instance Groups (2026): Specify multiple instance types and subnets within a single instance group, simplifying Karpenter auto-scaling configurations.
  • Flexible Training Plans: Reserve compute capacity (P5, P5e, P5en, Trn1, Trn2) with instant start times.
  • HyperPod Recipes: Pre-built fine-tuning workflows for popular models (DeepSeek-R1, Llama) to simplify customization.
  • Optimized for popular ML frameworks like PyTorch and TensorFlow.

AWS ML Accelerators

AWS offers custom silicon designed specifically for machine learning workloads:

  • AWS Inferentia2:
    • Second-generation purpose-built chip for deep learning inference.
    • Delivers 3x higher compute, 4x larger memory, up to 4x higher throughput, and up to 10x lower latency compared to first-gen Inferentia.
    • Optimized for LLMs, latent diffusion models, and vision transformers.
    • Available through Amazon EC2 Inf2 instances.
  • AWS Trainium:
    • Custom chips designed specifically for training deep learning models.
    • Provides up to 50% cost savings over comparable GPU-based instances.
    • Available through Amazon EC2 Trn1 instances.
  • AWS Trainium2:
    • Second-generation training chip — 4x faster, 4x more memory bandwidth, 3x more memory capacity than Trn1.
    • Offers 30-40% better price performance than GPU-based P5e and P5en instances.
    • Supports training and deploying models with hundreds of billions to trillion+ parameters.
    • Available through Amazon EC2 Trn2 and Trn2 UltraServer instances (GA December 2024).
  • AWS Trainium3:
    • Third-generation chip announced at re:Invent 2025, built on TSMC 3nm.
    • Delivers 2.52 PFLOPS per chip.
    • Supports NVLink Fusion for hybrid GPU/Trainium clusters.
Note: Amazon Elastic Inference is no longer available to new customers (discontinued April 15, 2023). Customers should use AWS Inferentia2-based instances (Inf2) for cost-effective, high-performance ML inference. SageMaker Notebook Instances also support Trn1 and Inf2 instance types (November 2024).

Model Quality and Responsible AI

SageMaker Clarify

SageMaker Clarify helps improve ML models by detecting potential bias and helping to explain the predictions that the models make:

  • Provides explainability for complex models including deep neural networks.
  • Offers comprehensive fairness metrics with customizable thresholds.
  • Includes techniques for detecting and mitigating bias in training data and model predictions.
  • Integrates with Model Cards for automatic documentation of fairness and explainability insights.
  • Supports automatic FM evaluation for accuracy, robustness, and toxicity metrics for generative AI.
  • Helps meet regulatory compliance requirements with pre-built reports.

SageMaker Model Monitor

SageMaker Model Monitor monitors the quality of SageMaker AI machine learning models in production:

  • Provides unified monitoring of data quality, model quality, bias, and explainability.
  • Offers early warning system to detect potential issues before they impact model performance.
  • Supports configurable automated responses to detected issues.
  • Allows for custom domain-specific and business-oriented monitoring metrics.
  • Includes specialized monitoring for foundation models, including prompt drift and output quality.

SageMaker Ground Truth

SageMaker Ground Truth provides automated data labeling using machine learning:

  • Uses active learning to automate the labeling of input data for certain built-in task types.
  • Supports labeling workflows for complex data types including video, audio, and multi-modal content.
  • Offers quality control workflows with consensus labeling and expert review.
  • Available as both a self-service offering and an AWS-managed offering (Ground Truth Plus).
  • Helps lower labeling costs by up to 70% through automation.

Diagram showing the SageMaker Ground Truth workflow with human labelers and automated labeling

Automation and Experimentation

SageMaker AutoPilot

SageMaker AutoPilot automates the end-to-end machine learning process while maintaining transparency:

  • Automatically analyzes data and selects appropriate algorithms.
  • Preprocesses data and performs feature engineering.
  • Trains and tunes multiple models to find the best performer.
  • Generates notebooks with the code used for model creation, enabling customization.
  • Provides explainability reports showing feature importance for model predictions.

Diagram showing the SageMaker AutoPilot workflow from data analysis through model deployment

SageMaker Automatic Model Tuning

SageMaker Automatic Model Tuning helps optimize hyperparameters to improve model performance:

  • Uses Bayesian optimization to efficiently search the hyperparameter space.
  • Supports random search and grid search strategies.
  • Can leverage managed spot training to reduce costs.
  • Provides warm start capability to use previous tuning jobs as starting points.

Best practices for hyperparameter tuning include:

  • Limit the search space: Focus on fewer, more impactful hyperparameters.
  • Choose appropriate ranges: Avoid overly broad ranges that waste computational resources.
  • Use logarithmic scales for parameters that vary by orders of magnitude.
  • Consider concurrent jobs carefully: While parallel jobs complete faster, sequential jobs often find better solutions.
  • Design distributed training to target your specific objective metrics.

SageMaker Experiments

SageMaker Experiments helps track, organize, and compare machine learning iterations:

  • Automatically captures inputs, parameters, configurations, and results for each run.
  • Organizes related runs into experiments for easy comparison.
  • Visualizes performance metrics across multiple runs to identify the best models.
  • Integrates with SageMaker Studio for a unified experience.
  • Serverless MLflow integration for automatic logging of metrics during model customization.

SageMaker Pipelines

SageMaker Pipelines provides a comprehensive MLOps solution:

  • Enables creation and editing of ML workflows with visual tools.
  • Offers pre-built templates for common ML workflows.
  • Includes built-in testing frameworks for validating models before deployment.
  • Provides comprehensive monitoring of pipeline execution.
  • Supports ML workflows that span multiple AWS accounts for enhanced security.

SageMaker Debugger

SageMaker Debugger provides tools to debug training jobs and resolve problems:

  • Automatically detects common training problems like overfitting, vanishing gradients, and exploding tensors.
  • Captures metrics and tensors during training for real-time and post-training analysis.
  • Sends alerts when anomalies are detected, enabling early intervention.
  • Provides visualization tools to understand model behavior.
  • Supports popular frameworks including TensorFlow, PyTorch, MXNet, and XGBoost.

Edge and Hybrid Deployment

⚠️ SageMaker Edge Manager – End of Life: SageMaker Edge Manager reached EOL on April 26, 2024. All references to edge packaging jobs, devices, and device fleets have been deleted. Resources created by Edge Manager (S3 packages, IoT things, IAM roles) continue to exist on their respective services. For edge deployment, use SageMaker Neo with AWS IoT Greengrass.

SageMaker Neo

SageMaker Neo enables machine learning models to train once and run anywhere in the cloud and at the edge:

  • Optimizes models for a wide range of hardware platforms including CPUs, GPUs, and specialized ML accelerators.
  • Provides specialized techniques for optimizing large language models and other foundation models.
  • Includes advanced quantization techniques that preserve model accuracy while reducing size.
  • Supports intelligent partitioning of models across multiple devices or between edge and cloud.
  • Can be used with IoT Greengrass to perform machine learning inference locally on devices (recommended replacement for Edge Manager workflows).

SageMaker vs Amazon Bedrock

Understanding when to use SageMaker AI vs Amazon Bedrock for foundation models:

Criteria SageMaker AI / JumpStart Amazon Bedrock
Best For Full control over training, fine-tuning, and deployment infrastructure Serverless API access to FMs without infrastructure management
Customization Full fine-tuning, PEFT (LoRA/QLoRA), DPO, RLVR, distillation Fine-tuning, continued pre-training, model distillation via API
Infrastructure Customer-managed instances (GPU, Trainium, Inferentia) Fully managed, no instance provisioning
Model Selection Open-source models (Llama, DeepSeek, Mistral, Falcon, etc.) Proprietary + open models (Claude, Titan, Llama, Mistral, etc.)
Integration JumpStart models can be imported into Bedrock via Custom Model Import Both accessible via SageMaker Unified Studio

SageMaker AI Pricing

SageMaker AI follows a pay-as-you-go pricing model with no upfront commitments:

  • Users pay only for the resources they use across the ML workflow.
  • Costs are based on the instance types and duration of usage for notebooks, training, and inference.
  • Storage costs apply for data stored in SageMaker Feature Store, model artifacts, and other components.
  • Serverless options like SageMaker Serverless Inference charge based on duration and memory configuration.
  • Serverless Model Customization uses pay-per-token pricing for both training and inference.
  • Cost optimization features include:
    • Managed Spot Training for reduced training costs (up to 90% savings)
    • Multi-model endpoints for efficient hosting of multiple models
    • Inference Components for intelligent model packing and per-model scaling
    • Serverless Inference for pay-per-use model hosting
    • Inference Recommender for optimal instance selection
    • Auto-scaling to match resources with demand
    • SageMaker Savings Plans for committed usage discounts
    • Flexible Training Plans for reserved compute capacity at predictable pricing
    • Trainium/Inferentia instances for up to 50-70% cost savings vs GPUs

AWS Certification Exam Practice Questions

  • Questions are collected from various sources and answers reflect our understanding, which may differ from yours.
  • AWS services are updated frequently, so some information may become outdated.
  • AWS exam questions may not always reflect the latest service updates.
  • We welcome feedback and corrections to improve accuracy.
  1. A company has built a deep learning model and now wants to deploy it using the SageMaker AI Hosting Services. For inference, they want a cost-effective option that guarantees low latency but still comes at a fraction of the cost of using a GPU instance for your endpoint. As a machine learning Specialist, what feature should be used?
    1. Inference Pipeline
    2. AWS Inferentia
    3. SageMaker Ground Truth
    4. SageMaker Neo

    Answer: AWS Inferentia – AWS Inferentia (and Inferentia2) is designed specifically for high-performance, cost-effective ML inference, providing better performance at lower cost compared to general-purpose GPU instances. Note: Amazon Elastic Inference is no longer available and should not be selected.

  2. A trading company is experimenting with different datasets, algorithms, and hyperparameters to find the best combination for the machine learning problem. The company doesn’t want to limit the number of experiments the team can perform but wants to track the several hundred to over a thousand experiments throughout the modeling effort. Which Amazon SageMaker AI feature should they use to help manage your team’s experiments at scale?
    1. SageMaker Inference Pipeline
    2. SageMaker Experiments
    3. SageMaker Neo
    4. SageMaker model containers

    Answer: SageMaker Experiments – SageMaker Experiments is designed specifically for tracking, organizing, and comparing machine learning iterations at scale.

  3. A Machine Learning Specialist needs to monitor Amazon SageMaker AI in a production environment for analyzing records of actions taken by a user, role, or an AWS service. Which service should the Specialist use to meet these needs?
    1. AWS CloudTrail
    2. Amazon CloudWatch
    3. AWS Systems Manager
    4. AWS Config

    Answer: AWS CloudTrail – CloudTrail is the appropriate service for tracking API calls and user actions, while CloudWatch is better suited for performance monitoring and metrics.

  4. A company wants to train a large language model with hundreds of billions of parameters but is concerned about hardware failures interrupting multi-week training jobs. Which SageMaker AI feature is specifically designed to address this concern?
    1. SageMaker Managed Spot Training
    2. SageMaker HyperPod
    3. SageMaker Distributed Training
    4. SageMaker Neo

    Answer: SageMaker HyperPod – HyperPod is specifically designed for training foundation models at scale with automatic recovery from instance failures and checkpointing capabilities, reducing training time by up to 40%.

  5. An organization wants to deploy ML models at the edge on IoT devices. They were previously using SageMaker Edge Manager. What is the recommended approach now that Edge Manager has reached end of life?
    1. Use SageMaker Batch Transform with scheduled jobs
    2. Use SageMaker Neo with AWS IoT Greengrass
    3. Use SageMaker Serverless Inference
    4. Use Amazon Elastic Inference

    Answer: SageMaker Neo with AWS IoT Greengrass – SageMaker Neo optimizes models for edge hardware, and IoT Greengrass enables local inference on devices. This is the recommended replacement for Edge Manager (EOL April 2024). Note: Elastic Inference is also discontinued.

  6. A data science team needs to fine-tune a DeepSeek-R1 model for their specific use case but wants to avoid the complexity of managing infrastructure and selecting compute resources. Which SageMaker AI capability should they use?
    1. SageMaker Training with custom containers
    2. SageMaker Serverless Model Customization
    3. SageMaker Canvas
    4. SageMaker Studio Lab

    Answer: SageMaker Serverless Model Customization – This capability automatically provisions the right compute resources based on model and data size, supports advanced techniques (SFT, DPO, RLVR, RLAIF), and uses pay-per-token pricing without infrastructure management.

  7. A company wants to deploy a voice agent that requires real-time, continuous interaction where the model needs to receive audio input and generate responses simultaneously. Which SageMaker AI inference option is best suited for this use case?
    1. Real-time inference
    2. Asynchronous inference
    3. Bidirectional streaming
    4. Batch transform

    Answer: Bidirectional streaming – Introduced in 2025, bidirectional streaming enables real-time multi-modal applications by maintaining persistent WebSocket connections where data flows simultaneously in both directions, ideal for voice agents and live transcription.

  8. A company is deploying multiple foundation models and wants to optimize costs by efficiently sharing compute resources across models while maintaining individual scaling policies for each model. Which SageMaker AI feature should they use?
    1. Multi-model endpoints
    2. Inference Components
    3. Production Variants
    4. SageMaker Serverless Inference

    Answer: Inference Components – Inference Components abstract ML models and enable assigning dedicated resources and specific scaling policies per model while optimizing resource utilization through intelligent model packing on shared infrastructure.

  9. An ML team wants to reduce cold start latency when scaling their inference endpoints during traffic spikes. They want new model copies to become available faster on already-provisioned instances. Which 2026 SageMaker AI feature addresses this?
    1. SageMaker Inference Recommender
    2. Container Caching
    3. SageMaker Neo compilation
    4. Flexible Training Plans

    Answer: Container Caching – Container caching stores container images and model artifacts on already running instances, reducing cold start latency for scaling inference component operations that reuse existing instances.

  10. A company wants to use a unified platform where data engineers can process data with SQL and Spark, data scientists can train ML models, and AI developers can build generative AI applications — all using a single governed environment. Which AWS service provides this capability? (Select TWO)
    1. Amazon SageMaker Unified Studio
    2. Amazon SageMaker AI Studio
    3. Amazon SageMaker Lakehouse
    4. Amazon Bedrock
    5. AWS Glue Studio

    Answer: A, C – SageMaker Unified Studio provides the single IDE for all data, analytics, and AI workloads, while SageMaker Lakehouse provides the unified data architecture built on Apache Iceberg. Together they form the next-generation SageMaker platform.

References

Amazon SageMaker AI Documentation

Amazon SageMaker FAQs

Amazon SageMaker Pricing

Next-Generation Amazon SageMaker Documentation

Amazon SageMaker Lakehouse

Amazon SageMaker Data and AI Governance

AWS Trainium

AWS Inferentia

Machine Learning Concepts – Cheat Sheet

Machine Learning Concepts

📋 Certification Relevance (Updated June 2026)

This post covers Machine Learning concepts relevant for:

  • AWS Certified AI Practitioner (AIF-C01) — Domain 1: AI and ML Fundamentals (20%)
  • AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Domain 2: ML Model Development (26%)
  • AWS Certified Generative AI Developer – Professional (New in 2026)

Note: The AWS Certified Machine Learning – Specialty exam was retired on March 31, 2026. It has been replaced by the ML Engineer Associate and the new Generative AI Developer Professional certifications.

This post covers some of the basic Machine Learning concepts mostly relevant for the AWS AI and Machine Learning certification exams.

Machine Learning Lifecycle

Data Processing and Exploratory Analysis

  • To train a model, you need data.
  • Type of data that depends on the business problem that you want the model to solve (the inferences that you want the model to generate).
  • Process data includes data collection, data cleaning, data split, data exploring, preprocessing, transformation, formatting etc.

Feature Selection and Engineering

  • helps improve model accuracy and speed up training
  • remove irrelevant data inputs using domain knowledge for e.g. name
  • remove features which has same values, very low correlation, very little variance or lot of missing values
  • handle missing data using mean values or imputation
  • combine features which are related for e.g. height and age to height/age
  • convert or transform features to useful representation for e.g. date to day or hour
  • standardize data ranges across features

Missing Data

  • do nothing
  • remove the feature with lot of missing data points
  • remove samples with missing data, if the feature needs to be used
  • Impute using mean/median value
    • no impact and the dataset is not skewed
    • works with numerical values only. Do not use for categorical features.
    • doesn’t factor correlations between features
  • Impute using (Most Frequent) or (Zero/Constant) Values
    • works with categorical features
    • doesn’t factor correlations between features
    • can introduce bias
  • Impute using k-NN, Multivariate Imputation by Chained Equation (MICE), Deep Learning
    • more accurate than the mean, median or most frequent
    • Computationally expensive

Unbalanced Data

  • Source more real data
  • Oversampling instances of the minority class or undersampling instances of the majority class
  • Create or synthesize data using techniques like SMOTE (Synthetic Minority Oversampling TEchnique)

Label Encoding and One-hot Encoding

  • Models cannot multiply strings by the learned weights, encoding helps convert strings to numeric values.
  • Label encoding
    • Use Label encoding to provide lookup or map string data values to a numerical values
    • However, the values are random and would impact the model
  • One-hot encoding
    • Use One-hot encoding for Categorical features that have a discrete set of possible values.
    • One-hot encoding provide binary representation by converting data values into features without impacting the relationships
    • a binary vector is created for each categorical feature in the model that represents values as follows:
      • For values that apply to the example, set corresponding vector elements to 1.
      • Set all other elements to 0.
    • Multi-hot encoding is when multiple values are 1

Cleaning Data

  • Scaling or Normalization means converting floating-point feature values from their natural range (for example, 100 to 900) into a standard range (for example, 0 to 1 or -1 to +1)

Train a model

  • Model training includes both training and evaluating the model,
  • To train a model, algorithm is needed.
  • Data can be split into training data, validation data and test data
    • Algorithm sees and is directly influenced by the training data
    • Algorithm uses but is indirectly influenced by the validation data
    • Algorithm does not see the testing data during training
  • Training can be performed using normal parameters or features and hyperparameters

Supervised, Unsupervised, Semi-Supervised, and Reinforcement Learning

Supervised Learning

  • Uses labeled data — both input features and correct output (target) are provided
  • Model learns mapping between inputs and outputs
  • Types: Classification (categorical output) and Regression (continuous output)
  • Examples: spam detection, image classification, price prediction

Unsupervised Learning

  • Uses unlabeled data — no target variable provided
  • Model discovers hidden patterns or structures in data
  • Types: Clustering, Dimensionality Reduction, Anomaly Detection, Association
  • Examples: customer segmentation, PCA, anomaly detection

Semi-Supervised Learning

  • Combines a small amount of labeled data with a large amount of unlabeled data
  • Useful when labeling data is expensive or time-consuming
  • The model learns from labeled data and generalizes using unlabeled data patterns
  • Examples: medical image classification (few expert-labeled images + many unlabeled)

Self-Supervised Learning

  • A form of unsupervised learning where the model generates its own labels from the input data
  • The foundation of modern Large Language Models (LLMs) and Foundation Models
  • Techniques include masked language modeling (predict missing words) and next-token prediction
  • Enables pre-training on massive unlabeled datasets before fine-tuning on specific tasks
  • Examples: BERT (masked word prediction), GPT (next-token prediction)

Reinforcement Learning

  • Agent learns by interacting with an environment and receiving rewards or penalties
  • Goal is to maximize cumulative reward through trial and error
  • Key concepts: Agent, Environment, State, Action, Reward, Policy
  • Used in robotics, game playing, autonomous vehicles, recommendation systems
  • Reinforcement Learning from Human Feedback (RLHF) — used to fine-tune LLMs based on human preference rankings (key technique behind ChatGPT and similar models)

Splitting and Randomization

  • Always randomize the data before splitting

Hyperparameters

  • influence how the training occurs
  • Common hyperparameters are learning rate, epoch, batch size
  • Learning rate
    • size of the step taken during gradient descent optimization
    • Large learning rates can overshoot the correct solution
    • Small learning rates increase training time
  • Batch size
    • number of samples used to train at any one time
    • can be all (batch), one (stochastic), or some (mini-batch)
    • calculable from infrastructure
    • Small batch sizes tend to not get stuck in local minima
    • Large batch sizes can converge on the wrong solution at random.
  • Epochs
    • number of times the algorithm processes the entire training data
    • each epoch or run can see the model get closer to the desired state
  • depends on algorithm used

Evaluating the model

After training the model, evaluate it to determine whether the accuracy of the inferences is acceptable.

ML Model Insights

  • For binary classification models use accuracy metric called Area Under the (Receiver Operating Characteristic) Curve (AUC). AUC measures the ability of the model to predict a higher score for positive examples as compared to negative examples.
  • For regression tasks, use the industry standard root mean square error (RMSE) metric. It is a distance measure between the predicted numeric target and the actual numeric answer (ground truth). The smaller the value of the RMSE, the better is the predictive accuracy of the model.

Cross-Validation

  • is a technique for evaluating ML models by training several ML models on subsets of the available input data and evaluating them on the complementary subset of the data.
  • Use cross-validation to detect overfitting, ie, failing to generalize a pattern.
  • there is no separate validation data, involves splitting the training data into chunks of validation data and use it for validation

Optimization

  • Gradient Descent is used to optimize many different types of machine learning algorithms
  • Step size sets Learning rate
    • If the learning rate is too large, the minimum slope might be missed and the graph would oscillate
    • If the learning rate is too small, it requires too many steps which would take the process longer and is less efficient

Underfitting

  • Model is underfitting the training data when the model performs poorly on the training data because the model is unable to capture the relationship between the input examples (often called X) and the target values (often called Y).
  • To increase model flexibility
    • Add new domain-specific features and more feature Cartesian products, and change the types of feature processing used (e.g., increasing n-grams size)
    • Regularization – Decrease the amount of regularization used
    • Increase the amount of training data examples.
    • Increase the number of passes on the existing training data.

Overfitting

  • Model is overfitting the training data when the model performs well on the training data but does not perform well on the evaluation data because the model is memorizing the data it has seen and is unable to generalize to unseen examples.
  • To increase model flexibility
    • Feature selection: consider using fewer feature combinations, decrease n-grams size, and decrease the number of numeric attribute bins.
    • Simplify the model, by reducing the number of layers.
    • Regularization – technique to reduce the complexity of the model. Increase the amount of regularization used.
    • Early Stopping – a form of regularization while training a model with an iterative method, such as gradient descent.
    • Data Augmentation – process of artificially generating new data from existing data, primarily to train new ML models.
    • Dropout is a regularization technique that prevents overfitting.

Deep Learning and Neural Networks

  • Deep Learning is a subset of machine learning that uses neural networks with multiple layers (deep neural networks) to learn complex patterns
  • Excels at tasks involving unstructured data (images, text, audio)
  • Requires large amounts of data and significant compute resources

Neural Network Fundamentals

  • Composed of layers: Input Layer, Hidden Layers, and Output Layer
  • Each neuron applies weights, bias, and an activation function to its inputs
  • Common activation functions: ReLU, Sigmoid, Tanh, Softmax
  • Backpropagation — algorithm used to calculate gradients and update weights during training
  • A network with 2+ hidden layers is considered a “deep” neural network

Types of Neural Networks

Convolutional Neural Networks (CNNs)

  • Specialized for processing grid-like data (images, spatial data)
  • Uses convolutional layers with filters/kernels to detect features (edges, shapes, objects)
  • Key layers: Convolutional, Pooling (Max/Average), Fully Connected
  • Use cases: image classification, object detection, computer vision

Recurrent Neural Networks (RNNs)

  • Designed for sequential data (time series, text, speech)
  • Maintains internal state (memory) that captures information from previous time steps
  • LSTM (Long Short-Term Memory) — addresses vanishing gradient problem with gates (forget, input, output)
  • GRU (Gated Recurrent Unit) — simplified version of LSTM with fewer parameters
  • Largely superseded by Transformers for NLP tasks due to inability to parallelize

Transformers

  • Architecture introduced in “Attention Is All You Need” (2017) — foundation of modern LLMs
  • Processes entire input sequences in parallel using self-attention mechanism
  • Key innovation: Attention mechanism allows each token to attend to every other token regardless of distance
  • Components: Multi-Head Attention, Positional Encoding, Feed-Forward Layers, Layer Normalization
  • Encoder — processes input and creates representations (BERT-style models)
  • Decoder — generates output tokens auto-regressively (GPT-style models)
  • Encoder-Decoder — used for sequence-to-sequence tasks (T5, translation)
  • Use cases: NLP, text generation, translation, code generation, image generation

Generative Adversarial Networks (GANs)

  • Two networks compete: Generator (creates synthetic data) vs Discriminator (detects fake data)
  • Training improves both networks iteratively
  • Use cases: image generation, style transfer, data augmentation

Generative AI and Foundation Models

  • Generative AI is a subset of deep learning that creates new content (text, images, code, audio, video) based on patterns learned from training data
  • Foundation Models (FMs) are large pre-trained models trained on broad datasets that can be adapted for many downstream tasks
  • Examples: GPT-4, Claude, Amazon Titan, Amazon Nova, Llama, Gemini

Large Language Models (LLMs)

  • Foundation models trained on massive text corpora using self-supervised learning
  • Based on the Transformer architecture
  • Capabilities: text generation, summarization, translation, code generation, reasoning
  • Key characteristics:
    • Parameters — learned weights (billions to trillions); more parameters generally = more capability
    • Context window — maximum number of tokens the model can process at once
    • Tokens — basic units of text (words or sub-words) the model processes
    • Temperature — controls randomness of output (0 = deterministic, higher = more creative)

Customizing Foundation Models

Prompt Engineering

  • Crafting input prompts to guide model output without changing model weights
  • Zero-shot — no examples provided in prompt
  • Few-shot — providing examples in the prompt to guide the response
  • Chain-of-thought — guiding model to show reasoning steps
  • No training required; fastest and cheapest customization method

Retrieval-Augmented Generation (RAG)

  • Combines retrieval of external knowledge with generation of responses
  • Retrieves relevant documents from a knowledge base and includes them in the prompt context
  • Reduces hallucinations by grounding responses in factual data
  • No model retraining required; knowledge base can be updated independently
  • Components: Document store, Embedding model, Vector database, Retriever, Generator
  • AWS service: Amazon Bedrock Knowledge Bases

Fine-Tuning

  • Further training a pre-trained model on a task-specific labeled dataset
  • Updates model weights to specialize for specific domain or task
  • Requires labeled training data and compute resources
  • Types:
    • Full fine-tuning — updates all model parameters (expensive)
    • Parameter-Efficient Fine-Tuning (PEFT) — updates only a small subset of parameters
    • LoRA (Low-Rank Adaptation) — adds small trainable matrices to frozen model layers
  • AWS service: Amazon Bedrock and Amazon SageMaker AI

Continued Pre-Training

  • Training a foundation model on additional domain-specific unlabeled data
  • Teaches the model new domain knowledge (medical, legal, financial terminology)
  • More expensive than fine-tuning but creates deeper domain understanding

Transfer Learning

  • Technique of using a model trained on one task as the starting point for a related task
  • Reduces training time and data requirements significantly
  • Foundation models are the ultimate form of transfer learning — pre-trained on broad data, then adapted
  • Approaches: Feature extraction (freeze base layers) or Fine-tuning (update some/all layers)

Agentic AI

  • AI systems that can autonomously plan, reason, and take actions to accomplish goals
  • Uses tools, APIs, and knowledge bases to complete multi-step tasks
  • Key components: Planning, Memory, Tool use, Reasoning
  • AWS service: Amazon Bedrock Agents, Amazon Bedrock AgentCore

Classification Model Evaluation

Confusion Matrix

  • Confusion matrix represents the percentage of times each label was predicted in the training set during evaluation
  • An NxN table that summarizes how successful a classification model’s predictions were; that is, the correlation between the label and the model’s classification.
  • One axis of a confusion matrix is the label that the model predicted, and the other axis is the actual label.
  • N represents the number of classes. In a binary classification problem, N=2
    • For example, here is a sample confusion matrix for a binary classification problem:
Tumor (predicted) Non-Tumor (predicted)
Tumor (actual) 18 (True Positives) 1 (False Negatives)
Non-Tumor (actual) 6 (False Positives) 452 (True Negatives)
    • Confusion matrix shows that of the 19 samples that actually had tumors, the model correctly classified 18 as having tumors (18 true positives), and incorrectly classified 1 as not having a tumor (1 false negative).
    • Similarly, of 458 samples that actually did not have tumors, 452 were correctly classified (452 true negatives) and 6 were incorrectly classified (6 false positives).
  • Confusion matrix for a multi-class classification problem can help you determine mistake patterns. For example, a confusion matrix could reveal that a model trained to recognize handwritten digits tends to mistakenly predict 9 instead of 4, or 1 instead of 7.

Accuracy, Precision, Recall (Sensitivity) and Specificity

Accuracy

  • A metric for classification models, that identifies fraction of predictions that a classification model got right.
  • In Binary classification, calculated as (True Positives+True Negatives)/Total Number Of Examples
  • In Multi-class classification, calculated as Correct Predictions/Total Number Of Examples

Precision

  • A metric for classification models. that identifies the frequency with which a model was correct when predicting the positive class.
  • Calculated as True Positives/(True Positives + False Positives)

Recall – Sensitivity – True Positive Rate (TPR)

  • A metric for classification models that answers the following question: Out of all the possible positive labels, how many did the model correctly identify i.e. Number of correct positives out of actual positive results
  • Calculated as True Positives/(True Positives + False Negatives)
  • Important when – False Positives are acceptable as long as ALL positives are found for e.g. it is fine to predict Non-Tumor as Tumor as long as All the Tumors are correctly predicted

Specificity – True Negative Rate (TNR)

  • Number of correct negatives out of actual negative results
  • Calculated as True Negatives/(True Negatives + False Positives)
  • Important when – False Positives are unacceptable; it’s better to have false negatives for e.g. it is not fine to predict Non-Tumor as Tumor;

ROC and AUC

ROC (Receiver Operating Characteristic) Curve

  • An ROC curve (receiver operating characteristic curve) is curve of true positive rate vs. false positive rate at different classification thresholds.
  • An ROC curve is a graph showing the performance of a classification model at all classification thresholds.
  • An ROC curve plots True Positive Rate (TPR) vs. False Positive Rate (FPR) at different classification thresholds. Lowering the classification threshold classifies more items as positive, thus increasing both False Positives and True Positives.

    ROC Curve showing TP Rate vs. FP Rate at different classification thresholds.

AUC (Area under the ROC curve)

  • AUC stands for “Area under the ROC Curve.”
  • AUC measures the entire two-dimensional area underneath the entire ROC curve (think integral calculus) from (0,0) to (1,1).
  • AUC provides an aggregate measure of performance across all possible classification thresholds.
  • One way of interpreting AUC is as the probability that the model ranks a random positive example more highly than a random negative example.

AUC (Area under the ROC Curve).

F1 Score

  • F1 score (also F-score or F-measure) is a measure of a test’s accuracy.
  • It considers both the precision p and the recall r of the test to compute the score: p is the number of correct positive results divided by the number of all positive results returned by the classifier, and r is the number of correct positive results divided by the number of all relevant samples (all samples that should have been identified as positive).
  • Calculated as: F1 = 2 × (Precision × Recall) / (Precision + Recall)
  • Ranges from 0 to 1, with 1 being perfect precision and recall
  • Useful when you need a balance between precision and recall, especially with imbalanced datasets

Generative AI Evaluation Metrics

  • Traditional ML metrics (accuracy, F1) are insufficient for evaluating generative AI outputs
  • Generative AI requires metrics that measure quality, safety, relevance, and factual accuracy

Text Generation Metrics

BLEU (Bilingual Evaluation Understudy)

  • Measures n-gram overlap between generated text and reference text
  • Ranges from 0 to 1 (higher is better)
  • Originally designed for machine translation evaluation
  • Limitation: only measures lexical similarity, not semantic meaning

ROUGE (Recall-Oriented Understudy for Gisting Evaluation)

  • Measures overlap between generated summary and reference summary
  • ROUGE-N — n-gram overlap (ROUGE-1 = unigrams, ROUGE-2 = bigrams)
  • ROUGE-L — longest common subsequence
  • Commonly used for summarization tasks

Perplexity

  • Measures how well a model predicts a sequence of words
  • Lower perplexity = better model (model is less “surprised” by the text)
  • Useful for comparing language models but doesn’t directly measure output quality

BERTScore

  • Uses contextual embeddings to measure semantic similarity between generated and reference text
  • Better than BLEU/ROUGE at capturing meaning rather than just word overlap

Generative AI Safety Metrics

Hallucination Rate

  • Measures how often a model generates factually incorrect or fabricated information
  • Critical metric for production deployments
  • Mitigated by RAG, grounding, and guardrails

Groundedness

  • Measures whether model responses are supported by provided context/source documents
  • Key metric for RAG systems

Toxicity and Content Safety

  • Measures presence of harmful, biased, or inappropriate content in outputs
  • AWS service: Amazon Bedrock Guardrails for content filtering

Human Evaluation and LLM-as-a-Judge

  • Human evaluation remains the gold standard for assessing quality, helpfulness, and safety
  • LLM-as-a-Judge — using a capable LLM to evaluate outputs of another model (scalable alternative to human evaluation)
  • AWS service: Amazon Bedrock Evaluations for automated model evaluation

Responsible AI and ML Fairness

  • Ensuring AI systems are fair, transparent, accountable, and safe
  • Regulatory frameworks: EU AI Act, ISO 42001, NIST AI RMF

Bias in Machine Learning

  • Data Bias — bias present in training data (sampling bias, historical bias, measurement bias)
  • Algorithmic Bias — bias introduced by the model or training process
  • Selection Bias — non-representative training data
  • Detection: AWS Amazon SageMaker Clarify provides pre-training and post-training bias detection metrics

Explainability and Interpretability

  • Explainability — ability to understand why a model made a specific prediction
  • Techniques: SHAP (SHapley Additive exPlanations), feature importance, attention visualization
  • AWS service: Amazon SageMaker Clarify for model explainability
  • Critical for regulated industries (healthcare, finance) where decisions must be justified

Model Governance

  • Tracking model lineage, versioning, and approval workflows
  • Model cards documenting intended use, limitations, and evaluation results
  • AWS service: Amazon SageMaker Model Registry and SageMaker Data & AI Governance

Deploy the model

  • Re-engineer a model before integrating it with the application and deploy it.
  • Can be deployed as a Batch or as a Service (real-time endpoint)
  • Model Monitoring — continuously track model performance in production to detect drift
    • Data Drift — input data distribution changes over time compared to training data
    • Model Drift (Concept Drift) — relationship between input and output changes over time
    • Requires retraining or updating the model when drift is detected
  • A/B Testing — deploying multiple model variants to compare performance with real traffic
  • Shadow Deployment — running new model alongside production model without serving its predictions to users

AWS Machine Learning Services Summary

  • Amazon SageMaker AI — fully managed service to build, train, and deploy ML models at scale
  • Amazon Bedrock — managed service to build generative AI applications using foundation models (Amazon Titan, Amazon Nova, Claude, Llama, etc.)
  • Amazon Bedrock Knowledge Bases — managed RAG service
  • Amazon Bedrock Agents — build autonomous AI agents
  • Amazon Bedrock Guardrails — content filtering and safety controls
  • Amazon Q Developer — AI-powered coding assistant
  • Amazon SageMaker Clarify — bias detection and model explainability
  • Amazon Comprehend — NLP service for text analysis
  • Amazon Rekognition — computer vision service
  • Amazon Transcribe — speech-to-text
  • Amazon Polly — text-to-speech
  • Amazon Translate — language translation

References