AWS AI Professional (AIP-C01) Exam Learning Path

AWS Certified Generative AI Developer – Professional (AIP-C01) Overview

The AWS Certified Generative AI Developer – Professional (AIP-C01) is AWS’s professional-level certification for developers who build and deploy production-ready Generative AI solutions. Launched in 2025, this certification validates your ability to integrate foundation models into applications, implement RAG architectures, design agentic AI systems, and operationalize GenAI solutions on AWS.

Exam Detail Information
Exam Code AIP-C01
Full Name AWS Certified Generative AI Developer – Professional
Level Professional
Number of Questions 75 (+ 10 unscored)
Duration 180 minutes
Passing Score 750 / 1000
Cost $300 USD
Format Multiple choice & multiple response
Testing Pearson VUE (center or online proctored)
Languages English, Japanese, Korean, Simplified Chinese
Validity 3 years

Target Candidate Profile

  • 2+ years building production-grade applications on AWS
  • 1+ year hands-on experience implementing Generative AI solutions
  • Experience with AWS compute, storage, networking, and security services
  • Understanding of AWS deployment, IaC tools, and monitoring services
  • Familiarity with AI/ML concepts and data engineering

Recommended prior certifications (not required): AWS Certified AI Practitioner (AIF-C01), AWS Solutions Architect Associate, AWS Machine Learning Engineer Associate

AIP-C01 Exam Domains & Weightings

Domain Weight Key Topics
Domain 1: Foundation Model Integration, Data Management & Compliance 31% RAG implementation, vector stores, prompt engineering, FM selection & customization, data pipelines
Domain 2: Implementation & Integration 26% Agentic AI, tool integrations, model deployment, enterprise integration, CI/CD, troubleshooting
Domain 3: AI Safety, Security & Governance 20% Data privacy, model security, Guardrails, responsible AI, compliance, access control
Domain 4: Operational Efficiency & Optimization 12% Cost optimization, performance tuning, scaling, monitoring, A/B testing
Domain 5: Testing, Validation & Troubleshooting 11% Model evaluation metrics, benchmarking, quality assurance, debugging

Domain 1: Foundation Model Integration, Data Management & Compliance (31%)

This is the largest domain and covers the core of building GenAI solutions on AWS.

Key Topics

  • Solution Design: Architecture design using FMs, proof-of-concept implementations, Well-Architected Framework GenAI Lens
  • FM Selection & Configuration: Model benchmarking, cross-region inference, fine-tuning (LoRA, adapters), model lifecycle management via SageMaker Model Registry
  • Data Pipelines: Data validation workflows (AWS Glue Data Quality), multimodal data processing, input formatting for FM inference
  • Vector Stores: Vector database architecture (OpenSearch, Aurora pgvector, Bedrock Knowledge Bases), metadata frameworks, embedding solutions (Amazon Titan Embeddings)
  • Retrieval Mechanisms (RAG): Document chunking strategies, hybrid search (keyword + vector), reranking models, query expansion & decomposition
  • Prompt Engineering & Governance: Amazon Bedrock Prompt Management, parameterized templates, prompt flows, chain-of-thought patterns, quality assurance

AWS Services to Study

Domain 2: Implementation & Integration (26%)

This domain focuses on building production systems with agentic AI and enterprise integrations.

Key Topics

  • Agentic AI: Bedrock Agents, Strands Agents, AWS Agent Squad, MCP (Model Context Protocol), ReAct patterns, multi-agent systems
  • Tool Integrations: Function calling, MCP servers (Lambda & ECS), custom tool behaviors, error handling
  • Model Deployment: Lambda for on-demand inference, Bedrock provisioned throughput, SageMaker endpoints, container-based deployment
  • Enterprise Integration: API Gateway, EventBridge event-driven architectures, CI/CD pipelines (CodePipeline, CodeBuild), GenAI gateway architectures
  • Troubleshooting: Context window overflow, prompt debugging, retrieval system diagnostics, embedding drift monitoring

AWS Services to Study

  • Amazon Bedrock Agents – Autonomous AI agents with tool use
  • Amazon Q Developer – AI-powered development assistant
  • AWS Step Functions – Workflow orchestration for AI pipelines
  • AWS Lambda – Serverless inference, MCP servers
  • Amazon API Gateway – Enterprise API integrations
  • AWS CodePipeline / CodeBuild – CI/CD for GenAI

Domain 3: AI Safety, Security & Governance (20%)

Security and responsible AI are critical at the professional level.

Key Topics

  • Data Privacy: Data encryption (at rest/in transit), PII detection and redaction, data residency compliance
  • Model Security: IAM least-privilege access to FMs, identity federation, role-based access control
  • Guardrails: Amazon Bedrock Guardrails – content filtering, topic denial, PII redaction, grounding checks
  • Responsible AI: Bias detection, fairness evaluation, transparency, human-in-the-loop workflows
  • Compliance: Cross-jurisdiction deployments (Outposts, Wavelength), audit logging (CloudTrail), governance frameworks

AWS Services to Study

  • Amazon Bedrock Guardrails – Content filtering, responsible AI controls
  • AWS IAM – Fine-grained access control for AI services
  • AWS CloudTrail – Audit logging for AI operations
  • AWS KMS – Encryption key management
  • Amazon Macie – PII detection in data stores

Domain 4: Operational Efficiency & Optimization (12%)

Key Topics

  • Cost Optimization: Model cascading (smaller models for simple tasks), provisioned throughput vs. on-demand, right-sizing
  • Performance Tuning: Latency optimization, token processing capacity, GPU utilization
  • Scaling: Auto-scaling SageMaker endpoints, Bedrock cross-region inference, load balancing
  • Monitoring: CloudWatch metrics for AI workloads, observability pipelines (X-Ray), drift detection

Domain 5: Testing, Validation & Troubleshooting (11%)

Key Topics

  • Model Evaluation: Relevance scoring, hallucination detection, semantic drift, RAGAS metrics
  • Agent Evaluation: Task completion rates, tool usage effectiveness, Amazon Bedrock Agent evaluations
  • Retrieval Quality: Context matching verification, retrieval latency, embedding quality diagnostics
  • Deployment Validation: A/B testing, canary deployments, synthetic user workflows, automated quality checks

Recommended Study Resources

Video Courses

Course Platform Notes
Ultimate AWS Certified Generative AI Developer Professional by Stephane Maarek Udemy Comprehensive course with hands-on labs and 75-question practice exam
AWS Certified Generative AI Developer Professional AIP-C01 Udemy Security, governance, cost optimization focus
Exam Prep: AWS Certified Generative AI Developer AWS Skill Builder Official AWS exam prep (free with subscription)
Generative AI Developer Professional KodeKloud Hands-on labs with AWS sandbox environments

Practice Tests

Resource Platform Questions
[Practice Exams] AWS Certified Generative AI Developer Pro by Stephane Maarek & Abhishek Singh Udemy Multiple full-length exams with explanations
AWS Certification Official Practice Question Set AWS Skill Builder 20 official questions (free)
AWS Certification Official Pretest AWS Skill Builder Full-length readiness assessment
Whizlabs AIP-C01 Practice Tests Whizlabs Multiple practice exams with explanations

Documentation & Reading

10-Week Study Plan

Week Focus Area Activities
Week 1 Exam Overview & Foundations Read exam guide, review AI Services Cheat Sheet, understand all 5 domains and weightings
Week 2 Amazon Bedrock Core Study Amazon Bedrock, FM selection, model invocation APIs, Nova models, Titan Embeddings
Week 3 RAG & Vector Stores Study Bedrock Knowledge Bases, chunking strategies, OpenSearch vector search, hybrid search, reranking
Week 4 Prompt Engineering & Fine-tuning Bedrock Prompt Management, Prompt Flows, chain-of-thought, LoRA fine-tuning, SageMaker model customization
Week 5 Agentic AI & Tool Integration Study Bedrock Agents, Strands Agents, MCP, function calling, multi-agent orchestration, ReAct patterns
Week 6 Enterprise Integration & Deployment API Gateway integration, Step Functions workflows, CI/CD for GenAI (CodePipeline), container deployment patterns, Q Developer
Week 7 Security, Governance & Responsible AI Bedrock Guardrails, IAM for AI services, data privacy, PII handling, compliance, responsible AI practices
Week 8 Optimization & Monitoring Cost optimization (model cascading, provisioned throughput), performance tuning, CloudWatch metrics, X-Ray observability
Week 9 Testing, Evaluation & Troubleshooting Model evaluation metrics, agent evaluations, retrieval quality testing, deployment validation, debugging GenAI apps
Week 10 Review & Practice Exams Take 2-3 full practice exams, review weak areas, re-read exam guide, focus on scenario-based questions

Study Tips

  • Hands-on practice is essential – This is a professional-level exam; build actual RAG pipelines and deploy agents on AWS
  • Focus on Domain 1 & 2 – Together they represent 57% of the exam
  • Understand scenario-based questions – Questions are long and test architectural decision-making, not memorization
  • Know the trade-offs – When to use Bedrock vs. SageMaker, on-demand vs. provisioned throughput, different chunking strategies
  • Practice with time management – 180 minutes for 75 complex questions means ~2.4 minutes per question

AIP-C01 Practice Questions

Question 1

A company is building a customer support chatbot using Amazon Bedrock. The chatbot needs to answer questions based on 50,000 internal product documents that are updated weekly. The solution must minimize hallucinations and provide source citations. Which architecture best meets these requirements?

  1. Fine-tune a foundation model on all product documents monthly
  2. Use Amazon Bedrock Knowledge Bases with automatic chunking, vector store synchronization, and source attribution enabled
  3. Include all product documents in the system prompt for each request
  4. Train a custom model using Amazon SageMaker with the product documents as training data
Show Answer

Answer: B – Amazon Bedrock Knowledge Bases provides managed RAG with automatic document chunking, scheduled sync for weekly updates, vector store management, and built-in source attribution. Fine-tuning (A/D) doesn’t provide up-to-date factual recall, and including all documents in the prompt (C) exceeds context window limits.

Question 2

A developer is implementing an agentic AI solution that needs to query a company’s internal database, call external APIs, and generate reports. The solution must handle failures gracefully and maintain conversation state. Which combination of services should be used? (Select TWO)

  1. Amazon Bedrock Agents with action groups and Lambda functions
  2. Amazon Comprehend with custom entity recognition
  3. Amazon DynamoDB for conversation history and session state
  4. Amazon Kinesis Data Streams for real-time processing
  5. Amazon Rekognition for document analysis
Show Answer

Answer: A, C – Bedrock Agents with action groups handle tool orchestration (database queries, API calls) with built-in error handling and ReAct reasoning. DynamoDB stores conversation history for state management. Comprehend (B), Kinesis (D), and Rekognition (E) don’t address the agentic workflow requirements.

Question 3

An organization needs to ensure their GenAI application does not generate responses about competitor products, does not reveal PII from training data, and stays within approved topic boundaries. Which approach provides the MOST comprehensive solution?

  1. Implement input validation using AWS Lambda functions
  2. Configure Amazon Bedrock Guardrails with denied topics, PII filters, and content filters
  3. Use system prompts to instruct the model to avoid certain topics
  4. Fine-tune the model to remove knowledge about competitors
Show Answer

Answer: B – Amazon Bedrock Guardrails provides configurable denied topics, automated PII detection/redaction, and content filters that work at both input and output levels. System prompts (C) can be bypassed through prompt injection. Lambda validation (A) only handles input. Fine-tuning (D) cannot reliably remove specific knowledge.

Question 4

A team has deployed a GenAI application using Amazon Bedrock. After launch, they notice that response latency increases during peak hours and costs are 3x their budget. The application handles both simple FAQ queries and complex analytical questions. What is the MOST cost-effective optimization strategy?

  1. Switch all requests to the largest available model for better performance
  2. Implement model cascading: route simple queries to a smaller/cheaper model and complex queries to a larger model using a classification layer
  3. Purchase provisioned throughput for the maximum expected load
  4. Cache all responses in Amazon ElastiCache and serve cached answers for all queries
Show Answer

Answer: B – Model cascading routes simple queries to smaller, faster, cheaper models while reserving larger models for complex tasks. This optimizes both cost and latency. Using only the largest model (A) increases cost. Maximum provisioned throughput (C) over-provisions for average load. Caching all responses (D) doesn’t work for analytical questions requiring unique answers.

Question 5

A developer is building a RAG application and notices that retrieved documents are often irrelevant, leading to poor response quality. The documents are technical manuals with hierarchical structure (chapters, sections, subsections). Which combination of improvements will MOST effectively address retrieval quality? (Select TWO)

  1. Increase the chunk size to 10,000 tokens to capture more context
  2. Implement hierarchical chunking that preserves document structure and parent-child relationships
  3. Use hybrid search combining semantic vector search with keyword-based BM25 scoring
  4. Reduce the number of retrieved documents to 1 to increase precision
  5. Switch from vector search to simple keyword search
Show Answer

Answer: B, C – Hierarchical chunking preserves the document structure, maintaining context relationships between sections. Hybrid search combines the semantic understanding of vector search with the precision of keyword matching, improving relevance for technical content. Very large chunks (A) reduce precision. Only 1 document (D) may miss relevant information. Keyword-only search (E) loses semantic understanding.

Related Posts

References

Frequently Asked Questions

What is the AIP-C01 exam?

The AWS Certified AI Practitioner Professional (AIP-C01) validates ability to build, deploy, and operationalize generative AI solutions on AWS. It covers RAG implementation, agent design, MLOps, model security, and evaluation — requiring hands-on experience with Bedrock, SageMaker, and related services.

How does AIP-C01 differ from AIF-C01?

AIF-C01 (AI Practitioner) is foundational — testing conceptual knowledge of AI/ML. AIP-C01 (AI Professional) is advanced — testing hands-on ability to implement Gen AI solutions, fine-tune models, build agents, deploy with MLOps pipelines, and secure AI applications.

What experience do I need for AIP-C01?

AWS recommends 2+ years of hands-on experience building ML/Gen AI solutions on AWS, including working with Bedrock, SageMaker, and implementing RAG, fine-tuning, and agent architectures in production.

AWS AI & Generative AI Services – Cheat Sheet

AWS AI & Generative AI Services – Cheat Sheet

This is the definitive cheat sheet covering AI, Machine Learning, and Generative AI services on AWS — designed as the anchor page for both the AWS Certified AI Practitioner (AIF-C01) and AWS Certified Generative AI Developer – Professional (AIP-C01) exams.

Related Posts:

AI/ML/Generative AI Fundamentals

AI vs ML vs Deep Learning vs Generative AI

Concept Definition Examples
Artificial Intelligence (AI) Broad field of computer science focused on creating systems that can perform tasks requiring human intelligence Rule-based systems, expert systems, robotics
Machine Learning (ML) Subset of AI where systems learn from data without being explicitly programmed Fraud detection, recommendations, forecasting
Deep Learning (DL) Subset of ML using neural networks with multiple layers (deep neural networks) to learn complex patterns Image recognition, NLP, speech recognition
Generative AI (GenAI) Subset of DL that creates new content (text, images, code, video, audio) by learning patterns from training data ChatGPT, DALL-E, Amazon Nova, Claude

Learning Paradigms

  • Supervised Learning — model learns from labeled data (input-output pairs). Used for classification (spam/not spam) and regression (price prediction).
  • Unsupervised Learning — model finds patterns in unlabeled data. Used for clustering (customer segmentation), anomaly detection, and dimensionality reduction.
  • Semi-supervised Learning — combines small amount of labeled data with large amounts of unlabeled data.
  • Reinforcement Learning (RL) — agent learns by interacting with an environment, receiving rewards/penalties. Used for game playing, robotics, and RLHF in LLMs.
  • Self-supervised Learning — model generates its own labels from input data (e.g., predicting masked tokens). Used for pre-training foundation models.

Neural Networks Basics

  • Neurons/Nodes — basic computation units that receive inputs, apply weights, add bias, and pass through an activation function.
  • Layers — Input layer (receives data), Hidden layers (process data), Output layer (produces result).
  • Weights & Biases — parameters learned during training that determine the model’s behavior.
  • Activation Functions — introduce non-linearity (ReLU, Sigmoid, Softmax, Tanh).
  • Backpropagation — algorithm to compute gradients and update weights by propagating errors backward.
  • Loss Function — measures how far the model’s predictions are from actual values.
  • Transformer Architecture — foundation of modern LLMs; uses self-attention mechanism to process entire sequences in parallel (introduced in “Attention is All You Need” paper, 2017).
  • CNNs (Convolutional Neural Networks) — specialized for image/spatial data.
  • RNNs/LSTMs — sequential data processing (largely superseded by Transformers for NLP).
  • GANs (Generative Adversarial Networks) — generator + discriminator for image generation.
  • Diffusion Models — generate images/video by learning to denoise (e.g., Stable Diffusion, Nova Canvas).

📖 Deep Dive Guides: Bedrock vs SageMaker | RAG Architecture | Prompt Engineering | Responsible AI | AI Services Decision Guide

Foundation Model Concepts

Pre-training

  • Training a model on massive datasets (trillions of tokens) to learn general language/world knowledge.
  • Extremely expensive and resource-intensive (millions of GPU hours).
  • Results in a base model with broad capabilities but no specific task alignment.
  • Common objectives: next-token prediction (GPT-style), masked language modeling (BERT-style).

Fine-tuning Techniques

  • Instruction Tuning — fine-tuning on instruction-response pairs to make the model follow instructions better.
  • RLHF (Reinforcement Learning from Human Feedback) — trains a reward model from human preferences, then uses RL (PPO) to optimize the language model against that reward. Used to align models with human values.
  • DPO (Direct Preference Optimization) — simpler alternative to RLHF that directly optimizes on preference pairs without a separate reward model. More stable training.
  • LoRA / QLoRA — Parameter-Efficient Fine-Tuning (PEFT) that freezes base model and trains small adapter layers. Reduces compute by 90%+.
  • Continued Pre-training — further pre-training on domain-specific data to teach the model new knowledge (e.g., medical, legal, financial).
  • Distillation — training a smaller “student” model to mimic a larger “teacher” model’s outputs. Reduces inference cost while retaining most capability.

RAG (Retrieval Augmented Generation)

  • Combines information retrieval with text generation to ground LLM responses in external knowledge.
  • How it works: Query → Retrieve relevant documents from knowledge base → Augment prompt with retrieved context → Generate response.
  • Benefits: Reduces hallucinations, enables up-to-date responses, no model retraining needed, source attribution.
  • Components: Document ingestion, chunking strategy, embedding model, vector database, retrieval algorithm, re-ranking.
  • AWS Implementation: Amazon Bedrock Knowledge Bases, Amazon Kendra (GenAI Index), OpenSearch vector search.

Prompt Engineering

  • Zero-shot — asking the model to perform a task without any examples. Relies on pre-trained knowledge.
  • Few-shot (In-Context Learning) — providing a few examples in the prompt to guide the model’s output format and behavior.
  • Chain-of-Thought (CoT) — asking the model to “think step by step” to improve reasoning on complex tasks.
  • System Prompts — instructions that define the model’s role, behavior, and constraints.
  • Prompt Templates — reusable prompt structures with placeholders for dynamic content.
  • Prompt Chaining — breaking complex tasks into sequential prompts where output of one feeds input of next.

Key Parameters & Concepts

  • Tokenization — splitting text into tokens (subwords/words). Models process tokens, not characters. Affects context limits and pricing.
  • Embeddings — dense vector representations of text/images in high-dimensional space. Semantically similar items have similar embeddings. Used for search, RAG, and clustering.
  • Temperature — controls randomness of output. Low (0-0.3) = deterministic/focused, High (0.7-1.0) = creative/diverse. 0 = greedy decoding.
  • Top-p (Nucleus Sampling) — considers only tokens whose cumulative probability exceeds p. Top-p 0.9 = considers top 90% probability mass.
  • Top-k — limits token selection to the k most likely next tokens.
  • Context Window — maximum number of tokens (input + output) the model can process at once. Ranges from 4K to 1M+ tokens in modern models.
  • Max Tokens — limits the length of generated output.
  • Stop Sequences — tokens that signal the model to stop generating.
  • Hallucination — when a model generates plausible-sounding but factually incorrect information.
  • Grounding — techniques to anchor model responses in factual data (RAG, tool use, citations).

Responsible AI

Core Principles

  • Fairness & Bias — ensuring models don’t discriminate based on protected attributes (race, gender, age). Types: selection bias, measurement bias, representation bias, confirmation bias.
  • Explainability — ability to understand and explain how/why a model made a specific prediction. Techniques: SHAP, LIME, attention visualization, feature importance.
  • Transparency — openly communicating model capabilities, limitations, and intended use cases to users.
  • Robustness — model performs reliably across different inputs, including adversarial examples and edge cases.
  • Privacy & Security — protecting training data, user inputs, and model outputs. Preventing data leakage and prompt injection.
  • Governance — organizational policies, processes, and controls for responsible AI development and deployment.
  • Safety — preventing harmful outputs including toxic content, misinformation, and dangerous instructions.

AWS Responsible AI Tools

  • AWS AI Service Cards — transparency documentation for AWS AI services covering intended use cases, limitations, responsible AI design choices, and deployment best practices.
  • Amazon Bedrock Guardrails — configurable safeguards for GenAI applications:
    • Content filters (hate, insults, sexual, violence, misconduct)
    • Denied topics (topic avoidance policies)
    • Word/phrase filters
    • Sensitive information filters (PII redaction)
    • Contextual grounding checks (hallucination detection)
    • Automated Reasoning Checks (logical verification)
  • SageMaker Clarify — detects bias in data and models, provides feature attributions for explainability (note: moving to maintenance July 2026 for new customers).
  • Model Cards — documentation that describes a model’s intended use, performance metrics, limitations, and ethical considerations. Supported in SageMaker Model Registry.
  • Human-in-the-Loop (HITL) — keeping humans involved in AI decision-making for high-stakes scenarios. AWS A2I (Augmented AI) provided review workflows (note: moving to maintenance July 2026 for new customers).
  • Amazon Bedrock Model Evaluation — automatic evaluation (accuracy, robustness, toxicity), human evaluation, and LLM-as-a-judge for quality assessment.

Bias Mitigation Strategies

  • Pre-processing: Balance training data, remove sensitive attributes, data augmentation.
  • In-processing: Regularization techniques, adversarial debiasing, fairness constraints during training.
  • Post-processing: Calibrate outputs, threshold adjustment, reject option classification.
  • Monitoring: Continuously track model performance across demographic groups in production.

Agentic AI

What are AI Agents?

  • AI systems that can autonomously plan, reason, and execute multi-step tasks to achieve goals.
  • Go beyond simple prompt-response by taking actions, using tools, and adapting based on outcomes.
  • Can operate for extended periods, making decisions and course-correcting without human intervention.

Key Concepts

  • Tool Use (Function Calling) — agents invoke external tools (APIs, databases, code execution) to gather information or perform actions.
  • Multi-step Reasoning — breaking complex problems into steps, executing sequentially with intermediate evaluations.
  • Orchestration — coordinating multiple agents or components to complete complex workflows. Patterns: sequential, parallel, routing, supervisor.
  • Memory — maintaining context across interactions:
    • Short-term memory (conversation context within a session)
    • Long-term memory (persistent knowledge across sessions)
    • Episodic memory (past experiences and outcomes)
  • Planning — decomposing goals into actionable sub-tasks, determining execution order, handling dependencies.
  • Reflection — agents evaluate their own outputs and self-correct errors before responding.
  • Model Context Protocol (MCP) — open standard for connecting AI agents with external tools and data sources.
  • Agent2Agent (A2A) — protocol for inter-agent communication and collaboration.

AWS Agentic AI Services

  • Amazon Bedrock Agents — create agents that can break down tasks, call APIs, and access knowledge bases (transitioning to Bedrock Agents Classic, July 2026).
  • Amazon Bedrock AgentCore (GA 2025/2026) — enterprise-grade infrastructure for deploying and operating AI agents at scale:
    • AgentCore Runtime — serverless, scalable environment to host agents
    • AgentCore Gateway — MCP-compatible tool connectivity
    • AgentCore Identity — per-agent identity and least-privilege access
    • AgentCore Observability — monitoring, tracing, and debugging
    • AgentCore Code Interpreter — secure sandboxed code execution
    • AgentCore Optimization — continuous quality evaluation and improvement
  • Amazon Nova Act — browser automation agent for web-based tasks.
  • AWS Step Functions — orchestrate multi-step agent workflows with state management.

AWS AI Service Stack

AWS organizes AI/ML services into three layers:

Layer 1: AI Infrastructure (Compute & Silicon)

Service/Chip Purpose Key Details
AWS Trainium Custom chip for ML training Trainium2 (4x perf vs gen1), Trainium3 (3nm, 4.4x vs Trn2, GA Dec 2025)
AWS Inferentia Custom chip for ML inference Inferentia2 (4x throughput, 10x lower latency vs gen1), Inf2 instances
EC2 UltraServers Multi-instance AI clusters Trn2 UltraServers (64 Trainium2 chips, NeuronLink interconnect), Trn3 UltraServers
AWS AI Factories On-premises AI infrastructure Deploy AI training/inference infrastructure in customer data centers
AWS Neuron SDK Software for Trainium/Inferentia Integrates with PyTorch, JAX, TensorFlow. Compiler, runtime, profiler
EC2 P5/P5e/P5en GPU instances (NVIDIA) H100/H200 GPUs for training and inference
EC2 G6/G6e GPU instances (NVIDIA) L4/L40S GPUs for inference and graphics
AWS Graviton Arm-based general compute Best price-performance for inference serving and general ML workloads
Amazon EFA Elastic Fabric Adapter Low-latency networking for distributed training across instances

Layer 2: ML Platform (SageMaker AI)

  • Amazon SageMaker AI (rebranded from SageMaker, late 2024) — end-to-end ML platform for building, training, and deploying models.
Component Purpose
SageMaker Unified Studio Single IDE for data, analytics, and ML/AI development (integrates Bedrock)
SageMaker Canvas No-code ML for business analysts — point-and-click model building
SageMaker HyperPod Managed clusters for large-scale distributed training with auto-recovery
SageMaker Pipelines CI/CD for ML — define, automate, and manage ML workflows
SageMaker Feature Store Centralized repository for ML features (online + offline store)
SageMaker MLflow Managed MLflow for experiment tracking, model versioning, deployment
SageMaker Model Registry Central catalog to version, manage, and deploy models with approval workflows
SageMaker JumpStart Model hub with 400+ pre-trained models, one-click deploy, fine-tuning
SageMaker Endpoints Real-time inference hosting (single model or multi-model endpoints)
SageMaker Training Managed training with built-in algorithms, distributed training, spot instances
SageMaker Processing Run data processing and evaluation jobs at scale
SageMaker Lakehouse Unified access to data lakes and warehouses for ML

Layer 3: AI Applications & Services

Amazon Bedrock (Generative AI Platform)

  • Amazon Bedrock — fully managed service for building GenAI applications with foundation models.
  • Model Providers: Amazon (Nova), Anthropic (Claude), Meta (Llama), Mistral, Cohere, AI21 Labs, OpenAI, Stability AI.
  • Key Capabilities:
    • Model Inference — Converse API, InvokeModel, streaming, batch inference, cross-region inference
    • Knowledge Bases — managed RAG with vector stores (OpenSearch, Aurora, Pinecone, etc.)
    • Managed Knowledge Base (2026) — fully managed RAG primitive (storage + retrieval + embeddings + re-ranking)
    • Agents — multi-step task execution with tool use (transitioning to AgentCore)
    • Guardrails — content filtering, topic avoidance, PII protection, grounding checks
    • Model Customization — fine-tuning, continued pre-training, distillation
    • Model Evaluation — automatic metrics, human evaluation, LLM-as-judge
    • Flows — visual workflow builder for chaining prompts, agents, and knowledge bases

Amazon Nova Models

  • Nova Micro — text-only, fastest, lowest cost (128K context). Ideal for classification, summarization.
  • Nova Lite — multimodal (text + image + video input), cost-effective (300K context).
  • Nova Pro — balanced multimodal, strong accuracy/speed/cost trade-off (300K context).
  • Nova Premier — most capable, complex reasoning, agentic workflows, teacher model (1M context).
  • Nova Canvas — image generation with editing controls and watermarking.
  • Nova Reel — video generation (1280×720, 24fps, up to 6 seconds).
  • Nova Sonic — speech-to-speech for real-time conversational AI.
  • Nova 2 (Dec 2025) — next generation with extended thinking (adjustable levels), 1M token context, built-in tools:
    • Nova 2 Lite — fast, cost-effective reasoning model
    • Nova 2 Pro — most intelligent, complex agentic tasks
    • Nova 2 Sonic — next-gen speech with async tool calling
    • Nova 2 Omni — unified multimodal I/O (text + image generation)
  • Nova Act — browser automation agent for web tasks.
  • Nova Forge — custom model building program (open training).

Amazon Q Developer & Q Business

  • Amazon Q Developer — AI-powered coding assistant (evolved from CodeWhisperer):
    • Code generation, completion, and inline suggestions (15+ languages)
    • Agentic coding (autonomous multi-step development)
    • Security vulnerability scanning
    • Code transformation and modernization (Java, .NET upgrades)
    • CLI integration (natural language → commands)
    • Debugging and troubleshooting with CloudWatch integration
  • Amazon Q Business — AI assistant for enterprise knowledge (connects to 40+ data sources):
    • Natural language answers from company data
    • Document summarization and content creation
    • Task automation with plugins
    • Access control respecting existing permissions (ACL-aware)
  • Amazon Q in Console — chat assistant in AWS Management Console for troubleshooting and guidance.
⚠️ Note (July 2026): Amazon Q Developer IDE plugins reaching end-of-support April 2027. Successor is Kiro — AWS’s agentic development environment. Amazon Q Business and Amazon Kendra entering maintenance mode for new customers July 30, 2026.

AWS AI/ML Application Services

Service Category Purpose
Amazon Comprehend NLP Sentiment analysis, entity recognition, key phrase extraction, language detection, topic modeling
Amazon Rekognition Computer Vision Object/face detection, content moderation, celebrity recognition, text in images, custom labels
Amazon Polly Speech Text-to-speech with neural voices (60+ voices, 30+ languages), SSML support
Amazon Transcribe Speech Speech-to-text (ASR), real-time and batch, custom vocabularies, speaker identification
Amazon Translate Language Neural machine translation (75+ languages), real-time and batch, custom terminology
Amazon Textract Document AI OCR + intelligent document processing, extracts text, tables, forms, and queries from documents
Amazon Lex Conversational AI Build chatbots and voice bots with automatic speech recognition and NLU
Amazon Kendra Search Enterprise search with NLP, semantic understanding, GenAI index for RAG ⚠️ Maintenance mode July 2026
Amazon Personalize Recommendations Real-time personalization and recommendations (same tech as Amazon.com)
Amazon Forecast Time Series Time series forecasting using ML (closed to new customers since 2024)
Amazon HealthScribe Healthcare Generate clinical documentation from patient-clinician conversations
Amazon Bedrock AgentCore Agentic AI Deploy, manage, and optimize AI agents at scale (GA 2025/2026)

Decision Matrix: Use Case → Recommended Service

Use Case Recommended Service Why
Build GenAI apps with FMs (no ML expertise) Amazon Bedrock Serverless, multi-model, fully managed
Custom model training from scratch SageMaker AI + Trainium Full control over training, data, and infrastructure
Enterprise Q&A over company documents Amazon Q Business / Bedrock Knowledge Bases Connects to 40+ data sources, ACL-aware
AI coding assistant Amazon Q Developer / Kiro Inline completions, security scanning, agentic coding
Build and deploy AI agents Bedrock AgentCore Serverless runtime, MCP tools, identity, observability
Chatbot / virtual assistant Amazon Lex + Bedrock Lex for structure, Bedrock for natural responses
Document processing (forms, invoices) Amazon Textract Extracts structured data from documents at scale
Content moderation (images/video) Amazon Rekognition Pre-built moderation labels, custom labels for specifics
Sentiment analysis on customer feedback Amazon Comprehend Pre-built NLP models, no training needed
Real-time product recommendations Amazon Personalize Same ML tech as Amazon.com, real-time updates
Transcribe meetings/calls Amazon Transcribe Real-time ASR, speaker diarization, custom vocab
Generate speech from text Amazon Polly Neural TTS, SSML support, multiple voices
Translate content at scale Amazon Translate 75+ languages, real-time, custom terminology
No-code ML for business users SageMaker Canvas Point-and-click, AutoML, visual interface
Fine-tune FMs on proprietary data Bedrock Custom Models / SageMaker JumpStart Bedrock for serverless; SageMaker for full control
Prevent harmful GenAI outputs Amazon Bedrock Guardrails Content filters, PII, grounding checks, topic avoidance
Cost-effective GenAI inference at scale Bedrock + Nova models (or Inferentia2/Trainium) Nova = lowest cost in class; custom silicon for self-hosted
Clinical documentation from conversations Amazon HealthScribe Purpose-built for healthcare, HIPAA eligible

Quick Reference: All AWS AI/ML Services

Service One-Liner
Amazon Bedrock Fully managed GenAI platform with multi-provider foundation models
Amazon Bedrock AgentCore Enterprise infrastructure for deploying and operating AI agents at scale
Amazon Nova Amazon’s family of foundation models (text, multimodal, speech, image, video)
Amazon Q Developer AI coding assistant with code generation, security scanning, and transformation
Amazon Q Business Enterprise AI assistant for Q&A and task automation over company data
Amazon SageMaker AI End-to-end ML platform for building, training, and deploying custom models
Amazon Comprehend NLP service for sentiment, entities, key phrases, language detection
Amazon Rekognition Computer vision for object/face detection, moderation, and custom labels
Amazon Polly Text-to-speech with neural and standard voices
Amazon Transcribe Automatic speech recognition (speech-to-text)
Amazon Translate Neural machine translation for 75+ languages
Amazon Textract Extract text, tables, and forms from documents (OCR+)
Amazon Lex Build conversational chatbots and voice bots
Amazon Kendra Intelligent enterprise search with NLP and GenAI index
Amazon Personalize Real-time ML-powered personalization and recommendations
Amazon HealthScribe Generate clinical notes from patient-clinician conversations
AWS Trainium Custom AI chip optimized for training (Trn2, Trn3 instances)
AWS Inferentia Custom AI chip optimized for inference (Inf2 instances)
AWS Neuron SDK SDK for running ML workloads on Trainium and Inferentia chips
Amazon SageMaker Canvas No-code ML model building for business analysts
Amazon SageMaker HyperPod Managed clusters for distributed training with auto fault recovery

Exam Tips

AIF-C01 — AWS Certified AI Practitioner

  • Format: 65 questions, 90 minutes, 700/1000 passing score.
  • Domains:
    • Domain 1: Fundamentals of AI and ML (20%)
    • Domain 2: Fundamentals of Generative AI (24%)
    • Domain 3: Applications of Foundation Models (28%) — largest domain, most technical
    • Domain 4: Guidelines for Responsible AI (14%)
    • Domain 5: Security, Compliance, and Governance for AI Solutions (14%)
  • Key Focus Areas:
    • Domains 2+3 = 52% of exam — master Bedrock, RAG, prompt engineering, fine-tuning
    • Know the difference between AI vs ML vs DL vs GenAI
    • Understand when to use Bedrock vs SageMaker
    • RAG architecture and when to use it vs fine-tuning
    • Responsible AI principles and Bedrock Guardrails
    • Temperature, top-p effects on output
    • Know all AWS AI services at a high level (what each does)

AIP-C01 — AWS Certified Generative AI Developer – Professional

  • Format: 85 questions, 180 minutes, 750/1000 passing score.
  • Domains:
    • Domain 1: FM Selection and Integration (26%)
    • Domain 2: Data Management and Optimization (22%)
    • Domain 3: Model Performance and Compliance (31%) — largest domain
    • Domain 4: Security and Governance (21%)
  • Key Focus Areas:
    • Deep hands-on knowledge of Bedrock APIs, agents, knowledge bases, guardrails
    • RAG implementation details (chunking strategies, embedding models, vector stores)
    • Model customization (when fine-tuning vs RAG vs prompt engineering)
    • Agentic AI patterns (tool use, multi-step, AgentCore)
    • SageMaker for custom training and deployment
    • Model evaluation and monitoring in production
    • Security: data encryption, VPC endpoints, IAM for Bedrock, prompt injection mitigation
    • Cost optimization (model selection, batch inference, provisioned throughput)

Common Exam Scenarios

  • “Least operational overhead” → Bedrock (serverless) over SageMaker (managed infrastructure)
  • “Custom model with proprietary data” → Fine-tuning on Bedrock or SageMaker depending on control needed
  • “Reduce hallucinations” → RAG with Knowledge Bases + Guardrails grounding checks
  • “Enterprise search over internal docs” → Amazon Q Business or Bedrock Knowledge Bases
  • “Control AI outputs for safety” → Bedrock Guardrails
  • “Lowest cost inference” → Nova Micro (text) or Nova Lite (multimodal) on Bedrock
  • “Deploy agents in production” → Bedrock AgentCore (serverless, scalable, observable)
  • “Train trillion-parameter model” → Trainium3 UltraServers + SageMaker HyperPod

Practice Questions

Question 1 (AIF-C01)

A company wants to reduce hallucinations in their generative AI application that answers customer questions about company policies. The application uses Amazon Bedrock. What is the MOST effective approach?

  1. Increase the model temperature to generate more diverse responses
  2. Implement Retrieval Augmented Generation (RAG) with Amazon Bedrock Knowledge Bases
  3. Fine-tune the foundation model on company documents
  4. Switch to a larger foundation model
Show Answer

Answer: B – RAG grounds responses in actual company documents, directly reducing hallucinations. Fine-tuning (C) teaches style/format but doesn’t guarantee factual accuracy for specific documents. Higher temperature (A) increases randomness. Larger models (D) don’t inherently reduce hallucinations.

Question 2 (AIF-C01)

Which combination of techniques helps ensure responsible AI in a generative AI application? (Select TWO)

  1. Increase the context window size
  2. Configure Amazon Bedrock Guardrails with content filters and denied topics
  3. Use the lowest-cost foundation model available
  4. Implement human review workflows for high-stakes decisions
  5. Maximize the temperature parameter for creative outputs
Show Answer

Answer: B, D – Bedrock Guardrails (B) provides configurable safety controls to filter harmful content. Human-in-the-loop (D) ensures human oversight for critical decisions. Context window size (A), model cost (C), and temperature (E) are not responsible AI techniques.

Question 3 (AIP-C01)

A developer is building an AI agent that needs to autonomously execute multi-step workflows, call external APIs, and maintain state across interactions. The solution must be production-grade with monitoring and minimal infrastructure management. Which AWS service should they use?

  1. Amazon Lex with Lambda fulfillment functions
  2. Amazon Bedrock AgentCore with AgentCore Runtime and Observability
  3. AWS Step Functions with SageMaker endpoints
  4. Amazon Q Business with custom plugins
Show Answer

Answer: B – Bedrock AgentCore provides serverless runtime for agents, MCP-compatible tool connectivity (Gateway), built-in observability, and identity management — purpose-built for production AI agents. Lex (A) is for chatbots, not autonomous agents. Step Functions (C) requires more infrastructure management. Q Business (D) is for enterprise knowledge, not custom agent workflows.

Question 4 (AIP-C01)

A team needs to fine-tune a foundation model on their proprietary dataset with minimal compute cost. The dataset contains 10,000 instruction-response pairs. Which approach provides the BEST balance of performance improvement and cost?

  1. Full fine-tuning of the entire model on Amazon SageMaker with P5 GPU instances
  2. Continued pre-training on Amazon Bedrock with the full dataset
  3. Parameter-efficient fine-tuning (LoRA) through Amazon Bedrock custom models
  4. Distilling the model into a smaller variant using Nova Premier as teacher
Show Answer

Answer: C – LoRA fine-tuning on Bedrock trains only small adapter layers (reduces compute by 90%+) while the base model stays frozen. It’s ideal for instruction-tuning with limited data. Full fine-tuning (A) is expensive. Continued pre-training (B) is for teaching new knowledge, not task alignment. Distillation (D) creates a smaller model but doesn’t directly fine-tune on task data.

Question 5 (AIF-C01 / AIP-C01)

A company wants to deploy a generative AI solution with the following requirements: lowest possible latency for text summarization, minimal cost, and no infrastructure management. Which combination should they choose?

  1. Amazon Nova Premier on Bedrock with provisioned throughput
  2. Amazon Nova Micro on Bedrock with on-demand pricing
  3. Claude 3 Opus on Bedrock with batch inference
  4. Self-hosted Llama model on SageMaker with Inferentia2 instances
Show Answer

Answer: B – Nova Micro is the fastest text-only model (200+ tokens/sec), lowest cost, and Bedrock provides serverless (no infrastructure). Premier (A) is more capable but slower and costlier. Batch (C) has high latency. Self-hosted (D) requires infrastructure management.

Frequently Asked Questions

What is the difference between AI, ML, and Generative AI?

AI is the broadest category — machines performing tasks that typically require human intelligence. ML is a subset that learns from data without explicit programming. Generative AI is a subset of ML that creates new content (text, images, code) using foundation models trained on vast datasets.

What is the difference between Amazon Bedrock and SageMaker?

Bedrock provides access to pre-built foundation models for generative AI applications without ML expertise. SageMaker is a full ML platform for building, training, and deploying custom models from scratch. Use Bedrock for gen AI apps; SageMaker when you need complete control over model training.

What AWS certifications cover AI and Generative AI?

AWS offers two AI-focused certifications: AIF-C01 (AI Practitioner) for foundational knowledge of AI/ML/Gen AI concepts and AWS services, and AIP-C01 (AI Professional) for practitioners building and deploying Gen AI solutions. Both require knowledge of Bedrock, SageMaker, and responsible AI.

Detailed Guides

Exam Prep: AWS AI Professional (AIP-C01) Exam Learning Path

References