Table of Contents hide

AWS AI & Generative AI Services – Cheat Sheet

Responsible AI

Agentic AI

AWS AI Service Stack

Decision Matrix: Use Case → Recommended Service

Quick Reference: All AWS AI/ML Services

Exam Tips

Practice Questions

Frequently Asked Questions

Detailed Guides

AWS AI & Generative AI Services – Cheat Sheet

This is the definitive cheat sheet covering AI, Machine Learning, and Generative AI services on AWS — designed as the anchor page for both the AWS Certified AI Practitioner (AIF-C01) and AWS Certified Generative AI Developer – Professional (AIP-C01) exams.

Related Posts:

AI/ML/Generative AI Fundamentals

AI vs ML vs Deep Learning vs Generative AI

Concept	Definition	Examples
Artificial Intelligence (AI)	Broad field of computer science focused on creating systems that can perform tasks requiring human intelligence	Rule-based systems, expert systems, robotics
Machine Learning (ML)	Subset of AI where systems learn from data without being explicitly programmed	Fraud detection, recommendations, forecasting
Deep Learning (DL)	Subset of ML using neural networks with multiple layers (deep neural networks) to learn complex patterns	Image recognition, NLP, speech recognition
Generative AI (GenAI)	Subset of DL that creates new content (text, images, code, video, audio) by learning patterns from training data	ChatGPT, DALL-E, Amazon Nova, Claude

Learning Paradigms

Supervised Learning — model learns from labeled data (input-output pairs). Used for classification (spam/not spam) and regression (price prediction).
Unsupervised Learning — model finds patterns in unlabeled data. Used for clustering (customer segmentation), anomaly detection, and dimensionality reduction.

Semi-supervised Learning — combines small amount of labeled data with large amounts of unlabeled data.
Reinforcement Learning (RL) — agent learns by interacting with an environment, receiving rewards/penalties. Used for game playing, robotics, and RLHF in LLMs.
Self-supervised Learning — model generates its own labels from input data (e.g., predicting masked tokens). Used for pre-training foundation models.

Neural Networks Basics

Neurons/Nodes — basic computation units that receive inputs, apply weights, add bias, and pass through an activation function.
Layers — Input layer (receives data), Hidden layers (process data), Output layer (produces result).
Weights & Biases — parameters learned during training that determine the model’s behavior.

Activation Functions — introduce non-linearity (ReLU, Sigmoid, Softmax, Tanh).
Backpropagation — algorithm to compute gradients and update weights by propagating errors backward.
Loss Function — measures how far the model’s predictions are from actual values.
Transformer Architecture — foundation of modern LLMs; uses self-attention mechanism to process entire sequences in parallel (introduced in “Attention is All You Need” paper, 2017).

CNNs (Convolutional Neural Networks) — specialized for image/spatial data.
RNNs/LSTMs — sequential data processing (largely superseded by Transformers for NLP).
GANs (Generative Adversarial Networks) — generator + discriminator for image generation.
Diffusion Models — generate images/video by learning to denoise (e.g., Stable Diffusion, Nova Canvas).

📖 Deep Dive Guides: Bedrock vs SageMaker | RAG Architecture | Prompt Engineering | Responsible AI | AI Services Decision Guide

Foundation Model Concepts

Pre-training

Training a model on massive datasets (trillions of tokens) to learn general language/world knowledge.
Extremely expensive and resource-intensive (millions of GPU hours).
Results in a base model with broad capabilities but no specific task alignment.

Common objectives: next-token prediction (GPT-style), masked language modeling (BERT-style).

Fine-tuning Techniques

Instruction Tuning — fine-tuning on instruction-response pairs to make the model follow instructions better.
RLHF (Reinforcement Learning from Human Feedback) — trains a reward model from human preferences, then uses RL (PPO) to optimize the language model against that reward. Used to align models with human values.
DPO (Direct Preference Optimization) — simpler alternative to RLHF that directly optimizes on preference pairs without a separate reward model. More stable training.

LoRA / QLoRA — Parameter-Efficient Fine-Tuning (PEFT) that freezes base model and trains small adapter layers. Reduces compute by 90%+.
Continued Pre-training — further pre-training on domain-specific data to teach the model new knowledge (e.g., medical, legal, financial).
Distillation — training a smaller “student” model to mimic a larger “teacher” model’s outputs. Reduces inference cost while retaining most capability.

RAG (Retrieval Augmented Generation)

Combines information retrieval with text generation to ground LLM responses in external knowledge.
How it works: Query → Retrieve relevant documents from knowledge base → Augment prompt with retrieved context → Generate response.
Benefits: Reduces hallucinations, enables up-to-date responses, no model retraining needed, source attribution.

Components: Document ingestion, chunking strategy, embedding model, vector database, retrieval algorithm, re-ranking.
AWS Implementation: Amazon Bedrock Knowledge Bases, Amazon Kendra (GenAI Index), OpenSearch vector search.

Prompt Engineering

Zero-shot — asking the model to perform a task without any examples. Relies on pre-trained knowledge.

Few-shot (In-Context Learning) — providing a few examples in the prompt to guide the model’s output format and behavior.
Chain-of-Thought (CoT) — asking the model to “think step by step” to improve reasoning on complex tasks.
System Prompts — instructions that define the model’s role, behavior, and constraints.

Prompt Templates — reusable prompt structures with placeholders for dynamic content.
Prompt Chaining — breaking complex tasks into sequential prompts where output of one feeds input of next.

Key Parameters & Concepts

Tokenization — splitting text into tokens (subwords/words). Models process tokens, not characters. Affects context limits and pricing.

Embeddings — dense vector representations of text/images in high-dimensional space. Semantically similar items have similar embeddings. Used for search, RAG, and clustering.
Temperature — controls randomness of output. Low (0-0.3) = deterministic/focused, High (0.7-1.0) = creative/diverse. 0 = greedy decoding.
Top-p (Nucleus Sampling) — considers only tokens whose cumulative probability exceeds p. Top-p 0.9 = considers top 90% probability mass.

Top-k — limits token selection to the k most likely next tokens.
Context Window — maximum number of tokens (input + output) the model can process at once. Ranges from 4K to 1M+ tokens in modern models.
Max Tokens — limits the length of generated output.
Stop Sequences — tokens that signal the model to stop generating.

Hallucination — when a model generates plausible-sounding but factually incorrect information.
Grounding — techniques to anchor model responses in factual data (RAG, tool use, citations).

Responsible AI

Core Principles

Fairness & Bias — ensuring models don’t discriminate based on protected attributes (race, gender, age). Types: selection bias, measurement bias, representation bias, confirmation bias.

Explainability — ability to understand and explain how/why a model made a specific prediction. Techniques: SHAP, LIME, attention visualization, feature importance.
Transparency — openly communicating model capabilities, limitations, and intended use cases to users.
Robustness — model performs reliably across different inputs, including adversarial examples and edge cases.

Privacy & Security — protecting training data, user inputs, and model outputs. Preventing data leakage and prompt injection.
Governance — organizational policies, processes, and controls for responsible AI development and deployment.
Safety — preventing harmful outputs including toxic content, misinformation, and dangerous instructions.

AWS Responsible AI Tools

AWS AI Service Cards — transparency documentation for AWS AI services covering intended use cases, limitations, responsible AI design choices, and deployment best practices.
Amazon Bedrock Guardrails — configurable safeguards for GenAI applications:
- Content filters (hate, insults, sexual, violence, misconduct)
- Denied topics (topic avoidance policies)
- Word/phrase filters
- Sensitive information filters (PII redaction)
- Contextual grounding checks (hallucination detection)
- Automated Reasoning Checks (logical verification)
SageMaker Clarify — detects bias in data and models, provides feature attributions for explainability (note: moving to maintenance July 2026 for new customers).

Model Cards — documentation that describes a model’s intended use, performance metrics, limitations, and ethical considerations. Supported in SageMaker Model Registry.
Human-in-the-Loop (HITL) — keeping humans involved in AI decision-making for high-stakes scenarios. AWS A2I (Augmented AI) provided review workflows (note: moving to maintenance July 2026 for new customers).
Amazon Bedrock Model Evaluation — automatic evaluation (accuracy, robustness, toxicity), human evaluation, and LLM-as-a-judge for quality assessment.

Bias Mitigation Strategies

Pre-processing: Balance training data, remove sensitive attributes, data augmentation.
In-processing: Regularization techniques, adversarial debiasing, fairness constraints during training.
Post-processing: Calibrate outputs, threshold adjustment, reject option classification.
Monitoring: Continuously track model performance across demographic groups in production.

Agentic AI

What are AI Agents?

AI systems that can autonomously plan, reason, and execute multi-step tasks to achieve goals.
Go beyond simple prompt-response by taking actions, using tools, and adapting based on outcomes.

Can operate for extended periods, making decisions and course-correcting without human intervention.

Key Concepts

Tool Use (Function Calling) — agents invoke external tools (APIs, databases, code execution) to gather information or perform actions.
Multi-step Reasoning — breaking complex problems into steps, executing sequentially with intermediate evaluations.

Orchestration — coordinating multiple agents or components to complete complex workflows. Patterns: sequential, parallel, routing, supervisor.
Memory — maintaining context across interactions:
- Short-term memory (conversation context within a session)
- Long-term memory (persistent knowledge across sessions)
- Episodic memory (past experiences and outcomes)
Planning — decomposing goals into actionable sub-tasks, determining execution order, handling dependencies.

Reflection — agents evaluate their own outputs and self-correct errors before responding.
Model Context Protocol (MCP) — open standard for connecting AI agents with external tools and data sources.
Agent2Agent (A2A) — protocol for inter-agent communication and collaboration.

AWS Agentic AI Services

Amazon Bedrock Agents — create agents that can break down tasks, call APIs, and access knowledge bases (transitioning to Bedrock Agents Classic, July 2026).

Amazon Bedrock AgentCore (GA 2025/2026) — enterprise-grade infrastructure for deploying and operating AI agents at scale:
- AgentCore Runtime — serverless, scalable environment to host agents
- AgentCore Gateway — MCP-compatible tool connectivity
- AgentCore Identity — per-agent identity and least-privilege access
- AgentCore Observability — monitoring, tracing, and debugging
- AgentCore Code Interpreter — secure sandboxed code execution
- AgentCore Optimization — continuous quality evaluation and improvement
Amazon Nova Act — browser automation agent for web-based tasks.
AWS Step Functions — orchestrate multi-step agent workflows with state management.

AWS AI Service Stack

AWS organizes AI/ML services into three layers:

Layer 1: AI Infrastructure (Compute & Silicon)

Service/Chip	Purpose	Key Details
AWS Trainium	Custom chip for ML training	Trainium2 (4x perf vs gen1), Trainium3 (3nm, 4.4x vs Trn2, GA Dec 2025)
AWS Inferentia	Custom chip for ML inference	Inferentia2 (4x throughput, 10x lower latency vs gen1), Inf2 instances
EC2 UltraServers	Multi-instance AI clusters	Trn2 UltraServers (64 Trainium2 chips, NeuronLink interconnect), Trn3 UltraServers
AWS AI Factories	On-premises AI infrastructure	Deploy AI training/inference infrastructure in customer data centers
AWS Neuron SDK	Software for Trainium/Inferentia	Integrates with PyTorch, JAX, TensorFlow. Compiler, runtime, profiler
EC2 P5/P5e/P5en	GPU instances (NVIDIA)	H100/H200 GPUs for training and inference
EC2 G6/G6e	GPU instances (NVIDIA)	L4/L40S GPUs for inference and graphics
AWS Graviton	Arm-based general compute	Best price-performance for inference serving and general ML workloads
Amazon EFA	Elastic Fabric Adapter	Low-latency networking for distributed training across instances

Layer 2: ML Platform (SageMaker AI)

Amazon SageMaker AI (rebranded from SageMaker, late 2024) — end-to-end ML platform for building, training, and deploying models.

Component	Purpose
SageMaker Unified Studio	Single IDE for data, analytics, and ML/AI development (integrates Bedrock)
SageMaker Canvas	No-code ML for business analysts — point-and-click model building
SageMaker HyperPod	Managed clusters for large-scale distributed training with auto-recovery
SageMaker Pipelines	CI/CD for ML — define, automate, and manage ML workflows
SageMaker Feature Store	Centralized repository for ML features (online + offline store)
SageMaker MLflow	Managed MLflow for experiment tracking, model versioning, deployment
SageMaker Model Registry	Central catalog to version, manage, and deploy models with approval workflows
SageMaker JumpStart	Model hub with 400+ pre-trained models, one-click deploy, fine-tuning
SageMaker Endpoints	Real-time inference hosting (single model or multi-model endpoints)
SageMaker Training	Managed training with built-in algorithms, distributed training, spot instances
SageMaker Processing	Run data processing and evaluation jobs at scale
SageMaker Lakehouse	Unified access to data lakes and warehouses for ML

Layer 3: AI Applications & Services

Amazon Bedrock (Generative AI Platform)

Amazon Bedrock — fully managed service for building GenAI applications with foundation models.
Model Providers: Amazon (Nova), Anthropic (Claude), Meta (Llama), Mistral, Cohere, AI21 Labs, OpenAI, Stability AI.

Key Capabilities:
- Model Inference — Converse API, InvokeModel, streaming, batch inference, cross-region inference
- Knowledge Bases — managed RAG with vector stores (OpenSearch, Aurora, Pinecone, etc.)
- Managed Knowledge Base (2026) — fully managed RAG primitive (storage + retrieval + embeddings + re-ranking)
- Agents — multi-step task execution with tool use (transitioning to AgentCore)
- Guardrails — content filtering, topic avoidance, PII protection, grounding checks
- Model Customization — fine-tuning, continued pre-training, distillation
- Model Evaluation — automatic metrics, human evaluation, LLM-as-judge
- Flows — visual workflow builder for chaining prompts, agents, and knowledge bases

Amazon Nova Models

Nova Micro — text-only, fastest, lowest cost (128K context). Ideal for classification, summarization.
Nova Lite — multimodal (text + image + video input), cost-effective (300K context).

Nova Pro — balanced multimodal, strong accuracy/speed/cost trade-off (300K context).
Nova Premier — most capable, complex reasoning, agentic workflows, teacher model (1M context).
Nova Canvas — image generation with editing controls and watermarking.

Nova Reel — video generation (1280×720, 24fps, up to 6 seconds).
Nova Sonic — speech-to-speech for real-time conversational AI.
Nova 2 (Dec 2025) — next generation with extended thinking (adjustable levels), 1M token context, built-in tools:
- Nova 2 Lite — fast, cost-effective reasoning model
- Nova 2 Pro — most intelligent, complex agentic tasks
- Nova 2 Sonic — next-gen speech with async tool calling
- Nova 2 Omni — unified multimodal I/O (text + image generation)
Nova Act — browser automation agent for web tasks.
Nova Forge — custom model building program (open training).

Amazon Q Developer & Q Business

Amazon Q Developer — AI-powered coding assistant (evolved from CodeWhisperer):
- Code generation, completion, and inline suggestions (15+ languages)
- Agentic coding (autonomous multi-step development)
- Security vulnerability scanning
- Code transformation and modernization (Java, .NET upgrades)
- CLI integration (natural language → commands)
- Debugging and troubleshooting with CloudWatch integration
Amazon Q Business — AI assistant for enterprise knowledge (connects to 40+ data sources):
- Natural language answers from company data
- Document summarization and content creation
- Task automation with plugins
- Access control respecting existing permissions (ACL-aware)

Amazon Q in Console — chat assistant in AWS Management Console for troubleshooting and guidance.

⚠️ Note (July 2026): Amazon Q Developer IDE plugins reaching end-of-support April 2027. Successor is Kiro — AWS’s agentic development environment. Amazon Q Business and Amazon Kendra entering maintenance mode for new customers July 30, 2026.

AWS AI/ML Application Services

Service	Category	Purpose
Amazon Comprehend	NLP	Sentiment analysis, entity recognition, key phrase extraction, language detection, topic modeling
Amazon Rekognition	Computer Vision	Object/face detection, content moderation, celebrity recognition, text in images, custom labels
Amazon Polly	Speech	Text-to-speech with neural voices (60+ voices, 30+ languages), SSML support
Amazon Transcribe	Speech	Speech-to-text (ASR), real-time and batch, custom vocabularies, speaker identification
Amazon Translate	Language	Neural machine translation (75+ languages), real-time and batch, custom terminology
Amazon Textract	Document AI	OCR + intelligent document processing, extracts text, tables, forms, and queries from documents
Amazon Lex	Conversational AI	Build chatbots and voice bots with automatic speech recognition and NLU
Amazon Kendra	Search	Enterprise search with NLP, semantic understanding, GenAI index for RAG ⚠️ Maintenance mode July 2026
Amazon Personalize	Recommendations	Real-time personalization and recommendations (same tech as Amazon.com)
Amazon Forecast	Time Series	Time series forecasting using ML (closed to new customers since 2024)
Amazon HealthScribe	Healthcare	Generate clinical documentation from patient-clinician conversations
Amazon Bedrock AgentCore	Agentic AI	Deploy, manage, and optimize AI agents at scale (GA 2025/2026)

Decision Matrix: Use Case → Recommended Service

Use Case	Recommended Service	Why
Build GenAI apps with FMs (no ML expertise)	Amazon Bedrock	Serverless, multi-model, fully managed
Custom model training from scratch	SageMaker AI + Trainium	Full control over training, data, and infrastructure
Enterprise Q&A over company documents	Amazon Q Business / Bedrock Knowledge Bases	Connects to 40+ data sources, ACL-aware
AI coding assistant	Amazon Q Developer / Kiro	Inline completions, security scanning, agentic coding
Build and deploy AI agents	Bedrock AgentCore	Serverless runtime, MCP tools, identity, observability
Chatbot / virtual assistant	Amazon Lex + Bedrock	Lex for structure, Bedrock for natural responses
Document processing (forms, invoices)	Amazon Textract	Extracts structured data from documents at scale
Content moderation (images/video)	Amazon Rekognition	Pre-built moderation labels, custom labels for specifics
Sentiment analysis on customer feedback	Amazon Comprehend	Pre-built NLP models, no training needed
Real-time product recommendations	Amazon Personalize	Same ML tech as Amazon.com, real-time updates
Transcribe meetings/calls	Amazon Transcribe	Real-time ASR, speaker diarization, custom vocab
Generate speech from text	Amazon Polly	Neural TTS, SSML support, multiple voices
Translate content at scale	Amazon Translate	75+ languages, real-time, custom terminology
No-code ML for business users	SageMaker Canvas	Point-and-click, AutoML, visual interface
Fine-tune FMs on proprietary data	Bedrock Custom Models / SageMaker JumpStart	Bedrock for serverless; SageMaker for full control
Prevent harmful GenAI outputs	Amazon Bedrock Guardrails	Content filters, PII, grounding checks, topic avoidance
Cost-effective GenAI inference at scale	Bedrock + Nova models (or Inferentia2/Trainium)	Nova = lowest cost in class; custom silicon for self-hosted
Clinical documentation from conversations	Amazon HealthScribe	Purpose-built for healthcare, HIPAA eligible

Quick Reference: All AWS AI/ML Services

Service	One-Liner
Amazon Bedrock	Fully managed GenAI platform with multi-provider foundation models
Amazon Bedrock AgentCore	Enterprise infrastructure for deploying and operating AI agents at scale
Amazon Nova	Amazon’s family of foundation models (text, multimodal, speech, image, video)
Amazon Q Developer	AI coding assistant with code generation, security scanning, and transformation
Amazon Q Business	Enterprise AI assistant for Q&A and task automation over company data
Amazon SageMaker AI	End-to-end ML platform for building, training, and deploying custom models
Amazon Comprehend	NLP service for sentiment, entities, key phrases, language detection
Amazon Rekognition	Computer vision for object/face detection, moderation, and custom labels
Amazon Polly	Text-to-speech with neural and standard voices
Amazon Transcribe	Automatic speech recognition (speech-to-text)
Amazon Translate	Neural machine translation for 75+ languages
Amazon Textract	Extract text, tables, and forms from documents (OCR+)
Amazon Lex	Build conversational chatbots and voice bots
Amazon Kendra	Intelligent enterprise search with NLP and GenAI index
Amazon Personalize	Real-time ML-powered personalization and recommendations
Amazon HealthScribe	Generate clinical notes from patient-clinician conversations
AWS Trainium	Custom AI chip optimized for training (Trn2, Trn3 instances)
AWS Inferentia	Custom AI chip optimized for inference (Inf2 instances)
AWS Neuron SDK	SDK for running ML workloads on Trainium and Inferentia chips
Amazon SageMaker Canvas	No-code ML model building for business analysts
Amazon SageMaker HyperPod	Managed clusters for distributed training with auto fault recovery

Exam Tips

AIF-C01 — AWS Certified AI Practitioner

Format: 65 questions, 90 minutes, 700/1000 passing score.
Domains:
- Domain 1: Fundamentals of AI and ML (20%)
- Domain 2: Fundamentals of Generative AI (24%)
- Domain 3: Applications of Foundation Models (28%) — largest domain, most technical
- Domain 4: Guidelines for Responsible AI (14%)
- Domain 5: Security, Compliance, and Governance for AI Solutions (14%)
Key Focus Areas:
- Domains 2+3 = 52% of exam — master Bedrock, RAG, prompt engineering, fine-tuning
- Know the difference between AI vs ML vs DL vs GenAI
- Understand when to use Bedrock vs SageMaker
- RAG architecture and when to use it vs fine-tuning
- Responsible AI principles and Bedrock Guardrails
- Temperature, top-p effects on output
- Know all AWS AI services at a high level (what each does)

AIP-C01 — AWS Certified Generative AI Developer – Professional

Format: 85 questions, 180 minutes, 750/1000 passing score.
Domains:
- Domain 1: FM Selection and Integration (26%)
- Domain 2: Data Management and Optimization (22%)
- Domain 3: Model Performance and Compliance (31%) — largest domain
- Domain 4: Security and Governance (21%)
Key Focus Areas:
- Deep hands-on knowledge of Bedrock APIs, agents, knowledge bases, guardrails
- RAG implementation details (chunking strategies, embedding models, vector stores)
- Model customization (when fine-tuning vs RAG vs prompt engineering)
- Agentic AI patterns (tool use, multi-step, AgentCore)
- SageMaker for custom training and deployment
- Model evaluation and monitoring in production
- Security: data encryption, VPC endpoints, IAM for Bedrock, prompt injection mitigation
- Cost optimization (model selection, batch inference, provisioned throughput)

Common Exam Scenarios

“Least operational overhead” → Bedrock (serverless) over SageMaker (managed infrastructure)
“Custom model with proprietary data” → Fine-tuning on Bedrock or SageMaker depending on control needed

“Reduce hallucinations” → RAG with Knowledge Bases + Guardrails grounding checks
“Enterprise search over internal docs” → Amazon Q Business or Bedrock Knowledge Bases
“Control AI outputs for safety” → Bedrock Guardrails
“Lowest cost inference” → Nova Micro (text) or Nova Lite (multimodal) on Bedrock

“Deploy agents in production” → Bedrock AgentCore (serverless, scalable, observable)
“Train trillion-parameter model” → Trainium3 UltraServers + SageMaker HyperPod

Practice Questions

Question 1 (AIF-C01)

A company wants to reduce hallucinations in their generative AI application that answers customer questions about company policies. The application uses Amazon Bedrock. What is the MOST effective approach?

Increase the model temperature to generate more diverse responses
Implement Retrieval Augmented Generation (RAG) with Amazon Bedrock Knowledge Bases

Fine-tune the foundation model on company documents
Switch to a larger foundation model

Show Answer

Answer: B – RAG grounds responses in actual company documents, directly reducing hallucinations. Fine-tuning (C) teaches style/format but doesn’t guarantee factual accuracy for specific documents. Higher temperature (A) increases randomness. Larger models (D) don’t inherently reduce hallucinations.

Question 2 (AIF-C01)

Which combination of techniques helps ensure responsible AI in a generative AI application? (Select TWO)

Increase the context window size
Configure Amazon Bedrock Guardrails with content filters and denied topics

Use the lowest-cost foundation model available
Implement human review workflows for high-stakes decisions
Maximize the temperature parameter for creative outputs

Show Answer

Answer: B, D – Bedrock Guardrails (B) provides configurable safety controls to filter harmful content. Human-in-the-loop (D) ensures human oversight for critical decisions. Context window size (A), model cost (C), and temperature (E) are not responsible AI techniques.

Question 3 (AIP-C01)

A developer is building an AI agent that needs to autonomously execute multi-step workflows, call external APIs, and maintain state across interactions. The solution must be production-grade with monitoring and minimal infrastructure management. Which AWS service should they use?

Amazon Lex with Lambda fulfillment functions
Amazon Bedrock AgentCore with AgentCore Runtime and Observability
AWS Step Functions with SageMaker endpoints
Amazon Q Business with custom plugins

Show Answer

Answer: B – Bedrock AgentCore provides serverless runtime for agents, MCP-compatible tool connectivity (Gateway), built-in observability, and identity management — purpose-built for production AI agents. Lex (A) is for chatbots, not autonomous agents. Step Functions (C) requires more infrastructure management. Q Business (D) is for enterprise knowledge, not custom agent workflows.

Question 4 (AIP-C01)

A team needs to fine-tune a foundation model on their proprietary dataset with minimal compute cost. The dataset contains 10,000 instruction-response pairs. Which approach provides the BEST balance of performance improvement and cost?

Full fine-tuning of the entire model on Amazon SageMaker with P5 GPU instances
Continued pre-training on Amazon Bedrock with the full dataset
Parameter-efficient fine-tuning (LoRA) through Amazon Bedrock custom models
Distilling the model into a smaller variant using Nova Premier as teacher

Show Answer

Answer: C – LoRA fine-tuning on Bedrock trains only small adapter layers (reduces compute by 90%+) while the base model stays frozen. It’s ideal for instruction-tuning with limited data. Full fine-tuning (A) is expensive. Continued pre-training (B) is for teaching new knowledge, not task alignment. Distillation (D) creates a smaller model but doesn’t directly fine-tune on task data.

Question 5 (AIF-C01 / AIP-C01)

A company wants to deploy a generative AI solution with the following requirements: lowest possible latency for text summarization, minimal cost, and no infrastructure management. Which combination should they choose?

Amazon Nova Premier on Bedrock with provisioned throughput
Amazon Nova Micro on Bedrock with on-demand pricing
Claude 3 Opus on Bedrock with batch inference
Self-hosted Llama model on SageMaker with Inferentia2 instances

Show Answer

Answer: B – Nova Micro is the fastest text-only model (200+ tokens/sec), lowest cost, and Bedrock provides serverless (no infrastructure). Premier (A) is more capable but slower and costlier. Batch (C) has high latency. Self-hosted (D) requires infrastructure management.

Frequently Asked Questions

What is the difference between AI, ML, and Generative AI?

AI is the broadest category — machines performing tasks that typically require human intelligence. ML is a subset that learns from data without explicit programming. Generative AI is a subset of ML that creates new content (text, images, code) using foundation models trained on vast datasets.

What is the difference between Amazon Bedrock and SageMaker?

Bedrock provides access to pre-built foundation models for generative AI applications without ML expertise. SageMaker is a full ML platform for building, training, and deploying custom models from scratch. Use Bedrock for gen AI apps; SageMaker when you need complete control over model training.

What AWS certifications cover AI and Generative AI?

AWS offers two AI-focused certifications: AIF-C01 (AI Practitioner) for foundational knowledge of AI/ML/Gen AI concepts and AWS services, and AIP-C01 (AI Professional) for practitioners building and deploying Gen AI solutions. Both require knowledge of Bedrock, SageMaker, and responsible AI.

Detailed Guides

Exam Prep: AWS AI Professional (AIP-C01) Exam Learning Path

AWS AI & Generative AI Services – Cheat Sheet

AI/ML/Generative AI Fundamentals

AI vs ML vs Deep Learning vs Generative AI

Learning Paradigms

Neural Networks Basics

Foundation Model Concepts

Pre-training

Fine-tuning Techniques

RAG (Retrieval Augmented Generation)

Prompt Engineering

Key Parameters & Concepts

Responsible AI

Core Principles

AWS Responsible AI Tools

Bias Mitigation Strategies

Agentic AI

What are AI Agents?

Key Concepts

AWS Agentic AI Services

AWS AI Service Stack

Layer 1: AI Infrastructure (Compute & Silicon)

Layer 2: ML Platform (SageMaker AI)

Layer 3: AI Applications & Services

Amazon Bedrock (Generative AI Platform)

Amazon Nova Models

Amazon Q Developer & Q Business

AWS AI/ML Application Services

Decision Matrix: Use Case → Recommended Service

Quick Reference: All AWS AI/ML Services

Exam Tips

AIF-C01 — AWS Certified AI Practitioner

AIP-C01 — AWS Certified Generative AI Developer – Professional

Common Exam Scenarios

Practice Questions

Question 1 (AIF-C01)

Question 2 (AIF-C01)

Question 3 (AIP-C01)

Question 4 (AIP-C01)

Question 5 (AIF-C01 / AIP-C01)

Frequently Asked Questions

What is the difference between AI, ML, and Generative AI?

What is the difference between Amazon Bedrock and SageMaker?

What AWS certifications cover AI and Generative AI?

Detailed Guides

References