AWS AI & Generative AI Services – Cheat Sheet

AWS AI & Generative AI Services – Cheat Sheet

This is the definitive cheat sheet covering AI, Machine Learning, and Generative AI services on AWS — designed as the anchor page for both the AWS Certified AI Practitioner (AIF-C01) and AWS Certified Generative AI Developer – Professional (AIP-C01) exams.

Related Posts:

AI/ML/Generative AI Fundamentals

AI vs ML vs Deep Learning vs Generative AI

Concept Definition Examples
Artificial Intelligence (AI) Broad field of computer science focused on creating systems that can perform tasks requiring human intelligence Rule-based systems, expert systems, robotics
Machine Learning (ML) Subset of AI where systems learn from data without being explicitly programmed Fraud detection, recommendations, forecasting
Deep Learning (DL) Subset of ML using neural networks with multiple layers (deep neural networks) to learn complex patterns Image recognition, NLP, speech recognition
Generative AI (GenAI) Subset of DL that creates new content (text, images, code, video, audio) by learning patterns from training data ChatGPT, DALL-E, Amazon Nova, Claude

Learning Paradigms

  • Supervised Learning — model learns from labeled data (input-output pairs). Used for classification (spam/not spam) and regression (price prediction).
  • Unsupervised Learning — model finds patterns in unlabeled data. Used for clustering (customer segmentation), anomaly detection, and dimensionality reduction.
  • Semi-supervised Learning — combines small amount of labeled data with large amounts of unlabeled data.
  • Reinforcement Learning (RL) — agent learns by interacting with an environment, receiving rewards/penalties. Used for game playing, robotics, and RLHF in LLMs.
  • Self-supervised Learning — model generates its own labels from input data (e.g., predicting masked tokens). Used for pre-training foundation models.

Neural Networks Basics

  • Neurons/Nodes — basic computation units that receive inputs, apply weights, add bias, and pass through an activation function.
  • Layers — Input layer (receives data), Hidden layers (process data), Output layer (produces result).
  • Weights & Biases — parameters learned during training that determine the model’s behavior.
  • Activation Functions — introduce non-linearity (ReLU, Sigmoid, Softmax, Tanh).
  • Backpropagation — algorithm to compute gradients and update weights by propagating errors backward.
  • Loss Function — measures how far the model’s predictions are from actual values.
  • Transformer Architecture — foundation of modern LLMs; uses self-attention mechanism to process entire sequences in parallel (introduced in “Attention is All You Need” paper, 2017).
  • CNNs (Convolutional Neural Networks) — specialized for image/spatial data.
  • RNNs/LSTMs — sequential data processing (largely superseded by Transformers for NLP).
  • GANs (Generative Adversarial Networks) — generator + discriminator for image generation.
  • Diffusion Models — generate images/video by learning to denoise (e.g., Stable Diffusion, Nova Canvas).

📖 Deep Dive Guides: Bedrock vs SageMaker | RAG Architecture | Prompt Engineering | Responsible AI | AI Services Decision Guide

Foundation Model Concepts

Pre-training

  • Training a model on massive datasets (trillions of tokens) to learn general language/world knowledge.
  • Extremely expensive and resource-intensive (millions of GPU hours).
  • Results in a base model with broad capabilities but no specific task alignment.
  • Common objectives: next-token prediction (GPT-style), masked language modeling (BERT-style).

Fine-tuning Techniques

  • Instruction Tuning — fine-tuning on instruction-response pairs to make the model follow instructions better.
  • RLHF (Reinforcement Learning from Human Feedback) — trains a reward model from human preferences, then uses RL (PPO) to optimize the language model against that reward. Used to align models with human values.
  • DPO (Direct Preference Optimization) — simpler alternative to RLHF that directly optimizes on preference pairs without a separate reward model. More stable training.
  • LoRA / QLoRA — Parameter-Efficient Fine-Tuning (PEFT) that freezes base model and trains small adapter layers. Reduces compute by 90%+.
  • Continued Pre-training — further pre-training on domain-specific data to teach the model new knowledge (e.g., medical, legal, financial).
  • Distillation — training a smaller “student” model to mimic a larger “teacher” model’s outputs. Reduces inference cost while retaining most capability.

RAG (Retrieval Augmented Generation)

  • Combines information retrieval with text generation to ground LLM responses in external knowledge.
  • How it works: Query → Retrieve relevant documents from knowledge base → Augment prompt with retrieved context → Generate response.
  • Benefits: Reduces hallucinations, enables up-to-date responses, no model retraining needed, source attribution.
  • Components: Document ingestion, chunking strategy, embedding model, vector database, retrieval algorithm, re-ranking.
  • AWS Implementation: Amazon Bedrock Knowledge Bases, Amazon Kendra (GenAI Index), OpenSearch vector search.

Prompt Engineering

  • Zero-shot — asking the model to perform a task without any examples. Relies on pre-trained knowledge.
  • Few-shot (In-Context Learning) — providing a few examples in the prompt to guide the model’s output format and behavior.
  • Chain-of-Thought (CoT) — asking the model to “think step by step” to improve reasoning on complex tasks.
  • System Prompts — instructions that define the model’s role, behavior, and constraints.
  • Prompt Templates — reusable prompt structures with placeholders for dynamic content.
  • Prompt Chaining — breaking complex tasks into sequential prompts where output of one feeds input of next.

Key Parameters & Concepts

  • Tokenization — splitting text into tokens (subwords/words). Models process tokens, not characters. Affects context limits and pricing.
  • Embeddings — dense vector representations of text/images in high-dimensional space. Semantically similar items have similar embeddings. Used for search, RAG, and clustering.
  • Temperature — controls randomness of output. Low (0-0.3) = deterministic/focused, High (0.7-1.0) = creative/diverse. 0 = greedy decoding.
  • Top-p (Nucleus Sampling) — considers only tokens whose cumulative probability exceeds p. Top-p 0.9 = considers top 90% probability mass.
  • Top-k — limits token selection to the k most likely next tokens.
  • Context Window — maximum number of tokens (input + output) the model can process at once. Ranges from 4K to 1M+ tokens in modern models.
  • Max Tokens — limits the length of generated output.
  • Stop Sequences — tokens that signal the model to stop generating.
  • Hallucination — when a model generates plausible-sounding but factually incorrect information.
  • Grounding — techniques to anchor model responses in factual data (RAG, tool use, citations).

Responsible AI

Core Principles

  • Fairness & Bias — ensuring models don’t discriminate based on protected attributes (race, gender, age). Types: selection bias, measurement bias, representation bias, confirmation bias.
  • Explainability — ability to understand and explain how/why a model made a specific prediction. Techniques: SHAP, LIME, attention visualization, feature importance.
  • Transparency — openly communicating model capabilities, limitations, and intended use cases to users.
  • Robustness — model performs reliably across different inputs, including adversarial examples and edge cases.
  • Privacy & Security — protecting training data, user inputs, and model outputs. Preventing data leakage and prompt injection.
  • Governance — organizational policies, processes, and controls for responsible AI development and deployment.
  • Safety — preventing harmful outputs including toxic content, misinformation, and dangerous instructions.

AWS Responsible AI Tools

  • AWS AI Service Cards — transparency documentation for AWS AI services covering intended use cases, limitations, responsible AI design choices, and deployment best practices.
  • Amazon Bedrock Guardrails — configurable safeguards for GenAI applications:
    • Content filters (hate, insults, sexual, violence, misconduct)
    • Denied topics (topic avoidance policies)
    • Word/phrase filters
    • Sensitive information filters (PII redaction)
    • Contextual grounding checks (hallucination detection)
    • Automated Reasoning Checks (logical verification)
  • SageMaker Clarify — detects bias in data and models, provides feature attributions for explainability (note: moving to maintenance July 2026 for new customers).
  • Model Cards — documentation that describes a model’s intended use, performance metrics, limitations, and ethical considerations. Supported in SageMaker Model Registry.
  • Human-in-the-Loop (HITL) — keeping humans involved in AI decision-making for high-stakes scenarios. AWS A2I (Augmented AI) provided review workflows (note: moving to maintenance July 2026 for new customers).
  • Amazon Bedrock Model Evaluation — automatic evaluation (accuracy, robustness, toxicity), human evaluation, and LLM-as-a-judge for quality assessment.

Bias Mitigation Strategies

  • Pre-processing: Balance training data, remove sensitive attributes, data augmentation.
  • In-processing: Regularization techniques, adversarial debiasing, fairness constraints during training.
  • Post-processing: Calibrate outputs, threshold adjustment, reject option classification.
  • Monitoring: Continuously track model performance across demographic groups in production.

Agentic AI

What are AI Agents?

  • AI systems that can autonomously plan, reason, and execute multi-step tasks to achieve goals.
  • Go beyond simple prompt-response by taking actions, using tools, and adapting based on outcomes.
  • Can operate for extended periods, making decisions and course-correcting without human intervention.

Key Concepts

  • Tool Use (Function Calling) — agents invoke external tools (APIs, databases, code execution) to gather information or perform actions.
  • Multi-step Reasoning — breaking complex problems into steps, executing sequentially with intermediate evaluations.
  • Orchestration — coordinating multiple agents or components to complete complex workflows. Patterns: sequential, parallel, routing, supervisor.
  • Memory — maintaining context across interactions:
    • Short-term memory (conversation context within a session)
    • Long-term memory (persistent knowledge across sessions)
    • Episodic memory (past experiences and outcomes)
  • Planning — decomposing goals into actionable sub-tasks, determining execution order, handling dependencies.
  • Reflection — agents evaluate their own outputs and self-correct errors before responding.
  • Model Context Protocol (MCP) — open standard for connecting AI agents with external tools and data sources.
  • Agent2Agent (A2A) — protocol for inter-agent communication and collaboration.

AWS Agentic AI Services

  • Amazon Bedrock Agents — create agents that can break down tasks, call APIs, and access knowledge bases (transitioning to Bedrock Agents Classic, July 2026).
  • Amazon Bedrock AgentCore (GA 2025/2026) — enterprise-grade infrastructure for deploying and operating AI agents at scale:
    • AgentCore Runtime — serverless, scalable environment to host agents
    • AgentCore Gateway — MCP-compatible tool connectivity
    • AgentCore Identity — per-agent identity and least-privilege access
    • AgentCore Observability — monitoring, tracing, and debugging
    • AgentCore Code Interpreter — secure sandboxed code execution
    • AgentCore Optimization — continuous quality evaluation and improvement
  • Amazon Nova Act — browser automation agent for web-based tasks.
  • AWS Step Functions — orchestrate multi-step agent workflows with state management.

AWS AI Service Stack

AWS organizes AI/ML services into three layers:

Layer 1: AI Infrastructure (Compute & Silicon)

Service/Chip Purpose Key Details
AWS Trainium Custom chip for ML training Trainium2 (4x perf vs gen1), Trainium3 (3nm, 4.4x vs Trn2, GA Dec 2025)
AWS Inferentia Custom chip for ML inference Inferentia2 (4x throughput, 10x lower latency vs gen1), Inf2 instances
EC2 UltraServers Multi-instance AI clusters Trn2 UltraServers (64 Trainium2 chips, NeuronLink interconnect), Trn3 UltraServers
AWS AI Factories On-premises AI infrastructure Deploy AI training/inference infrastructure in customer data centers
AWS Neuron SDK Software for Trainium/Inferentia Integrates with PyTorch, JAX, TensorFlow. Compiler, runtime, profiler
EC2 P5/P5e/P5en GPU instances (NVIDIA) H100/H200 GPUs for training and inference
EC2 G6/G6e GPU instances (NVIDIA) L4/L40S GPUs for inference and graphics
AWS Graviton Arm-based general compute Best price-performance for inference serving and general ML workloads
Amazon EFA Elastic Fabric Adapter Low-latency networking for distributed training across instances

Layer 2: ML Platform (SageMaker AI)

  • Amazon SageMaker AI (rebranded from SageMaker, late 2024) — end-to-end ML platform for building, training, and deploying models.
Component Purpose
SageMaker Unified Studio Single IDE for data, analytics, and ML/AI development (integrates Bedrock)
SageMaker Canvas No-code ML for business analysts — point-and-click model building
SageMaker HyperPod Managed clusters for large-scale distributed training with auto-recovery
SageMaker Pipelines CI/CD for ML — define, automate, and manage ML workflows
SageMaker Feature Store Centralized repository for ML features (online + offline store)
SageMaker MLflow Managed MLflow for experiment tracking, model versioning, deployment
SageMaker Model Registry Central catalog to version, manage, and deploy models with approval workflows
SageMaker JumpStart Model hub with 400+ pre-trained models, one-click deploy, fine-tuning
SageMaker Endpoints Real-time inference hosting (single model or multi-model endpoints)
SageMaker Training Managed training with built-in algorithms, distributed training, spot instances
SageMaker Processing Run data processing and evaluation jobs at scale
SageMaker Lakehouse Unified access to data lakes and warehouses for ML

Layer 3: AI Applications & Services

Amazon Bedrock (Generative AI Platform)

  • Amazon Bedrock — fully managed service for building GenAI applications with foundation models.
  • Model Providers: Amazon (Nova), Anthropic (Claude), Meta (Llama), Mistral, Cohere, AI21 Labs, OpenAI, Stability AI.
  • Key Capabilities:
    • Model Inference — Converse API, InvokeModel, streaming, batch inference, cross-region inference
    • Knowledge Bases — managed RAG with vector stores (OpenSearch, Aurora, Pinecone, etc.)
    • Managed Knowledge Base (2026) — fully managed RAG primitive (storage + retrieval + embeddings + re-ranking)
    • Agents — multi-step task execution with tool use (transitioning to AgentCore)
    • Guardrails — content filtering, topic avoidance, PII protection, grounding checks
    • Model Customization — fine-tuning, continued pre-training, distillation
    • Model Evaluation — automatic metrics, human evaluation, LLM-as-judge
    • Flows — visual workflow builder for chaining prompts, agents, and knowledge bases

Amazon Nova Models

  • Nova Micro — text-only, fastest, lowest cost (128K context). Ideal for classification, summarization.
  • Nova Lite — multimodal (text + image + video input), cost-effective (300K context).
  • Nova Pro — balanced multimodal, strong accuracy/speed/cost trade-off (300K context).
  • Nova Premier — most capable, complex reasoning, agentic workflows, teacher model (1M context).
  • Nova Canvas — image generation with editing controls and watermarking.
  • Nova Reel — video generation (1280×720, 24fps, up to 6 seconds).
  • Nova Sonic — speech-to-speech for real-time conversational AI.
  • Nova 2 (Dec 2025) — next generation with extended thinking (adjustable levels), 1M token context, built-in tools:
    • Nova 2 Lite — fast, cost-effective reasoning model
    • Nova 2 Pro — most intelligent, complex agentic tasks
    • Nova 2 Sonic — next-gen speech with async tool calling
    • Nova 2 Omni — unified multimodal I/O (text + image generation)
  • Nova Act — browser automation agent for web tasks.
  • Nova Forge — custom model building program (open training).

Amazon Q Developer & Q Business

  • Amazon Q Developer — AI-powered coding assistant (evolved from CodeWhisperer):
    • Code generation, completion, and inline suggestions (15+ languages)
    • Agentic coding (autonomous multi-step development)
    • Security vulnerability scanning
    • Code transformation and modernization (Java, .NET upgrades)
    • CLI integration (natural language → commands)
    • Debugging and troubleshooting with CloudWatch integration
  • Amazon Q Business — AI assistant for enterprise knowledge (connects to 40+ data sources):
    • Natural language answers from company data
    • Document summarization and content creation
    • Task automation with plugins
    • Access control respecting existing permissions (ACL-aware)
  • Amazon Q in Console — chat assistant in AWS Management Console for troubleshooting and guidance.
⚠️ Note (July 2026): Amazon Q Developer IDE plugins reaching end-of-support April 2027. Successor is Kiro — AWS’s agentic development environment. Amazon Q Business and Amazon Kendra entering maintenance mode for new customers July 30, 2026.

AWS AI/ML Application Services

Service Category Purpose
Amazon Comprehend NLP Sentiment analysis, entity recognition, key phrase extraction, language detection, topic modeling
Amazon Rekognition Computer Vision Object/face detection, content moderation, celebrity recognition, text in images, custom labels
Amazon Polly Speech Text-to-speech with neural voices (60+ voices, 30+ languages), SSML support
Amazon Transcribe Speech Speech-to-text (ASR), real-time and batch, custom vocabularies, speaker identification
Amazon Translate Language Neural machine translation (75+ languages), real-time and batch, custom terminology
Amazon Textract Document AI OCR + intelligent document processing, extracts text, tables, forms, and queries from documents
Amazon Lex Conversational AI Build chatbots and voice bots with automatic speech recognition and NLU
Amazon Kendra Search Enterprise search with NLP, semantic understanding, GenAI index for RAG ⚠️ Maintenance mode July 2026
Amazon Personalize Recommendations Real-time personalization and recommendations (same tech as Amazon.com)
Amazon Forecast Time Series Time series forecasting using ML (closed to new customers since 2024)
Amazon HealthScribe Healthcare Generate clinical documentation from patient-clinician conversations
Amazon Bedrock AgentCore Agentic AI Deploy, manage, and optimize AI agents at scale (GA 2025/2026)

Decision Matrix: Use Case → Recommended Service

Use Case Recommended Service Why
Build GenAI apps with FMs (no ML expertise) Amazon Bedrock Serverless, multi-model, fully managed
Custom model training from scratch SageMaker AI + Trainium Full control over training, data, and infrastructure
Enterprise Q&A over company documents Amazon Q Business / Bedrock Knowledge Bases Connects to 40+ data sources, ACL-aware
AI coding assistant Amazon Q Developer / Kiro Inline completions, security scanning, agentic coding
Build and deploy AI agents Bedrock AgentCore Serverless runtime, MCP tools, identity, observability
Chatbot / virtual assistant Amazon Lex + Bedrock Lex for structure, Bedrock for natural responses
Document processing (forms, invoices) Amazon Textract Extracts structured data from documents at scale
Content moderation (images/video) Amazon Rekognition Pre-built moderation labels, custom labels for specifics
Sentiment analysis on customer feedback Amazon Comprehend Pre-built NLP models, no training needed
Real-time product recommendations Amazon Personalize Same ML tech as Amazon.com, real-time updates
Transcribe meetings/calls Amazon Transcribe Real-time ASR, speaker diarization, custom vocab
Generate speech from text Amazon Polly Neural TTS, SSML support, multiple voices
Translate content at scale Amazon Translate 75+ languages, real-time, custom terminology
No-code ML for business users SageMaker Canvas Point-and-click, AutoML, visual interface
Fine-tune FMs on proprietary data Bedrock Custom Models / SageMaker JumpStart Bedrock for serverless; SageMaker for full control
Prevent harmful GenAI outputs Amazon Bedrock Guardrails Content filters, PII, grounding checks, topic avoidance
Cost-effective GenAI inference at scale Bedrock + Nova models (or Inferentia2/Trainium) Nova = lowest cost in class; custom silicon for self-hosted
Clinical documentation from conversations Amazon HealthScribe Purpose-built for healthcare, HIPAA eligible

Quick Reference: All AWS AI/ML Services

Service One-Liner
Amazon Bedrock Fully managed GenAI platform with multi-provider foundation models
Amazon Bedrock AgentCore Enterprise infrastructure for deploying and operating AI agents at scale
Amazon Nova Amazon’s family of foundation models (text, multimodal, speech, image, video)
Amazon Q Developer AI coding assistant with code generation, security scanning, and transformation
Amazon Q Business Enterprise AI assistant for Q&A and task automation over company data
Amazon SageMaker AI End-to-end ML platform for building, training, and deploying custom models
Amazon Comprehend NLP service for sentiment, entities, key phrases, language detection
Amazon Rekognition Computer vision for object/face detection, moderation, and custom labels
Amazon Polly Text-to-speech with neural and standard voices
Amazon Transcribe Automatic speech recognition (speech-to-text)
Amazon Translate Neural machine translation for 75+ languages
Amazon Textract Extract text, tables, and forms from documents (OCR+)
Amazon Lex Build conversational chatbots and voice bots
Amazon Kendra Intelligent enterprise search with NLP and GenAI index
Amazon Personalize Real-time ML-powered personalization and recommendations
Amazon HealthScribe Generate clinical notes from patient-clinician conversations
AWS Trainium Custom AI chip optimized for training (Trn2, Trn3 instances)
AWS Inferentia Custom AI chip optimized for inference (Inf2 instances)
AWS Neuron SDK SDK for running ML workloads on Trainium and Inferentia chips
Amazon SageMaker Canvas No-code ML model building for business analysts
Amazon SageMaker HyperPod Managed clusters for distributed training with auto fault recovery

Exam Tips

AIF-C01 — AWS Certified AI Practitioner

  • Format: 65 questions, 90 minutes, 700/1000 passing score.
  • Domains:
    • Domain 1: Fundamentals of AI and ML (20%)
    • Domain 2: Fundamentals of Generative AI (24%)
    • Domain 3: Applications of Foundation Models (28%) — largest domain, most technical
    • Domain 4: Guidelines for Responsible AI (14%)
    • Domain 5: Security, Compliance, and Governance for AI Solutions (14%)
  • Key Focus Areas:
    • Domains 2+3 = 52% of exam — master Bedrock, RAG, prompt engineering, fine-tuning
    • Know the difference between AI vs ML vs DL vs GenAI
    • Understand when to use Bedrock vs SageMaker
    • RAG architecture and when to use it vs fine-tuning
    • Responsible AI principles and Bedrock Guardrails
    • Temperature, top-p effects on output
    • Know all AWS AI services at a high level (what each does)

AIP-C01 — AWS Certified Generative AI Developer – Professional

  • Format: 85 questions, 180 minutes, 750/1000 passing score.
  • Domains:
    • Domain 1: FM Selection and Integration (26%)
    • Domain 2: Data Management and Optimization (22%)
    • Domain 3: Model Performance and Compliance (31%) — largest domain
    • Domain 4: Security and Governance (21%)
  • Key Focus Areas:
    • Deep hands-on knowledge of Bedrock APIs, agents, knowledge bases, guardrails
    • RAG implementation details (chunking strategies, embedding models, vector stores)
    • Model customization (when fine-tuning vs RAG vs prompt engineering)
    • Agentic AI patterns (tool use, multi-step, AgentCore)
    • SageMaker for custom training and deployment
    • Model evaluation and monitoring in production
    • Security: data encryption, VPC endpoints, IAM for Bedrock, prompt injection mitigation
    • Cost optimization (model selection, batch inference, provisioned throughput)

Common Exam Scenarios

  • “Least operational overhead” → Bedrock (serverless) over SageMaker (managed infrastructure)
  • “Custom model with proprietary data” → Fine-tuning on Bedrock or SageMaker depending on control needed
  • “Reduce hallucinations” → RAG with Knowledge Bases + Guardrails grounding checks
  • “Enterprise search over internal docs” → Amazon Q Business or Bedrock Knowledge Bases
  • “Control AI outputs for safety” → Bedrock Guardrails
  • “Lowest cost inference” → Nova Micro (text) or Nova Lite (multimodal) on Bedrock
  • “Deploy agents in production” → Bedrock AgentCore (serverless, scalable, observable)
  • “Train trillion-parameter model” → Trainium3 UltraServers + SageMaker HyperPod

Practice Questions

Question 1 (AIF-C01)

A company wants to reduce hallucinations in their generative AI application that answers customer questions about company policies. The application uses Amazon Bedrock. What is the MOST effective approach?

  1. Increase the model temperature to generate more diverse responses
  2. Implement Retrieval Augmented Generation (RAG) with Amazon Bedrock Knowledge Bases
  3. Fine-tune the foundation model on company documents
  4. Switch to a larger foundation model
Show Answer

Answer: B – RAG grounds responses in actual company documents, directly reducing hallucinations. Fine-tuning (C) teaches style/format but doesn’t guarantee factual accuracy for specific documents. Higher temperature (A) increases randomness. Larger models (D) don’t inherently reduce hallucinations.

Question 2 (AIF-C01)

Which combination of techniques helps ensure responsible AI in a generative AI application? (Select TWO)

  1. Increase the context window size
  2. Configure Amazon Bedrock Guardrails with content filters and denied topics
  3. Use the lowest-cost foundation model available
  4. Implement human review workflows for high-stakes decisions
  5. Maximize the temperature parameter for creative outputs
Show Answer

Answer: B, D – Bedrock Guardrails (B) provides configurable safety controls to filter harmful content. Human-in-the-loop (D) ensures human oversight for critical decisions. Context window size (A), model cost (C), and temperature (E) are not responsible AI techniques.

Question 3 (AIP-C01)

A developer is building an AI agent that needs to autonomously execute multi-step workflows, call external APIs, and maintain state across interactions. The solution must be production-grade with monitoring and minimal infrastructure management. Which AWS service should they use?

  1. Amazon Lex with Lambda fulfillment functions
  2. Amazon Bedrock AgentCore with AgentCore Runtime and Observability
  3. AWS Step Functions with SageMaker endpoints
  4. Amazon Q Business with custom plugins
Show Answer

Answer: B – Bedrock AgentCore provides serverless runtime for agents, MCP-compatible tool connectivity (Gateway), built-in observability, and identity management — purpose-built for production AI agents. Lex (A) is for chatbots, not autonomous agents. Step Functions (C) requires more infrastructure management. Q Business (D) is for enterprise knowledge, not custom agent workflows.

Question 4 (AIP-C01)

A team needs to fine-tune a foundation model on their proprietary dataset with minimal compute cost. The dataset contains 10,000 instruction-response pairs. Which approach provides the BEST balance of performance improvement and cost?

  1. Full fine-tuning of the entire model on Amazon SageMaker with P5 GPU instances
  2. Continued pre-training on Amazon Bedrock with the full dataset
  3. Parameter-efficient fine-tuning (LoRA) through Amazon Bedrock custom models
  4. Distilling the model into a smaller variant using Nova Premier as teacher
Show Answer

Answer: C – LoRA fine-tuning on Bedrock trains only small adapter layers (reduces compute by 90%+) while the base model stays frozen. It’s ideal for instruction-tuning with limited data. Full fine-tuning (A) is expensive. Continued pre-training (B) is for teaching new knowledge, not task alignment. Distillation (D) creates a smaller model but doesn’t directly fine-tune on task data.

Question 5 (AIF-C01 / AIP-C01)

A company wants to deploy a generative AI solution with the following requirements: lowest possible latency for text summarization, minimal cost, and no infrastructure management. Which combination should they choose?

  1. Amazon Nova Premier on Bedrock with provisioned throughput
  2. Amazon Nova Micro on Bedrock with on-demand pricing
  3. Claude 3 Opus on Bedrock with batch inference
  4. Self-hosted Llama model on SageMaker with Inferentia2 instances
Show Answer

Answer: B – Nova Micro is the fastest text-only model (200+ tokens/sec), lowest cost, and Bedrock provides serverless (no infrastructure). Premier (A) is more capable but slower and costlier. Batch (C) has high latency. Self-hosted (D) requires infrastructure management.

Frequently Asked Questions

What is the difference between AI, ML, and Generative AI?

AI is the broadest category — machines performing tasks that typically require human intelligence. ML is a subset that learns from data without explicit programming. Generative AI is a subset of ML that creates new content (text, images, code) using foundation models trained on vast datasets.

What is the difference between Amazon Bedrock and SageMaker?

Bedrock provides access to pre-built foundation models for generative AI applications without ML expertise. SageMaker is a full ML platform for building, training, and deploying custom models from scratch. Use Bedrock for gen AI apps; SageMaker when you need complete control over model training.

What AWS certifications cover AI and Generative AI?

AWS offers two AI-focused certifications: AIF-C01 (AI Practitioner) for foundational knowledge of AI/ML/Gen AI concepts and AWS services, and AIP-C01 (AI Professional) for practitioners building and deploying Gen AI solutions. Both require knowledge of Bedrock, SageMaker, and responsible AI.

Detailed Guides

Exam Prep: AWS AI Professional (AIP-C01) Exam Learning Path

References

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.