Table of Contents hide

Amazon Bedrock – Fully Managed Foundation Models & Gen AI

Key Features

Pricing Model

Security & Data Privacy

Integration with AWS Services

Amazon Bedrock vs Amazon SageMaker AI

AWS Certification Exam Relevance

Amazon Bedrock Practice Questions

Frequently Asked Questions

Amazon Bedrock – Fully Managed Foundation Models & Gen AI

Amazon Bedrock Overview

Amazon Bedrock is a fully managed service that provides access to high-performing foundation models (FMs) from leading AI companies through a single, unified API.

was announced at AWS re:Invent 2022 and became generally available in September 2023.
enables building and scaling generative AI applications without managing any infrastructure (serverless).

offers a broad set of capabilities including model access, customization (fine-tuning, continued pre-training), RAG with Knowledge Bases, Agents, Guardrails, and model evaluation.
ensures data privacy — your data is not shared with model providers and is not used to improve or train the base models.
supports multiple inference modes: on-demand, batch (50% cheaper), and provisioned throughput for consistent performance.

integrates with AWS services you already use — IAM, CloudWatch, CloudTrail, VPC, Lambda, S3, Step Functions, and more.
as of 2026, the model catalog spans 18+ providers and over 110 individually addressable model variants including language, reasoning, image, video, speech, and embedding models.
supports OpenAI-compatible API endpoints (2026) including Responses API and Chat Completions API for simplified migration from existing OpenAI-based applications.

Amazon Bedrock Architecture
Your Application
↓ Bedrock API ↓
Foundation Models
Claude, Nova, Llama, Mistral
Knowledge Bases
RAG → Vector Store
Agents
Tool Use, Multi-step
Guardrails
Filters, PII, Grounding
Fine-tuning
Model Evaluation
Prompt Flows
AgentCore
Fully managed • No infrastructure • Your data never leaves your account

📖 Deep Dive Guides: Bedrock vs SageMaker | RAG Architecture | Prompt Engineering | Responsible AI | AI Services Decision Guide

Supported Foundation Models

Amazon Bedrock provides access to a diverse selection of foundation models from multiple leading AI providers, allowing customers to choose the best model for their specific use case.

Model Providers Comparison

Provider	Models	Strengths	Key Use Cases
Anthropic	Claude Opus 4, Claude Sonnet 4/4.5, Claude Haiku 3.5	Advanced reasoning, coding, long context (200K–1M tokens), agentic tasks	Complex analysis, code generation, enterprise workflows
Amazon	Nova Micro, Lite, Pro, Premier, Nova 2 Lite/Pro, Nova Canvas, Nova Reel, Nova Sonic	Best price-performance, multimodal (text, image, video, speech), fast inference	Cost-efficient workloads, content generation, RAG, agentic apps
Meta	Llama 3.3 70B, Llama 3.2 (11B/90B), Llama 3.1 (8B/70B/405B)	Open-weight, strong reasoning, multilingual	General-purpose, fine-tuning, on-device deployment
Mistral AI	Mistral Large, Mistral Small, Mixtral	Efficiency, multilingual, code generation	European language tasks, coding, cost-optimized inference
Cohere	Command R, Command R+	Enterprise search, RAG-optimized, multilingual	Enterprise search, document processing, RAG applications
AI21 Labs	Jamba 1.5 (Large/Mini), Jamba-Instruct	Long context (256K), efficient architecture (SSM+Transformer hybrid)	Document summarization, Q&A, long-form content
Stability AI	Stable Diffusion XL, SDXL variants, image editing models	High-quality image generation and editing	Creative content, marketing visuals, design prototyping
OpenAI	GPT models (added 2025–2026)	Broad capabilities, coding, reasoning	General-purpose AI, migration from OpenAI direct
DeepSeek	DeepSeek-R1	Advanced reasoning, open-weight	Math, science, complex reasoning tasks

Amazon Nova Foundation Models

Amazon Nova Micro — Text-only model optimized for speed (200+ tokens/sec) and lowest cost. 128K context. Ideal for summarization, translation, classification, and chat.

Amazon Nova Lite — Low-cost multimodal model (text, images, video input → text output). Best for document analysis, visual Q&A, and RAG.
Amazon Nova Pro — Balanced multimodal model offering strong accuracy, speed, and cost. Suitable for a wide range of complex tasks.
Amazon Nova Premier — Most capable model for complex reasoning, agentic workflows, and teacher model for distillation.

Amazon Nova Canvas — Image generation model with professional-grade outputs and customization controls.
Amazon Nova Reel — Video generation model supporting multi-shot videos up to 2 minutes with motion control.
Amazon Nova Sonic — Speech-to-speech model for conversational AI with natural turn-taking and expressivity.

Amazon Nova 2 (December 2025)

Nova 2 Lite and Nova 2 Pro — Next-generation models with significantly improved capabilities.
Support extended thinking with step-by-step reasoning and task decomposition.
Include three thinking intensity levels (low, medium, high) for controlling the balance of speed, intelligence, and cost.
Process up to 1M tokens of context, enabling analysis of extensive codebases, long documents, and videos.

Built-in tools: code interpreter, web grounding, and remote MCP tool support.
Nova 2 Sonic — Next-generation speech-to-speech model with expressive voices, native expressivity in multiple languages, natural turn-taking, and interruption handling.

Key Features

Model Access & Inference

Provides a unified API to access all foundation models — no need to manage separate integrations per provider.

Supports four families of runtime APIs:
- Invoke family — InvokeModel (synchronous), InvokeModelWithResponseStream (streaming), InvokeModelWithBidirectionalStream (full-duplex), AsyncInvoke (long-running, output to S3).
- Converse family — Model-agnostic interface for multi-turn conversations (Converse and ConverseStream).
- OpenAI-compatible family — ChatCompletions and Responses API for migration from OpenAI-based apps.
- Messages family — Anthropic Messages interface on the bedrock-mantle endpoint.
Supports cross-region inference with routing tiers: In-Region, Geo, and Global for optimized latency and availability.

Models can be swapped without rewriting application code thanks to the unified Converse API.

Model Customization

Fine-tuning — Supervised fine-tuning using labeled prompt-response pairs to improve model performance for specific use cases. Supports iterative fine-tuning for systematic refinement.
Continued Pre-training — Feed domain-specific unlabeled data to adapt a model’s knowledge to your industry (e.g., finance, healthcare, legal).

Reinforcement Fine-tuning (RFT) — Uses reinforcement learning from human or AI feedback (RLHF/RLAIF) to align model behavior with desired outcomes. Available for Nova, OpenAI GPT OSS, and Qwen models.
Model Distillation — Use a larger “teacher” model (e.g., Nova Premier) to create a smaller, more cost-effective “student” model with comparable accuracy.
Custom Model Import — Bring your own fine-tuned models (including open-weight models like DeepSeek-R1) into Bedrock for managed inference.

Creates a private copy of the model — your data is never shared with model providers.
Supports on-demand inference for custom models (no need to provision throughput for fine-tuned Amazon Nova models).

Knowledge Bases (RAG)

Provides fully managed Retrieval Augmented Generation (RAG) to ground model responses in your proprietary data.

Connects to data sources including Amazon S3, Confluence, SharePoint, Salesforce, and web crawlers.
Automatically handles document chunking, embedding generation, and vector storage.
Supports vector databases: Amazon OpenSearch Serverless, Amazon Aurora PostgreSQL, Pinecone, Redis Enterprise, and MongoDB Atlas.
Amazon Bedrock Managed Knowledge Base (2026) — Abstracts storage, retrieval, embeddings, and re-ranking into a single managed primitive. No need to choose or configure a vector database.

Supports hybrid search combining semantic (vector) and keyword (lexical) retrieval for improved accuracy.
Includes metadata filtering and access controls for multi-tenant scenarios.
Available in 22+ AWS Regions.

Agents

Bedrock Agents enable building AI applications that can plan and execute multi-step tasks autonomously.
Agents break down user requests, call APIs, access knowledge bases, and maintain conversation context.

Support action groups — define allowed actions via OpenAPI schemas or Lambda functions.
Include built-in capabilities: code interpreter, web search, and user confirmation for sensitive actions.
Amazon Bedrock AgentCore (2026) — A dedicated runtime for production-grade AI agents with:
- AgentCore Runtime — Serverless scaling and session isolation.
- AgentCore Memory — Conversation context and long-term preference storage.
- AgentCore Identity — Secure multi-IDP authentication.
- AgentCore Harness — Testing, evaluation, and continuous optimization.
- AgentCore Observability — Monitoring and debugging agent behavior.
- Guardrails Integration — Real-time evaluation of agent actions for safety.
Integrated with AWS Step Functions for complex orchestration workflows.

Guardrails

Amazon Bedrock Guardrails provides configurable safety controls to filter harmful content and enforce responsible AI policies.
Works with any foundation model on Bedrock and external models via the standalone ApplyGuardrail API (model-agnostic).

Key safeguards include:
- Content Filters — Block harmful content across categories: Hate, Insults, Sexual, Violence, Misconduct. Supports both text and images (88% blocking accuracy).
- Denied Topics — Define topics the model should refuse to discuss.
- Word Filters — Block specific words or phrases.
- Sensitive Information Filters (PII) — Detect and redact/block personally identifiable information with Block or Mask modes.
- Contextual Grounding — Detect hallucinations by checking response relevance to source data.
- Automated Reasoning Checks — Logic-based validation of model outputs.
- Prompt Attack Detection — Identify and block prompt injection attempts.
- Code Domain Protection (2025) — Filter harmful content in code elements including comments, variables, and string literals.
Supports IAM policy-based enforcement for organization-wide guardrail application.
Scalable: up to 50 calls/second via ApplyGuardrail API, 200 TUPS for content filters.

Model Evaluation

Amazon Bedrock Evaluations helps compare and select the best foundation model for your use case.

Supports automatic evaluation with predefined metrics: accuracy, robustness, toxicity, semantic similarity, and faithfulness.
Supports human evaluation with custom workflows for subjective quality assessment.
Can evaluate custom (fine-tuned) models against base models to validate improvements.

Evaluates RAG systems end-to-end — retrieval relevance, answer faithfulness, and response quality.
Model-agnostic: evaluate models running on Bedrock, other clouds, or on-premises (provide data in required format).
Supports LLM-as-a-judge evaluation pattern for automated scoring at scale.

Pricing Model

Amazon Bedrock offers flexible pricing options to match different workload requirements and budgets.

On-Demand Inference — Pay-per-use based on tokens processed (input and output). No commitments. Three tiers available:
- Flex — Best-effort capacity at standard pricing.
- Priority — Higher priority access during peak demand.
- Standard — Guaranteed capacity with highest priority.
Batch Inference — Process large volumes of data asynchronously at 50% of on-demand pricing. Ideal for non-time-sensitive workloads like bulk classification, summarization, or embeddings.
Provisioned Throughput (Reserved Capacity) — Reserve a fixed number of tokens-per-minute for 1 month or 3 months. Provides consistent, predictable performance. Billed monthly at a fixed rate per 1K tokens-per-minute.

Cross-Region Inference — Automatically routes requests to optimize for availability and latency across regions. Pricing varies by routing tier.
Knowledge Bases, Agents, and Guardrails have additional per-request pricing components.
Model customization (fine-tuning, continued pre-training) is charged based on tokens processed during training.

No upfront costs for on-demand; Bedrock spend can count toward AWS Enterprise Discount Programs (EDP).

Security & Data Privacy

Data Privacy
- Your data is NOT used to train or improve the base foundation models.
- Your data is NOT shared with any model providers.
- Inputs and outputs are not stored by Amazon Bedrock (unless you opt-in to logging).
- Custom model training creates a private copy exclusively for your use.
Encryption
- All data encrypted in transit (TLS) — only encrypted connections to the service are allowed.
- All data encrypted at rest using AWS KMS (customer-managed keys supported).
- Model artifacts and training data are encrypted with KMS CMKs.

Network Security
- AWS PrivateLink — Establish private VPC endpoints to access Bedrock without traversing the public internet.
- VPC endpoint policies for fine-grained access control.
- No public IP addresses required for VPC instances communicating with Bedrock.

Access Control
- AWS IAM for authentication and authorization.
- Resource-based and identity-based policies.
- AWS CloudTrail for complete API audit logging.
- Service Control Policies (SCPs) for organizational governance.
Compliance
- SOC 1/2/3, ISO 27001, HIPAA eligible, PCI DSS, FedRAMP.
- Available in AWS GovCloud (US) regions and AWS Top Secret cloud.

Integration with AWS Services

AWS Lambda — Invoke Bedrock models from serverless functions for event-driven AI applications.
Amazon S3 — Store training data, Knowledge Base source documents, and async inference outputs.
AWS Step Functions — Orchestrate multi-step generative AI workflows with direct Bedrock integration (InvokeModel and fine-tuning actions).

Amazon CloudWatch — Monitor model invocation metrics, latency, token usage, and set alarms.
AWS CloudTrail — Audit all Bedrock API calls for security and compliance.
Amazon EventBridge — Trigger workflows based on Bedrock events (e.g., fine-tuning job completion).

AWS IAM — Fine-grained access control for models, knowledge bases, agents, and guardrails.
Amazon OpenSearch Serverless — Vector database backend for Knowledge Bases.
Amazon Kendra — Enterprise search integration for RAG scenarios.
AWS PrivateLink / VPC — Private network connectivity to Bedrock.

Amazon SageMaker Unified Studio — Access Bedrock models within the unified SageMaker development environment.
Amazon Connect — Build AI-powered contact center experiences with Bedrock models.
Amazon Lex — Enhanced conversational AI bots powered by Bedrock foundation models.

Amazon Bedrock vs Amazon SageMaker AI

Criteria	Amazon Bedrock	Amazon SageMaker AI
Primary Purpose	Use pre-trained foundation models for inference and generative AI apps	Build, train, and deploy custom ML models from scratch
Infrastructure	Fully serverless — no infrastructure to manage	Managed but requires selecting instance types, scaling config
Model Access	Multi-provider FM marketplace (18+ providers)	JumpStart for pre-trained models + bring your own
Customization	Fine-tuning, continued pre-training, RFT, distillation	Full training pipeline control, any framework, any architecture
Best For	Generative AI applications, RAG, agents, content generation	Custom ML (tabular, time-series, computer vision), proprietary architectures
Skill Level	Developers and application builders	Data scientists and ML engineers
Pricing	Pay-per-token (input/output)	Pay-per-instance-hour (training & inference)
Orchestration	Built-in Agents, AgentCore, Knowledge Bases	MLOps pipelines, model registry, feature store
Safety Controls	Built-in Guardrails with content filtering, PII detection	Model Cards, Bias detection (Clarify), monitoring

When to Use Each

Choose Bedrock when:
- You need to use pre-trained foundation models for inference.
- You want the fastest path to production with no infrastructure management.
- Your use case is generative AI — chatbots, content generation, summarization, RAG, agents.
- You want to compare and switch between models from different providers.
- Your team consists of application developers rather than ML specialists.
Choose SageMaker AI when:
- You need to train custom models from scratch with proprietary architectures.
- Your workload involves traditional ML (classification, regression, forecasting) on structured data.
- You need complete control over the training pipeline, hyperparameters, and infrastructure.
- Your token volume is high and predictable (cost-effective at scale with reserved instances).
- Compliance requires complete VPC data isolation with self-hosted model endpoints.
- AI/ML is your core product and you need full MLOps lifecycle management.

Use Both — Most mature AWS AI deployments use Bedrock for generative AI and agentic applications while leveraging SageMaker AI for custom model training and specialized ML workloads.

AWS Certification Exam Relevance

Certification	Bedrock Topics Tested	Weight
AWS Certified AI Practitioner (AIF-C01)	Core Bedrock service, model selection, Knowledge Bases/RAG, Agents, Guardrails, responsible AI, fine-tuning concepts, pricing	High — Bedrock is central to this exam
AWS Solutions Architect Associate (SAA-C03)	When to use Bedrock vs SageMaker, integration patterns, security (VPC, encryption), serverless architecture	Medium — appears in ML/AI service selection questions
AWS Solutions Architect Professional (SAP-C02)	Architecture patterns with Bedrock, multi-account governance, data privacy, cross-region inference, cost optimization	Medium — complex scenario questions involving AI workloads

Key Exam Tips

AIF-C01: If a question asks for a way to use foundation models with no infrastructure, no deployment overhead, and access to multiple model providers — the answer is almost always Bedrock.
AIF-C01: RAG with Knowledge Bases is the standard approach for grounding model responses in enterprise data without fine-tuning.

AIF-C01: Guardrails is the answer for content filtering, PII protection, and responsible AI enforcement.
SAA-C03: Bedrock = serverless foundation models; SageMaker AI = custom model training and deployment.
SAP-C02: For enterprise scenarios requiring data privacy, remember: Bedrock data is never used for training, and VPC endpoints provide private connectivity.

Amazon Bedrock Practice Questions

A company wants to build a customer-facing chatbot that answers questions using information from their internal knowledge base stored in Amazon S3. The solution must be serverless, ground responses in actual company data, and prevent hallucinations. Which combination of Amazon Bedrock features should they use?
1. Amazon Bedrock with fine-tuning on company documents
2. Amazon Bedrock Knowledge Bases with Guardrails contextual grounding checks
3. Amazon SageMaker AI with a custom-trained LLM
4. Amazon Kendra with Amazon Lex
Show Answer

Answer: b – Bedrock Knowledge Bases provides managed RAG that grounds responses in company data from S3. Guardrails with contextual grounding checks validate that responses are faithful to the retrieved source data, helping prevent hallucinations. Fine-tuning teaches style/behavior but doesn’t ground in specific documents. SageMaker requires infrastructure management. Kendra + Lex doesn’t provide generative AI responses.
A financial services company needs to deploy a generative AI application that processes sensitive customer data. They require that: (1) data is never shared with model providers, (2) all traffic stays within their private network, and (3) all data is encrypted with their own keys. Which Amazon Bedrock security features address these requirements?
1. IAM policies and CloudTrail logging
2. AWS PrivateLink VPC endpoints, AWS KMS customer-managed keys, and Bedrock’s data privacy guarantee
3. Security groups and NACLs
4. AWS Shield and AWS WAF
Show Answer

Answer: b – Amazon Bedrock guarantees that customer data is never shared with model providers or used to train base models. AWS PrivateLink provides private VPC connectivity without internet exposure. AWS KMS with customer-managed keys provides encryption at rest and in transit with customer control. These three features together satisfy all stated requirements.
A startup needs to build a generative AI application quickly and wants to evaluate multiple foundation models before selecting one for production. They have application developers but no ML engineers. Which AWS service is most appropriate?
1. Amazon SageMaker AI with JumpStart
2. Amazon Bedrock with Model Evaluation
3. Amazon Comprehend
4. AWS Deep Learning AMIs on EC2
Show Answer

Answer: b – Amazon Bedrock provides serverless access to multiple foundation models through a unified API, requires no ML expertise or infrastructure management, and includes built-in Model Evaluation to compare models using automatic and human evaluation metrics. SageMaker JumpStart provides pre-trained models but requires more ML expertise and infrastructure decisions.
An enterprise wants to ensure their generative AI application blocks harmful content, detects prompt injection attacks, and redacts PII from both inputs and outputs. The solution must work across different foundation models and be enforced organization-wide via IAM policies. Which service should they use?
1. Amazon Comprehend for PII detection with custom Lambda filters
2. Amazon Bedrock Guardrails with IAM policy-based enforcement
3. AWS WAF with custom rules
4. Amazon Macie for sensitive data discovery
Show Answer

Answer: b – Amazon Bedrock Guardrails provides all the required capabilities: content filtering (harmful content blocking), prompt attack detection, and sensitive information filters with PII redaction (Block or Mask modes). It works with any model via the ApplyGuardrail API and supports IAM policy-based enforcement to apply guardrails organization-wide across all model inference calls.

A company has a large volume of product descriptions (5 million items) that need to be classified into categories using a foundation model. The task is not time-sensitive and they want to minimize costs. Which Amazon Bedrock inference option should they choose?
1. On-demand inference with Standard tier
2. Provisioned throughput with 3-month commitment
3. Batch inference
4. Cross-region inference with Global routing
Show Answer

Answer: c – Batch inference is ideal for large-volume, non-time-sensitive workloads and is priced at 50% of on-demand pricing, making it the most cost-effective option. Provisioned throughput is for consistent, predictable workloads requiring low latency. On-demand Standard tier is more expensive. Cross-region inference optimizes availability, not cost.

Frequently Asked Questions

What is Amazon Bedrock?

Amazon Bedrock is a fully managed service that provides access to foundation models from leading AI companies (Anthropic, Meta, Mistral, Amazon, and others) through a single API. It offers fine-tuning, RAG with Knowledge Bases, Agents, and Guardrails without managing infrastructure.

How does Bedrock differ from SageMaker?

Bedrock is for accessing and customizing pre-built foundation models without ML expertise. SageMaker is for building, training, and deploying custom ML models from scratch. Use Bedrock for generative AI applications; use SageMaker when you need full control over the model training pipeline.

Does AWS use my data to train Bedrock models?

No. AWS does not use any customer inputs or outputs to train or improve Amazon Bedrock foundation models. Your data remains private and is encrypted in transit and at rest.

Jayendra's Cloud Certification Blog

Amazon Bedrock – Foundation Models & Generative AI

Amazon Bedrock – Fully Managed Foundation Models & Gen AI

Amazon Bedrock Overview

Supported Foundation Models

Model Providers Comparison

Amazon Nova Foundation Models

Amazon Nova 2 (December 2025)

Key Features

Model Access & Inference

Model Customization

Knowledge Bases (RAG)

Agents

Guardrails

Model Evaluation

Pricing Model

Security & Data Privacy

Integration with AWS Services

Amazon Bedrock vs Amazon SageMaker AI

When to Use Each

AWS Certification Exam Relevance

Key Exam Tips

Amazon Bedrock Practice Questions

Frequently Asked Questions

What is Amazon Bedrock?

How does Bedrock differ from SageMaker?

Does AWS use my data to train Bedrock models?

References