Google Cloud provides a comprehensive suite of AI and Machine Learning services spanning the full ML lifecycle — from data preparation and model training to deployment, inference, and responsible AI governance.

In April 2026, Google rebranded Vertex AI as the Gemini Enterprise Agent Platform at Cloud Next ’26, consolidating all AI/ML services under an agent-first architecture.
Google Cloud AI services are broadly categorized into: AI Platform (Vertex AI / Gemini Enterprise Agent Platform), Foundation Models (Gemini), Pre-trained APIs, Conversational AI, AI Infrastructure (TPUs, GPUs), and Responsible AI tools.

Vertex AI / Gemini Enterprise Agent Platform

Vertex AI (now Gemini Enterprise Agent Platform since April 2026) is Google Cloud’s unified, fully managed ML platform for building, training, deploying, and scaling ML models and generative AI applications.
Provides a single environment combining AutoML and custom training with no-code, low-code, and code-first approaches.
Key components include:
- Vertex AI Workbench — managed Jupyter notebook environment for data exploration and ML development.
- Vertex AI Training — custom model training with distributed training support on GPUs and TPUs.
- Vertex AI Predictions — online and batch prediction endpoints with autoscaling.
- Vertex AI Pipelines — serverless ML workflow orchestration based on Kubeflow Pipelines or TFX.
- Vertex AI Model Registry — central repository to manage, version, and deploy models.
- Vertex AI Feature Store — managed feature storage for serving and sharing ML features at scale.
- Vertex AI Model Garden — catalog of 200+ foundation models including Gemini, Claude, Llama, and open-source models.
- Vertex AI Studio — UI for prompt engineering, model tuning, and testing generative AI models.
- Vertex AI Experiments — track, compare, and analyze ML experiments.
- Vertex AI Model Monitoring — detect data drift and model quality degradation in production.
Supports custom containers (Docker) for training and serving with any ML framework (TensorFlow, PyTorch, JAX, XGBoost, scikit-learn).
Provides pre-built containers for popular frameworks optimized for Google Cloud hardware.

Integrates with BigQuery, Cloud Storage, Dataflow, and other Google Cloud data services.
As of May 2026, Vertex AI has been fully migrated to Gemini Enterprise Agent Platform in the Google Cloud Console. All future updates are delivered through the Agent Platform.

Gemini (Foundation Model)

Gemini is Google’s family of multimodal foundation models from Google DeepMind, capable of understanding and generating text, images, audio, video, and code.

Gemini model family includes:
- Gemini 3 Pro — most capable model for complex reasoning, coding, and multimodal tasks.
- Gemini 3 Flash — optimized for speed and efficiency with near-Pro intelligence at lower cost.
- Gemini 3.5 Flash — latest model with Pro-level coding proficiency and parallel agentic execution at Flash-tier pricing.
- Gemini Nano — on-device model for mobile and edge deployments.
Supports multimodal inputs — can process text, images, audio, video, and code in a single prompt.

Offers a 1M+ token context window for processing large documents, codebases, and long videos.
Supports function calling, grounding with Google Search, and tool use for agentic applications.
Available through Vertex AI Studio, Vertex AI API, and Google AI Studio.

Supports fine-tuning and distillation to customize models for specific use cases.
Provides built-in safety filters with configurable thresholds for responsible deployment.
Gemini for Google Cloud (formerly Duet AI) provides AI-powered assistance across Google Cloud Console, Cloud Code, BigQuery, and other services.

Vertex AI Agent Builder

Vertex AI Agent Builder is Google Cloud’s comprehensive platform to build, scale, and govern reliable AI agents.
Key components include:
- Agent Development Kit (ADK) — open-source, code-first framework for building multi-agent systems.
- Agent Studio — low-code visual builder for designing agent workflows.
- Agent Engine — managed runtime for deploying and scaling agents in production.
- Agent Garden — collection of ready-to-use agent samples and tools.

Supports multi-agent orchestration where multiple agents collaborate on complex workflows.
Provider-agnostic — supports Gemini, Claude, Llama, and hundreds of third-party models from Model Garden.
Includes persistent memory, session management, and enterprise governance features.
Integrates with Google Workspace, third-party APIs, and enterprise data sources.

Supports the Agent-to-Agent (A2A) protocol for inter-agent communication across platforms.

Vertex AI Search

Vertex AI Search (part of AI Applications) brings together deep information retrieval, NLP, and LLM processing to understand user intent and return highly relevant results.
Goes beyond basic keyword matching using AI to deliver relevant results grounded in enterprise data.

Supports multiple data sources — websites, unstructured documents, structured data, and Cloud Storage.
Provides generative AI answers grounded in enterprise data with citations.
Includes Vertex AI Search for Commerce (formerly Recommendations AI) for e-commerce with:
- AI-driven product rankings and catalog enhancements.
- Conversational Commerce agent for guiding users from intent to purchase.
- Personalized search results and recommendations optimized for revenue.
Supports RAG (Retrieval Augmented Generation) patterns for grounding LLM responses in enterprise data.
Provides out-of-the-box search widgets and APIs for quick integration.

Document AI

Document AI is a fully managed platform for document understanding that uses ML and generative AI to extract, classify, and enrich data from documents.

Supports structured, semi-structured, and unstructured documents (invoices, receipts, contracts, forms, IDs).
Key capabilities:
- Document OCR — extract printed and handwritten text from documents and images.
- Form Parser — extract key-value pairs, tables, and checkboxes from forms.
- Specialized Processors — pre-trained models for invoices, receipts, bank statements, pay slips, W-2s, and procurement documents.
- Custom Document Extractor — train custom models for domain-specific documents.
- Document Splitter — classify and split multi-page documents.
- Document AI Warehouse — search, store, and govern documents at scale with AI-powered classification.
Integrates with BigQuery, Cloud Storage, and Vertex AI Pipelines for end-to-end document processing workflows.
Supports human-in-the-loop review for critical document processing.
Processes documents asynchronously in batch or synchronously in real-time.

Vision AI

Vision AI provides pre-trained models for image analysis and computer vision tasks via the Cloud Vision API.

Key features:
- Label Detection — identify objects, locations, activities, animal species, and products in images.
- OCR (Text Detection) — extract printed and handwritten text from images.
- Face Detection — detect faces along with associated attributes (joy, sorrow, anger, surprise).
- Landmark Detection — identify popular natural and man-made landmarks.
- Logo Detection — detect popular product and brand logos.
- SafeSearch Detection — detect explicit content (adult, violence, medical, racy).
- Image Properties — detect dominant colors and crop hints.
- Object Localization — detect and locate multiple objects in an image with bounding polygons.

Supports batch image annotation for processing large volumes of images.
Provides a Product Search feature to find similar products in a product catalog.
Imagen on Vertex AI — Google’s text-to-image generation model for creating and editing images from text prompts.

Veo on Vertex AI — video generation model for creating videos from text and image prompts.
Vision AI pre-trained API provides basic capabilities; for custom image classification or object detection, use AutoML on Vertex AI.

Cloud Speech-to-Text

Speech-to-Text converts audio to text using Google’s deep learning neural network algorithms.

Supports 125+ languages and variants with automatic language detection.
Key features:
- Real-time Streaming — transcribe audio from a microphone or streaming source in real-time.
- Batch Recognition — transcribe pre-recorded audio files up to 480 minutes.
- Multi-channel Recognition — transcribe separate channels (e.g., caller and agent in a call center).
- Speaker Diarization — identify who said what in multi-speaker audio.
- Automatic Punctuation — automatically add punctuation to transcripts.
- Word-level Confidence — confidence scores for individual words.
- Speech Adaptation — boost recognition of domain-specific terms and phrases.
- Chirp — universal speech model with state-of-the-art accuracy across languages.
Provides V2 API with improved accuracy using latest foundation models.
Supports noise robustness for transcribing audio in noisy environments.

Cloud Text-to-Speech

Text-to-Speech converts text into natural-sounding speech using Google’s AI.
Offers 700+ voices across 50+ languages and variants, including Neural2, Studio, and WaveNet voices.

Key features:
- WaveNet Voices — high-fidelity voices generated by DeepMind’s WaveNet model.
- Neural2 Voices — next-generation voices combining Tensor2Tensor with WaveNet for improved quality.
- Studio Voices — premium, human-like voices for professional applications.
- Custom Voice — create a unique voice using your own recordings.
- SSML Support — control pronunciation, speaking rate, pitch, and volume with Speech Synthesis Markup Language.
- Multi-speaker — generate audio with multiple distinct speakers in a single request.

Supports audio output in MP3, OGG Opus, LINEAR16, and MULAW formats.
Integrates with Dialogflow for voice-enabled conversational agents.

Cloud Natural Language AI

Natural Language AI uses ML to extract insights from unstructured text.
Key capabilities:
- Sentiment Analysis — understand the overall sentiment (positive/negative) of text at document and sentence level.
- Entity Analysis — identify entities (people, organizations, locations, events, products) and their types.
- Entity Sentiment Analysis — combine entity and sentiment analysis to understand sentiment about specific entities.
- Syntax Analysis — extract tokens and sentences, identify parts of speech, and create dependency parse trees.
- Content Classification — classify documents into 1,000+ predefined categories.
- Text Moderation — classify text into safety categories (toxic, insult, profanity, etc.).
Supports multiple languages for all analysis features.
Provides the Healthcare Natural Language API for extracting medical entities from clinical text.

For custom text classification or entity extraction, use AutoML Natural Language on Vertex AI.

Cloud Translation AI

Cloud Translation provides real-time language translation using neural machine translation (NMT).
Two editions available:
- Translation API Basic (v2) — simple, quick translations for 100+ languages.
- Translation API Advanced (v3) — enterprise features including glossaries, custom models, and batch translation.
Key features:
- AutoML Translation — train custom translation models with domain-specific terminology.
- Adaptive Translation — real-time customization using few-shot examples without training a full model.
- Glossaries — ensure consistent translation of domain-specific terms (brand names, product names).
- Batch Translation — translate large volumes of documents asynchronously.
- Language Detection — automatically detect the source language.
- Document Translation — translate documents while preserving formatting (PDF, DOCX).
Supports 130+ languages for text translation.
Integrates with Cloud Storage for batch processing and BigQuery for analytics.

Video Intelligence AI

Video Intelligence API enables understanding of video content by analyzing stored and streaming video.

Key features:
- Label Detection — recognize 20,000+ objects, places, and actions in video at shot, frame, or segment level.
- Shot Change Detection — detect scene transitions in video.
- Explicit Content Detection — identify inappropriate content in video.
- Speech Transcription — transcribe speech within video content.
- Text Detection (OCR) — detect and extract text appearing in video frames.
- Object Tracking — track objects across video frames with bounding boxes.
- Person Detection — detect people and track their poses in video.
- Face Detection — detect faces in video (without identification).
- Logo Detection — detect and track brand logos in video.

Supports both stored video (Cloud Storage, URIs) and streaming video analysis.
Provides rich metadata at video, shot, and frame levels for building searchable video archives.
Integrates with Cloud Storage, BigQuery, and Pub/Sub for automated video processing pipelines.

Contact Center AI (CCAI)

Contact Center AI Platform is a full-stack contact center solution for managing customer interactions across voice and digital channels.
Key components:
- CCAI Platform — full CCaaS (Contact Center as a Service) with routing, queuing, and workforce management.
- Dialogflow CX Virtual Agents — AI-powered virtual agents that handle customer interactions before routing to human agents.
- Agent Assist — provides real-time suggestions, knowledge articles, and smart replies to human agents during conversations.
- CCAI Insights — analyzes call transcripts to identify call drivers, sentiment, and conversation topics at scale.
- Conversational Agents — new name for Dialogflow CX in the CCAI context (renamed 2025).
Supports IVA-only deployments to add Google’s generative AI virtual agents without replacing existing contact center infrastructure.
Integrates with third-party CRM and telephony systems (Genesys, Avaya, NICE, Cisco).

Provides sentiment analysis, entity extraction, and intent detection for every conversation.
Supports both voice and digital channels (chat, email, SMS, social media).

Dialogflow (CX and ES)

Dialogflow is a natural language understanding platform for building conversational interfaces (chatbots, voice bots, IVR systems).

Two editions available:
- Dialogflow CX (Conversational Agents) — enterprise-grade edition for complex, multi-turn conversations.
  - Uses visual flow builder for designing conversation paths.
  - Supports state-based conversation management with pages, flows, and transition routes.
  - Provides built-in generative AI capabilities using Gemini for dynamic responses.
  - Supports data store agents for grounding responses in enterprise data.
  - Multi-language support with separate flows per language.
  - Advanced analytics and debugging tools.
- Dialogflow ES (Essentials) — standard edition for simpler, single-turn or basic multi-turn conversations.
  - Intent-based conversation model with contexts for state management.
  - Suitable for small to medium chatbots and simple IVR systems.
  - Simpler setup but less control over complex conversation flows.
Dialogflow CX is recommended for new projects. ES is maintained but CX provides superior capabilities for enterprise use cases.
Integrates with telephony partners, Google Chat, Slack, Facebook Messenger, Twilio, and custom channels.

Supports webhook fulfillment for dynamic responses and backend integration.

Recommendations AI

Recommendations AI (now part of Vertex AI Search for Commerce) delivers personalized product recommendations at scale using Google’s ML expertise.
Key recommendation types:
- Recommended for You — personalized suggestions based on user browsing and purchase history.
- Others You May Like — similar product recommendations based on collective user behavior.
- Frequently Bought Together — complementary product suggestions for cross-selling.
- Similar Items — visually or categorically similar products.
- Recently Viewed — personalized recall of previously viewed items.
Supports real-time user events for immediate personalization updates.
Provides A/B testing capabilities to measure recommendation quality impact on revenue.
Requires catalog data (products) and user events (views, add-to-cart, purchases) for model training.

Models improve automatically as more user interaction data is collected.
Recommendations AI has been consolidated into Vertex AI Search for Commerce / AI Commerce Search as of 2025.

Gemini for Google Cloud (formerly Duet AI)

Gemini for Google Cloud is an AI-powered collaborator embedded across Google Cloud services to boost developer and operator productivity.

Previously known as Duet AI for Google Cloud (rebranded to Gemini in February 2024).
Key capabilities across services:
- Gemini Code Assist — AI-powered code completion, generation, and explanation in Cloud Shell Editor, VS Code, JetBrains IDEs, and Cloud Workstations.
- Gemini in BigQuery — generate SQL queries, explain results, suggest optimizations using natural language.
- Gemini in Cloud Console — natural language assistance for cloud operations, troubleshooting, and configuration.
- Gemini in Looker — generate visualizations and formulas from natural language.
- Gemini Cloud Assist — AI-driven recommendations for design, operations, and troubleshooting.
- Gemini in Security — summarize security findings, explain threats, and suggest remediation in Security Command Center.
- Gemini in Databases — generate schemas, optimize queries, and explain database operations (Cloud SQL, Spanner, AlloyDB).

Gemini Code Assist supports 20+ programming languages with full codebase context awareness.
Available in two tiers: Gemini Code Assist Standard and Gemini Code Assist Enterprise with codebase customization.

AutoML

AutoML enables training custom, high-quality ML models with minimal ML expertise using transfer learning and neural architecture search.
AutoML is now integrated into Vertex AI and supports:
- AutoML Image Classification — classify images into custom categories.
- AutoML Object Detection — detect and locate custom objects in images.
- AutoML Text Classification — classify text documents into custom categories.
- AutoML Entity Extraction — extract custom entities from text.
- AutoML Sentiment Analysis — analyze sentiment with custom models.
- AutoML Translation — train custom neural machine translation models.
- AutoML Video Classification — classify video segments.
- AutoML Video Object Tracking — track custom objects in video.
- AutoML Tabular — train models on structured/tabular data for classification, regression, and forecasting.
Uses Google’s state-of-the-art transfer learning and neural architecture search technology.
Requires labeled training data — supports human labeling through Vertex AI Data Labeling service.
Provides model evaluation metrics (precision, recall, F1, confusion matrix) before deployment.
Trained models can be exported for edge deployment (TensorFlow Lite, TF.js, Core ML) or served via Vertex AI Endpoints.
Standalone AutoML products (automl.googleapis.com) have been migrated to Vertex AI. Use Vertex AI for all new AutoML workloads.

Cloud TPUs (Tensor Processing Units)

Cloud TPUs are Google’s custom-designed AI accelerators (ASICs) optimized for training and inference of large ML models using TensorFlow, PyTorch, and JAX.
TPU generations available on Google Cloud:
- TPU v5e — cost-efficient accelerator optimized for training and serving transformer models, text-to-image, and CNNs. 256 chips per Pod.
- TPU v6e (Trillium) — 6th generation with 4.7x peak compute improvement over v5e, doubled HBM capacity/bandwidth, and doubled ICI bandwidth. 256 chips per Pod.
- TPU v5p — high-performance variant optimized for large-scale training workloads.
- TPU7x (Ironwood) — 7th generation, Google’s most powerful TPU:
  - 4.6 petaFLOPS of peak FP8 compute per chip.
  - 192 GiB HBM3e memory per chip with 7.4 TB/s bandwidth.
  - 10x peak performance improvement over v5p.
  - 4x better performance per chip vs. v6e for training and inference.
  - 9,216-chip superpods delivering 42.5 exaFLOPS of FP8 compute.
  - 1.77 PB of directly accessible HBM capacity per superpod.
  - Each chip contains two TensorCores and four SparseCores.
TPUs are connected via high-speed Inter-Chip Interconnect (ICI) for efficient distributed training.
Support multislice training to scale beyond a single TPU Pod for training frontier models.
Available in Google Kubernetes Engine (GKE), Vertex AI, and Cloud TPU VMs.
Optimized for ML frameworks: JAX (best performance), TensorFlow, and PyTorch/XLA.
Support Queued Resources for managing TPU allocation in high-demand scenarios.
Only TPU v5e, v6e, and TPU7x are supported for Vertex AI model deployment. Earlier generations are deprecated for new workloads.

AI Infrastructure (GPUs and VMs)

Google Cloud provides GPU-accelerated VMs optimized for AI/ML workloads as part of AI Hypercomputer — a unified architecture integrating hardware, software, and flexible consumption models.
Key GPU VM families:
- A3 Mega VMs — powered by 8x NVIDIA H100 80GB GPUs with 3.2 Tbps GPU-to-GPU networking. Optimized for large-scale training.
- A3 Ultra VMs — powered by 8x NVIDIA H200 141GB GPUs (GA since late 2024). Superior memory bandwidth for large model training and inference.
- A2 Ultra VMs — powered by NVIDIA A100 80GB GPUs.
- G2 VMs — powered by NVIDIA L4 GPUs, optimized for inference and smaller training workloads.
Hypercompute Cluster — highly scalable clustering system for multi-node GPU workloads (GA 2024).
Key features:
- Dynamic Workload Scheduler — efficiently schedule and manage GPU/TPU workloads.
- Multislice/Multihost Training — scale training across multiple VMs/TPU slices.
- NVIDIA NVLink and NVSwitch — high-bandwidth GPU-to-GPU interconnect within nodes.
- GPUDirect-TCPXO — optimized networking stack for distributed GPU training.
Supports JetStream and vLLM for optimized LLM serving on both TPUs and GPUs.
Available with committed use discounts (CUDs) and on-demand pricing.
Integrates with GKE for container-orchestrated AI workloads and Vertex AI for managed training/serving.

Responsible AI

Google Cloud provides tools and frameworks for developing AI responsibly, aligned with Google’s AI Principles.
Key Responsible AI capabilities:
- Vertex Explainable AI (XAI) — understand model predictions through:
  - Feature-based Explanations — feature attributions showing how each input feature contributed to a prediction (Shapley values, Integrated Gradients, XRAI).
  - Example-based Explanations — identify training examples most similar to the input being explained.
- Model Cards — structured documentation describing model performance, intended use, limitations, and ethical considerations. Supports generating Model Cards automatically via Vertex AI Pipelines.
- Fairness Indicators — evaluate model performance across different demographic groups to identify potential bias.
- Data Cards — document dataset characteristics, collection methodology, and known biases.
- Safety Filters — configurable content filtering for generative AI models across categories (hate speech, harassment, sexually explicit, dangerous content).
- Guardrails — set boundaries on model behavior with system instructions and safety settings.
- Model Evaluation — evaluate generative models on safety, quality, and groundedness metrics.
Safety attribute scoring available in all Vertex AI generative AI APIs with configurable confidence thresholds.
Vertex AI provides built-in content filtering that can be tuned per use case.
Supports Responsible AI practices throughout the ML lifecycle: data collection, training, evaluation, deployment, and monitoring.
Google publishes annual Responsible AI Progress Reports detailing governance, safety testing, and red-teaming practices.

Google Cloud vs AWS AI Services Comparison

Category	Google Cloud Service	AWS Equivalent
ML Platform	Vertex AI / Gemini Enterprise Agent Platform	Amazon SageMaker
Foundation Model	Gemini (3 Pro, 3 Flash, 3.5 Flash, Nano)	Amazon Nova, Claude (via Bedrock)
Model Hub / API	Vertex AI Model Garden	Amazon Bedrock
AI Agent Builder	Vertex AI Agent Builder (ADK, Agent Engine)	Amazon Bedrock Agents
Enterprise Search	Vertex AI Search	Amazon Kendra / Amazon Q Business
AI Code Assistant	Gemini Code Assist	Amazon Q Developer (formerly CodeWhisperer)
Document Processing	Document AI	Amazon Textract
Image Analysis	Cloud Vision AI	Amazon Rekognition (Images)
Image Generation	Imagen on Vertex AI	Amazon Titan Image Generator / Amazon Nova Canvas
Video Generation	Veo on Vertex AI	Amazon Nova Reel
Video Analysis	Video Intelligence AI	Amazon Rekognition Video
Speech-to-Text	Cloud Speech-to-Text	Amazon Transcribe
Text-to-Speech	Cloud Text-to-Speech	Amazon Polly
NLP / Text Analysis	Cloud Natural Language AI	Amazon Comprehend
Translation	Cloud Translation AI	Amazon Translate
Conversational AI	Dialogflow CX (Conversational Agents)	Amazon Lex
Contact Center AI	CCAI Platform	Amazon Connect
Recommendations	Vertex AI Search for Commerce / Recommendations AI	Amazon Personalize
AutoML	Vertex AI AutoML	Amazon SageMaker Autopilot
Custom AI Chips	Cloud TPUs (v5e, v6e Trillium, TPU7x Ironwood)	AWS Trainium / Inferentia
GPU VMs (Training)	A3 Ultra (H200), A3 Mega (H100)	P5 (H100), P5e (H200) instances
GPU VMs (Inference)	G2 (L4 GPUs)	G5 (A10G), Inf2 (Inferentia2)
AI Infrastructure Platform	AI Hypercomputer	AWS AI Infrastructure (UltraClusters)
Explainability	Vertex Explainable AI	SageMaker Clarify
Model Documentation	Model Cards	SageMaker Model Cards
Bias Detection	Fairness Indicators	SageMaker Clarify (Bias Detection)
Forecasting	Vertex AI Forecasting (AutoML Tabular)	Amazon Forecast
Data Labeling	Vertex AI Data Labeling	SageMaker Ground Truth
Feature Store	Vertex AI Feature Store	SageMaker Feature Store
ML Pipelines	Vertex AI Pipelines	SageMaker Pipelines
Notebook Environment	Vertex AI Workbench	SageMaker Studio