Table of Contents
hide
Google Cloud AI Services Cheat Sheet
- Google Cloud provides a comprehensive suite of AI and Machine Learning services spanning the full ML lifecycle — from data preparation and model training to deployment, inference, and responsible AI governance.
- In April 2026, Google rebranded Vertex AI as the Gemini Enterprise Agent Platform at Cloud Next ’26, consolidating all AI/ML services under an agent-first architecture.
- Google Cloud AI services are broadly categorized into: AI Platform (Vertex AI / Gemini Enterprise Agent Platform), Foundation Models (Gemini), Pre-trained APIs, Conversational AI, AI Infrastructure (TPUs, GPUs), and Responsible AI tools.
Vertex AI / Gemini Enterprise Agent Platform
- Vertex AI (now Gemini Enterprise Agent Platform since April 2026) is Google Cloud’s unified, fully managed ML platform for building, training, deploying, and scaling ML models and generative AI applications.
- Provides a single environment combining AutoML and custom training with no-code, low-code, and code-first approaches.
- Key components include:
- Vertex AI Workbench — managed Jupyter notebook environment for data exploration and ML development.
- Vertex AI Training — custom model training with distributed training support on GPUs and TPUs.
- Vertex AI Predictions — online and batch prediction endpoints with autoscaling.
- Vertex AI Pipelines — serverless ML workflow orchestration based on Kubeflow Pipelines or TFX.
- Vertex AI Model Registry — central repository to manage, version, and deploy models.
- Vertex AI Feature Store — managed feature storage for serving and sharing ML features at scale.
- Vertex AI Model Garden — catalog of 200+ foundation models including Gemini, Claude, Llama, and open-source models.
- Vertex AI Studio — UI for prompt engineering, model tuning, and testing generative AI models.
- Vertex AI Experiments — track, compare, and analyze ML experiments.
- Vertex AI Model Monitoring — detect data drift and model quality degradation in production.
- Supports custom containers (Docker) for training and serving with any ML framework (TensorFlow, PyTorch, JAX, XGBoost, scikit-learn).
- Provides pre-built containers for popular frameworks optimized for Google Cloud hardware.
- Integrates with BigQuery, Cloud Storage, Dataflow, and other Google Cloud data services.
- As of May 2026, Vertex AI has been fully migrated to Gemini Enterprise Agent Platform in the Google Cloud Console. All future updates are delivered through the Agent Platform.
Gemini (Foundation Model)
- Gemini is Google’s family of multimodal foundation models from Google DeepMind, capable of understanding and generating text, images, audio, video, and code.
- Gemini model family includes:
- Gemini 3 Pro — most capable model for complex reasoning, coding, and multimodal tasks.
- Gemini 3 Flash — optimized for speed and efficiency with near-Pro intelligence at lower cost.
- Gemini 3.5 Flash — latest model with Pro-level coding proficiency and parallel agentic execution at Flash-tier pricing.
- Gemini Nano — on-device model for mobile and edge deployments.
- Supports multimodal inputs — can process text, images, audio, video, and code in a single prompt.
- Offers a 1M+ token context window for processing large documents, codebases, and long videos.
- Supports function calling, grounding with Google Search, and tool use for agentic applications.
- Available through Vertex AI Studio, Vertex AI API, and Google AI Studio.
- Supports fine-tuning and distillation to customize models for specific use cases.
- Provides built-in safety filters with configurable thresholds for responsible deployment.
- Gemini for Google Cloud (formerly Duet AI) provides AI-powered assistance across Google Cloud Console, Cloud Code, BigQuery, and other services.
Vertex AI Agent Builder
- Vertex AI Agent Builder is Google Cloud’s comprehensive platform to build, scale, and govern reliable AI agents.
- Key components include:
- Agent Development Kit (ADK) — open-source, code-first framework for building multi-agent systems.
- Agent Studio — low-code visual builder for designing agent workflows.
- Agent Engine — managed runtime for deploying and scaling agents in production.
- Agent Garden — collection of ready-to-use agent samples and tools.
- Supports multi-agent orchestration where multiple agents collaborate on complex workflows.
- Provider-agnostic — supports Gemini, Claude, Llama, and hundreds of third-party models from Model Garden.
- Includes persistent memory, session management, and enterprise governance features.
- Integrates with Google Workspace, third-party APIs, and enterprise data sources.
- Supports the Agent-to-Agent (A2A) protocol for inter-agent communication across platforms.
Vertex AI Search
- Vertex AI Search (part of AI Applications) brings together deep information retrieval, NLP, and LLM processing to understand user intent and return highly relevant results.
- Goes beyond basic keyword matching using AI to deliver relevant results grounded in enterprise data.
- Supports multiple data sources — websites, unstructured documents, structured data, and Cloud Storage.
- Provides generative AI answers grounded in enterprise data with citations.
- Includes Vertex AI Search for Commerce (formerly Recommendations AI) for e-commerce with:
- AI-driven product rankings and catalog enhancements.
- Conversational Commerce agent for guiding users from intent to purchase.
- Personalized search results and recommendations optimized for revenue.
- Supports RAG (Retrieval Augmented Generation) patterns for grounding LLM responses in enterprise data.
- Provides out-of-the-box search widgets and APIs for quick integration.
Document AI
- Document AI is a fully managed platform for document understanding that uses ML and generative AI to extract, classify, and enrich data from documents.
- Supports structured, semi-structured, and unstructured documents (invoices, receipts, contracts, forms, IDs).
- Key capabilities:
- Document OCR — extract printed and handwritten text from documents and images.
- Form Parser — extract key-value pairs, tables, and checkboxes from forms.
- Specialized Processors — pre-trained models for invoices, receipts, bank statements, pay slips, W-2s, and procurement documents.
- Custom Document Extractor — train custom models for domain-specific documents.
- Document Splitter — classify and split multi-page documents.
- Document AI Warehouse — search, store, and govern documents at scale with AI-powered classification.
- Integrates with BigQuery, Cloud Storage, and Vertex AI Pipelines for end-to-end document processing workflows.
- Supports human-in-the-loop review for critical document processing.
- Processes documents asynchronously in batch or synchronously in real-time.
Vision AI
- Vision AI provides pre-trained models for image analysis and computer vision tasks via the Cloud Vision API.
- Key features:
- Label Detection — identify objects, locations, activities, animal species, and products in images.
- OCR (Text Detection) — extract printed and handwritten text from images.
- Face Detection — detect faces along with associated attributes (joy, sorrow, anger, surprise).
- Landmark Detection — identify popular natural and man-made landmarks.
- Logo Detection — detect popular product and brand logos.
- SafeSearch Detection — detect explicit content (adult, violence, medical, racy).
- Image Properties — detect dominant colors and crop hints.
- Object Localization — detect and locate multiple objects in an image with bounding polygons.
- Supports batch image annotation for processing large volumes of images.
- Provides a Product Search feature to find similar products in a product catalog.
- Imagen on Vertex AI — Google’s text-to-image generation model for creating and editing images from text prompts.
- Veo on Vertex AI — video generation model for creating videos from text and image prompts.
- Vision AI pre-trained API provides basic capabilities; for custom image classification or object detection, use AutoML on Vertex AI.
Cloud Speech-to-Text
- Speech-to-Text converts audio to text using Google’s deep learning neural network algorithms.
- Supports 125+ languages and variants with automatic language detection.
- Key features:
- Real-time Streaming — transcribe audio from a microphone or streaming source in real-time.
- Batch Recognition — transcribe pre-recorded audio files up to 480 minutes.
- Multi-channel Recognition — transcribe separate channels (e.g., caller and agent in a call center).
- Speaker Diarization — identify who said what in multi-speaker audio.
- Automatic Punctuation — automatically add punctuation to transcripts.
- Word-level Confidence — confidence scores for individual words.
- Speech Adaptation — boost recognition of domain-specific terms and phrases.
- Chirp — universal speech model with state-of-the-art accuracy across languages.
- Provides V2 API with improved accuracy using latest foundation models.
- Supports noise robustness for transcribing audio in noisy environments.
Cloud Text-to-Speech
- Text-to-Speech converts text into natural-sounding speech using Google’s AI.
- Offers 700+ voices across 50+ languages and variants, including Neural2, Studio, and WaveNet voices.
- Key features:
- WaveNet Voices — high-fidelity voices generated by DeepMind’s WaveNet model.
- Neural2 Voices — next-generation voices combining Tensor2Tensor with WaveNet for improved quality.
- Studio Voices — premium, human-like voices for professional applications.
- Custom Voice — create a unique voice using your own recordings.
- SSML Support — control pronunciation, speaking rate, pitch, and volume with Speech Synthesis Markup Language.
- Multi-speaker — generate audio with multiple distinct speakers in a single request.
- Supports audio output in MP3, OGG Opus, LINEAR16, and MULAW formats.
- Integrates with Dialogflow for voice-enabled conversational agents.
Cloud Natural Language AI
- Natural Language AI uses ML to extract insights from unstructured text.
- Key capabilities:
- Sentiment Analysis — understand the overall sentiment (positive/negative) of text at document and sentence level.
- Entity Analysis — identify entities (people, organizations, locations, events, products) and their types.
- Entity Sentiment Analysis — combine entity and sentiment analysis to understand sentiment about specific entities.
- Syntax Analysis — extract tokens and sentences, identify parts of speech, and create dependency parse trees.
- Content Classification — classify documents into 1,000+ predefined categories.
- Text Moderation — classify text into safety categories (toxic, insult, profanity, etc.).
- Supports multiple languages for all analysis features.
- Provides the Healthcare Natural Language API for extracting medical entities from clinical text.
- For custom text classification or entity extraction, use AutoML Natural Language on Vertex AI.
Cloud Translation AI
- Cloud Translation provides real-time language translation using neural machine translation (NMT).
- Two editions available:
- Translation API Basic (v2) — simple, quick translations for 100+ languages.
- Translation API Advanced (v3) — enterprise features including glossaries, custom models, and batch translation.
- Key features:
- AutoML Translation — train custom translation models with domain-specific terminology.
- Adaptive Translation — real-time customization using few-shot examples without training a full model.
- Glossaries — ensure consistent translation of domain-specific terms (brand names, product names).
- Batch Translation — translate large volumes of documents asynchronously.
- Language Detection — automatically detect the source language.
- Document Translation — translate documents while preserving formatting (PDF, DOCX).
- Supports 130+ languages for text translation.
- Integrates with Cloud Storage for batch processing and BigQuery for analytics.
Video Intelligence AI
- Video Intelligence API enables understanding of video content by analyzing stored and streaming video.
- Key features:
- Label Detection — recognize 20,000+ objects, places, and actions in video at shot, frame, or segment level.
- Shot Change Detection — detect scene transitions in video.
- Explicit Content Detection — identify inappropriate content in video.
- Speech Transcription — transcribe speech within video content.
- Text Detection (OCR) — detect and extract text appearing in video frames.
- Object Tracking — track objects across video frames with bounding boxes.
- Person Detection — detect people and track their poses in video.
- Face Detection — detect faces in video (without identification).
- Logo Detection — detect and track brand logos in video.
- Supports both stored video (Cloud Storage, URIs) and streaming video analysis.
- Provides rich metadata at video, shot, and frame levels for building searchable video archives.
- Integrates with Cloud Storage, BigQuery, and Pub/Sub for automated video processing pipelines.
Contact Center AI (CCAI)
- Contact Center AI Platform is a full-stack contact center solution for managing customer interactions across voice and digital channels.
- Key components:
- CCAI Platform — full CCaaS (Contact Center as a Service) with routing, queuing, and workforce management.
- Dialogflow CX Virtual Agents — AI-powered virtual agents that handle customer interactions before routing to human agents.
- Agent Assist — provides real-time suggestions, knowledge articles, and smart replies to human agents during conversations.
- CCAI Insights — analyzes call transcripts to identify call drivers, sentiment, and conversation topics at scale.
- Conversational Agents — new name for Dialogflow CX in the CCAI context (renamed 2025).
- Supports IVA-only deployments to add Google’s generative AI virtual agents without replacing existing contact center infrastructure.
- Integrates with third-party CRM and telephony systems (Genesys, Avaya, NICE, Cisco).
- Provides sentiment analysis, entity extraction, and intent detection for every conversation.
- Supports both voice and digital channels (chat, email, SMS, social media).
Dialogflow (CX and ES)
- Dialogflow is a natural language understanding platform for building conversational interfaces (chatbots, voice bots, IVR systems).
- Two editions available:
- Dialogflow CX (Conversational Agents) — enterprise-grade edition for complex, multi-turn conversations.
- Uses visual flow builder for designing conversation paths.
- Supports state-based conversation management with pages, flows, and transition routes.
- Provides built-in generative AI capabilities using Gemini for dynamic responses.
- Supports data store agents for grounding responses in enterprise data.
- Multi-language support with separate flows per language.
- Advanced analytics and debugging tools.
- Dialogflow ES (Essentials) — standard edition for simpler, single-turn or basic multi-turn conversations.
- Intent-based conversation model with contexts for state management.
- Suitable for small to medium chatbots and simple IVR systems.
- Simpler setup but less control over complex conversation flows.
- Dialogflow CX (Conversational Agents) — enterprise-grade edition for complex, multi-turn conversations.
- Dialogflow CX is recommended for new projects. ES is maintained but CX provides superior capabilities for enterprise use cases.
- Integrates with telephony partners, Google Chat, Slack, Facebook Messenger, Twilio, and custom channels.
- Supports webhook fulfillment for dynamic responses and backend integration.
Recommendations AI
- Recommendations AI (now part of Vertex AI Search for Commerce) delivers personalized product recommendations at scale using Google’s ML expertise.
- Key recommendation types:
- Recommended for You — personalized suggestions based on user browsing and purchase history.
- Others You May Like — similar product recommendations based on collective user behavior.
- Frequently Bought Together — complementary product suggestions for cross-selling.
- Similar Items — visually or categorically similar products.
- Recently Viewed — personalized recall of previously viewed items.
- Supports real-time user events for immediate personalization updates.
- Provides A/B testing capabilities to measure recommendation quality impact on revenue.
- Requires catalog data (products) and user events (views, add-to-cart, purchases) for model training.
- Models improve automatically as more user interaction data is collected.
- Recommendations AI has been consolidated into Vertex AI Search for Commerce / AI Commerce Search as of 2025.
Gemini for Google Cloud (formerly Duet AI)
- Gemini for Google Cloud is an AI-powered collaborator embedded across Google Cloud services to boost developer and operator productivity.
- Previously known as Duet AI for Google Cloud (rebranded to Gemini in February 2024).
- Key capabilities across services:
- Gemini Code Assist — AI-powered code completion, generation, and explanation in Cloud Shell Editor, VS Code, JetBrains IDEs, and Cloud Workstations.
- Gemini in BigQuery — generate SQL queries, explain results, suggest optimizations using natural language.
- Gemini in Cloud Console — natural language assistance for cloud operations, troubleshooting, and configuration.
- Gemini in Looker — generate visualizations and formulas from natural language.
- Gemini Cloud Assist — AI-driven recommendations for design, operations, and troubleshooting.
- Gemini in Security — summarize security findings, explain threats, and suggest remediation in Security Command Center.
- Gemini in Databases — generate schemas, optimize queries, and explain database operations (Cloud SQL, Spanner, AlloyDB).
- Gemini Code Assist supports 20+ programming languages with full codebase context awareness.
- Available in two tiers: Gemini Code Assist Standard and Gemini Code Assist Enterprise with codebase customization.
AutoML
- AutoML enables training custom, high-quality ML models with minimal ML expertise using transfer learning and neural architecture search.
- AutoML is now integrated into Vertex AI and supports:
- AutoML Image Classification — classify images into custom categories.
- AutoML Object Detection — detect and locate custom objects in images.
- AutoML Text Classification — classify text documents into custom categories.
- AutoML Entity Extraction — extract custom entities from text.
- AutoML Sentiment Analysis — analyze sentiment with custom models.
- AutoML Translation — train custom neural machine translation models.
- AutoML Video Classification — classify video segments.
- AutoML Video Object Tracking — track custom objects in video.
- AutoML Tabular — train models on structured/tabular data for classification, regression, and forecasting.
- Uses Google’s state-of-the-art transfer learning and neural architecture search technology.
- Requires labeled training data — supports human labeling through Vertex AI Data Labeling service.
- Provides model evaluation metrics (precision, recall, F1, confusion matrix) before deployment.
- Trained models can be exported for edge deployment (TensorFlow Lite, TF.js, Core ML) or served via Vertex AI Endpoints.
- Standalone AutoML products (automl.googleapis.com) have been migrated to Vertex AI. Use Vertex AI for all new AutoML workloads.
Cloud TPUs (Tensor Processing Units)
- Cloud TPUs are Google’s custom-designed AI accelerators (ASICs) optimized for training and inference of large ML models using TensorFlow, PyTorch, and JAX.
- TPU generations available on Google Cloud:
- TPU v5e — cost-efficient accelerator optimized for training and serving transformer models, text-to-image, and CNNs. 256 chips per Pod.
- TPU v6e (Trillium) — 6th generation with 4.7x peak compute improvement over v5e, doubled HBM capacity/bandwidth, and doubled ICI bandwidth. 256 chips per Pod.
- TPU v5p — high-performance variant optimized for large-scale training workloads.
- TPU7x (Ironwood) — 7th generation, Google’s most powerful TPU:
- 4.6 petaFLOPS of peak FP8 compute per chip.
- 192 GiB HBM3e memory per chip with 7.4 TB/s bandwidth.
- 10x peak performance improvement over v5p.
- 4x better performance per chip vs. v6e for training and inference.
- 9,216-chip superpods delivering 42.5 exaFLOPS of FP8 compute.
- 1.77 PB of directly accessible HBM capacity per superpod.
- Each chip contains two TensorCores and four SparseCores.
- TPUs are connected via high-speed Inter-Chip Interconnect (ICI) for efficient distributed training.
- Support multislice training to scale beyond a single TPU Pod for training frontier models.
- Available in Google Kubernetes Engine (GKE), Vertex AI, and Cloud TPU VMs.
- Optimized for ML frameworks: JAX (best performance), TensorFlow, and PyTorch/XLA.
- Support Queued Resources for managing TPU allocation in high-demand scenarios.
- Only TPU v5e, v6e, and TPU7x are supported for Vertex AI model deployment. Earlier generations are deprecated for new workloads.
AI Infrastructure (GPUs and VMs)
- Google Cloud provides GPU-accelerated VMs optimized for AI/ML workloads as part of AI Hypercomputer — a unified architecture integrating hardware, software, and flexible consumption models.
- Key GPU VM families:
- A3 Mega VMs — powered by 8x NVIDIA H100 80GB GPUs with 3.2 Tbps GPU-to-GPU networking. Optimized for large-scale training.
- A3 Ultra VMs — powered by 8x NVIDIA H200 141GB GPUs (GA since late 2024). Superior memory bandwidth for large model training and inference.
- A2 Ultra VMs — powered by NVIDIA A100 80GB GPUs.
- G2 VMs — powered by NVIDIA L4 GPUs, optimized for inference and smaller training workloads.
- Hypercompute Cluster — highly scalable clustering system for multi-node GPU workloads (GA 2024).
- Key features:
- Dynamic Workload Scheduler — efficiently schedule and manage GPU/TPU workloads.
- Multislice/Multihost Training — scale training across multiple VMs/TPU slices.
- NVIDIA NVLink and NVSwitch — high-bandwidth GPU-to-GPU interconnect within nodes.
- GPUDirect-TCPXO — optimized networking stack for distributed GPU training.
- Supports JetStream and vLLM for optimized LLM serving on both TPUs and GPUs.
- Available with committed use discounts (CUDs) and on-demand pricing.
- Integrates with GKE for container-orchestrated AI workloads and Vertex AI for managed training/serving.
Responsible AI
- Google Cloud provides tools and frameworks for developing AI responsibly, aligned with Google’s AI Principles.
- Key Responsible AI capabilities:
- Vertex Explainable AI (XAI) — understand model predictions through:
- Feature-based Explanations — feature attributions showing how each input feature contributed to a prediction (Shapley values, Integrated Gradients, XRAI).
- Example-based Explanations — identify training examples most similar to the input being explained.
- Model Cards — structured documentation describing model performance, intended use, limitations, and ethical considerations. Supports generating Model Cards automatically via Vertex AI Pipelines.
- Fairness Indicators — evaluate model performance across different demographic groups to identify potential bias.
- Data Cards — document dataset characteristics, collection methodology, and known biases.
- Safety Filters — configurable content filtering for generative AI models across categories (hate speech, harassment, sexually explicit, dangerous content).
- Guardrails — set boundaries on model behavior with system instructions and safety settings.
- Model Evaluation — evaluate generative models on safety, quality, and groundedness metrics.
- Vertex Explainable AI (XAI) — understand model predictions through:
- Safety attribute scoring available in all Vertex AI generative AI APIs with configurable confidence thresholds.
- Vertex AI provides built-in content filtering that can be tuned per use case.
- Supports Responsible AI practices throughout the ML lifecycle: data collection, training, evaluation, deployment, and monitoring.
- Google publishes annual Responsible AI Progress Reports detailing governance, safety testing, and red-teaming practices.
Google Cloud vs AWS AI Services Comparison
| Category | Google Cloud Service | AWS Equivalent |
|---|---|---|
| ML Platform | Vertex AI / Gemini Enterprise Agent Platform | Amazon SageMaker |
| Foundation Model | Gemini (3 Pro, 3 Flash, 3.5 Flash, Nano) | Amazon Nova, Claude (via Bedrock) |
| Model Hub / API | Vertex AI Model Garden | Amazon Bedrock |
| AI Agent Builder | Vertex AI Agent Builder (ADK, Agent Engine) | Amazon Bedrock Agents |
| Enterprise Search | Vertex AI Search | Amazon Kendra / Amazon Q Business |
| AI Code Assistant | Gemini Code Assist | Amazon Q Developer (formerly CodeWhisperer) |
| Document Processing | Document AI | Amazon Textract |
| Image Analysis | Cloud Vision AI | Amazon Rekognition (Images) |
| Image Generation | Imagen on Vertex AI | Amazon Titan Image Generator / Amazon Nova Canvas |
| Video Generation | Veo on Vertex AI | Amazon Nova Reel |
| Video Analysis | Video Intelligence AI | Amazon Rekognition Video |
| Speech-to-Text | Cloud Speech-to-Text | Amazon Transcribe |
| Text-to-Speech | Cloud Text-to-Speech | Amazon Polly |
| NLP / Text Analysis | Cloud Natural Language AI | Amazon Comprehend |
| Translation | Cloud Translation AI | Amazon Translate |
| Conversational AI | Dialogflow CX (Conversational Agents) | Amazon Lex |
| Contact Center AI | CCAI Platform | Amazon Connect |
| Recommendations | Vertex AI Search for Commerce / Recommendations AI | Amazon Personalize |
| AutoML | Vertex AI AutoML | Amazon SageMaker Autopilot |
| Custom AI Chips | Cloud TPUs (v5e, v6e Trillium, TPU7x Ironwood) | AWS Trainium / Inferentia |
| GPU VMs (Training) | A3 Ultra (H200), A3 Mega (H100) | P5 (H100), P5e (H200) instances |
| GPU VMs (Inference) | G2 (L4 GPUs) | G5 (A10G), Inf2 (Inferentia2) |
| AI Infrastructure Platform | AI Hypercomputer | AWS AI Infrastructure (UltraClusters) |
| Explainability | Vertex Explainable AI | SageMaker Clarify |
| Model Documentation | Model Cards | SageMaker Model Cards |
| Bias Detection | Fairness Indicators | SageMaker Clarify (Bias Detection) |
| Forecasting | Vertex AI Forecasting (AutoML Tabular) | Amazon Forecast |
| Data Labeling | Vertex AI Data Labeling | SageMaker Ground Truth |
| Feature Store | Vertex AI Feature Store | SageMaker Feature Store |
| ML Pipelines | Vertex AI Pipelines | SageMaker Pipelines |
| Notebook Environment | Vertex AI Workbench | SageMaker Studio |