Back to Providers

Google

Explore all 24 AI models from Google. Compare benchmarks, pricing, and capabilities across the full model lineup.

deepmind.google

24

Total Models

62.8

Avg Benchmark

10

Open Source

5

Modalities

Top Performing Model

Gemini 3.1 Pro

Avg benchmark score: 93.5

Most Affordable Model

Gemma 3 1B

Input price: $0.02/M tokens

frontiermidbudgettextimageaudiovideocode Open Source

All Models

24 models

Gemini 3.1 Pro

Google

Google's most capable model. 94.3% on GPQA Diamond, 80.6% on SWE-bench, 77.1% on ARC-AGI-2. #1 on 12 of 18 tracked benchmarks.

textimageaudiovideocode

Input

$2.00/M

Output

$12.00/M

Context

1.0M

Gemini 3 Flash

Google

Google's frontier-class model at Flash-level latency and cost. 90.4% on GPQA Diamond, 78% on SWE-bench, 1M context window.

textimageaudiovideocode

Input

$0.50/M

Output

$3.00/M

Context

1.0M

Gemini 3 Pro

Google

Most powerful Gemini model with native multimodal understanding. Supports adjustable reasoning depth via thinking_level parameter.

textimageaudiovideocode

Input

$3.50/M

Output

$10.50/M

Context

1.0M

Gemini 3 Deep Think

Google

Specialized reasoning model designed for science, research, and complex engineering challenges.

textimageaudiovideo

Input

$5.00/M

Output

$15.00/M

Context

1.0M

Veo 3.1

Google

An enhanced iteration of Google DeepMind's Veo series that produces 8-second clips that can be seamlessly extended up to 148 seconds through iterative generation. Veo 3.1 improves temporal consistency over long sequences, delivers higher resolution output, and refines audio synchronization for extended storytelling and commercial content production.

Input

$3.00/M

Output

$80.00/M

Gemini 2.5 Flash Image

Google

A multimodal extension of Google's Gemini 2.5 Flash model that adds native image generation and editing capabilities alongside text understanding. This model enables conversational image creation, iterative visual refinement, and combined text-image output within a single unified interface, making it particularly effective for design iteration and creative brainstorming workflows.

Input

$0.15/M

Output

$30.00/M

Imagen 4

Google

Google DeepMind's fourth-generation image synthesis model capable of producing images up to 2K resolution with exceptional photorealism and compositional accuracy. Imagen 4 includes SynthID watermarking by default for responsible AI deployment, supports advanced inpainting and outpainting, and demonstrates industry-leading performance on text rendering and spatial reasoning tasks.

Input

$4.00/M

Output

$20.00/M

Veo 3

Google

Google DeepMind's flagship video generation model that natively produces joint audio-visual output in a single pass. Veo 3 leverages a Latent Diffusion Transformer to generate high-fidelity clips with synchronized dialogue, sound effects, and ambient audio without requiring a separate audio model. It demonstrates strong physical understanding and prompt adherence across diverse cinematic styles.

Input

$5.00/M

Output

$150.00/M

Gemini 2.5 Flash

Google

Google's fast and cost-efficient thinking model with strong reasoning capabilities.

textimageaudiovideo

Input

$0.15/M

Output

$0.60/M

Context

1.0M

Gemini 2.5 Pro

Google

Google's most capable thinking model with breakthrough performance on reasoning and coding.

textimageaudiovideocode

Input

$1.25/M

Output

$10.00/M

Context

1.0M

Gemma 3 1B

Google

Smallest Gemma 3 model for edge and mobile deployment. Text-only with 128K context.

Input

$0.02/M

Output

$0.02/M

Context

128K

Gemma 3 27B

Google

Google's open-source multimodal model. Strong performance for its size with vision capabilities.

Input

$0.10/M

Output

$0.10/M

Context

128K

Gemma 3 12B

Google

Efficient open-source model from Google with multimodal capabilities at 12B parameters.

Input

$0.05/M

Output

$0.05/M

Context

128K

Gemma 3 4B

Google

Ultra-efficient open-source model from Google. Runs on mobile and edge devices.

Input

$0.02/M

Output

$0.02/M

Context

128K

PaliGemma2 28B

Google

Open vision-language model for image captioning, visual QA, and OCR tasks. Built on Gemma 2 backbone.

Input

$0.30/M

Output

$0.60/M

Context

8K

PaliGemma2 10B

Google

Mid-size PaliGemma for efficient vision-language tasks. Strong OCR and document understanding.

Input

$0.15/M

Output

$0.30/M

Context

8K

Gemini 2.0 Flash

Google

Google's fastest multimodal model with native tool use and advanced agentic capabilities.

textimageaudiovideo

Input

$0.10/M

Output

$0.40/M

Context

1.0M

Gemini 2.0 Flash-Lite

Google

Google's ultra-efficient model offering better performance than Gemini 1.5 Flash at the same cost point.

Input

$0.07/M

Output

$0.30/M

Context

1.0M

Gemini 2 Flash Thinking

Google

Experimental Gemini model with extended chain-of-thought reasoning. Transparent thinking process with strong performance on math and science.

Input

$0.15/M

Output

$0.60/M

Context

1.0M

Gemma 2 2B

Google

Smallest Gemma 2 model for efficient text processing on consumer hardware.

Input

$0.02/M

Output

$0.04/M

Context

8K

Gemma 2 9B

Google

Efficient open-source model from Google. Great performance-to-size ratio.

Input

$0.03/M

Output

$0.03/M

Context

8K

Gemma 2 27B

Google

Google's previous-gen open-source model with strong general capabilities.

Input

$0.07/M

Output

$0.07/M

Context

8K

CodeGemma 7B

Google

Google's open-source code-focused model based on the Gemma architecture.

Input

$0.03/M

Output

$0.03/M

Context

8K

Gemini 1.5 Pro

Google

Google's previous-gen flagship model with the longest context window in production.

textimageaudiovideo

Input

$1.25/M

Output

$5.00/M

Context

2.1M

Compare Google models side by side

See how Google models stack up against each other and the competition