losclouds

AI Models / Compare

Gemini 2.0 Flash

Fast Gemini family optimized for high-throughput chat, extraction, and multimodal routing.

Creator
Google Gemini
Lifecycle
Active
Context
1.0M
Max output
8.2K
Released
Dec 11, 2024
Status
up
Input
$0.35 / 1M tokens
Output
$1.05 / 1M tokens
Cached read
/ 1M tokens
Cached write
/ 1M tokens
Batch discount
%
Source
Google AI pricing
Verified
Apr 1, 2026 (Medium)

Capabilities

Modalities
textimageaudiovideotextaudio
Capabilities
webSearchaudioInputimageInputaudioOutputbatchSupportfunctionCallingstructuredOutputs
Strengths
High throughput, Low blended token cost, Broad multimodal coverage
Tradeoffs
Reasoning depth trails Pro-tier models
Official Links

Benchmark Coverage

BenchmarkVersionScoreDateSourceNotes
MMLU2026-Q182.1 accuracyJan 30, 2026Gemini model docsDisplay as raw benchmark evidence only.

Release History

ReleaseAliasLifecycleRelease DateDeprecationShutdownSummary
Gemini 2.0 Flashgemini-2.0-flashActiveDec 11, 2024Fast Gemini family tuned for cost-efficient throughput.

Host Coverage

HostTypeContextPricing NoteDifferences
Google AI Studiofirst-party1.0MReference access to the Flash family.Fast first-party rollout
Vertex AIcloud1.0MEnterprise Google Cloud hosting for Gemini traffic.Vertex IAM and governance
Migration Guidance

Default Gemini family when cost and latency matter more than top-end reasoning depth.

Replacement models: gemini-2-5-pro

Breaking changes: Long-chain reasoning quality trails Pro-tier traffic.

Change Events
DateTypeTitleDescriptionSource
Dec 11, 2024family_addedFamily launchedGemini 2.0 Flash launched for high-throughput multimodal workloads.Gemini changelog

Other models from Google Gemini

Gemini 2.5 Flash, Gemini 2.5 Flash-Lite, Gemini 2.5 Pro