Gemini 2.0 Flash
Fast Gemini family optimized for high-throughput chat, extraction, and multimodal routing.
- Creator
- Google Gemini
- Lifecycle
- Active
- Context
- 1.0M
- Max output
- 8.2K
- Released
- Dec 11, 2024
- Status
- up
- Input
- $0.35 / 1M tokens
- Output
- $1.05 / 1M tokens
- Cached read
- — / 1M tokens
- Cached write
- — / 1M tokens
- Batch discount
- —%
- Source
- Google AI pricing
- Verified
- Apr 1, 2026 (Medium)
Capabilities
- Modalities
- textimageaudiovideo→textaudio
- Capabilities
- webSearchaudioInputimageInputaudioOutputbatchSupportfunctionCallingstructuredOutputs
- Strengths
- High throughput, Low blended token cost, Broad multimodal coverage
- Tradeoffs
- Reasoning depth trails Pro-tier models
Official Links
Benchmark Coverage
| Benchmark | Version | Score | Date | Source | Notes |
|---|---|---|---|---|---|
| MMLU | 2026-Q1 | 82.1 accuracy | Jan 30, 2026 | Gemini model docs | Display as raw benchmark evidence only. |
Release History
| Release | Alias | Lifecycle | Release Date | Deprecation | Shutdown | Summary |
|---|---|---|---|---|---|---|
| Gemini 2.0 Flash | gemini-2.0-flash | Active | Dec 11, 2024 | — | — | Fast Gemini family tuned for cost-efficient throughput. |
Host Coverage
| Host | Type | Context | Pricing Note | Differences |
|---|---|---|---|---|
| Google AI Studio | first-party | 1.0M | Reference access to the Flash family. | Fast first-party rollout |
| Vertex AI | cloud | 1.0M | Enterprise Google Cloud hosting for Gemini traffic. | Vertex IAM and governance |
Migration Guidance
Default Gemini family when cost and latency matter more than top-end reasoning depth.
Replacement models: gemini-2-5-pro
Breaking changes: Long-chain reasoning quality trails Pro-tier traffic.
Change Events
| Date | Type | Title | Description | Source |
|---|---|---|---|---|
| Dec 11, 2024 | family_added | Family launched | Gemini 2.0 Flash launched for high-throughput multimodal workloads. | Gemini changelog |