losclouds

AI Models / Compare

GPT-OSS 120B on Groq

High-speed Groq-hosted GPT-OSS tier with reasoning, prompt caching, and tool support.

Creator
Groq
Lifecycle
Active
Context
131.1K
Max output
65.5K
Released
Aug 1, 2025
Status
up
Input
$0.15 / 1M tokens
Output
$0.60 / 1M tokens
Cached read
$0.07 / 1M tokens
Cached write
$0.15 / 1M tokens
Batch discount
%
Source
GPT-OSS 120B on Groq pricing
Verified
Apr 2, 2026 (High)

Capabilities

Modalities
texttext
Capabilities
reasoningwebSearchbatchSupportcodeExecutionpromptCachingfunctionCallingstructuredOutputs
Strengths
Fast hosted reasoning, Cheap compared with frontier APIs
Tradeoffs
Provider page is host-oriented, not creator-oriented
Official Links

Benchmark Coverage

BenchmarkVersionScoreDateSourceNotes

Release History

ReleaseAliasLifecycleRelease DateDeprecationShutdownSummary
GPT-OSS 120B on Groqgroq-gpt-oss-120bActiveAug 1, 2025Current published model family snapshot.

Host Coverage

HostTypeContextPricing NoteDifferences
Groq APIfirst-party131.1KReference hosted GPT-OSS pricing on Groq.Prompt caching; Structured outputs
Migration Guidance

Hosted reasoning option for teams prioritizing Groq latency.

Change Events
DateTypeTitleDescriptionSource
Aug 1, 2025family_addedGPT-OSS 120B on Groq publishedInitial public model family launch.GPT-OSS 120B on Groq release notes

Other models from Groq

GPT-OSS 20B on Groq, Groq Compound, Llama 3.1 8B Instant on Groq, Llama 3.3 70B Versatile on Groq, Llama 4 Scout on Groq, Qwen3 32B on Groq