losclouds

AI Models / Compare

Llama 3.1 8B Instant on Groq

Ultra-cheap Groq text model for high-volume chat, classification, and routing.

Creator
Groq
Lifecycle
Active
Context
131.1K
Max output
8.2K
Released
Sep 1, 2024
Status
up
Input
$0.05 / 1M tokens
Output
$0.08 / 1M tokens
Cached read
/ 1M tokens
Cached write
/ 1M tokens
Batch discount
%
Source
Llama 3.1 8B Instant on Groq pricing
Verified
Apr 2, 2026 (High)

Capabilities

Modalities
texttext
Capabilities
batchSupportpromptCachingfunctionCallingstructuredOutputs
Strengths
Lowest Groq cost, Very fast
Tradeoffs
Smallest model in current Groq compare set
Official Links

Benchmark Coverage

BenchmarkVersionScoreDateSourceNotes

Release History

ReleaseAliasLifecycleRelease DateDeprecationShutdownSummary
Llama 3.1 8B Instant on Groqgroq-llama-3-1-8b-instantActiveSep 1, 2024Current published model family snapshot.

Host Coverage

HostTypeContextPricing NoteDifferences
Groq APIfirst-party131.1KReference production Groq pricing.Production model tier
Migration Guidance

Default Groq budget tier for simple generation, routing, and classification.

Replacement models: groq-llama-3-3-70b-versatile

Change Events
DateTypeTitleDescriptionSource
Sep 1, 2024family_addedLlama 3.1 8B Instant on Groq publishedInitial public model family launch.Llama 3.1 8B Instant on Groq release notes

Other models from Groq

GPT-OSS 120B on Groq, GPT-OSS 20B on Groq, Groq Compound, Llama 3.3 70B Versatile on Groq, Llama 4 Scout on Groq, Qwen3 32B on Groq