losclouds

AI Models / Compare

Llama 3.1 405B

Flagship open Llama — 405B parameters, comparable to GPT-4 on many tasks.

Creator
Meta AI
Lifecycle
Active
Context
128.0K
Max output
8.2K
Released
Jul 23, 2024
Status
unknown
Input
$3.50 / 1M tokens
Output
$3.50 / 1M tokens
Cached read
/ 1M tokens
Cached write
/ 1M tokens
Batch discount
%
Source
Llama 3.1 405B pricing
Verified
Apr 5, 2026 (High)

Capabilities

Modalities
texttext
Capabilities
batchSupportpromptCachingfunctionCallingstructuredOutputs
Strengths
Largest open-weight model, GPT-4-class quality
Tradeoffs
Needs multi-GPU, expensive to host
Official Links

Benchmark Coverage

BenchmarkVersionScoreDateSourceNotes
MMLU202487.3 %Jul 1, 2024MetaVendor-reported
HumanEval202489 %Jul 1, 2024MetaVendor-reported

Release History

ReleaseAliasLifecycleRelease DateDeprecationShutdownSummary
Llama 3.1 405Bllama-3-1-405bActiveJul 23, 2024Current published model family snapshot.

Host Coverage

HostTypeContextPricing NoteDifferences
Together AIcloud128.0K$3.50/$3.50 per MTok.
DeepInfracloud128.0K$1.79/$1.79 per MTok.
Migration Guidance

Use Llama 4 Maverick for better quality with cheaper MoE hosting.

Replacement models: llama-4-maverick

Change Events
DateTypeTitleDescriptionSource
Jul 23, 2024family_addedLlama 3.1 405B publishedInitial public model family launch.Llama 3.1 405B release notes

Other models from Meta AI

Llama 3.1 70B, Llama 3.1 8B, Llama 3.2 11B Vision, Llama 3.2 90B Vision, Llama 3.3 70B, Llama 4 Maverick, Llama 4 Scout