Pricing
Model Overview
Tracked token-pricing fields for this model family. Empty pricing fields stay hidden until the source publishes them.
| Price field | Value |
|---|---|
| Input | $0.60 / 1M tokens |
| Output | $1.80 / 1M tokens |
| Source | Llama Nemotron Ultra 253B pricing |
| Verified | Apr 5, 2026 (High) |
Surface
Capabilities
Input and output modalities, enabled feature flags, strengths, and tradeoffs.
| Attribute | Values |
|---|---|
| Modalities | text totext |
| Capabilities | reasoningbatchSupportpromptCachingfunctionCallingstructuredOutputs |
| Strengths | Frontier reasoning quality open-weight, Vendor-reported GPQA result |
| Tradeoffs | Needs 4×B100 or 8×H100 to self-host |
References
Official Links
Canonical launch, documentation, pricing, and release-note URLs.
| Reference | URL |
|---|---|
| Intro | https://developer.nvidia.com/blog/llama-nemotron-ultra-an-open-advanced-reasoning-model/ |
| Docs | https://developer.nvidia.com/nemotron |
| Pricing | https://build.nvidia.com/explore/discover |
| Release note | https://developer.nvidia.com/blog/ |
Coverage
Benchmark Coverage
Reported benchmark families, versions, scores, sources, and notes.
| Benchmark | Version | Score | Date | Source | Notes |
|---|---|---|---|---|---|
| GPQA | 2024 | 76.01 % | Apr 1, 2025 | NVIDIA | Reasoning ON, vendor-reported |
| AIME 2025 | 2025 | 72.5 % | Apr 1, 2025 | NVIDIA | Reasoning ON, vendor-reported |
| MATH-500 | 2024 | 97 % | Apr 1, 2025 | NVIDIA | Vendor-reported |
Lifecycle
Release History
Lifecycle transitions and release timeline for this model family.
| Release | Alias | Lifecycle | Release Date | Deprecation | Shutdown | Summary |
|---|---|---|---|---|---|---|
| Llama Nemotron Ultra 253B | nemotron-ultra-253b | Active | Apr 7, 2025 | — | — | Current published model family snapshot. |
Surfaces
Host Coverage
Provider-specific hosting, context, pricing notes, feature differences, and provider-status context.
| Host | Type | Context | Pricing Note | Differences |
|---|---|---|---|---|
| NVIDIA NIM | first-party | 131.1K | $0.60/$1.80 per MTok. | Thinking mode toggle; Multilingual |
Migration
Migration Guidance
Documented migration summary, successor families, and known breaking changes.
| Topic | Details |
|---|---|
| Summary | Open-weight reasoning model for quality-first workloads. |
Timeline
Change Events
Cataloged model-family updates and source references.
| Date | Type | Title | Description | Source |
|---|---|---|---|---|
| Apr 7, 2025 | family_added | Llama Nemotron Ultra 253B published | Initial public model family launch. | Llama Nemotron Ultra 253B release notes |
From NVIDIA