losclouds

AI Models / Compare

Llama Nemotron Super 49B

Best single-H100 reasoning model; toggleable think-mode, top AIME scores.

Creator
NVIDIA
Lifecycle
Active
Context
128.0K
Max output
32.8K
Released
Mar 18, 2025
Status
unknown
Input
$0.20 / 1M tokens
Output
$0.60 / 1M tokens
Cached read
/ 1M tokens
Cached write
/ 1M tokens
Batch discount
%
Source
Llama Nemotron Super 49B pricing
Verified
Apr 5, 2026 (High)

Capabilities

Modalities
texttext
Capabilities
reasoningbatchSupportpromptCachingfunctionCallingstructuredOutputs
Strengths
Fits single H100, Top AIME score for 50B class, Toggleable reasoning
Tradeoffs
Text-only, requires NIM API or own GPU
Official Links

Benchmark Coverage

BenchmarkVersionScoreDateSourceNotes
GPQA202466.67 %Mar 1, 2025NVIDIAVendor-reported
AIME 2025202582.71 %Mar 1, 2025NVIDIAVendor-reported, v1.5
MATH-500202497.4 %Mar 1, 2025NVIDIAVendor-reported, v1.5

Release History

ReleaseAliasLifecycleRelease DateDeprecationShutdownSummary
Llama Nemotron Super 49Bnemotron-super-49bActiveMar 18, 2025Current published model family snapshot.

Host Coverage

HostTypeContextPricing NoteDifferences
NVIDIA NIMfirst-party128.0K$0.20/$0.60 per MTok.Thinking mode toggle
Migration Guidance

Best Nemotron for single-GPU deployments. Use Ultra for max quality.

Replacement models: nemotron-ultra-253b

Change Events
DateTypeTitleDescriptionSource
Mar 18, 2025family_addedLlama Nemotron Super 49B publishedInitial public model family launch.Llama Nemotron Super 49B release notes

Other models from NVIDIA

Llama Nemotron Ultra 253B