AI Models / Compare
Llama Nemotron Super 49B
Best single-H100 reasoning model; toggleable think-mode, top AIME scores.
- Creator
- NVIDIA
- Lifecycle
- Active
- Context
- 128.0K
- Max output
- 32.8K
- Released
- Mar 18, 2025
- Status
- unknown
- Input
- $0.20 / 1M tokens
- Output
- $0.60 / 1M tokens
- Cached read
- — / 1M tokens
- Cached write
- — / 1M tokens
- Batch discount
- —%
- Source
- Llama Nemotron Super 49B pricing
- Verified
- Apr 5, 2026 (High)
Capabilities
- Modalities
- text→text
- Capabilities
- reasoningbatchSupportpromptCachingfunctionCallingstructuredOutputs
- Strengths
- Fits single H100, Top AIME score for 50B class, Toggleable reasoning
- Tradeoffs
- Text-only, requires NIM API or own GPU
Migration Guidance
Best Nemotron for single-GPU deployments. Use Ultra for max quality.
Replacement models: nemotron-ultra-253b