nvidia
/

NVIDIA-Nemotron-Nano-9B-v2

@@ -15,12 +15,14 @@ language:
 - ja
 library_name: transformers
 tags:
-  - nvidia
-  - llama-3
-  - pytorch
 ---
 # NVIDIA-Nemotron-Nano-9B-v2
 **Model Developer:** NVIDIA Corporation
 **Model Dates:**
@@ -43,33 +45,39 @@ The supported languages include: English, German, Spanish, French, Italian, and
 This model is ready for commercial use.
 ## Evaluation Results
-#### Benchmark Results (Reasoning On)
 We evaluated our model in \*\*Reasoning-On\*\* mode across all benchmarks.
-| Benchmark | NVIDIA-Nemotron-Nano-9B-v2 |
-| :---- | ----- |
-| AIME25 | 72.1% |
-| MATH500 | 97.8% |
-| GPQA | 64.0% |
-| LCB | 71.1% |
-| BFCL v3 | 66.9% |
-| IFEVAL-Prompt | 85.4% |
-| IFEVAL-Instruction | 90.3% |
 All evaluations were done using [NeMo-Skills](https://github.com/NVIDIA/NeMo-Skills/tree/main/docs).
-### Reasoning Budget Control
 This model supports runtime “thinking” budget control. During inference, the user can specify how many tokens the model is allowed to "think".
 ![](./acc-vs-budget.png)
-## License/Terms of Use
-GOVERNING TERMS: This trial service is governed by the [NVIDIA API Trial Terms of Service](https://assets.ngc.nvidia.com/products/api-catalog/legal/NVIDIA%20API%20Trial%20Terms%20of%20Service.pdf). Use of this model is governed by the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).
 ## Model Architecture
@@ -91,6 +99,8 @@ API Catalog 08/18/2025 via [https://catalog.ngc.nvidia.com/models](https://catal
 - [NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model](https://research.nvidia.com/labs/adlr/files/NVIDIA-Nemotron-Nano-2-Technical-Report.pdf)
 ## Computational Load
 Cumulative compute : 1.53E+24 FLOPS

 - ja
 library_name: transformers
 tags:
+- nvidia
+- pytorch
 ---
 # NVIDIA-Nemotron-Nano-9B-v2
+![](./accuracy_chart.png)
 **Model Developer:** NVIDIA Corporation
 **Model Dates:**
 This model is ready for commercial use.
+## License/Terms of Use
+GOVERNING TERMS: This trial service is governed by the [NVIDIA API Trial Terms of Service](https://assets.ngc.nvidia.com/products/api-catalog/legal/NVIDIA%20API%20Trial%20Terms%20of%20Service.pdf). Use of this model is governed by the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).
 ## Evaluation Results
+### Benchmark Results (Reasoning On)
 We evaluated our model in \*\*Reasoning-On\*\* mode across all benchmarks.
+| Benchmark | Qwen3-8B | NVIDIA-Nemotron-Nano-9B-v2 |
+| :---- | ----: | ----: |
+| AIME25 | 69.3% | 72.1% |
+| MATH500 | 96.3% | 97.8% |
+| GPQA | 59.6% | 64.0% |
+| LCB | 59.5% | 71.1% |
+| BFCL v3 | 66.3% | 66.9% |
+| IFEval (Instruction Strict) | 89.4% | 90.3% |
+| HLE | 4.4% | 6.5% |
+| RULER (128K) | 74.1% | 78.9% |
 All evaluations were done using [NeMo-Skills](https://github.com/NVIDIA/NeMo-Skills/tree/main/docs).
+## Reasoning Budget Control
 This model supports runtime “thinking” budget control. During inference, the user can specify how many tokens the model is allowed to "think".
 ![](./acc-vs-budget.png)
 ## Model Architecture
 - [NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model](https://research.nvidia.com/labs/adlr/files/NVIDIA-Nemotron-Nano-2-Technical-Report.pdf)
 ## Computational Load
 Cumulative compute : 1.53E+24 FLOPS