Text Generation
Transformers
Safetensors
PyTorch
nvidia
conversational
suhara commited on
Commit
dd33c82
·
verified ·
1 Parent(s): 2f9072b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -17
README.md CHANGED
@@ -15,12 +15,14 @@ language:
15
  - ja
16
  library_name: transformers
17
  tags:
18
- - nvidia
19
- - llama-3
20
- - pytorch
21
  ---
22
  # NVIDIA-Nemotron-Nano-9B-v2
23
 
 
 
 
24
  **Model Developer:** NVIDIA Corporation
25
 
26
  **Model Dates:**
@@ -43,33 +45,39 @@ The supported languages include: English, German, Spanish, French, Italian, and
43
 
44
  This model is ready for commercial use.
45
 
 
 
 
 
 
 
46
  ## Evaluation Results
47
 
48
- #### Benchmark Results (Reasoning On)
49
 
50
  We evaluated our model in \*\*Reasoning-On\*\* mode across all benchmarks.
51
 
52
- | Benchmark | NVIDIA-Nemotron-Nano-9B-v2 |
53
- | :---- | ----- |
54
- | AIME25 | 72.1% |
55
- | MATH500 | 97.8% |
56
- | GPQA | 64.0% |
57
- | LCB | 71.1% |
58
- | BFCL v3 | 66.9% |
59
- | IFEVAL-Prompt | 85.4% |
60
- | IFEVAL-Instruction | 90.3% |
 
 
 
61
 
62
  All evaluations were done using [NeMo-Skills](https://github.com/NVIDIA/NeMo-Skills/tree/main/docs).
63
 
64
- ### Reasoning Budget Control
65
 
66
  This model supports runtime “thinking” budget control. During inference, the user can specify how many tokens the model is allowed to "think".
67
 
68
  ![](./acc-vs-budget.png)
69
 
70
- ## License/Terms of Use
71
-
72
- GOVERNING TERMS: This trial service is governed by the [NVIDIA API Trial Terms of Service](https://assets.ngc.nvidia.com/products/api-catalog/legal/NVIDIA%20API%20Trial%20Terms%20of%20Service.pdf). Use of this model is governed by the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).
73
 
74
  ## Model Architecture
75
 
@@ -91,6 +99,8 @@ API Catalog 08/18/2025 via [https://catalog.ngc.nvidia.com/models](https://catal
91
 
92
  - [NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model](https://research.nvidia.com/labs/adlr/files/NVIDIA-Nemotron-Nano-2-Technical-Report.pdf)
93
 
 
 
94
  ## Computational Load
95
 
96
  Cumulative compute : 1.53E+24 FLOPS
 
15
  - ja
16
  library_name: transformers
17
  tags:
18
+ - nvidia
19
+ - pytorch
 
20
  ---
21
  # NVIDIA-Nemotron-Nano-9B-v2
22
 
23
+ ![](./accuracy_chart.png)
24
+
25
+
26
  **Model Developer:** NVIDIA Corporation
27
 
28
  **Model Dates:**
 
45
 
46
  This model is ready for commercial use.
47
 
48
+
49
+ ## License/Terms of Use
50
+
51
+ GOVERNING TERMS: This trial service is governed by the [NVIDIA API Trial Terms of Service](https://assets.ngc.nvidia.com/products/api-catalog/legal/NVIDIA%20API%20Trial%20Terms%20of%20Service.pdf). Use of this model is governed by the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).
52
+
53
+
54
  ## Evaluation Results
55
 
56
+ ### Benchmark Results (Reasoning On)
57
 
58
  We evaluated our model in \*\*Reasoning-On\*\* mode across all benchmarks.
59
 
60
+
61
+ | Benchmark | Qwen3-8B | NVIDIA-Nemotron-Nano-9B-v2 |
62
+ | :---- | ----: | ----: |
63
+ | AIME25 | 69.3% | 72.1% |
64
+ | MATH500 | 96.3% | 97.8% |
65
+ | GPQA | 59.6% | 64.0% |
66
+ | LCB | 59.5% | 71.1% |
67
+ | BFCL v3 | 66.3% | 66.9% |
68
+ | IFEval (Instruction Strict) | 89.4% | 90.3% |
69
+ | HLE | 4.4% | 6.5% |
70
+ | RULER (128K) | 74.1% | 78.9% |
71
+
72
 
73
  All evaluations were done using [NeMo-Skills](https://github.com/NVIDIA/NeMo-Skills/tree/main/docs).
74
 
75
+ ## Reasoning Budget Control
76
 
77
  This model supports runtime “thinking” budget control. During inference, the user can specify how many tokens the model is allowed to "think".
78
 
79
  ![](./acc-vs-budget.png)
80
 
 
 
 
81
 
82
  ## Model Architecture
83
 
 
99
 
100
  - [NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model](https://research.nvidia.com/labs/adlr/files/NVIDIA-Nemotron-Nano-2-Technical-Report.pdf)
101
 
102
+
103
+
104
  ## Computational Load
105
 
106
  Cumulative compute : 1.53E+24 FLOPS