Update README.md
Browse files
README.md
CHANGED
|
@@ -129,9 +129,10 @@ Our models are designed and optimized to run on NVIDIA GPU-accelerated systems.
|
|
| 129 |
## Software Integration
|
| 130 |
|
| 131 |
- Runtime Engine(s): NeMo 25.07.nemotron-nano-v2
|
| 132 |
-
- Supported Hardware Microarchitecture Compatibility: NVIDIA A10G, NVIDIA H100-80GB, NVIDIA A100
|
| 133 |
- Operating System(s): Linux
|
| 134 |
|
|
|
|
| 135 |
### **Use it with Transformers**
|
| 136 |
|
| 137 |
The snippet below shows how to use this model with Huggingface Transformers (tested on version 4.48.3).
|
|
@@ -276,6 +277,9 @@ docker run --runtime nvidia --gpus all \
|
|
| 276 |
--mamba_ssm_cache_dtype float32
|
| 277 |
```
|
| 278 |
|
|
|
|
|
|
|
|
|
|
| 279 |
#### Using Budget Control with a vLLM Server
|
| 280 |
|
| 281 |
The thinking budget allows developers to keep accuracy high and meet response‑time targets \- which is especially crucial for customer support, autonomous agent steps, and edge devices where every millisecond counts.
|
|
|
|
| 129 |
## Software Integration
|
| 130 |
|
| 131 |
- Runtime Engine(s): NeMo 25.07.nemotron-nano-v2
|
| 132 |
+
- Supported Hardware Microarchitecture Compatibility: NVIDIA A10G, NVIDIA H100-80GB, NVIDIA A100, Jetson AGX Thor
|
| 133 |
- Operating System(s): Linux
|
| 134 |
|
| 135 |
+
|
| 136 |
### **Use it with Transformers**
|
| 137 |
|
| 138 |
The snippet below shows how to use this model with Huggingface Transformers (tested on version 4.48.3).
|
|
|
|
| 277 |
--mamba_ssm_cache_dtype float32
|
| 278 |
```
|
| 279 |
|
| 280 |
+
For Jetson AGX Thor, please use [this vLLM container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/vllm?version=25.09-py3).
|
| 281 |
+
|
| 282 |
+
|
| 283 |
#### Using Budget Control with a vLLM Server
|
| 284 |
|
| 285 |
The thinking budget allows developers to keep accuracy high and meet response‑time targets \- which is especially crucial for customer support, autonomous agent steps, and edge devices where every millisecond counts.
|