clean requirements and update reamde
Browse files- README.md +30 -31
- requirements.txt +11 -14
README.md
CHANGED
|
@@ -31,7 +31,7 @@ pip install -r requirements.txt
|
|
| 31 |
|
| 32 |
* **Python**: 3.8 or higher
|
| 33 |
* **CUDA**: 11.0 or higher (for GPU support)
|
| 34 |
-
*
|
| 35 |
|
| 36 |
## Model description
|
| 37 |
|
|
@@ -67,23 +67,23 @@ pip install -r requirements.txt
|
|
| 67 |
|
| 68 |
3. **Hyperparameters**
|
| 69 |
|
| 70 |
-
| Parameter
|
| 71 |
-
|
|
| 72 |
-
| `num_train_epochs`
|
| 73 |
-
| `per_device_train_batch_size`
|
| 74 |
-
| `gradient_accumulation_steps`
|
| 75 |
-
| `per_device_eval_batch_size`
|
| 76 |
-
| `learning_rate`
|
| 77 |
-
| `weight_decay`
|
| 78 |
-
| `warmup_steps`
|
| 79 |
-
| `max_seq_length`
|
| 80 |
-
| `evaluation_strategy`
|
| 81 |
-
| `eval_steps`
|
| 82 |
-
| `save_strategy`
|
| 83 |
-
| `logging_steps`
|
| 84 |
-
| `optimizer`
|
| 85 |
-
| `gradient_checkpointing`
|
| 86 |
-
| `seed`
|
| 87 |
| `EarlyStoppingCallback patience` | 4 evals |
|
| 88 |
|
| 89 |
4. **Training & push**
|
|
@@ -118,32 +118,31 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
| 118 |
## Framework versions
|
| 119 |
|
| 120 |
```text
|
| 121 |
-
apex==0.1
|
| 122 |
bitsandbytes==0.45.5
|
| 123 |
datasets==3.2.0
|
| 124 |
-
flash_attn==2.7.3
|
| 125 |
hatchet==1.4.0
|
| 126 |
importlib_metadata==8.6.1
|
| 127 |
lit==18.1.8
|
| 128 |
-
matplotlib
|
| 129 |
-
numpy
|
| 130 |
-
packaging
|
| 131 |
-
pandas
|
| 132 |
psutil==6.1.1
|
| 133 |
pybind11==2.13.6
|
| 134 |
pytest==8.1.1
|
| 135 |
redis==6.0.0
|
| 136 |
-
scipy
|
| 137 |
-
setuptools==
|
| 138 |
-
Sphinx
|
| 139 |
-
sphinx_gallery
|
| 140 |
-
sphinx_rtd_theme
|
| 141 |
tabulate==0.9.0
|
| 142 |
-
torch==2.7.
|
| 143 |
transformers==4.47.1
|
| 144 |
trl==0.15.2
|
| 145 |
unsloth==2025.4.1
|
| 146 |
unsloth_zoo==2025.4.2
|
| 147 |
-
|
|
|
|
| 148 |
wheel==0.45.1
|
| 149 |
```
|
|
|
|
| 31 |
|
| 32 |
* **Python**: 3.8 or higher
|
| 33 |
* **CUDA**: 11.0 or higher (for GPU support)
|
| 34 |
+
* All other dependencies and exact versions are specified in [requirements.txt](requirements.txt).
|
| 35 |
|
| 36 |
## Model description
|
| 37 |
|
|
|
|
| 67 |
|
| 68 |
3. **Hyperparameters**
|
| 69 |
|
| 70 |
+
| Parameter | Value |
|
| 71 |
+
| -------------------------------- | -----------------: |
|
| 72 |
+
| `num_train_epochs` | 3 |
|
| 73 |
+
| `per_device_train_batch_size` | 40 |
|
| 74 |
+
| `gradient_accumulation_steps` | 1 |
|
| 75 |
+
| `per_device_eval_batch_size` | 1 |
|
| 76 |
+
| `learning_rate` | 2e-4 |
|
| 77 |
+
| `weight_decay` | 0.01 |
|
| 78 |
+
| `warmup_steps` | 500 |
|
| 79 |
+
| `max_seq_length` | 512 |
|
| 80 |
+
| `evaluation_strategy` | steps (every 100) |
|
| 81 |
+
| `eval_steps` | 100 |
|
| 82 |
+
| `save_strategy` | steps (every 1000) |
|
| 83 |
+
| `logging_steps` | 50 |
|
| 84 |
+
| `optimizer` | adamw\_8bit |
|
| 85 |
+
| `gradient_checkpointing` | false |
|
| 86 |
+
| `seed` | 3407 |
|
| 87 |
| `EarlyStoppingCallback patience` | 4 evals |
|
| 88 |
|
| 89 |
4. **Training & push**
|
|
|
|
| 118 |
## Framework versions
|
| 119 |
|
| 120 |
```text
|
|
|
|
| 121 |
bitsandbytes==0.45.5
|
| 122 |
datasets==3.2.0
|
|
|
|
| 123 |
hatchet==1.4.0
|
| 124 |
importlib_metadata==8.6.1
|
| 125 |
lit==18.1.8
|
| 126 |
+
matplotlib
|
| 127 |
+
numpy
|
| 128 |
+
packaging
|
| 129 |
+
pandas
|
| 130 |
psutil==6.1.1
|
| 131 |
pybind11==2.13.6
|
| 132 |
pytest==8.1.1
|
| 133 |
redis==6.0.0
|
| 134 |
+
scipy
|
| 135 |
+
setuptools==70.3.0
|
| 136 |
+
Sphinx
|
| 137 |
+
sphinx_gallery
|
| 138 |
+
sphinx_rtd_theme
|
| 139 |
tabulate==0.9.0
|
| 140 |
+
torch==2.7.0
|
| 141 |
transformers==4.47.1
|
| 142 |
trl==0.15.2
|
| 143 |
unsloth==2025.4.1
|
| 144 |
unsloth_zoo==2025.4.2
|
| 145 |
+
cut_cross_entropy
|
| 146 |
+
wandb
|
| 147 |
wheel==0.45.1
|
| 148 |
```
|
requirements.txt
CHANGED
|
@@ -1,31 +1,28 @@
|
|
| 1 |
-
apex==0.1
|
| 2 |
bitsandbytes==0.45.5
|
| 3 |
datasets==3.2.0
|
| 4 |
-
flash_attn==2.7.3
|
| 5 |
hatchet==1.4.0
|
| 6 |
-
importlib_metadata==8.0.0
|
| 7 |
importlib_metadata==8.6.1
|
| 8 |
lit==18.1.8
|
| 9 |
-
matplotlib
|
| 10 |
-
numpy
|
| 11 |
-
packaging
|
| 12 |
-
pandas
|
| 13 |
psutil==6.1.1
|
| 14 |
pybind11==2.13.6
|
| 15 |
pytest==8.1.1
|
| 16 |
redis==6.0.0
|
| 17 |
-
scipy
|
| 18 |
-
setuptools==79.0.0
|
| 19 |
setuptools==70.3.0
|
| 20 |
-
Sphinx
|
| 21 |
-
sphinx_gallery
|
| 22 |
-
sphinx_rtd_theme
|
| 23 |
tabulate==0.9.0
|
| 24 |
torch==2.7.0
|
| 25 |
-
torch==2.7.0a0+ecf3bae40a.nv25.2
|
| 26 |
transformers==4.47.1
|
| 27 |
trl==0.15.2
|
| 28 |
unsloth==2025.4.1
|
| 29 |
unsloth_zoo==2025.4.2
|
| 30 |
-
|
|
|
|
| 31 |
wheel==0.45.1
|
|
|
|
|
|
|
|
|
| 1 |
bitsandbytes==0.45.5
|
| 2 |
datasets==3.2.0
|
|
|
|
| 3 |
hatchet==1.4.0
|
|
|
|
| 4 |
importlib_metadata==8.6.1
|
| 5 |
lit==18.1.8
|
| 6 |
+
matplotlib
|
| 7 |
+
numpy
|
| 8 |
+
packaging
|
| 9 |
+
pandas
|
| 10 |
psutil==6.1.1
|
| 11 |
pybind11==2.13.6
|
| 12 |
pytest==8.1.1
|
| 13 |
redis==6.0.0
|
| 14 |
+
scipy
|
|
|
|
| 15 |
setuptools==70.3.0
|
| 16 |
+
Sphinx
|
| 17 |
+
sphinx_gallery
|
| 18 |
+
sphinx_rtd_theme
|
| 19 |
tabulate==0.9.0
|
| 20 |
torch==2.7.0
|
|
|
|
| 21 |
transformers==4.47.1
|
| 22 |
trl==0.15.2
|
| 23 |
unsloth==2025.4.1
|
| 24 |
unsloth_zoo==2025.4.2
|
| 25 |
+
cut_cross_entropy
|
| 26 |
+
wandb
|
| 27 |
wheel==0.45.1
|
| 28 |
+
|