update readme
Browse files
README.md
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
---
|
| 2 |
library_name: peft
|
| 3 |
license: apache-2.0
|
| 4 |
-
|
| 5 |
tags:
|
| 6 |
- unsloth
|
| 7 |
- trl
|
|
@@ -19,42 +19,131 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 19 |
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pesi/SmolLM2-360M-Instruct-TaiwanChat_CLOUD/runs/9fnxruem)
|
| 20 |
# SmolLM2-360M-Instruct-TaiwanChat
|
| 21 |
|
| 22 |
-
This model is a fine-tuned version of [unsloth/SmolLM2-360M-Instruct](https://huggingface.co/unsloth/SmolLM2-360M-Instruct) on
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
## Model description
|
| 25 |
|
| 26 |
-
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
## Intended uses & limitations
|
| 29 |
|
| 30 |
-
|
|
|
|
|
|
|
|
|
|
| 31 |
|
| 32 |
-
|
| 33 |
|
| 34 |
-
|
|
|
|
|
|
|
| 35 |
|
| 36 |
## Training procedure
|
| 37 |
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
library_name: peft
|
| 3 |
license: apache-2.0
|
| 4 |
+
base\_model: unsloth/SmolLM2-360M-Instruct
|
| 5 |
tags:
|
| 6 |
- unsloth
|
| 7 |
- trl
|
|
|
|
| 19 |
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pesi/SmolLM2-360M-Instruct-TaiwanChat_CLOUD/runs/9fnxruem)
|
| 20 |
# SmolLM2-360M-Instruct-TaiwanChat
|
| 21 |
|
| 22 |
+
This model is a fine-tuned version of [unsloth/SmolLM2-360M-Instruct](https://huggingface.co/unsloth/SmolLM2-360M-Instruct) on the TaiwanChat dataset using Unsloth’s 4-bit quantization and LoRA adapters for efficient instruction-following in Traditional Chinese.
|
| 23 |
+
|
| 24 |
+
## Installation
|
| 25 |
+
|
| 26 |
+
```bash
|
| 27 |
+
pip install -r requirements.txt
|
| 28 |
+
```
|
| 29 |
+
|
| 30 |
+
## Requirements
|
| 31 |
+
|
| 32 |
+
* **Python**: 3.8 or higher
|
| 33 |
+
* **CUDA**: 11.0 or higher (for GPU support)
|
| 34 |
+
* See [requirements.txt](requirements.txt) for all package versions exactly.
|
| 35 |
|
| 36 |
## Model description
|
| 37 |
|
| 38 |
+
* **Base**: SmolLM2-360M-Instruct (360M parameters)
|
| 39 |
+
* **Quantization**: 4-bit weight quantization (activations in full precision)
|
| 40 |
+
* **Adapters**: LoRA with rank `r=16`, alpha `α=16`, dropout `0.0`, applied to projection layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`) citeturn2file0
|
| 41 |
+
* **Dataset**: TaiwanChat (`yentinglin/TaiwanChat`) — 600 k filtered examples, max length 512, streamed and deduplicated, then split 90% train / 10% validation citeturn2file0
|
| 42 |
|
| 43 |
## Intended uses & limitations
|
| 44 |
|
| 45 |
+
**Intended uses:**
|
| 46 |
+
|
| 47 |
+
* Conversational AI and chatbots handling Traditional Chinese queries (e.g., weather, FAQs).
|
| 48 |
+
* Instruction-following in a dialogue format.
|
| 49 |
|
| 50 |
+
**Limitations:**
|
| 51 |
|
| 52 |
+
* Limited capacity may cause occasional hallucinations or vague answers.
|
| 53 |
+
* Performance measured on a 10% hold-out; real-world data discrepancies may impact quality.
|
| 54 |
+
* Quantization and adapter-based tuning trade off some accuracy for efficiency.
|
| 55 |
|
| 56 |
## Training procedure
|
| 57 |
|
| 58 |
+
1. **Data preparation**
|
| 59 |
+
|
| 60 |
+
* Streamed 600 k examples from HF dataset, filtered to `max_len=512`, cleaned assistant markers via regex, then shuffled and split with `Dataset.train_test_split(test_size=0.1)` citeturn2file0
|
| 61 |
+
|
| 62 |
+
2. **Model & training setup**
|
| 63 |
+
|
| 64 |
+
* Loaded base with `FastLanguageModel.from_pretrained(..., load_in_4bit=True, full_finetuning=False)`
|
| 65 |
+
* Applied LoRA adapters via `FastLanguageModel.get_peft_model(...)`
|
| 66 |
+
* Used `LoggingSFTTrainer` subclass to catch empty-label and NaN-loss cases during eval citeturn2file0
|
| 67 |
+
|
| 68 |
+
3. **Hyperparameters**
|
| 69 |
+
|
| 70 |
+
| Parameter | Value |
|
| 71 |
+
| ------------------------------ | -----------------: |
|
| 72 |
+
| `num_train_epochs` | 3 |
|
| 73 |
+
| `per_device_train_batch_size` | 40 |
|
| 74 |
+
| `gradient_accumulation_steps` | 1 |
|
| 75 |
+
| `per_device_eval_batch_size` | 1 |
|
| 76 |
+
| `learning_rate` | 2e-4 |
|
| 77 |
+
| `weight_decay` | 0.01 |
|
| 78 |
+
| `warmup_steps` | 500 |
|
| 79 |
+
| `max_seq_length` | 512 |
|
| 80 |
+
| `evaluation_strategy` | steps (every 100) |
|
| 81 |
+
| `eval_steps` | 100 |
|
| 82 |
+
| `save_strategy` | steps (every 1000) |
|
| 83 |
+
| `logging_steps` | 50 |
|
| 84 |
+
| `optimizer` | adamw\_8bit |
|
| 85 |
+
| `gradient_checkpointing` | false |
|
| 86 |
+
| `seed` | 3407 |
|
| 87 |
+
| `EarlyStoppingCallback patience` | 4 evals |
|
| 88 |
+
|
| 89 |
+
4. **Training & push**
|
| 90 |
+
|
| 91 |
+
* Ran `trainer.train()`, merged LoRA weights, then pushed the merged 16-bit model to `Luigi/SmolLM2-360M-Instruct-TaiwanChat` on Hugging Face via `model.push_to_hub_merged()` citeturn2file0
|
| 92 |
+
|
| 93 |
+
## Example inference
|
| 94 |
+
|
| 95 |
+
```python
|
| 96 |
+
from transformers import AutoTokenizer
|
| 97 |
+
from peft import PeftModel
|
| 98 |
+
|
| 99 |
+
# Load merged model
|
| 100 |
+
tokenizer = AutoTokenizer.from_pretrained("Luigi/SmolLM2-360M-Instruct-TaiwanChat")
|
| 101 |
+
model = PeftModel.from_pretrained(
|
| 102 |
+
"Luigi/SmolLM2-360M-Instruct-TaiwanChat",
|
| 103 |
+
torch_dtype=torch.float16,
|
| 104 |
+
).eval().to("cuda")
|
| 105 |
+
|
| 106 |
+
# Query
|
| 107 |
+
test_prompt = "請問台北今天的天氣如何?"
|
| 108 |
+
inputs = tokenizer(test_prompt, return_tensors="pt").to(model.device)
|
| 109 |
+
outputs = model.generate(
|
| 110 |
+
**inputs,
|
| 111 |
+
max_new_tokens=100,
|
| 112 |
+
do_sample=True,
|
| 113 |
+
temperature=0.8,
|
| 114 |
+
)
|
| 115 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 116 |
+
```
|
| 117 |
+
|
| 118 |
+
## Framework versions
|
| 119 |
+
|
| 120 |
+
```text
|
| 121 |
+
apex==0.1
|
| 122 |
+
bitsandbytes==0.45.5
|
| 123 |
+
datasets==3.2.0
|
| 124 |
+
flash_attn==2.7.3
|
| 125 |
+
hatchet==1.4.0
|
| 126 |
+
importlib_metadata==8.6.1
|
| 127 |
+
lit==18.1.8
|
| 128 |
+
matplotlib==3.10.3
|
| 129 |
+
numpy==2.2.5
|
| 130 |
+
packaging==25.0
|
| 131 |
+
pandas==2.2.3
|
| 132 |
+
psutil==6.1.1
|
| 133 |
+
pybind11==2.13.6
|
| 134 |
+
pytest==8.1.1
|
| 135 |
+
redis==6.0.0
|
| 136 |
+
scipy==1.15.3
|
| 137 |
+
setuptools==79.0.0
|
| 138 |
+
Sphinx==8.2.3
|
| 139 |
+
sphinx_gallery==0.19.0
|
| 140 |
+
sphinx_rtd_theme==3.0.2
|
| 141 |
+
tabulate==0.9.0
|
| 142 |
+
torch==2.7.0a0+ecf3bae40a.nv25.2
|
| 143 |
+
transformers==4.47.1
|
| 144 |
+
trl==0.15.2
|
| 145 |
+
unsloth==2025.4.1
|
| 146 |
+
unsloth_zoo==2025.4.2
|
| 147 |
+
vllm==0.8.5.post1
|
| 148 |
+
wheel==0.45.1
|
| 149 |
+
```
|