Luigi commited on
Commit
24c7d0b
·
1 Parent(s): f650378

update readme

Browse files
Files changed (1) hide show
  1. README.md +118 -29
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  library_name: peft
3
  license: apache-2.0
4
- base_model: unsloth/SmolLM2-360M-Instruct
5
  tags:
6
  - unsloth
7
  - trl
@@ -19,42 +19,131 @@ should probably proofread and complete it, then remove this comment. -->
19
  [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pesi/SmolLM2-360M-Instruct-TaiwanChat_CLOUD/runs/9fnxruem)
20
  # SmolLM2-360M-Instruct-TaiwanChat
21
 
22
- This model is a fine-tuned version of [unsloth/SmolLM2-360M-Instruct](https://huggingface.co/unsloth/SmolLM2-360M-Instruct) on an unknown dataset.
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  ## Model description
25
 
26
- More information needed
 
 
 
27
 
28
  ## Intended uses & limitations
29
 
30
- More information needed
 
 
 
31
 
32
- ## Training and evaluation data
33
 
34
- More information needed
 
 
35
 
36
  ## Training procedure
37
 
38
- ### Training hyperparameters
39
-
40
- The following hyperparameters were used during training:
41
- - learning_rate: 5e-05
42
- - train_batch_size: 1
43
- - eval_batch_size: 1
44
- - seed: 3407
45
- - gradient_accumulation_steps: 4
46
- - total_train_batch_size: 4
47
- - optimizer: Use adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
- - lr_scheduler_type: linear
49
- - lr_scheduler_warmup_ratio: 0.1
50
- - lr_scheduler_warmup_steps: 10
51
- - training_steps: 60
52
- - mixed_precision_training: Native AMP
53
-
54
- ### Framework versions
55
-
56
- - PEFT 0.14.0
57
- - Transformers 4.47.1
58
- - Pytorch 2.5.1+cu124
59
- - Datasets 3.2.0
60
- - Tokenizers 0.21.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: peft
3
  license: apache-2.0
4
+ base\_model: unsloth/SmolLM2-360M-Instruct
5
  tags:
6
  - unsloth
7
  - trl
 
19
  [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pesi/SmolLM2-360M-Instruct-TaiwanChat_CLOUD/runs/9fnxruem)
20
  # SmolLM2-360M-Instruct-TaiwanChat
21
 
22
+ This model is a fine-tuned version of [unsloth/SmolLM2-360M-Instruct](https://huggingface.co/unsloth/SmolLM2-360M-Instruct) on the TaiwanChat dataset using Unsloth’s 4-bit quantization and LoRA adapters for efficient instruction-following in Traditional Chinese.
23
+
24
+ ## Installation
25
+
26
+ ```bash
27
+ pip install -r requirements.txt
28
+ ```
29
+
30
+ ## Requirements
31
+
32
+ * **Python**: 3.8 or higher
33
+ * **CUDA**: 11.0 or higher (for GPU support)
34
+ * See [requirements.txt](requirements.txt) for all package versions exactly.
35
 
36
  ## Model description
37
 
38
+ * **Base**: SmolLM2-360M-Instruct (360M parameters)
39
+ * **Quantization**: 4-bit weight quantization (activations in full precision)
40
+ * **Adapters**: LoRA with rank `r=16`, alpha `α=16`, dropout `0.0`, applied to projection layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`) citeturn2file0
41
+ * **Dataset**: TaiwanChat (`yentinglin/TaiwanChat`) — 600 k filtered examples, max length 512, streamed and deduplicated, then split 90% train / 10% validation citeturn2file0
42
 
43
  ## Intended uses & limitations
44
 
45
+ **Intended uses:**
46
+
47
+ * Conversational AI and chatbots handling Traditional Chinese queries (e.g., weather, FAQs).
48
+ * Instruction-following in a dialogue format.
49
 
50
+ **Limitations:**
51
 
52
+ * Limited capacity may cause occasional hallucinations or vague answers.
53
+ * Performance measured on a 10% hold-out; real-world data discrepancies may impact quality.
54
+ * Quantization and adapter-based tuning trade off some accuracy for efficiency.
55
 
56
  ## Training procedure
57
 
58
+ 1. **Data preparation**
59
+
60
+ * Streamed 600 k examples from HF dataset, filtered to `max_len=512`, cleaned assistant markers via regex, then shuffled and split with `Dataset.train_test_split(test_size=0.1)` citeturn2file0
61
+
62
+ 2. **Model & training setup**
63
+
64
+ * Loaded base with `FastLanguageModel.from_pretrained(..., load_in_4bit=True, full_finetuning=False)`
65
+ * Applied LoRA adapters via `FastLanguageModel.get_peft_model(...)`
66
+ * Used `LoggingSFTTrainer` subclass to catch empty-label and NaN-loss cases during eval citeturn2file0
67
+
68
+ 3. **Hyperparameters**
69
+
70
+ | Parameter | Value |
71
+ | ------------------------------ | -----------------: |
72
+ | `num_train_epochs` | 3 |
73
+ | `per_device_train_batch_size` | 40 |
74
+ | `gradient_accumulation_steps` | 1 |
75
+ | `per_device_eval_batch_size` | 1 |
76
+ | `learning_rate` | 2e-4 |
77
+ | `weight_decay` | 0.01 |
78
+ | `warmup_steps` | 500 |
79
+ | `max_seq_length` | 512 |
80
+ | `evaluation_strategy` | steps (every 100) |
81
+ | `eval_steps` | 100 |
82
+ | `save_strategy` | steps (every 1000) |
83
+ | `logging_steps` | 50 |
84
+ | `optimizer` | adamw\_8bit |
85
+ | `gradient_checkpointing` | false |
86
+ | `seed` | 3407 |
87
+ | `EarlyStoppingCallback patience` | 4 evals |
88
+
89
+ 4. **Training & push**
90
+
91
+ * Ran `trainer.train()`, merged LoRA weights, then pushed the merged 16-bit model to `Luigi/SmolLM2-360M-Instruct-TaiwanChat` on Hugging Face via `model.push_to_hub_merged()` citeturn2file0
92
+
93
+ ## Example inference
94
+
95
+ ```python
96
+ from transformers import AutoTokenizer
97
+ from peft import PeftModel
98
+
99
+ # Load merged model
100
+ tokenizer = AutoTokenizer.from_pretrained("Luigi/SmolLM2-360M-Instruct-TaiwanChat")
101
+ model = PeftModel.from_pretrained(
102
+ "Luigi/SmolLM2-360M-Instruct-TaiwanChat",
103
+ torch_dtype=torch.float16,
104
+ ).eval().to("cuda")
105
+
106
+ # Query
107
+ test_prompt = "請問台北今天的天氣如何?"
108
+ inputs = tokenizer(test_prompt, return_tensors="pt").to(model.device)
109
+ outputs = model.generate(
110
+ **inputs,
111
+ max_new_tokens=100,
112
+ do_sample=True,
113
+ temperature=0.8,
114
+ )
115
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
116
+ ```
117
+
118
+ ## Framework versions
119
+
120
+ ```text
121
+ apex==0.1
122
+ bitsandbytes==0.45.5
123
+ datasets==3.2.0
124
+ flash_attn==2.7.3
125
+ hatchet==1.4.0
126
+ importlib_metadata==8.6.1
127
+ lit==18.1.8
128
+ matplotlib==3.10.3
129
+ numpy==2.2.5
130
+ packaging==25.0
131
+ pandas==2.2.3
132
+ psutil==6.1.1
133
+ pybind11==2.13.6
134
+ pytest==8.1.1
135
+ redis==6.0.0
136
+ scipy==1.15.3
137
+ setuptools==79.0.0
138
+ Sphinx==8.2.3
139
+ sphinx_gallery==0.19.0
140
+ sphinx_rtd_theme==3.0.2
141
+ tabulate==0.9.0
142
+ torch==2.7.0a0+ecf3bae40a.nv25.2
143
+ transformers==4.47.1
144
+ trl==0.15.2
145
+ unsloth==2025.4.1
146
+ unsloth_zoo==2025.4.2
147
+ vllm==0.8.5.post1
148
+ wheel==0.45.1
149
+ ```