Text Generation
Transformers
Safetensors
qwen2
logic
argumentation
critical-thinking
argument-mapping
trl
sft
conversational
text-generation-inference
Files changed (1) hide show
  1. README.md +118 -104
README.md CHANGED
@@ -1,105 +1,119 @@
1
- ---
2
- model_name: Qwen2.5-Argunaut-1-1.5B-SFT
3
- license: apache-2.0
4
- datasets:
5
- - DebateLabKIT/deepa2-conversations
6
- - DebateLabKIT/deep-argmap-conversations
7
- - allenai/tulu-3-sft-mixture
8
- base_model:
9
- - Qwen/Qwen2.5-1.5B-Instruct
10
- pipeline_tag: text-generation
11
- library_name: transformers
12
- tags:
13
- - logic
14
- - argumentation
15
- - critical-thinking
16
- - argument-mapping
17
- - trl
18
- - sft
19
- ---
20
-
21
- # Model Card for Qwen2.5-Argunaut-1-1.5B-SFT
22
-
23
- 🧪 _Experimental, not recommended for use in teaching._
24
-
25
-
26
- This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct).
27
- It has been trained using [TRL](https://github.com/huggingface/trl).
28
-
29
- 📘 [HF Blog Article](https://huggingface.co/blog/ggbetz/argunauts-phase-1)
30
-
31
-
32
- ## Quick start
33
-
34
- ```python
35
- from transformers import pipeline
36
- question = "Are you familiar with Argdown syntax? What's its purpose?"
37
- generator = pipeline("text-generation", model="DebateLabKIT/Llama-3.1-Argunaut-1-8B-SFT", device="cuda")
38
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
39
- print(output["generated_text"])
40
- ```
41
-
42
- ## Evaluation
43
-
44
-
45
- ### Chat Experience
46
-
47
- _coming soon_
48
-
49
- ### Metrics
50
-
51
- _coming soon_
52
-
53
-
54
- ## SFT dataset mixture
55
-
56
- |Dataset|Weight (examples)|Weight (tokens)|
57
- |:------|:----:|:----:|
58
- |DebateLabKIT/deepa2-conversations|25%|49%|
59
- |DebateLabKIT/deep-argmap-conversations|25%|18%|
60
- |allenai/tulu-3-sft-mixture|50%|33%|
61
-
62
-
63
- ## Training procedure
64
-
65
- Trained with SFT on **1M examples** and for 1 epoch with
66
-
67
- * context length 8196
68
- * packing (trl implementation)
69
-
70
- ```yaml
71
- # Training parameters
72
- num_train_epochs: 1
73
- per_device_train_batch_size: 32
74
- gradient_accumulation_steps: 1
75
- gradient_checkpointing: true
76
- gradient_checkpointing_kwargs:
77
- use_reentrant: false
78
- learning_rate: 5.0e-6
79
- lr_scheduler_type: cosine
80
- warmup_ratio: 0.1
81
- ```
82
-
83
- Hardware: 4 x H100 GPUs.
84
-
85
- _This work was performed on the HoreKa supercomputer funded by the
86
- Ministry of Science, Research and the Arts Baden-Württemberg and by
87
- the Federal Ministry of Education and Research._
88
-
89
- ### Framework versions
90
-
91
- - TRL: 0.14.0
92
- - Transformers: 4.46.3
93
- - Pytorch: 2.4.1
94
- - Datasets: 3.1.0
95
- - Tokenizers: 0.20.3
96
-
97
- ## Credits
98
-
99
- This work wouldn't be possible without all the **great contributions from the open LLM community**. Thank you! Special kudos go to
100
-
101
- - @philschmid for his latest [fine-tuning boilerplate](https://www.philschmid.de/fine-tune-llms-in-2025)
102
- - @lvwerra, @lewtun et al for building and maintaining [trl](https://github.com/huggingface/trl)
103
- - @cognitivecomputations for sharing [spectrum](https://github.com/cognitivecomputations/spectrum/tree/main)
104
- - @allenai for releasing [tulu-3-sft-mixture](https://huggingface.co/datasets/allenai/tulu-3-sft-mixture)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
  - @qwen for building [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)
 
1
+ ---
2
+ model_name: Qwen2.5-Argunaut-1-1.5B-SFT
3
+ license: apache-2.0
4
+ datasets:
5
+ - DebateLabKIT/deepa2-conversations
6
+ - DebateLabKIT/deep-argmap-conversations
7
+ - allenai/tulu-3-sft-mixture
8
+ base_model:
9
+ - Qwen/Qwen2.5-1.5B-Instruct
10
+ pipeline_tag: text-generation
11
+ library_name: transformers
12
+ tags:
13
+ - logic
14
+ - argumentation
15
+ - critical-thinking
16
+ - argument-mapping
17
+ - trl
18
+ - sft
19
+ language:
20
+ - zho
21
+ - eng
22
+ - fra
23
+ - spa
24
+ - por
25
+ - deu
26
+ - ita
27
+ - rus
28
+ - jpn
29
+ - kor
30
+ - vie
31
+ - tha
32
+ - ara
33
+ ---
34
+
35
+ # Model Card for Qwen2.5-Argunaut-1-1.5B-SFT
36
+
37
+ 🧪 _Experimental, not recommended for use in teaching._
38
+
39
+
40
+ This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct).
41
+ It has been trained using [TRL](https://github.com/huggingface/trl).
42
+
43
+ 📘 [HF Blog Article](https://huggingface.co/blog/ggbetz/argunauts-phase-1)
44
+
45
+
46
+ ## Quick start
47
+
48
+ ```python
49
+ from transformers import pipeline
50
+ question = "Are you familiar with Argdown syntax? What's its purpose?"
51
+ generator = pipeline("text-generation", model="DebateLabKIT/Llama-3.1-Argunaut-1-8B-SFT", device="cuda")
52
+ output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
53
+ print(output["generated_text"])
54
+ ```
55
+
56
+ ## Evaluation
57
+
58
+
59
+ ### Chat Experience
60
+
61
+ _coming soon_
62
+
63
+ ### Metrics
64
+
65
+ _coming soon_
66
+
67
+
68
+ ## SFT dataset mixture
69
+
70
+ |Dataset|Weight (examples)|Weight (tokens)|
71
+ |:------|:----:|:----:|
72
+ |DebateLabKIT/deepa2-conversations|25%|49%|
73
+ |DebateLabKIT/deep-argmap-conversations|25%|18%|
74
+ |allenai/tulu-3-sft-mixture|50%|33%|
75
+
76
+
77
+ ## Training procedure
78
+
79
+ Trained with SFT on **1M examples** and for 1 epoch with
80
+
81
+ * context length 8196
82
+ * packing (trl implementation)
83
+
84
+ ```yaml
85
+ # Training parameters
86
+ num_train_epochs: 1
87
+ per_device_train_batch_size: 32
88
+ gradient_accumulation_steps: 1
89
+ gradient_checkpointing: true
90
+ gradient_checkpointing_kwargs:
91
+ use_reentrant: false
92
+ learning_rate: 5.0e-6
93
+ lr_scheduler_type: cosine
94
+ warmup_ratio: 0.1
95
+ ```
96
+
97
+ Hardware: 4 x H100 GPUs.
98
+
99
+ _This work was performed on the HoreKa supercomputer funded by the
100
+ Ministry of Science, Research and the Arts Baden-Württemberg and by
101
+ the Federal Ministry of Education and Research._
102
+
103
+ ### Framework versions
104
+
105
+ - TRL: 0.14.0
106
+ - Transformers: 4.46.3
107
+ - Pytorch: 2.4.1
108
+ - Datasets: 3.1.0
109
+ - Tokenizers: 0.20.3
110
+
111
+ ## Credits
112
+
113
+ This work wouldn't be possible without all the **great contributions from the open LLM community**. Thank you! Special kudos go to
114
+
115
+ - @philschmid for his latest [fine-tuning boilerplate](https://www.philschmid.de/fine-tune-llms-in-2025)
116
+ - @lvwerra, @lewtun et al for building and maintaining [trl](https://github.com/huggingface/trl)
117
+ - @cognitivecomputations for sharing [spectrum](https://github.com/cognitivecomputations/spectrum/tree/main)
118
+ - @allenai for releasing [tulu-3-sft-mixture](https://huggingface.co/datasets/allenai/tulu-3-sft-mixture)
119
  - @qwen for building [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)