DebateLabKIT
/

Qwen2.5-Argunaut-1-1.5B-SFT

@@ -1,105 +1,119 @@
----
-model_name: Qwen2.5-Argunaut-1-1.5B-SFT
-license: apache-2.0
-datasets:
-- DebateLabKIT/deepa2-conversations
-- DebateLabKIT/deep-argmap-conversations
-- allenai/tulu-3-sft-mixture
-base_model:
-- Qwen/Qwen2.5-1.5B-Instruct
-pipeline_tag: text-generation
-library_name: transformers
-tags:
-- logic
-- argumentation
-- critical-thinking
-- argument-mapping
-- trl
-- sft
----
-# Model Card for Qwen2.5-Argunaut-1-1.5B-SFT
-🧪 _Experimental, not recommended for use in teaching._
-This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct).
-It has been trained using [TRL](https://github.com/huggingface/trl).
-📘 [HF Blog Article](https://huggingface.co/blog/ggbetz/argunauts-phase-1)
-## Quick start
-```python
-from transformers import pipeline
-question = "Are you familiar with Argdown syntax? What's its purpose?"
-generator = pipeline("text-generation", model="DebateLabKIT/Llama-3.1-Argunaut-1-8B-SFT", device="cuda")
-output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
-print(output["generated_text"])
-```
-## Evaluation
-### Chat Experience
-_coming soon_
-### Metrics
-_coming soon_
-## SFT dataset mixture
-|Dataset|Weight (examples)|Weight (tokens)|
-|:------|:----:|:----:|
-|DebateLabKIT/deepa2-conversations|25%|49%|
-|DebateLabKIT/deep-argmap-conversations|25%|18%|
-|allenai/tulu-3-sft-mixture|50%|33%|
-## Training procedure
-Trained with SFT on **1M examples** and for 1 epoch with
-* context length 8196
-* packing (trl implementation)
-```yaml
-# Training parameters
-num_train_epochs: 1
-per_device_train_batch_size: 32
-gradient_accumulation_steps: 1
-gradient_checkpointing: true
-gradient_checkpointing_kwargs:
-  use_reentrant: false
-learning_rate: 5.0e-6
-lr_scheduler_type: cosine
-warmup_ratio: 0.1
-```
-Hardware: 4 x H100 GPUs.
-_This work was performed on the HoreKa supercomputer funded by the
-Ministry of Science, Research and the Arts Baden-Württemberg and by
-the Federal Ministry of Education and Research._
-### Framework versions
-- TRL: 0.14.0
-- Transformers: 4.46.3
-- Pytorch: 2.4.1
-- Datasets: 3.1.0
-- Tokenizers: 0.20.3
-## Credits
-This work wouldn't be possible without all the **great contributions from the open LLM community**. Thank you! Special kudos go to
-- @philschmid for his latest [fine-tuning boilerplate](https://www.philschmid.de/fine-tune-llms-in-2025)
-- @lvwerra, @lewtun et al for building and maintaining [trl](https://github.com/huggingface/trl)
-- @cognitivecomputations for sharing [spectrum](https://github.com/cognitivecomputations/spectrum/tree/main)
-- @allenai for releasing [tulu-3-sft-mixture](https://huggingface.co/datasets/allenai/tulu-3-sft-mixture)
 - @qwen for building [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)

+---
+model_name: Qwen2.5-Argunaut-1-1.5B-SFT
+license: apache-2.0
+datasets:
+- DebateLabKIT/deepa2-conversations
+- DebateLabKIT/deep-argmap-conversations
+- allenai/tulu-3-sft-mixture
+base_model:
+- Qwen/Qwen2.5-1.5B-Instruct
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- logic
+- argumentation
+- critical-thinking
+- argument-mapping
+- trl
+- sft
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+---
+# Model Card for Qwen2.5-Argunaut-1-1.5B-SFT
+🧪 _Experimental, not recommended for use in teaching._
+This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct).
+It has been trained using [TRL](https://github.com/huggingface/trl).
+📘 [HF Blog Article](https://huggingface.co/blog/ggbetz/argunauts-phase-1)
+## Quick start
+```python
+from transformers import pipeline
+question = "Are you familiar with Argdown syntax? What's its purpose?"
+generator = pipeline("text-generation", model="DebateLabKIT/Llama-3.1-Argunaut-1-8B-SFT", device="cuda")
+output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
+print(output["generated_text"])
+```
+## Evaluation
+### Chat Experience
+_coming soon_
+### Metrics
+_coming soon_
+## SFT dataset mixture
+|Dataset|Weight (examples)|Weight (tokens)|
+|:------|:----:|:----:|
+|DebateLabKIT/deepa2-conversations|25%|49%|
+|DebateLabKIT/deep-argmap-conversations|25%|18%|
+|allenai/tulu-3-sft-mixture|50%|33%|
+## Training procedure
+Trained with SFT on **1M examples** and for 1 epoch with
+* context length 8196
+* packing (trl implementation)
+```yaml
+# Training parameters
+num_train_epochs: 1
+per_device_train_batch_size: 32
+gradient_accumulation_steps: 1
+gradient_checkpointing: true
+gradient_checkpointing_kwargs:
+  use_reentrant: false
+learning_rate: 5.0e-6
+lr_scheduler_type: cosine
+warmup_ratio: 0.1
+```
+Hardware: 4 x H100 GPUs.
+_This work was performed on the HoreKa supercomputer funded by the
+Ministry of Science, Research and the Arts Baden-Württemberg and by
+the Federal Ministry of Education and Research._
+### Framework versions
+- TRL: 0.14.0
+- Transformers: 4.46.3
+- Pytorch: 2.4.1
+- Datasets: 3.1.0
+- Tokenizers: 0.20.3
+## Credits
+This work wouldn't be possible without all the **great contributions from the open LLM community**. Thank you! Special kudos go to
+- @philschmid for his latest [fine-tuning boilerplate](https://www.philschmid.de/fine-tune-llms-in-2025)
+- @lvwerra, @lewtun et al for building and maintaining [trl](https://github.com/huggingface/trl)
+- @cognitivecomputations for sharing [spectrum](https://github.com/cognitivecomputations/spectrum/tree/main)
+- @allenai for releasing [tulu-3-sft-mixture](https://huggingface.co/datasets/allenai/tulu-3-sft-mixture)
 - @qwen for building [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)