shawhin
/

Qwen2.5-0.5B-DPO

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Community

shawhin commited on Feb 26

Commit

d29d2f6

·

verified ·

1 Parent(s): 3e198d0

Update README.md

Files changed (1) hide show

README.md +1 -4

README.md CHANGED Viewed

@@ -11,16 +11,13 @@ licence: license
 # Model Card for Qwen2.5-0.5B-DPO
-Fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) to generate YouTube titles based on my preferences.
 Video link: coming soon! <br>
 [Blog link](https://shawhin.medium.com/fine-tuning-llms-on-human-feedback-rlhf-dpo-1c693dbc4cbf) <br>
 [GitHub Repo](https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/dpo) <br>
 [Training Dataset](https://huggingface.co/datasets/shawhin/youtube-titles-dpo)
-This model is a .
-It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
 ```python

 # Model Card for Qwen2.5-0.5B-DPO
+Fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) to generate YouTube titles based on my preferences. It was trained using [TRL](https://github.com/huggingface/trl).
 Video link: coming soon! <br>
 [Blog link](https://shawhin.medium.com/fine-tuning-llms-on-human-feedback-rlhf-dpo-1c693dbc4cbf) <br>
 [GitHub Repo](https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/dpo) <br>
 [Training Dataset](https://huggingface.co/datasets/shawhin/youtube-titles-dpo)
 ## Quick start
 ```python