shawhin commited on
Commit
d29d2f6
·
verified ·
1 Parent(s): 3e198d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -4
README.md CHANGED
@@ -11,16 +11,13 @@ licence: license
11
 
12
  # Model Card for Qwen2.5-0.5B-DPO
13
 
14
- Fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) to generate YouTube titles based on my preferences.
15
 
16
  Video link: coming soon! <br>
17
  [Blog link](https://shawhin.medium.com/fine-tuning-llms-on-human-feedback-rlhf-dpo-1c693dbc4cbf) <br>
18
  [GitHub Repo](https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/dpo) <br>
19
  [Training Dataset](https://huggingface.co/datasets/shawhin/youtube-titles-dpo)
20
 
21
- This model is a .
22
- It has been trained using [TRL](https://github.com/huggingface/trl).
23
-
24
  ## Quick start
25
 
26
  ```python
 
11
 
12
  # Model Card for Qwen2.5-0.5B-DPO
13
 
14
+ Fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) to generate YouTube titles based on my preferences. It was trained using [TRL](https://github.com/huggingface/trl).
15
 
16
  Video link: coming soon! <br>
17
  [Blog link](https://shawhin.medium.com/fine-tuning-llms-on-human-feedback-rlhf-dpo-1c693dbc4cbf) <br>
18
  [GitHub Repo](https://github.com/ShawhinT/YouTube-Blog/tree/main/LLMs/dpo) <br>
19
  [Training Dataset](https://huggingface.co/datasets/shawhin/youtube-titles-dpo)
20
 
 
 
 
21
  ## Quick start
22
 
23
  ```python