glorgao commited on
Commit
4345b13
·
verified ·
1 Parent(s): 0611686

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -6,6 +6,6 @@ base_model:
6
  - princeton-nlp/Llama-3-Base-8B-SFT
7
  ---
8
 
9
- This model is fine-tuned from the princeton-nlp/Llama-3-Base-8B-SFT model using the SelectiveDPO algorithm on the Ultrafeedback_binarized dataset.
10
 
11
  For the recipe to reproduce this model, please visit our [GitHub page](https://github.com/glorgao/SelectiveDPO).
 
6
  - princeton-nlp/Llama-3-Base-8B-SFT
7
  ---
8
 
9
+ This model is fine-tuned from the princeton-nlp/Llama-3-Base-8B-SFT model using the [SelectiveDPO](https://huggingface.co/papers/2502.09650) on the Ultrafeedback_binarized dataset.
10
 
11
  For the recipe to reproduce this model, please visit our [GitHub page](https://github.com/glorgao/SelectiveDPO).