Update README.md
Browse files
README.md
CHANGED
@@ -24,6 +24,8 @@ should probably proofread and complete it, then remove this comment. -->
|
|
24 |
# phi2-lora-quantized-distilabel-intel-orca-dpo-pairs
|
25 |
|
26 |
This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on [distilabel-intel-orca-dpo-pairs](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs).
|
|
|
|
|
27 |
It achieves the following results on the evaluation set:
|
28 |
- Loss: 0.0972
|
29 |
- Rewards/chosen: 0.2699
|
|
|
24 |
# phi2-lora-quantized-distilabel-intel-orca-dpo-pairs
|
25 |
|
26 |
This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on [distilabel-intel-orca-dpo-pairs](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs).
|
27 |
+
The full training notebook can be found [here](https://colab.research.google.com/drive/1PGMj7jlkJaCiSNNihA2NtpILsRgkRXrJ?usp=sharing).
|
28 |
+
|
29 |
It achieves the following results on the evaluation set:
|
30 |
- Loss: 0.0972
|
31 |
- Rewards/chosen: 0.2699
|