Dataset?
#1
by
0xbitches
- opened
Hello, thanks for releasing the weights. Are there plans to open source the dataset used for cDPO as well?
also what SFT dataset was used
This comment has been hidden
SFT is 10k from: LDJnr/Capybara and 10k generated from various models.
DPO is 5k generated using the model itself.
What prompt did you use to generate the DPO dataset samples?
DPO is 5k generated using the model itself
Did you manually validate the pairs generated by the model?