Dataset?

#1
by 0xbitches - opened

Hello, thanks for releasing the weights. Are there plans to open source the dataset used for cDPO as well?

also what SFT dataset was used

This comment has been hidden
Owner

SFT is 10k from: LDJnr/Capybara and 10k generated from various models.
DPO is 5k generated using the model itself.

What prompt did you use to generate the DPO dataset samples?

DPO is 5k generated using the model itself

Did you manually validate the pairs generated by the model?

Sign up or log in to comment