Enhancing Paraphrase Type Generation
Collection
Enhancing Paraphrase Type Generation: The Impact of DPO and RLHF Evaluated with Human-Ranked Data
•
12 items
•
Updated
This model is a fine-tuned version of meta-llama/Llama-3.1-8B on the None dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
No log | 1.0 | 34 | 0.7886 | 0.4925 |
No log | 2.0 | 68 | 0.7860 | 0.4925 |
No log | 3.0 | 102 | 0.7851 | 0.4925 |
Cite this model:
@misc{lübbers2025enhancingparaphrasetypegeneration,
title={Enhancing Paraphrase Type Generation: The Impact of DPO and RLHF Evaluated with Human-Ranked Data},
author={Christopher Lee Lübbers},
year={2025},
eprint={2506.02018},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.02018},
}
`
Base model
meta-llama/Llama-3.1-8B