BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Paper
•
2505.11080
•
Published
•
5
This collection contains datasets and models related to "BLEUBERI: BLEU is a surprisingly effective reward for instruction following".