This collection contains datasets and models related to "BLEUBERI: BLEU is a surprisingly effective reward for instruction following".
Yapei Chang PRO
yapeichang
AI & ML interests
NLP
Recent Activity
updated
a collection
about 6 hours ago
BLEUBERI
updated
a dataset
about 6 hours ago
yapeichang/BLEUBERI-Tulu3-50k
updated
a collection
2 days ago
BLEUBERI
Organizations
None yet
Collections
1
models
10

yapeichang/Qwen2.5-7B-SFT
Text Generation
•
Updated
•
1

yapeichang/Qwen2.5-3B-SFT
Text Generation
•
Updated
•
2

yapeichang/Qwen2.5-3B-RM8B
Text Generation
•
Updated
•
2

yapeichang/Qwen2.5-3B-BLEUBERI
Text Generation
•
Updated
•
3

yapeichang/Llama-3.1-8B-SFT
Text Generation
•
Updated
•
2

yapeichang/Llama-3.1-8B-RM8B
Text Generation
•
Updated
•
3

yapeichang/Qwen2.5-7B-RM8B
Text Generation
•
Updated
•
1

yapeichang/Llama-3.1-8B-BLEUBERI
Text Generation
•
Updated
•
1

yapeichang/Llama-3.1-8B
Text Generation
•
Updated
•
10

yapeichang/Qwen2.5-7B-BLEUBERI
Text Generation
•
Updated
•
27
•
1