@ccocks-deca on Hugging Face: "📢 New Dataset: OpenSynth Battles We've released OpenSynth Battles, a…"

Post

277

📢 New Dataset: OpenSynth Battles

We've released OpenSynth Battles, a benchmark dataset featuring generations from five large language models on shared prompts. Each prompt includes:

Responses from:
gpt-oss-120b, deepseek-v3.1-thinking, deepseek-v3.1-instruct, moonshotai/kimi-k2-instruct, and deepseek-r1-0528

Automated scoring by gpt-oss-120b

Useful for model comparison, automated evaluation research, and prompt-level performance analysis.
No data splits included.

🔗 https://huggingface.co/datasets/ccocks-deca/open-synth-battles

Join the conversation