Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities Paper • 2507.06261 • Published Jul 7 • 59
Preference-grounded Token-level Guidance for Language Model Fine-tuning Paper • 2306.00398 • Published Jun 1, 2023
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference Paper • 2402.08265 • Published Feb 13, 2024
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model Paper • 2501.02790 • Published Jan 6 • 9