Offline Reinforcement Learning for LLM Multi-Step Reasoning Paper โข 2412.16145 โข Published Dec 20, 2024 โข 38
Panacea: Pareto Alignment via Preference Adaptation for LLMs Paper โข 2402.02030 โข Published Feb 3, 2024 โข 10
Running on CPU Upgrade 12.4k 12.4k Open LLM Leaderboard ๐ Track, rank and evaluate open LLMs and chatbots