AugCon Automatically Generating Numerous Context-Driven SFT Data for LLMs across Diverse Granularity Automatically Generating Numerous Context-Driven SFT Data for LLMs across Diverse Granularity Paper • 2405.16579 • Published May 26 quanshr/Qwen-DailyM-32B-LoRA Updated Jul 16 • 4 • 1 quanshr/DailyM Viewer • Updated Jul 16 • 1k • 63 • 1 quanshr/DailyM-SFT Viewer • Updated Jul 19 • 117k • 62 • 1
Automatically Generating Numerous Context-Driven SFT Data for LLMs across Diverse Granularity Paper • 2405.16579 • Published May 26
DMoERM DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling Paper • 2403.01197 • Published Mar 2 quanshr/mtmc-rlhf Viewer • Updated May 10 • 21.7k • 55 • 9
DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling Paper • 2403.01197 • Published Mar 2