From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval Paper • 2505.23059 • Published 8 days ago • 13
Retrieval Sources Collection Retrieval sources for retrieval-augmented code generation. • 6 items • Updated Jun 2, 2024 • 5
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 862
view article Article Formatting Datasets for Chat Template Compatibility By nroggendorff • Jun 28, 2024 • 8
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs Paper • 2402.14740 • Published Feb 22, 2024 • 13
HARP: Hesitation-Aware Reframing in Transformer Inference Pass Paper • 2412.07282 • Published Dec 10, 2024 • 4