QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published May 23 • 87
view article Article TinyAgents: A Minimal Experiment with Code Agents and MCP Tools By albertvillanova • May 16 • 29
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback Paper • 2309.00267 • Published Sep 1, 2023 • 50
Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning Paper • 2504.20835 • Published Apr 29 • 1
Phi-4 (All Versions) Collection Microsoft's Phi-4 models including Reasoning + Reasoning Plus & mini. Includes Dynamic 2.0 GGUF, 4-bit & 16-bit versions. Includes Unsloth's bug fixes • 20 items • Updated 29 days ago • 71
view article Article ChatGPT-4o's Image Generation Capabilities and Its Wild Examples By prithivMLmods • Apr 5 • 20
view article Article Preference Optimization for Vision Language Models By qgallouedec and 3 others • Jul 10, 2024 • 79
GraphWiz: An Instruction-Following Language Model for Graph Problems Paper • 2402.16029 • Published Feb 25, 2024 • 3