Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2504.10481

🤔 Reasoning about Reasoning

papers and articles about reasoning LLMs

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Paper • 2503.21614 • Published Mar 27 • 39
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62
JudgeLRM: Large Reasoning Models as a Judge

Paper • 2504.00050 • Published Mar 31 • 60
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought

Paper • 2504.05599 • Published about 1 month ago • 81

ryzenai-finetuned-llms

This collection contains finetuned models that are targetted to run on Ryzen AI platforms

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published 24 days ago • 84
VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

Paper • 2504.15279 • Published 17 days ago • 73

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 10
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published 28 days ago • 42
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published 24 days ago • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published 24 days ago • 84

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 118
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 111
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 124

IAAR-Shanghai/xVerify-8B-I

Updated 23 days ago • 123 • 4
IAAR-Shanghai/xVerify-9B-C

Updated 23 days ago • 148 • 4
IAAR-Shanghai/xVerify-0.5B-I

Updated 23 days ago • 773 • 5
IAAR-Shanghai/xVerify-3B-Ia

Updated 23 days ago • 298 • 4

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 78
When an LLM is apprehensive about its answers -- and when its uncertainty is justified

Paper • 2503.01688 • Published Mar 3 • 21
Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 57
Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 48

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 124
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published 20 days ago • 119
Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published 17 days ago • 80

RL+reason model

about 22 hours ago

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 121
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 5

OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking

Paper • 2501.09751 • Published Jan 16 • 49
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16 • 41
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 391
s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 120

MLLM-as-a-Judge for Image Safety without Human Labeling

Paper • 2501.00192 • Published Dec 31, 2024 • 31
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107
Xmodel-2 Technical Report

Paper • 2412.19638 • Published Dec 27, 2024 • 27
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 102

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs