-
lusxvr/nanoVLM-222M
Image-Text-to-Text • Updated • 5.3k • 83 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 31 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 89 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 86
Juan Rafael Paulino
JuanRafap
·
AI & ML interests
None yet
Recent Activity
updated
a collection
about 21 hours ago
Interés
updated
a collection
about 21 hours ago
Models
updated
a collection
1 day ago
Models
Organizations
None yet
Collections
5
models
0
None public yet
datasets
0
None public yet