Library - a JuanRafap Collection

Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

JuanRafap 's Collections

Dataset

Agent

Library

Models

Library

updated 4 days ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 2.21k • 89
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 32
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published 30 days ago • 94
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 87
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published 30 days ago • 131
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Paper • 2505.24298 • Published 30 days ago • 25
GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning

Paper • 2505.20355 • Published May 26 • 36
Interleaved Reasoning for Large Language Models via Reinforcement Learning

Paper • 2505.19640 • Published May 26 • 13
FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow

Paper • 2505.17399 • Published May 23 • 14
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Paper • 2505.19914 • Published May 26 • 42
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23 • 59
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Paper • 2505.14810 • Published May 20 • 61
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning

Paper • 2505.16410 • Published May 22 • 56
JULI: Jailbreak Large Language Models by Self-Introspection

Paper • 2505.11790 • Published May 17
Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Paper • 2505.13438 • Published May 19 • 35
Reward Reasoning Model

Paper • 2505.14674 • Published May 20 • 35
RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5 • 77
CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models

Paper • 2505.12504 • Published May 18 • 23
Neuro-Symbolic Query Compiler

Paper • 2505.11932 • Published May 17 • 16
Ψ-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models

Paper • 2506.01320 • Published 27 days ago • 16
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published 24 days ago • 25
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics

Paper • 2506.00070 • Published about 1 month ago • 28
A Controllable Examination for Long-Context Language Models

Paper • 2506.02921 • Published 26 days ago • 32
MotionSight: Boosting Fine-Grained Motion Understanding in Multimodal LLMs

Paper • 2506.01674 • Published 27 days ago • 27
CodeContests+: High-Quality Test Case Generation for Competitive Programming

Paper • 2506.05817 • Published 23 days ago • 8
FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion

Paper • 2506.01111 • Published 28 days ago • 29
Reinforcement Pre-Training

Paper • 2506.08007 • Published 20 days ago • 234
GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior

Paper • 2506.08012 • Published 20 days ago • 7
Dreamland: Controllable World Creation with Simulator and Generative Models

Paper • 2506.08006 • Published 20 days ago • 7
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance

Paper • 2506.06444 • Published 23 days ago • 72
BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation

Paper • 2506.07530 • Published 20 days ago • 18
Solving Inequality Proofs with Large Language Models

Paper • 2506.07927 • Published 20 days ago • 20
Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better

Paper • 2506.09040 • Published 19 days ago • 34
Through the Valley: Path to Effective Long CoT Training for Small Language Models

Paper • 2506.07712 • Published 20 days ago • 18
Multimodal DeepResearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework

Paper • 2506.02454 • Published 26 days ago • 5
Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation

Paper • 2506.04614 • Published 24 days ago • 16
Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning

Paper • 2506.06205 • Published 23 days ago • 28
Magistral

Paper • 2506.10910 • Published 17 days ago • 60
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team

Paper • 2506.14234 • Published 12 days ago • 38
Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers

Paper • 2506.14702 • Published 12 days ago • 3
AR-RAG: Autoregressive Retrieval Augmentation for Image Generation

Paper • 2506.06962 • Published 21 days ago • 28
DoTA-RAG: Dynamic of Thought Aggregation RAG

Paper • 2506.12571 • Published 15 days ago • 47
syftr: Pareto-Optimal Generative AI

Paper • 2505.20266 • Published May 26
Scaling Test-time Compute for LLM Agents

Paper • 2506.12928 • Published 14 days ago • 58
LoRA-Edit: Controllable First-Frame-Guided Video Editing via Mask-Aware LoRA Fine-Tuning

Paper • 2506.10082 • Published 18 days ago • 8
General-Reasoner: Advancing LLM Reasoning Across All Domains

Paper • 2505.14652 • Published May 20 • 22
Optimizing Length Compression in Large Reasoning Models

Paper • 2506.14755 • Published 12 days ago • 10
UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation

Paper • 2506.17202 • Published 9 days ago • 9
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs

Paper • 2506.18896 • Published 6 days ago • 26
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding

Paper • 2506.16035 • Published 10 days ago • 79

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs