new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

May 22

Submitted by

hyungjoochae

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

·
21 authors

3

Submitted by

ChenMnZ

Scaling Law for Quantization-Aware Training

·
11 authors

1

Submitted by

Lingaaaaaaa

MMaDA: Multimodal Large Diffusion Language Models

·
7 authors

2

Submitted by

MingxingLi

UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning

·
8 authors

4

Submitted by

siyue

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

·
6 authors

1

Submitted by

zizi-0123

Efficient Agent Training for Computer Use

·
3 authors

1

Submitted by

Emaad

This Time is Different: An Observability Perspective on Time Series Foundation Models

·
17 authors

Submitted by

PeterV09

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

·
8 authors

1

Submitted by

Amanda2023

When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning

·
9 authors

1

Submitted by

knightnemo

Vid2World: Crafting Video Diffusion Models to Interactive World Models

·
5 authors

1

Submitted by

Snyhlxde

lmgame-Bench: How Good are LLMs at Playing Games?

·
9 authors

2

Submitted by

xw-eric

Constructing a 3D Town from a Single Image

·
5 authors

Submitted by

JamesMile

Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs

·
11 authors

Submitted by

afdsafas

IA-T2I: Internet-Augmented Text-to-Image Generation

·
6 authors

Submitted by

TongZheng1999

Learning to Reason via Mixture-of-Thought for Logical Reasoning

·
5 authors

1

Submitted by

nonstopfor

How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study

·
11 authors

Submitted by

horseee

dKV-Cache: The Cache for Diffusion Language Models

·
4 authors

Submitted by

nonstopfor

Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!

·
6 authors

Submitted by

yangjunxiao2021

BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs

·
12 authors

1

Submitted by

manchery

RLVR-World: Training World Models with Reinforcement Learning

·
4 authors

1

Submitted by

xw-eric

Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space

·
8 authors

Submitted by

sinwang

ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning

·
5 authors

Submitted by

yzhuang

Text Generation Beyond Discrete Token Sampling

·
5 authors

Submitted by

IvanTang

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

·
17 authors

1

Submitted by

Ziruibest

Evaluate Bias without Manual Test Sets: A Concept Representation Perspective for LLMs

·
9 authors

Submitted by

yanyc

VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models

·
12 authors

Submitted by

huangsiteng

VARD: Efficient and Dense Fine-Tuning for Diffusion Models with Value-based RL

·
7 authors

Submitted by

Ziruibest

Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models

·
12 authors

Submitted by

Mellen

PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration

·
3 authors

Submitted by

sunshinekevin

RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning

·
6 authors

1

Submitted by

craigwu

Streamline Without Sacrifice - Squeeze out Computation Redundancy in LMM

·
3 authors

Submitted by

zxbsmk

WebNovelBench: Placing LLM Novelists on the Web Novel Distribution

·
3 authors

Submitted by

pittawat

Prior Prompt Engineering for Reinforcement Fine-Tuning

·
4 authors

1

Submitted by

bytehxf

DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling

·
6 authors

Submitted by

hisoka94

Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach

·
5 authors

Submitted by

ernlavr

MultiHal: Multilingual Dataset for Knowledge-Graph Grounded Evaluation of LLM Hallucinations

·
4 authors

1

Submitted by

shainaraza

HumaniBench: A Human-Centric Framework for Large Multimodal Models Evaluation

·
8 authors

1

Submitted by

yapeichang

BLEUBERI: BLEU is a surprisingly effective reward for instruction following

·
7 authors

Submitted by

Fengzhuo

BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms

·
9 authors

Submitted by

shivamag99

The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning

·
5 authors

Submitted by

ishikaa

Language Specific Knowledge: Do Models Know Better in X than in English?

·
3 authors

Submitted by

NathanRoll

In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties

·
6 authors