new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Jun 9

Submitted by

zlatamaria

Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA

·
9 authors

3

Submitted by

Shunian

FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion

·
8 authors

Submitted by

abhi1nandy2

Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs

·
3 authors

1

Submitted by

russwang

MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning

·
13 authors

Submitted by

dror44

Sentinel: SOTA model to protect against prompt injections

·
2 authors

1

Submitted by

chenguolin

PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers

·
7 authors

2

Submitted by

DarthZhu

Is Extending Modality The Right Path Towards Omni-Modality?

·
4 authors

1

Submitted by

thomagram

STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

·
10 authors

Submitted by

dcml0714

Audio-Aware Large Language Models as Judges for Speaking Styles

·
11 authors

3

Submitted by

cg1177

Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision

·
8 authors

1

Submitted by

Hoyard

3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model

·
7 authors

1

Submitted by

zhwang01

CodeContests+: High-Quality Test Case Generation for Competitive Programming

·
5 authors

1

Submitted by

EmetTheGolum

Peer-Ranked Precision: Creating a Foundational Dataset for Fine-Tuning Vision Models from DataSeeds' Annotated Imagery

·
4 authors

1

Submitted by

JohnCage

Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward

·
8 authors

Submitted by

guineapig

HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization

·
3 authors

1

Submitted by

salman-abdullah

MIRIAD: Augmenting LLMs with millions of medical query-response pairs

·
10 authors

Submitted by

MauroC

Splatting Physical Scenes: End-to-End Real-to-Sim from Imperfect Robot Data

·
6 authors

Submitted by

benshi34

When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration

·
6 authors

Submitted by

sy1998

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

·
10 authors

Submitted by

lss727

Truth in the Few: High-Value Data Selection for Efficient Multi-Modal Reasoning

·
9 authors

Submitted by

neildlf

GuideX: Guided Synthetic Data Generation for Zero-Shot Information Extraction

·
4 authors

1

Submitted by

DhavalPatel

AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance

·
8 authors

Submitted by

scott-yjyang

Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning

·
11 authors

Submitted by

totolacky

Sparsified State-Space Models are Efficient Highway Networks

·
5 authors