new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Apr 3

Submitted by

zbhpku

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

PekingUniversity

Peking University

Submitted by

LZXzju

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

·
10 authors

Submitted by

yxl66666

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

·
37 authors

Submitted by

wangzx1994

Generative World Renderer

ShandaAI

Shanda AI Research Tokyo

Submitted by

wuzhi-hao

EgoSim: Egocentric World Simulator for Embodied Interaction Generation

·
8 authors

Submitted by

Huaxiu

Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory

UNC-ChapelHill

University of North Carolina at Chapel Hill

Submitted by

orres

LatentUM: Unleashing the Potential of Interleaved Cross-Modal Reasoning via a Latent-Space Unified Model

·
7 authors

Submitted by

owl10

UniDriveVLA: Unifying Understanding, Perception, and Action Planning for Autonomous Driving

Huster

Huazhong University of Science and Technology

Submitted by

chengtim

VOID: Video Object and Interaction Deletion

netflix

Submitted by

junhao910323

GPA: Learning GUI Process Automation from Demonstrations

Salesforce

Submitted by

marinero4972

VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification

PKU

Peking University

Submitted by

dominoer

FlowSlider: Training-Free Continuous Image Editing via Fidelity-Steering Decomposition

·
3 authors

Submitted by

Yuanshi

Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers

NationalUniversityofSingapore

National University of Singapore

Submitted by

chongjie

Omni123: Exploring 3D Native Foundation Models with Limited 3D Data by Unifying Text to 2D and 3D Generation

·
7 authors

Submitted by

Wonjoon-Jin

DynaVid: Learning to Generate Highly Dynamic Videos using Synthetic Motion Data

POSTECH

Pohang University of Science and Technology

Submitted by

vardaan123

Automatic Image-Level Morphological Trait Annotation for Organismal Images

osunlp

Submitted by

taesiri

Apriel-Reasoner: RL Post-Training for General-Purpose and Efficient Reasoning

ServiceNow

Submitted by

Razvan27

Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time

AISE research lab at TU Delft

Submitted by

zhensuuu

Executing as You Generate: Hiding Execution Latency in LLM Code Generation

SingaporeManagementUniversity

Singapore Management University

Submitted by

patrickamadeus

LinguDistill: Recovering Linguistic Ability in Vision- Language Models via Selective Cross-Modal Distillation

MBZUAI

Mohamed Bin Zayed University of Artificial Intelligence

Submitted by

Yuheng02

UniRecGen: Unifying Multi-View 3D Reconstruction and Generation

·
13 authors

Submitted by

Aratako

T5Gemma-TTS Technical Report

·
2 authors

Submitted by

taesiri

Woosh: A Sound Effects Foundation Model

Sony

Submitted by

tux

Friends and Grandmothers in Silico: Localizing Entity Cells in Language Models

·
4 authors