TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting Paper • 2503.17032 • Published 4 days ago • 16
Infinite Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation Paper • 2503.13424 • Published 8 days ago • 25
Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations Paper • 2503.06273 • Published 17 days ago • 5
QE4PE: Word-level Quality Estimation for Human Post-Editing Paper • 2503.03044 • Published 21 days ago • 6
LONGCODEU: Benchmarking Long-Context Language Models on Long Code Understanding Paper • 2503.04359 • Published 19 days ago • 6
R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning Paper • 2503.05379 • Published 18 days ago • 33
EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published 18 days ago • 75
Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation Paper • 2503.01370 • Published 22 days ago • 12
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published 19 days ago • 66
Persian Text Datasets Collection Collection of some good Persian datasets • 21 items • Updated 19 days ago • 3
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities Paper • 2503.03983 • Published 20 days ago • 22
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality 22 days ago • 69