FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Paper • 2412.01064 • Published Dec 2, 2024 • 31
UniversalRAG: Retrieval-Augmented Generation over Multiple Corpora with Diverse Modalities and Granularities Paper • 2504.20734 • Published 19 days ago • 61
Boosting Generative Image Modeling via Joint Image-Feature Synthesis Paper • 2504.16064 • Published 26 days ago • 14
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning Paper • 2504.17192 • Published 24 days ago • 109
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation Paper • 2504.12626 • Published Apr 17 • 48
WORLDMEM: Long-term Consistent World Simulation with Memory Paper • 2504.12369 • Published Apr 16 • 33
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models Paper • 2504.04718 • Published Apr 7 • 41
Wan: Open and Advanced Large-Scale Video Generative Models Paper • 2503.20314 • Published Mar 26 • 51
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation Paper • 2503.09151 • Published Mar 12 • 32
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12 • 72
FedRand: Enhancing Privacy in Federated Learning with Randomized LoRA Subparameter Updates Paper • 2503.07216 • Published Mar 10 • 32
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching Paper • 2503.05179 • Published Mar 7 • 46
Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model Paper • 2502.13449 • Published Feb 19 • 46
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models Paper • 2502.12464 • Published Feb 18 • 28
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published Feb 16 • 60
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published Feb 13 • 149