MonoTAKD: Teaching Assistant Knowledge Distillation for Monocular 3D Object Detection Paper • 2404.04910 • Published Apr 7, 2024
DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models Paper • 2503.04240 • Published Mar 6
Science-T2I: Addressing Scientific Illusions in Image Synthesis Paper • 2504.13129 • Published Apr 17 • 3
Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark Paper • 2504.14693 • Published Apr 20
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments Paper • 2503.08604 • Published Mar 11
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model Paper • 2505.23606 • Published May 29 • 15
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? Paper • 2506.11928 • Published Jun 13 • 24
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action Paper • 2505.01583 • Published May 2 • 9
ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration Paper • 2504.08591 • Published Apr 11 • 19
Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models Paper • 2405.15687 • Published May 24, 2024
OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting Paper • 2503.08677 • Published Mar 11 • 29
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published Mar 11 • 69
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published Feb 27 • 28
SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory Paper • 2411.11922 • Published Nov 18, 2024 • 19
Video Understanding with Large Language Models: A Survey Paper • 2312.17432 • Published Dec 29, 2023 • 3
DNAGPT: A Generalized Pre-trained Tool for Versatile DNA Sequence Analysis Tasks Paper • 2307.05628 • Published Jul 11, 2023 • 10
Cross Contrasting Feature Perturbation for Domain Generalization Paper • 2307.12502 • Published Jul 24, 2023