Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning Paper • 2410.06373 • Published Oct 8, 2024 • 36
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published 2 days ago • 56
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models Paper • 2503.24235 • Published 3 days ago • 42
MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 5 days ago • 65
Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper • 2504.00557 • Published 3 days ago • 12