VQ-Logits: Compressing the Output Bottleneck of Large Language Models via Vector Quantized Logits Paper • 2505.10202 • Published May 15
Power-Law Decay Loss for Large Language Model Finetuning: A Theory Perspective Paper • 2505.16900 • Published May 22
ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention Paper • 2505.10222 • Published May 15
Towards Analyzing and Understanding the Limitations of VAPO: A Theoretical Perspective Paper • 2505.17997 • Published May 23
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems Paper • 2505.18943 • Published May 25 • 24
Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer Paper • 2304.11818 • Published Apr 24, 2023
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis Paper • 2503.22420 • Published Mar 28
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation Paper • 2503.04872 • Published Mar 6 • 15
UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface Paper • 2503.01342 • Published Mar 3 • 8