Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark Paper • 2504.13143 • Published Apr 17 • 8
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning Paper • 2503.06960 • Published Mar 10 • 3
Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning Paper • 2312.11420 • Published Dec 18, 2023 • 2
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? Paper • 2409.15277 • Published Sep 23, 2024 • 39
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning Paper • 2406.12742 • Published Jun 18, 2024 • 15