Hanfu-Bench: A Multimodal Benchmark on Cross-Temporal Cultural Understanding and Transcreation Paper • 2506.01565 • Published Jun 2 • 3
S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information Paper • 2503.05085 • Published Mar 7 • 48
HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale Paper • 2406.19280 • Published Jun 27, 2024 • 65