ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning Paper • 2506.09513 • Published 16 days ago • 92
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published 19 days ago • 105
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published 19 days ago • 105
Evaluation Suite for Hallucination of Multilingual LLMs Collection Datasets for the paper "Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations" • 4 items • Updated 22 days ago • 2
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts Paper • 2504.21117 • Published Apr 29 • 25
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts Paper • 2504.21117 • Published Apr 29 • 25
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts Paper • 2504.21117 • Published Apr 29 • 25 • 2
view article Article MIEB: The Benchmark That Stress-Tests Image-Text Embeddings Like Never Before By isaacchung and 2 others • Apr 24 • 14