-
LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Paper • 2411.10440 • Published • 113 -
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?
Paper • 2411.06469 • Published • 17 -
Sharingan: Extract User Action Sequence from Desktop Recordings
Paper • 2411.08768 • Published • 10 -
AnimateAnything: Consistent and Controllable Animation for Video Generation
Paper • 2411.10836 • Published • 23
Collections
Discover the best community collections!
Collections including paper arxiv:2411.06469
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 58 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 42 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 54
-
Exploring the Inquiry-Diagnosis Relationship with Advanced Patient Simulators
Paper • 2501.09484 • Published • 16 -
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature
Paper • 2501.07171 • Published • 45 -
Multimodal Contrastive Representation Learning in Augmented Biomedical Knowledge Graphs
Paper • 2501.01644 • Published • 1 -
Understanding the Impact of Confidence in Retrieval Augmented Generation: A Case Study in the Medical Domain
Paper • 2412.20309 • Published