Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published Mar 26 • 133
view article Article Supercharge your OCR Pipelines with Open Models +5 merve, ariG23498, davanstrien, hynky, andito, reach-vb, pcuenq • Oct 21, 2025 • 314
view article Article How Financial News Can Be Used to Train Good Financial Models SelmaNajih001 • Oct 8, 2025 • 10
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14, 2025 • 310
view article Article Dingo Tested It for You: Nano Banana Delivers Strong Results chupei • Oct 3, 2025 • 1
view article Article Introducing RTEB: A New Standard for Retrieval Evaluation +4 fzliu, KennethEnevoldsen, Samoed, isaacchung, tomaarsen, fzoll • Oct 1, 2025 • 144
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing Paper • 2509.22186 • Published Sep 26, 2025 • 164
VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos Paper • 2506.10857 • Published Jun 12, 2025 • 30
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations Paper • 2412.07626 • Published Dec 10, 2024 • 30
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published Dec 6, 2024 • 161