HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published 8 days ago • 57
Text2SQL is Not Enough: Unifying AI and Databases with TAG Paper • 2408.14717 • Published Aug 27 • 23
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents Paper • 2408.07199 • Published Aug 13 • 20
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications Paper • 2408.11878 • Published Aug 20 • 50
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models Paper • 2402.05935 • Published Feb 8 • 15
AIMO Progress Prize Collection Models and datasets used in the winning solution to the AIMO 1st Progress Prize • 7 items • Updated Jul 19 • 9
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Paper • 2407.01284 • Published Jul 1 • 75
Plan, Generate and Complicate: Improving Low-resource Dialogue State Tracking via Easy-to-Difficult Zero-shot Data Augmentation Paper • 2406.08860 • Published Jun 13 • 1
CapS-Adapter: Caption-based MultiModal Adapter in Zero-Shot Classification Paper • 2405.16591 • Published May 26 • 1
Training-Free Unsupervised Prompt for Vision-Language Models Paper • 2404.16339 • Published Apr 25 • 1
Make It Count: Text-to-Image Generation with an Accurate Number of Objects Paper • 2406.10210 • Published Jun 14 • 76
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published May 20 • 45
Many-Shot In-Context Learning in Multimodal Foundation Models Paper • 2405.09798 • Published May 16 • 26