FinSage: A Multi-aspect RAG System for Financial Filings Question Answering Paper • 2504.14493 • Published Apr 20
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation Paper • 2506.14028 • Published 12 days ago • 88
DialGuide: Aligning Dialogue Model Behavior with Developer Guidelines Paper • 2212.10557 • Published Dec 20, 2022
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks Paper • 2412.04626 • Published Dec 5, 2024 • 14
UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction Paper • 2503.15661 • Published Mar 19 • 2
FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering Paper • 2412.07030 • Published Dec 9, 2024
Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA Paper • 2505.16293 • Published May 22 • 2
Rendering-Aware Reinforcement Learning for Vector Graphics Generation Paper • 2505.20793 • Published May 27 • 11
StarFlow: Generating Structured Workflow Outputs From Sketch Images Paper • 2503.21889 • Published Mar 27 • 1
Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows Paper • 2505.24189 • Published 30 days ago • 5
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts Paper • 2505.18962 • Published May 25 • 12
Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering Paper • 2306.09996 • Published Jun 16, 2023
Benchmarking Vision Language Models for Cultural Understanding Paper • 2407.10920 • Published Jul 15, 2024
Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding Paper • 2306.08832 • Published Jun 15, 2023