MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs Paper • 2505.24858 • Published 11 days ago • 17
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos Paper • 2505.23693 • Published 12 days ago • 56
Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation Paper • 2212.07981 • Published Dec 15, 2022
Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models Paper • 2502.19918 • Published Feb 27
Learning to Reason via Mixture-of-Thought for Logical Reasoning Paper • 2505.15817 • Published 20 days ago • 17
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective Paper • 2505.15045 • Published 21 days ago • 54
PHYSICS: Benchmarking Foundation Models on University-Level Physics Problem Solving Paper • 2503.21821 • Published Mar 26 • 17