ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models Paper • 2505.13444 • Published 22 days ago • 16
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents Paper • 2404.10774 • Published Apr 16, 2024 • 3
MiniCheck & LLM-AggreFact Collection MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents • 7 items • Updated 25 days ago • 4
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization Paper • 2402.13249 • Published Feb 20, 2024 • 13