Geopolitical biases in LLMs: what are the "good" and the "bad" countries according to contemporary language models Paper • 2506.06751 • Published 5 days ago • 55
FIRE: Fact-checking with Iterative Retrieval and Verification Paper • 2411.00784 • Published Oct 17, 2024
FinChain: A Symbolic Benchmark for Verifiable Chain-of-Thought Financial Reasoning Paper • 2506.02515 • Published 9 days ago • 2