An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging Paper • 2502.09056 • Published Feb 13 • 32
LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models Paper • 2307.07889 • Published Jul 15, 2023 • 1
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models Paper • 2405.13684 • Published May 22, 2024
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models Paper • 2303.08896 • Published Mar 15, 2023 • 4
MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization Paper • 2301.12307 • Published Jan 28, 2023 • 3