Reranking-based Generation for Unbiased Perspective Summarization
Abstract
Reranking and preference tuning improve the quality of perspective summaries generated by LLMs, as measured by language model-based metrics that outperform traditional ones.
Generating unbiased summaries in real-world settings such as political perspective summarization remains a crucial application of Large Language Models (LLMs). Yet, existing evaluation frameworks rely on traditional metrics for measuring key attributes such as coverage and faithfulness without verifying their applicability, and efforts to develop improved summarizers are still nascent. We address these gaps by (1) identifying reliable metrics for measuring perspective summary quality, and (2) investigating the efficacy of LLM-based methods beyond zero-shot inference. Namely, we build a test set for benchmarking metric reliability using human annotations and show that traditional metrics underperform compared to language model-based metrics, which prove to be strong evaluators. Using these metrics, we show that reranking-based methods yield strong results, and preference tuning with synthetically generated and reranking-labeled data further boosts performance. Our findings aim to contribute to the reliable evaluation and development of perspective summarization methods.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Towards Multi-dimensional Evaluation of LLM Summarization across Domains and Languages (2025)
- StrucSum: Graph-Structured Reasoning for Long Document Extractive Summarization with LLMs (2025)
- Principled Content Selection to Generate Diverse and Personalized Multi-Document Summaries (2025)
- Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts (2025)
- An Empirical Study of Many-to-Many Summarization with Large Language Models (2025)
- An Empirical Study of Evaluating Long-form Question Answering (2025)
- Evaluation Should Not Ignore Variation: On the Impact of Reference Set Choice on Summarization Metrics (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper