SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models
Abstract
SuperWriter-Agent enhances long-form text generation by integrating structured planning and refinement, achieving top performance with a 7B model and hierarchical Direct Preference Optimization.
Long-form text generation remains a significant challenge for large language models (LLMs), particularly in maintaining coherence, ensuring logical consistency, and preserving text quality as sequence length increases. To address these limitations, we propose SuperWriter-Agent, an agent-based framework designed to enhance the quality and consistency of long-form text generation. SuperWriter-Agent introduces explicit structured thinking-through planning and refinement stages into the generation pipeline, guiding the model to follow a more deliberate and cognitively grounded process akin to that of a professional writer. Based on this framework, we construct a supervised fine-tuning dataset to train a 7B SuperWriter-LM. We further develop a hierarchical Direct Preference Optimization (DPO) procedure that uses Monte Carlo Tree Search (MCTS) to propagate final quality assessments and optimize each generation step accordingly. Empirical results across diverse benchmarks demonstrate that SuperWriter-LM achieves state-of-the-art performance, surpassing even larger-scale baseline models in both automatic evaluation and human evaluation. Furthermore, comprehensive ablation studies demonstrate the effectiveness of hierarchical DPO and underscore the value of incorporating structured thinking steps to improve the quality of long-form text generation.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- XtraGPT: LLMs for Human-AI Collaboration on Controllable Academic Paper Revision (2025)
- Sparks of Tabular Reasoning via Text2SQL Reinforcement Learning (2025)
- Writing Like the Best: Exemplar-Based Expository Text Generation (2025)
- LongMagpie: A Self-synthesis Method for Generating Large-scale Long-context Instructions (2025)
- GreenMind: A Next-Generation Vietnamese Large Language Model for Structured and Logical Reasoning (2025)
- RECAST: Strengthening LLMs' Complex Instruction Following with Constraint-Verifiable Data (2025)
- The Price of Format: Diversity Collapse in LLMs (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper