Abstract
LATTICE, a hierarchical retrieval framework, enables efficient and accurate reasoning over large document collections using a semantic tree structure and a traversal algorithm that calibrates relevance scores.
Modern IR systems are increasingly tasked with answering complex, multi-faceted queries that require deep reasoning rather than simple keyword or semantic matching. While LLM-based IR has shown great promise, the prevailing retrieve-then-rerank paradigm inherits the limitations of embedding-based retrieval; parametric generative approaches are difficult to update with new information; and long-context methods that place the entire corpus in context are computationally infeasible for large document collections. To address these challenges, we introduce LATTICE, a hierarchical retrieval framework that enables an LLM to reason over and navigate large corpora with logarithmic search complexity by imposing a semantic tree structure on the corpus. Our approach consists of two stages: (1) an offline phase that organizes the corpus into a semantic hierarchy via either a bottom-up agglomerative strategy or a top-down divisive strategy using multi-level summaries and (2) an online traversal phase where a search LLM navigates this tree. A central challenge in such LLM-guided search is that the model's relevance judgments are noisy, context-dependent, and unaware of the hierarchy, making cross-branch and cross-level comparisons difficult. To overcome this, we propose a traversal algorithm that estimates calibrated latent relevance scores from local LLM outputs and aggregates them into a global path relevance metric. Our training-free framework achieves state-of-the-art zero-shot performance on the reasoning-intensive BRIGHT benchmark, demonstrating up to 9% improvement in Recall@100 and 5% in nDCG@10 over the next best zero-shot baseline. Furthermore, compared to the fine-tuned SOTA method DIVER-v2, LATTICE attains comparable results on BRIGHT subsets that use a static corpus for evaluation.
Community
LATTICE turns retrieval into an LLM-driven navigation problem over a semantic scaffold for computational tractability needed for large corpora.
๐ Project page: https://nilesh2797.github.io/publications/lattice/
๐ Paper link: https://arxiv.org/abs/2510.13217
๐ Code link: https://github.com/nilesh2797/lattice
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Hierarchical Semantic Retrieval with Cobweb (2025)
- Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering (2025)
- Equipping Retrieval-Augmented Large Language Models with Document Structure Awareness (2025)
- MoLoRAG: Bootstrapping Document Understanding via Multi-modal Logic-aware Retrieval (2025)
- Scalable In-context Ranking with Generative Models (2025)
- Test-time Corpus Feedback: From Retrieval to RAG (2025)
- FinGEAR: Financial Mapping-Guided Enhanced Answer Retrieval (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper