LegalSearchLM: Rethinking Legal Case Retrieval as Legal Elements Generation
Abstract
LegalSearchLM outperforms existing models in retrieving relevant legal cases by incorporating comprehensive reasoning and content generation, demonstrated on LEGAR BENCH, a large-scale Korean LCR benchmark.
Legal Case Retrieval (LCR), which retrieves relevant cases from a query case, is a fundamental task for legal professionals in research and decision-making. However, existing studies on LCR face two major limitations. First, they are evaluated on relatively small-scale retrieval corpora (e.g., 100-55K cases) and use a narrow range of criminal query types, which cannot sufficiently reflect the complexity of real-world legal retrieval scenarios. Second, their reliance on embedding-based or lexical matching methods often results in limited representations and legally irrelevant matches. To address these issues, we present: (1) LEGAR BENCH, the first large-scale Korean LCR benchmark, covering 411 diverse crime types in queries over 1.2M legal cases; and (2) LegalSearchLM, a retrieval model that performs legal element reasoning over the query case and directly generates content grounded in the target cases through constrained decoding. Experimental results show that LegalSearchLM outperforms baselines by 6-20% on LEGAR BENCH, achieving state-of-the-art performance. It also demonstrates strong generalization to out-of-domain cases, outperforming naive generative models trained on in-domain data by 15%.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- GuRE:Generative Query REwriter for Legal Passage Retrieval (2025)
- LegalRAG: A Hybrid RAG System for Multilingual Legal Information Retrieval (2025)
- UQLegalAI@COLIEE2025: Advancing Legal Case Retrieval with Large Language Models and Graph Neural Networks (2025)
- Incorporating Legal Structure in Retrieval-Augmented Generation: A Case Study on Copyright Fair Use (2025)
- Improving the Accuracy and Efficiency of Legal Document Tagging with Large Language Models and Instruction Prompts (2025)
- Legal Rule Induction: Towards Generalizable Principle Discovery from Analogous Judicial Precedents (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper