--- license: mit language: - fr base_model: - EuroBERT/EuroBERT-610m pipeline_tag: token-classification tags: - token classification - hallucination detection - transformers - question answer datasets: - KRLabsOrg/ragtruth-fr-translated --- # LettuceDetect: French Hallucination Detection Model

LettuceDetect Logo

**Model Name:** KRLabsOrg/lettucedect-610m-eurobert-fr-v1 **Organization:** KRLabsOrg **Github:** https://github.com/KRLabsOrg/LettuceDetect ## Overview LettuceDetect is a transformer-based model for hallucination detection on context and answer pairs, designed for multilingual Retrieval-Augmented Generation (RAG) applications. This model is built on **EuroBERT-610M**, which has been specifically chosen for its extended context support (up to **8192 tokens**) and strong multilingual capabilities. This long-context capability is critical for tasks where detailed and extensive documents need to be processed to accurately determine if an answer is supported by the provided context. **This is our French large model utilizing EuroBERT-610M architecture** ## Model Details - **Architecture:** EuroBERT-610M with extended context support (up to 8192 tokens) - **Task:** Token Classification / Hallucination Detection - **Training Dataset:** RagTruth-FR (translated from the original RAGTruth dataset) - **Language:** French ## How It Works The model is trained to identify tokens in the French answer text that are not supported by the given context. During inference, the model returns token-level predictions which are then aggregated into spans. This allows users to see exactly which parts of the answer are considered hallucinated. ## Usage ### Installation Install the 'lettucedetect' repository ```bash pip install lettucedetect ``` ### Using the model ```python from lettucedetect.models.inference import HallucinationDetector # For a transformer-based approach: detector = HallucinationDetector( method="transformer", model_path="KRLabsOrg/lettucedect-610m-eurobert-fr-v1", lang="fr", trust_remote_code=True ) contexts = ["La France est un pays d'Europe. La capitale de la France est Paris. La population de la France est de 67 millions."] question = "Quelle est la capitale de la France? Quelle est la population de la France?" answer = "La capitale de la France est Paris. La population de la France est de 69 millions." # Get span-level predictions indicating which parts of the answer are considered hallucinated. predictions = detector.predict(context=contexts, question=question, answer=answer, output_format="spans") print("Prédictions:", predictions) # Prédictions: [{'start': 36, 'end': 81, 'confidence': 0.9726481437683105, 'text': ' La population de la France est de 69 millions.'}] ``` ## Performance **Results on Translated RAGTruth-FR** We evaluate our French models on translated versions of the [RAGTruth](https://aclanthology.org/2024.acl-long.585/) dataset. The EuroBERT-610M French model achieves an F1 score of 73.13%, significantly outperforming prompt-based methods like GPT-4.1-mini (62.37%) with a substantial improvement of +10.76 percentage points. For detailed performance metrics, see the table below: | Language | Model | Precision (%) | Recall (%) | F1 (%) | GPT-4.1-mini F1 (%) | Δ F1 (%) | |----------|-----------------|---------------|------------|--------|---------------------|----------| | French | EuroBERT-210M | 58.86 | 74.34 | 65.70 | 62.37 | +3.33 | | French | EuroBERT-610M | **67.08** | **80.38** | **73.13** | 62.37 | **+10.76** | The 610M model offers the best performance with over 7 percentage points improvement in F1 score compared to the 210M model. It particularly excels in recall, detecting more hallucinations with an 80.38% recall rate. ## Citing If you use the model or the tool, please cite the following paper: ```bibtex @misc{Kovacs:2025, title={LettuceDetect: A Hallucination Detection Framework for RAG Applications}, author={Ádám Kovács and Gábor Recski}, year={2025}, eprint={2502.17125}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2502.17125}, } ```