TinyLettuce (Ettin-68M): Efficient Hallucination Detection

TinyLettuce

Model Name: tinylettuce-ettin-68m-en

Organization: KRLabsOrg

Github: https://github.com/KRLabsOrg/LettuceDetect

Ettin encoders: https://arxiv.org/pdf/2507.11412

Overview

TinyLettuce is a lightweight token‑classification model that flags unsupported spans in answers given context (span aggregation performed downstream). Built on the 68M Ettin encoder, it targets real‑time CPU inference and low‑cost domain fine‑tuning. This variant is trained only on our synthetic data and RAGTruth dataset for hallucination detection, using the 68M Ettin encoder and a token‑classification head. Highest accuracy among TinyLettuce sizes, works great given it's size (74.97% vs 76.07 LettuceDetect-ModernBERT-base); optimized for efficient CPU inference.

Model Details

  • Architecture: Ettin encoder (68M) + token‑classification head
  • Task: token classification (0 = supported, 1 = hallucinated)
  • Input: [CLS] context [SEP] question [SEP] answer [SEP], up to 4096 tokens
  • Language: English; License: MIT

Training Data

  • RAGTruth (English), span‑level labels; no synthetic data mixed

Training Procedure

  • Tokenizer: AutoTokenizer; DataCollatorForTokenClassification; label pad −100
  • Max length: 4096; batch size: 16; epochs: 5
  • Optimizer: AdamW (lr 1e‑5, weight_decay 0.01)
  • Hardware: Single A100 80GB

Results (RAGTruth)

This model is designed primarily for fine-tuning on smaller, domain-specific samples, rather than for general use.

Performs well on the RAGTruth benchmark, coming close to our LettuceDetect-base (150m ModernBERT) model.

Model Parameters F1 (%)
TinyLettuce-68M 68M 74.97
LettuceDetect-base (ModernBERT) 150M 76.07
LettuceDetect-large (ModernBERT) 395M 79.22
Llama-2-13B (RAGTruth FT) 13B 78.70

Usage

First install lettucedetect:

pip install lettucedetect

Then use it:

from lettucedetect.models.inference import HallucinationDetector

detector = HallucinationDetector(
    method="transformer",
    model_path="KRLabsOrg/tinylettuce-ettin-68m-en",
)

spans = detector.predict(
    context=[
        "Ibuprofen is an NSAID that reduces inflammation and pain. The typical adult dose is 400-600mg every 6-8 hours, not exceeding 2400mg daily."
    ],
    question="What is the maximum daily dose of ibuprofen?",
    answer="The maximum daily dose of ibuprofen for adults is 3200mg.",
    output_format="spans",
)
print(spans)
# Output: [{"start": 51, "end": 57, "text": "3200mg"}]

Citing

If you use the model or the tool, please cite the following paper:

@misc{Kovacs:2025,
      title={LettuceDetect: A Hallucination Detection Framework for RAG Applications}, 
      author={Ádám Kovács and Gábor Recski},
      year={2025},
      eprint={2502.17125},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.17125}, 
}
Downloads last month
13
Safetensors
Model size
68.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KRLabsOrg/tinylettuce-ettin-68m-en

Finetuned
(5)
this model

Dataset used to train KRLabsOrg/tinylettuce-ettin-68m-en

Collection including KRLabsOrg/tinylettuce-ettin-68m-en