metadata

base_model:
  - meta-llama/Meta-Llama-3-8B
datasets:
  - future7/CogniBench
  - future7/CogniBench-L
language:
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - text faithfulness
  - hallucination detection
  - RAG evaluation
  - cognitive statements
  - factual consistency

CogniDet: Cognitive Faithfulness Detector for LLMs

CogniDet is a state-of-the-art model for detecting both factual and cognitive hallucinations in Large Language Model (LLM) outputs. Developed as part of the CogniBench framework, it specifically addresses the challenge of evaluating inference-based statements beyond simple fact regurgitation. The model is presented in the paper CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models.

Key Features ✨

Dual Detection Capability
Identifies both:
- Factual Hallucinations (claims contradicting provided context)
- Cognitive Hallucinations (unsupported inferences/evaluations)
Legal-Inspired Rigor
Incorporates a tiered evaluation framework (Rational → Grounded → Unequivocal) inspired by legal evidence standards
Efficient Inference
Single-pass detection with 8B parameter Llama3 backbone (faster than NLI-based methods)
Large-Scale Training
Trained on CogniBench-L (24k+ dialogues, 234k+ annotated sentences)

Performance 🚀

Detection Type	F1 Score
Overall	70.30
Factual Hallucination	64.40
Cognitive Hallucination	73.80

Outperforms baselines like SelfCheckGPT (61.1 F1 on cognitive) and RAGTruth (45.3 F1 on factual)

Usage 💻

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "future7/CogniDet" 
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

def detect_hallucinations(context, response):
    inputs = tokenizer(
        f"CONTEXT: {context}
RESPONSE: {response}
HALLUCINATIONS:",
        return_tensors="pt"
    )
    outputs = model.generate(**inputs, max_new_tokens=100)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
context = "Moringa trees grow in USDA zones 9-10. Flowering occurs annually in spring."
response = "In cold regions, Moringa can bloom twice yearly if grown indoors."

print(detect_hallucinations(context, response))
# Output: "Bloom frequency claims in cold regions are speculative"

Training Data 🔬

Trained on CogniBench-L featuring:

7,058 knowledge-grounded dialogues
234,164 sentence-level annotations
Balanced coverage across 15+ domains (Medical, Legal, etc.)
Auto-labeled via rigorous pipeline (82.2% agreement with humans)

Limitations ⚠️

Best performance on English knowledge-grounded dialogues
Domain-specific applications (e.g., clinical diagnosis) may require fine-tuning
Context window limited to 8K tokens

Citation 📚

If you use CogniDet, please cite the CogniBench paper:

@inproceedings{tang2025cognibench,
  title = {CogniBench: A Legal-inspired Framework for Assessing Cognitive Faithfulness of LLMs},
  author = {Tang, Xiaqiang and Li, Jian and Hu, Keyu and Nan, Du 
           and Li, Xiaolong and Zhang, Xi and Sun, Weigao and Xie, Sihong},
  booktitle = {Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)},
  year = {2025},
  pages = {xxx--xxx},  % 添加页码范围
  publisher = {Association for Computational Linguistics},
  location = {Vienna, Austria},
  url = {https://arxiv.org/abs/2505.20767},
  archivePrefix = {arXiv},
  eprint = {2505.20767},
  primaryClass = {cs.CL}
}

Resources 🔗

CogniBench GitHub