CogniDet / README.md
future7's picture
Improve model card: Add paper link and refine description (#2)
f9d0bf8 verified
metadata
base_model:
  - meta-llama/Meta-Llama-3-8B
datasets:
  - future7/CogniBench
  - future7/CogniBench-L
language:
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - text faithfulness
  - hallucination detection
  - RAG evaluation
  - cognitive statements
  - factual consistency

CogniDet: Cognitive Faithfulness Detector for LLMs

CogniDet is a state-of-the-art model for detecting both factual and cognitive hallucinations in Large Language Model (LLM) outputs. Developed as part of the CogniBench framework, it specifically addresses the challenge of evaluating inference-based statements beyond simple fact regurgitation. The model is presented in the paper CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models.

Key Features ✨

  1. Dual Detection Capability
    Identifies both:

    • Factual Hallucinations (claims contradicting provided context)
    • Cognitive Hallucinations (unsupported inferences/evaluations)
  2. Legal-Inspired Rigor
    Incorporates a tiered evaluation framework (Rational β†’ Grounded β†’ Unequivocal) inspired by legal evidence standards

  3. Efficient Inference
    Single-pass detection with 8B parameter Llama3 backbone (faster than NLI-based methods)

  4. Large-Scale Training
    Trained on CogniBench-L (24k+ dialogues, 234k+ annotated sentences)

Performance πŸš€

Detection Type F1 Score
Overall 70.30
Factual Hallucination 64.40
Cognitive Hallucination 73.80

Outperforms baselines like SelfCheckGPT (61.1 F1 on cognitive) and RAGTruth (45.3 F1 on factual)

Usage πŸ’»

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "future7/CogniDet" 
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

def detect_hallucinations(context, response):
    inputs = tokenizer(
        f"CONTEXT: {context}
RESPONSE: {response}
HALLUCINATIONS:",
        return_tensors="pt"
    )
    outputs = model.generate(**inputs, max_new_tokens=100)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
context = "Moringa trees grow in USDA zones 9-10. Flowering occurs annually in spring."
response = "In cold regions, Moringa can bloom twice yearly if grown indoors."

print(detect_hallucinations(context, response))
# Output: "Bloom frequency claims in cold regions are speculative"

Training Data πŸ”¬

Trained on CogniBench-L featuring:

  • 7,058 knowledge-grounded dialogues
  • 234,164 sentence-level annotations
  • Balanced coverage across 15+ domains (Medical, Legal, etc.)
  • Auto-labeled via rigorous pipeline (82.2% agreement with humans)

Limitations ⚠️

  1. Best performance on English knowledge-grounded dialogues
  2. Domain-specific applications (e.g., clinical diagnosis) may require fine-tuning
  3. Context window limited to 8K tokens

Citation πŸ“š

If you use CogniDet, please cite the CogniBench paper:

@inproceedings{tang2025cognibench,
  title = {CogniBench: A Legal-inspired Framework for Assessing Cognitive Faithfulness of LLMs},
  author = {Tang, Xiaqiang and Li, Jian and Hu, Keyu and Nan, Du 
           and Li, Xiaolong and Zhang, Xi and Sun, Weigao and Xie, Sihong},
  booktitle = {Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)},
  year = {2025},
  pages = {xxx--xxx},  % ζ·»εŠ ι‘΅η θŒƒε›΄
  publisher = {Association for Computational Linguistics},
  location = {Vienna, Austria},
  url = {https://arxiv.org/abs/2505.20767},
  archivePrefix = {arXiv},
  eprint = {2505.20767},
  primaryClass = {cs.CL}
}

Resources πŸ”—