constehub
/

rag-evaluation

@@ -10,12 +10,117 @@ license: apache-2.0
 language:
 - en
 ---
-# Uploaded  model
-- **Developed by:** mendrika261
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/qwen3-8b-unsloth-bnb-4bit
 This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

 language:
 - en
 ---
+# RAG Context Evaluator - Qwen3-8B Fine-tuned 🚀
+## Model Details 📋
+**License:** apache-2.0
+**Finetuned from model:** unsloth/qwen3-8b-unsloth-bnb-4bit
+**Model type:** Text Generation (Specialized for RAG Evaluation)
+**Quantization:** Q8_0
+## Model Description 🎯
+This model is specifically fine-tuned to evaluate the quality of retrieved contexts in Retrieval-Augmented Generation (RAG) systems. It assesses retrieved passages against user queries using multiple evaluation metrics commonly used in information retrieval and RAG evaluation.
+## Intended Uses 💡
+### Primary Use Case 🎯
+- **RAG System Evaluation**: Automatically assess the quality of retrieved contexts for question-answering systems
+- **Information Retrieval Quality Control**: Evaluate how well retrieved documents match user queries
+- **Academic Research**: Support research in information retrieval and RAG system optimization
+### Evaluation Metrics 📊
+The model evaluates retrieved contexts using the following metrics:
+1. **Completeness** 📝 - How thoroughly the retrieved context addresses the query
+2. **Clarity** ✨ - How clear and understandable the retrieved information is
+3. **Conciseness** 🎪 - How efficiently the information is presented without redundancy
+4. **Precision** 🎯 - How accurate and relevant the retrieved information is
+5. **Recall** 🔍 - How comprehensive the retrieved information is in covering the query
+6. **MRR (Mean Reciprocal Rank)** 📈 - Ranking quality of relevant results
+7. **NDCG (Normalized Discounted Cumulative Gain)** 📊 - Ranking quality with position consideration
+8. **Relevance** 🔗 - Overall relevance of retrieved contexts to the query
+## Training Data 📚
+### Example Training Instance
+```json
+{
+  "instruction": "Evaluate the agent's response according to the metrics: completeness, clarity, conciseness, precision, recall, mrr, ndcg, relevance",
+  "input": {
+    "question": "Question about retrieved context",
+    "retrieved_contexts": "[Multiple numbered passages with source citations]"
+  },
+  "output": [
+    {
+      "name": "completeness",
+      "value": 1,
+      "comment": "Detailed evaluation comment"
+    }
+    // ... other metrics
+  ]
+}
+```
+## Performance and Limitations ⚡
+### Strengths
+- Specialized for RAG evaluation
+- Multi-dimensional assessment capability
+- Detailed explanatory comments for each metric
+### Limitations
+- **Context Length**: Performance may vary with very long retrieved contexts
+## Ethical Considerations 🤝
+- The model should be used as a tool to assist human evaluators, not replace human judgment entirely
+- Evaluations should be validated by domain experts for critical applications
+## Technical Specifications 🔧
+- **Base Model**: Qwen3-8B
+- **Quantization**: Q8_0
+## Usage Example 💻
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_name = "mendrika261/rag-evaluator-qwen3-8b"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name)
+# Example evaluation prompt
+prompt = """Evaluate the agent's response according to the metrics: completeness, clarity, conciseness, precision, recall, mrr, ndcg, relevance
+Question: [Your question here]
+Retrieved contexts: [Your retrieved contexts here]"""
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs)
+evaluation = tokenizer.decode(outputs[0], skip_special_tokens=True)
+```
+## Citation 📄
+If you use this model in your research, please cite:
+```bibtex
+@misc{mendrika261-rag-evaluator,
+  title={RAG Context Evaluator - Qwen3-8B Fine-tuned},
+  author={mendrika261},
+  year={2025},
+  howpublished={\url{https://huggingface.co/mendrika261/rag-evaluation}}
+}
+```
+## Contact 📧
+For questions or issues regarding this model, please contact the developer through the Hugging Face model repository.
+---
 This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.