language: fa
base_model: Qwen/Qwen2.5-14B-Instruct
datasets:
- safora/PersianSciQA-Extractive
tags:
- qwen
- question-answering
- persian
- farsi
- qlora
- scientific-documents
license: apache-2.0
PersianSciQA-Qwen2.5-14B: A QLoRA Fine-Tuned Model for Scientific Extractive QA in Persian
Model Description
This repository contains the PersianSciQA-Qwen2.5-14B model, a fine-tuned version of Qwen/Qwen2.5-14B-Instruct
specialized for extractive question answering on scientific texts in the Persian language.
The model was trained using the QLoRA method for efficient parameter-tuning. Its primary function is to analyze a given scientific context
and answer a question
based solely on the information within that context.
A key feature of its training is the strict instruction to output the exact phrase CANNOT_ANSWER
if the context does not contain the information required to answer the question. This makes the model a reliable tool for closed-domain, evidence-based QA tasks.
How to Use
To use this model, you must follow the specific prompt template it was trained on. The prompt enforces the model's role as a scientific assistant and its strict answering policy.
Here is a complete example using the transformers
library:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# Set the model ID
model_id = "safora/PersianSciQA-Qwen2.5-14B"
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# 1. Define the prompt template (MUST match the training format)
prompt_template = (
'شما یک دستیار متخصص در زمینه اسناد علمی هستید. وظیفه شما این است که به سوال پرسیده شده، **فقط و فقط** بر اساس متن زمینه (Context) ارائه شده پاسخ دهید. پاسخ شما باید دقیق و خلاصه باشد.\n\n'
'**دستورالعمل مهم:** اگر اطلاعات لازم برای پاسخ دادن به سوال در متن زمینه وجود ندارد، باید **دقیقا** عبارت "CANNOT_ANSWER" را به عنوان پاسخ بنویسید و هیچ توضیح اضافهای ندهید.\n\n'
'**زمینه (Context):**\n---\n{context}\n---\n\n'
'**سوال (Question):**\n{question}\n\n'
'**پاسخ (Answer):** '
)
# 2. Provide your context and question
context = "سلولهای خورشیدی پروسکایت به دلیل هزینه تولید پایین و بازدهی بالا، به عنوان یک فناوری نوظهور مورد توجه قرار گرفتهاند. بازدهی آزمایشگاهی این سلولها به بیش از ۲۵ درصد رسیده است، اما پایداری طولانیمدت آنها همچنان یک چالش اصلی محسوب میشود."
question = "بازدهی سلولهای خورشیدی پروسکایت در آزمایشگاه چقدر است؟"
# Example of a question that cannot be answered from the context:
# question = "این سلول ها اولین بار در چه سالی ساخته شدند؟"
# 3. Format the prompt
prompt = prompt_template.format(context=context, question=question)
# 4. Generate the response
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
generation_output = model.generate(
**inputs,
max_new_tokens=128,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.eos_token_id
)
# Decode and print the output
response = tokenizer.decode(generation_output[0], skip_special_tokens=True)
# The generated text will be after the prompt
answer = response.split("**پاسخ (Answer):**")[-1].strip()
print(answer)
# Expected output: به بیش از ۲۵ درصد رسیده است
# For the unanswerable question, expected output: CANNOT_ANSWER
Training Details
Model
The base model is Qwen/Qwen2.5-14B-Instruct, a highly capable instruction-tuned large language model.
Dataset
The model was fine-tuned on the safora/PersianSciQA-Extractive dataset. This dataset contains triplets of (context, question, model_answer) derived from Persian scientific documents. The dataset is split into:
Train: Used for training the model.
Validation: Used for evaluating the model during training epochs.
Test: A held-out set reserved for final model evaluation.
Fine-Tuning Procedure
The model was fine-tuned using the QLoRA (Quantized Low-Rank Adaptation) method, which significantly reduces memory usage while maintaining high performance. The training was performed using the trl and peft libraries.
Hyperparameters
The following key hyperparameters were used during training:
Parameter Value
LoRA Configuration
r (Rank) 16
lora_alpha 32
lora_dropout 0.05
target_modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Arguments
learning_rate 2e-5
optimizer paged_adamw_32bit
lr_scheduler_type cosine
num_train_epochs 1
per_device_train_batch_size 1
gradient_accumulation_steps 8
effective_batch_size 8
quantization 4-bit (nf4)
compute_dtype bfloat16
Of course. Based on your Python script, here is a professional, scientific, and community-focused README.md file for your Hugging Face model card. This is designed for maximum clarity and reusability.
Markdown
---
language: fa
base_model: Qwen/Qwen2.5-14B-Instruct
datasets:
- safora/PersianSciQA-Extractive
tags:
- qwen
- question-answering
- persian
- farsi
- qlora
- scientific-documents
---
# PersianSciQA-Qwen2.5-14B: A QLoRA Fine-Tuned Model for Scientific Extractive QA in Persian
## Model Description
This repository contains the **PersianSciQA-Qwen2.5-14B** model, a fine-tuned version of `Qwen/Qwen2.5-14B-Instruct` specialized for **extractive question answering on scientific texts in the Persian language**.
The model was trained using the QLoRA method for efficient parameter-tuning. Its primary function is to analyze a given scientific `context` and answer a `question` based **solely** on the information within that context.
A key feature of its training is the strict instruction to output the exact phrase `CANNOT_ANSWER` if the context does not contain the information required to answer the question. This makes the model a reliable tool for closed-domain, evidence-based QA tasks.
## How to Use
To use this model, you must follow the specific prompt template it was trained on. The prompt enforces the model's role as a scientific assistant and its strict answering policy.
Here is a complete example using the `transformers` library:
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# Set the model ID
model_id = "safora/PersianSciQA-Qwen2.5-14B"
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# 1. Define the prompt template (MUST match the training format)
prompt_template = (
'شما یک دستیار متخصص در زمینه اسناد علمی هستید. وظیفه شما این است که به سوال پرسیده شده، **فقط و فقط** بر اساس متن زمینه (Context) ارائه شده پاسخ دهید. پاسخ شما باید دقیق و خلاصه باشد.\n\n'
'**دستورالعمل مهم:** اگر اطلاعات لازم برای پاسخ دادن به سوال در متن زمینه وجود ندارد، باید **دقیقا** عبارت "CANNOT_ANSWER" را به عنوان پاسخ بنویسید و هیچ توضیح اضافهای ندهید.\n\n'
'**زمینه (Context):**\n---\n{context}\n---\n\n'
'**سوال (Question):**\n{question}\n\n'
'**پاسخ (Answer):** '
)
# 2. Provide your context and question
context = "سلولهای خورشیدی پروسکایت به دلیل هزینه تولید پایین و بازدهی بالا، به عنوان یک فناوری نوظهور مورد توجه قرار گرفتهاند. بازدهی آزمایشگاهی این سلولها به بیش از ۲۵ درصد رسیده است، اما پایداری طولانیمدت آنها همچنان یک چالش اصلی محسوب میشود."
question = "بازدهی سلولهای خورشیدی پروسکایت در آزمایشگاه چقدر است؟"
# Example of a question that cannot be answered from the context:
# question = "این سلول ها اولین بار در چه سالی ساخته شدند؟"
# 3. Format the prompt
prompt = prompt_template.format(context=context, question=question)
# 4. Generate the response
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
generation_output = model.generate(
**inputs,
max_new_tokens=128,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.eos_token_id
)
# Decode and print the output
response = tokenizer.decode(generation_output[0], skip_special_tokens=True)
# The generated text will be after the prompt
answer = response.split("**پاسخ (Answer):**")[-1].strip()
print(answer)
# Expected output: به بیش از ۲۵ درصد رسیده است
# For the unanswerable question, expected output: CANNOT_ANSWER
Training Details
Model
The base model is Qwen/Qwen2.5-14B-Instruct, a highly capable instruction-tuned large language model.
Dataset
The model was fine-tuned on the safora/PersianSciQA-Extractive dataset. This dataset contains triplets of (context, question, model_answer) derived from Persian scientific documents. The dataset is split into:
Train: Used for training the model.
Validation: Used for evaluating the model during training epochs.
Test: A held-out set reserved for final model evaluation.
Fine-Tuning Procedure
The model was fine-tuned using the QLoRA (Quantized Low-Rank Adaptation) method, which significantly reduces memory usage while maintaining high performance. The training was performed using the trl and peft libraries.
Hyperparameters
The following key hyperparameters were used during training:
Parameter Value
LoRA Configuration
r (Rank) 16
lora_alpha 32
lora_dropout 0.05
target_modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Arguments
learning_rate 2e-5
optimizer paged_adamw_32bit
lr_scheduler_type cosine
num_train_epochs 1
per_device_train_batch_size 1
gradient_accumulation_steps 8
effective_batch_size 8
quantization 4-bit (nf4)
compute_dtype bfloat16
Evaluation
The model's performance has not yet been formally evaluated on the held-out test split. The test split of the safora/PersianSciQA-Extractive dataset, containing 1049 samples, is available for this purpose. Community contributions to evaluate and benchmark this model are welcome.
Citation
If you use this model in your research or work, please cite it as follows:
Code snippet
@misc{persiansciqa_qwen2.5_14b,
author = {jolfaei,safora},
title = {PersianSciQA-Qwen2.5-14B: A QLoRA Fine-Tuned Model for Scientific Extractive QA in Persian},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
howpublished = {\url{[https://huggingface.co/safora/PersianSciQA-Qwen2.5-14B](https://huggingface.co/safora/PersianSciQA-Qwen2.5-14B)}}
}