Qwen2.5-1.5B Arabic Summarizer

This model is a fine-tuned version of unsloth/Qwen2.5-1.5B-Instruct for Arabic summarization.

It was trained using the TRL and transformers libraries with Parameter-Efficient Fine-Tuning (PEFT) via LoRA.

Model Description

This is a 1.5B parameter small language model (SLM) fine-tuned on a synthetically generated dataset for Arabic summarization. High-quality summaries were generated using a larger model ("Qwen/Qwen2.5-14B-Instruct-AWQ") on Arabic documents derived from the GEM/xlsum dataset.

The model was trained using supervised fine-tuning (SFT) with LoRA adapters, enabling training on consumer GPUs with limited memory (e.g., 16GB).

Intended Use

This model is intended for generating concise, accurate Arabic summaries from input texts. It performs best when used with the specific prompt format seen during training.

Training Data

Training used a synthetic summarization dataset created as follows:

Source: Arabic subset of the GEM/xlsum dataset
Steps:
- Noise cleaning and Arabic text normalization
- Filtering for text length (300–2500 characters)
- Duplicate removal (SHA1 hashing)
- Language filtering (Arabic-dominant only)
- Topic stratification using TF-IDF + NMF (~5000 samples)
Synthetic summaries generated by "Qwen/Qwen2.5-14B-Instruct-AWQ"

Training Procedure

Key details:

Base model: unsloth/Qwen2.5-1.5B-Instruct
LoRA (PEFT) settings:
- r: 16, alpha: 16, dropout: 0.1
- Target modules: q_proj, v_proj, up_proj, down_proj
Quantization: 4-bit NF4 (bnb_4bit_compute_dtype=torch.bfloat16)
Optimizer: paged_adamw_32bit
Learning rate: 2e-4, cosine schedule, warmup 3%
Epochs: 2
Batch: 2 per device, 4 accumulation steps
Eval: Based on validation loss
Gradient clipping: 0.3
Checkpointing: Best eval loss model saved

Framework Versions

TRL: 0.18.0
Transformers: 4.52.3
PyTorch: 2.7.0
Datasets: 3.6.0
Tokenizers: 0.21.1

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
from peft import PeftModel
import torch

base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Qwen2.5-1.5B-Instruct",
    torch_dtype=torch.float16,
    device_map="auto",
)

tokenizer = AutoTokenizer.from_pretrained("unsloth/Qwen2.5-1.5B-Instruct")
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"

model = PeftModel.from_pretrained(base_model, "ml-maverick/Qwen2.5-1.5B-Instruct-ArabicSum")
model = model.merge_and_unload()
model.eval()

instruction = (
    "أنت كاتب عربي محترف ذو خبرة واسعة في تلخيص النصوص بدقة وإيجاز."
    " عند استلام نص، اتبع الخطوات التالية لضمان تقديم ملخص فعّال:\n"
    "1. قم بتحليل المحتوى بعناية لتحديد الفكرة الرئيسية.\n"
    "2. استخرج المعلومات الجوهرية.\n"
    "3. صغ ملخصًا واضحًا وموجزًا لا يتجاوز ثلاث جمل.\n"
    "4. تجنب التفاصيل غير الموجودة، والتزم بالدقة.\n\n"
)
text = "أظهرت دراسة حديثة أن..."
input_prompt = f"{instruction}{text}\n\nالملخص:"
input_ids = tokenizer(input_prompt, return_tensors="pt").input_ids.to(model.device)

generation_config = GenerationConfig(
    max_new_tokens=200,
    num_beams=1,
    early_stopping=True,
    repetition_penalty=1.1,
    temperature=0.4,
    top_p=0.9,
    pad_token_id=tokenizer.pad_token_id,
    eos_token_id=tokenizer.eos_token_id,
)

with torch.no_grad():
    output_ids = model.generate(
        input_ids=input_ids,
        generation_config=generation_config,
        attention_mask=input_ids.ne(tokenizer.pad_token_id),
    )

output = tokenizer.decode(output_ids[0], skip_special_tokens=True)
summary = output.split("الملخص:")[-1].strip()
print("Generated Summary:", summary)

Limitations and Bias

Synthetic bias from source LLM
Requires exact prompt format
Arabic only
Not guaranteed factual accuracy

Citation

@misc{vonwerra2022trl,
  title = {{TRL: Transformer Reinforcement Learning}},
  author = {Leandro von Werra et al.},
  year = 2022,
  howpublished = {\url{https://github.com/huggingface/trl}}
}
@article{qwen2024qwen2,
  title={{Qwen2}: A Strong Large Language Model Family},
  author={Qwen Team},
  journal={arXiv preprint arXiv:2406.01175},
  year={2024}
}
@article{wolf2020transformers,
  title={Transformers: State-of-the-Art NLP},
  author={Wolf, Thomas et al.},
  journal={arXiv:1910.03771},
  year={2020}
}
@article{lhoest2021datasets,
  title={Datasets: A Community Library},
  author={Lhoest, Quentin et al.},
  journal={arXiv:2109.02844},
  year={2021}
}
@software{peft,
  title={{PEFT}: Parameter-Efficient Fine-Tuning},
  author={Hugging Face},
  year={2023},
  url={https://github.com/huggingface/peft}
}

ml-maverick
/

Qwen2.5-1.5B-Instruct-ArabicSum