Hala
Collection
A series of light-weight Arabic language models (instruction following + translation) and Arabic instruction dataset.
•
8 items
•
Updated
•
7
Paper: Hala Technical Report: Building Arabic‑Centric Instruction & Translation Models at Scale
Authors: Hasan Abed Al Kader Hammoud*, Mohammad Zbeeb*, Bernard Ghanem
Affiliation: King Abdullah University of Science and Technology (KAUST)
*Equal contribution
In Arabic, حلا (Hala) conveys sweetness and beauty—qualities long associated with the language itself. In this spirit, we call our models Hala.
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
model_id = "hammh0a/Hala-1.2B" # pick a released Hala model
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id, torch_dtype="auto", device_map="auto"
)
# Use chat template
messages = [
{"role": "system", "content": "أنت مساعد خبير في الفيزياء."},
{"role": "user", "content": "اشرح بإيجاز مبدأ الانحفاظ في الفيزياء، وأعطني مثالاً يومياً."},
]
prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipe = pipeline("text-generation", model=model, tokenizer=tok)
out = pipe(prompt, max_new_tokens=256, do_sample=False)
print(out[0]["generated_text"])
Hala models are placed at the end of each size category; best Average per category is in bold.
Size | Model Name | Params | AlGhafa | ArabicMMLU | EXAMS | MadinahQA | AraTrust | ArbMMLU‑HT | Average |
---|---|---|---|---|---|---|---|---|---|
≤2B | meta-llama/Llama-3.2-1B | 1B | 33.9 | 26.5 | 21.2 | 25.7 | 37.1 | 23.9 | 28.0 |
≤2B | Qwen/Qwen2-1.5B-Instruct | 1.5B | 53.1 | 49.2 | 35.2 | 45.5 | 68.9 | 37.4 | 48.2 |
≤2B | Qwen/Qwen2.5-1.5B-Instruct | 1.5B | 48.4 | 43.5 | 31.8 | 38.2 | 70.8 | 35.9 | 44.8 |
≤2B | Sakalti/Saka-1.5B | 1.5B | 51.4 | 40.0 | 31.3 | 31.5 | 47.5 | 33.5 | 39.2 |
≤2B | Qwen/Qwen3-1.7B-Base | 1.7B | 56.8 | 49.7 | 38.2 | 40.0 | 75.6 | 43.9 | 50.7 |
≤2B | Qwen/Qwen1.5-1.8B | 1.8B | 32.7 | 26.7 | 23.8 | 26.0 | 31.5 | 23.6 | 27.4 |
≤2B | silma-ai/SILMA-Kashif-2B-Instruct-v1.0 | 2B | 59.7 | 45.6 | 33.1 | 38.8 | 73.3 | 35.8 | 47.7 |
≤2B | google/gemma-2-2b-it | 2B | 34.1 | 30.1 | 23.6 | 20.1 | 31.2 | 23.4 | 27.1 |
≤2B | LiquidAI/LFM2-350M | 350M | 39.0 | 35.2 | 30.9 | 28.3 | 43.3 | 29.1 | 34.3 |
≤2B | Hala‑350M | 350M | 51.4 | 41.2 | 36.9 | 34.5 | 52.1 | 35.4 | 41.9 |
≤2B | LiquidAI/LFM2-700M | 700M | 50.1 | 38.3 | 34.3 | 32.5 | 56.3 | 37.2 | 41.4 |
≤2B | Hala‑700M | 700M | 55.5 | 45.9 | 40.6 | 34.7 | 65.2 | 39.4 | 46.9 |
≤2B | LiquidAI/LFM2-1.2B | 1.2B | 53.8 | 45.2 | 35.0 | 34.7 | 65.6 | 43.4 | 46.3 |
≤2B | Hala‑1.2B | 1.2B | 59.2 | 48.6 | 43.4 | 41.6 | 71.7 | 44.2 | 51.4 |
Size | Model Name | Params | AlGhafa | ArabicMMLU | EXAMS | MadinahQA | AraTrust | ArbMMLU‑HT | Average |
---|---|---|---|---|---|---|---|---|---|
7B–9B | CohereForAI/c4ai-command-r7b-arabic-02-2025 | 7B | 74.8 | 59.3 | 65.0 | 63.8 | 80.5 | 50.1 | 65.6 |
7B–9B | JasperV13/Yehia-7B-DPO-Reasoning-preview | 7B | 75.1 | 66.3 | 51.8 | 54.9 | 81.9 | 55.1 | 64.2 |
7B–9B | Navid-AI/Yehia-7B-preview | 7B | 70.8 | 64.9 | 52.1 | 54.4 | 87.5 | 53.4 | 63.9 |
7B–9B | JasperV13/Yehia-7B-Reasoning-preview | 7B | 75.2 | 66.3 | 52.7 | 55.0 | 80.8 | 55.2 | 64.2 |
7B–9B | ALLaM-AI/ALLaM-7B-Instruct-preview | 7B | 69.5 | 64.9 | 51.6 | 54.2 | 86.9 | 52.8 | 63.3 |
7B–9B | Qwen/Qwen2-7B-Instruct | 7B | 73.2 | 60.0 | 47.3 | 59.5 | 82.8 | 51.3 | 62.4 |
7B–9B | Qwen/Qwen3-8B-Base | 8B | 74.8 | 65.0 | 52.5 | 52.2 | 83.4 | 61.5 | 64.9 |
7B–9B | QCRI/Fanar-1-9B-Instruct | 9B | 76.4 | 65.8 | 52.7 | 73.3 | 88.3 | 58.6 | 69.2 |
7B–9B | Hala‑9B | 9B | 78.3 | 65.6 | 53.8 | 70.4 | 89.6 | 61.4 | 69.9 |
Evaluation protocol:
lighteval
on ArabicMMLU (OALL‑2) excluding AlRage.
If you find Hala useful, please cite:
@misc{hammoud2025halatechnicalreportbuilding,
title={Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale},
author={Hasan Abed Al Kader Hammoud and Mohammad Zbeeb and Bernard Ghanem},
year={2025},
url={https://arxiv.org/abs/2509.14008},
}