Hala: Arabic‑Centric Instruction & Translation Models

Hala logo

Paper: Hala Technical Report: Building Arabic‑Centric Instruction & Translation Models at Scale

Authors: Hasan Abed Al Kader Hammoud*, Mohammad Zbeeb*, Bernard Ghanem

Affiliation: King Abdullah University of Science and Technology (KAUST)

*Equal contribution

In Arabic, حلا (Hala) conveys sweetness and beauty—qualities long associated with the language itself. In this spirit, we call our models Hala.


🔗 Quick Links


Example

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_id = "hammh0a/Hala-1.2B"  # pick a released Hala model

tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype="auto", device_map="auto"
)

# Use chat template
messages = [
    {"role": "system", "content": "أنت مساعد خبير في الفيزياء."},
    {"role": "user", "content": "اشرح بإيجاز مبدأ الانحفاظ في الفيزياء، وأعطني مثالاً يومياً."},
]

prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

pipe = pipeline("text-generation", model=model, tokenizer=tok)
out = pipe(prompt, max_new_tokens=256, do_sample=False)

print(out[0]["generated_text"])

📊 Results

Hala models are placed at the end of each size category; best Average per category is in bold.

≤2B parameters

Size Model Name Params AlGhafa ArabicMMLU EXAMS MadinahQA AraTrust ArbMMLU‑HT Average
≤2B meta-llama/Llama-3.2-1B 1B 33.9 26.5 21.2 25.7 37.1 23.9 28.0
≤2B Qwen/Qwen2-1.5B-Instruct 1.5B 53.1 49.2 35.2 45.5 68.9 37.4 48.2
≤2B Qwen/Qwen2.5-1.5B-Instruct 1.5B 48.4 43.5 31.8 38.2 70.8 35.9 44.8
≤2B Sakalti/Saka-1.5B 1.5B 51.4 40.0 31.3 31.5 47.5 33.5 39.2
≤2B Qwen/Qwen3-1.7B-Base 1.7B 56.8 49.7 38.2 40.0 75.6 43.9 50.7
≤2B Qwen/Qwen1.5-1.8B 1.8B 32.7 26.7 23.8 26.0 31.5 23.6 27.4
≤2B silma-ai/SILMA-Kashif-2B-Instruct-v1.0 2B 59.7 45.6 33.1 38.8 73.3 35.8 47.7
≤2B google/gemma-2-2b-it 2B 34.1 30.1 23.6 20.1 31.2 23.4 27.1
≤2B LiquidAI/LFM2-350M 350M 39.0 35.2 30.9 28.3 43.3 29.1 34.3
≤2B Hala‑350M 350M 51.4 41.2 36.9 34.5 52.1 35.4 41.9
≤2B LiquidAI/LFM2-700M 700M 50.1 38.3 34.3 32.5 56.3 37.2 41.4
≤2B Hala‑700M 700M 55.5 45.9 40.6 34.7 65.2 39.4 46.9
≤2B LiquidAI/LFM2-1.2B 1.2B 53.8 45.2 35.0 34.7 65.6 43.4 46.3
≤2B Hala‑1.2B 1.2B 59.2 48.6 43.4 41.6 71.7 44.2 51.4

7B–9B parameters

Size Model Name Params AlGhafa ArabicMMLU EXAMS MadinahQA AraTrust ArbMMLU‑HT Average
7B–9B CohereForAI/c4ai-command-r7b-arabic-02-2025 7B 74.8 59.3 65.0 63.8 80.5 50.1 65.6
7B–9B JasperV13/Yehia-7B-DPO-Reasoning-preview 7B 75.1 66.3 51.8 54.9 81.9 55.1 64.2
7B–9B Navid-AI/Yehia-7B-preview 7B 70.8 64.9 52.1 54.4 87.5 53.4 63.9
7B–9B JasperV13/Yehia-7B-Reasoning-preview 7B 75.2 66.3 52.7 55.0 80.8 55.2 64.2
7B–9B ALLaM-AI/ALLaM-7B-Instruct-preview 7B 69.5 64.9 51.6 54.2 86.9 52.8 63.3
7B–9B Qwen/Qwen2-7B-Instruct 7B 73.2 60.0 47.3 59.5 82.8 51.3 62.4
7B–9B Qwen/Qwen3-8B-Base 8B 74.8 65.0 52.5 52.2 83.4 61.5 64.9
7B–9B QCRI/Fanar-1-9B-Instruct 9B 76.4 65.8 52.7 73.3 88.3 58.6 69.2
7B–9B Hala‑9B 9B 78.3 65.6 53.8 70.4 89.6 61.4 69.9

Evaluation protocol: lighteval on ArabicMMLU (OALL‑2) excluding AlRage.


📚 Citation

If you find Hala useful, please cite:

@misc{hammoud2025halatechnicalreportbuilding,
      title={Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale}, 
      author={Hasan Abed Al Kader Hammoud and Mohammad Zbeeb and Bernard Ghanem},
      year={2025},
      url={https://arxiv.org/abs/2509.14008}, 
}
Downloads last month
59
Safetensors
Model size
1.17B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hammh0a/Hala-1.2B

Base model

LiquidAI/LFM2-1.2B
Finetuned
(26)
this model
Quantizations
2 models

Dataset used to train hammh0a/Hala-1.2B

Collection including hammh0a/Hala-1.2B