Papers
arxiv:2509.14008

Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale

Published on Sep 17
· Submitted by Hasan Abed Al Kader Hammoud on Sep 18
#1 Paper of the day
Authors:

Abstract

Hala, a family of Arabic-centric instruction and translation models, achieves state-of-the-art results using a translate-and-tune pipeline, slerp merging, and fine-tuning on high-quality bilingual supervision.

AI-generated summary

We present Hala, a family of Arabic-centric instruction and translation models built with our translate-and-tune pipeline. We first compress a strong ARleftrightarrowEN teacher to FP8 (yielding sim2times higher throughput with no quality loss) and use it to create high-fidelity bilingual supervision. A lightweight language model LFM2-1.2B is then fine-tuned on this data and used to translate high-quality English instruction sets into Arabic, producing a million-scale corpus tailored to instruction following. We train Hala models at 350M, 700M, 1.2B, and 9B parameters, and apply slerp merging to balance Arabic specialization with base-model strengths. On Arabic-centric benchmarks, Hala achieves state-of-the-art results within both the "nano" (leq2B) and "small" (7-9B) categories, outperforming their bases. We release models, data, evaluation, and recipes to accelerate research in Arabic NLP.

Community

Paper author Paper submitter

A series of state-of-the-art nano and small scale Arabic language models.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 5

Browse 5 models citing this paper

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2509.14008 in a Space README.md to link it from this page.

Collections including this paper 2