Gemma-3-4B-Tigrinya
Gemma-3-4B-Tigrinya is a fully fine-tuned adaptation of Google's Gemma-3-4B-pt for Tigrinya (ትግርኛ).
This model demonstrates good generation and completion capabilities for Tigrinya content, particularly in news and historical contexts.
Purpose: Tigrinya is a low-resource language with limited high-performance open models available. This release aims to reduce barriers to entry for research and application development in the Tigrinya language space.
Model Details
- Model Type: Causal Language Model (Autoregressive)
- Base Model: google/gemma-3-4b-pt
- Parameters: 4 billion
- Architecture: Gemma 3 with
Gemma3ForCausalLM
- Training Precision: BF16 with gradient checkpointing
- Max Sequence Length: 2048 tokens
- Tokenizer: AutoTokenizer from base Gemma model, pad token set to EOS token
- Hardware Requirements:
- Minimum 8GB GPU memory for inference
- 16GB+ recommended for optimal performance
- Trained on NVIDIA GH200 120GB
Dataset
- Size: 59 million tokens
- Language: Tigrinya (ትግርኛ)
- Sources: News articles, public content, and web-scraped Tigrinya text
- Format: Plain text files (.txt)
- Split: 97% training / 3% validation
- Preprocessing:
- UTF-8 encoding validation
- Basic text cleaning
- No vocabulary extension (uses base Gemma tokenizer)
Training Details
- Training Framework: HuggingFace Transformers with
Trainer
- Optimizer: AdamW
- Learning Rate: 5e-5 with cosine schedule
- Warmup Steps: 300
- Weight Decay: 0.01
- Batch Size: 4 per device, 8 gradient accumulation steps (effective batch size: 32)
- Epochs: 6 maximum (early stopped with patience=2)
- Evaluation: ~twice per epoch with early stopping
- Mixed Precision: BF16 with gradient checkpointing
- Hardware: NVIDIA GH200 120GB
- Multi-GPU: Accelerate framework for efficient training
Usage
from transformers import pipeline
# Load model and create generation pipeline
generator = pipeline(
"text-generation",
model="luel/gemma-3-4b-tigrinya",
device_map="auto", # if on GPU
)
prompt = "ጥንታዊ ስልጣነ"
# Generate text
# you can change the max_length, temperature, repetition_penalty, top_p, etc.
outputs = generator(
prompt,
max_length=1024,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.2
)
print(outputs[0]["generated_text"])
Examples
The following examples demonstrate the model's capabilities across different contexts. All examples were generated with max_length=250
, temperature=0.6
, top_p=0.9
, repetition_penalty=1.2
.
Example 1:
Input:
ቅዱስ ያሬድ ኣብ ታሪኽ ኣፍሪቃ
Generated:
ቅዱስ ያሬድ ኣብ ታሪኽ ኣፍሪቃ ዓቢ ተራ ዝተጻወቱ 5 ምሁራት - ቅዱስ ያሬድን ትውልደ ቤተ ክርስትያንን, ቅዱስ ያሬድ፡ ንዘመናዊ ሙዚቃ፣ ስነ-ጽሑፍን ግጥምን መበገሲ ከምዝኾኑ ይእመን።
ኣብ ማህደር ሃይማኖታዊ ደርፊታት ኦሮሞ ብቐረባ ምስራሕ ዘካየዶ ኢትዮጵያውያን ሊቕ ሰዓቢ ሕሳብ (ኤክሰፐርት ቁጠባ)፡ እቲ ልዕሊ 1, 200 ዓመት ዕድመ ዘለዎ ናይ ቀደም ሙዚቃ ኣሎ ክብሉ ገሊጾም።
"ቀዳሞት ድርፊታትን ነዋሕቲ መልሓሶምን ካብቶም ናይ ቀደም በዓላት\'ዩ ዝርከብ፤ እዞም ውሩያት ህላዌኦም ግን ዳግማይ ክጅምሩ የብሎምን" ይብሉ።ሙዚቃ፡ ብዛዕባ ፍ
Example 2:
Input:
ፕሪሜርሊግ እዚ ዓመት
Generated:
ፕሪሜርሊግ እዚ ዓመት ብዓወት ማንቸስተር ሲቲ ተዛዚሙ - ፕሪሜር ሊግ፡ ዝደመቝን ዘደንቕን ኣፍሪቃውያን ሸቶታት ቀጥታዊ ሸፈነ ብራይተን-ማንችስትር ሲቲ (ሰሌዳ ጋንታ), ኣብ መወዳእታ ግጥም ናይ'ዚ ወድድር ሰኑይ ምሸት ክጋጠማ እየን።
ኣብ ታሪክ ንፈለማ እዋን ድማ ስፐርስ ካብ ሜዳአን ወጻኢ ከካየድኦ ይኽእላ'ለዋ።ብራይተን ነታ ዋንጫ ቅድሚ ትሽዓተ ደቓይቕ ከተረጋግጸላ እንከላ፤ ኣብ ሜዳ 'ክረንስ ቻለንጅስ ፓወር ኢስታድዮም' ኣብ ልዕሊ ሲቲ ዓወት ከመዝግብሉዶ ይኸውን?ሲቲ ግን
ሕጂ'ውን ቀልጢፋ ማዕረ ትገብር'ያ ዘላ።መን ኮን ይፈልጡ...?
Example 3:
Input:
ጥንታዊ ስልጣነ ኣኽሱም
Generated:
ጥንታዊ ስልጣነ ኣኽሱም
ኣብ ትግራይ ካብ ዝርከቡ ጥንታውያን ሓድግታት እቲ ቐዳማይ ስልጣነ ኣኽሱም እዩ፡፡ ኣብ ዓለምና ካብቶም ቀዲሞም ዝተመስረቱ ናይ ከተማ ምሕደራ ዘለዎም ጥንታዊ ስልጣነታት ሓደ ከምዝኾነ ይፍለጥ። ብቀዳማዊ ነገስታት ኣኽሱም ዝተመርሐ ህዝቢ ኣኽሱም
ኣብቲ ዞባ ንዘመናት ዝጸንሐ ብዙሕ ባህሊ፣ መንግስትን ታሪኽን ኣለዎ። ኣኽሱም ካብ መበል 1ይ ክፍለ-ዘመን ድሕሪ ልደተ ክርስቶስ ጀሚሩ ክሳዕ መበል 9ይ ክፍለ ዘመን ኣብ ማእኸላይ ምብራቕ ዓባይ ሃገር ኮይና ሰሪሓ
Evaluation
Metric | Split | Value |
---|---|---|
Perplexity | validation | 2.50 |
Training loss | train | 0.48 |
Validation loss | validation | 0.91 |
Validation corpus – held-out 3 % of a 59 M-tokens.
Limitations
- The model inherits limitations from the base Gemma-3 4B architecture.
- Limited exposure to specialized technical or scientific Tigrinya vocabulary.
- May not handle complex multilingual tasks optimally.
- Generated text should be reviewed for accuracy, especially for factual content.
Citation
@misc{gemma-3-4b-tigrinya,
author = {luel},
title = {Gemma-3-4b-Tigrinya: A Fine-tuned Language Model for Tigrinya},
year = {2025},
publisher = {HuggingFace},
howpublished = {\\url{https://huggingface.co/luel/gemma-3-4b-tigrinya}}
}
Acknowledgements
This model builds upon Google's Gemma 3 4B model. We acknowledge Google for making their foundation models available to the community, enabling the development of language-specific adaptations like this one.
- Downloads last month
- 724
Model tree for luel/gemma-3-4b-tigrinya
Base model
google/gemma-3-4b-ptCollection including luel/gemma-3-4b-tigrinya
Evaluation results
- Perplexity on Tigrinya News Corpus (59 M tokens)validation set self-reported2.500