Gemma-3-4B-Tigrinya

Gemma-3-4B-Tigrinya is a fully fine-tuned adaptation of Google's Gemma-3-4B-pt for Tigrinya (ትግርኛ).

This model demonstrates good generation and completion capabilities for Tigrinya content, particularly in news and historical contexts.

Purpose: Tigrinya is a low-resource language with limited high-performance open models available. This release aims to reduce barriers to entry for research and application development in the Tigrinya language space.

Model Details

Model Type: Causal Language Model (Autoregressive)
Base Model: google/gemma-3-4b-pt
Parameters: 4 billion
Architecture: Gemma 3 with Gemma3ForCausalLM
Training Precision: BF16 with gradient checkpointing
Max Sequence Length: 2048 tokens
Tokenizer: AutoTokenizer from base Gemma model, pad token set to EOS token
Hardware Requirements:
- Minimum 8GB GPU memory for inference
- 16GB+ recommended for optimal performance
- Trained on NVIDIA GH200 120GB

Dataset

Size: 59 million tokens
Language: Tigrinya (ትግርኛ)
Sources: News articles, public content, and web-scraped Tigrinya text
Format: Plain text files (.txt)
Split: 97% training / 3% validation
Preprocessing:
- UTF-8 encoding validation
- Basic text cleaning
- No vocabulary extension (uses base Gemma tokenizer)

Training Details

Training Framework: HuggingFace Transformers with Trainer
Optimizer: AdamW
Learning Rate: 5e-5 with cosine schedule
Warmup Steps: 300
Weight Decay: 0.01
Batch Size: 4 per device, 8 gradient accumulation steps (effective batch size: 32)
Epochs: 6 maximum (early stopped with patience=2)
Evaluation: ~twice per epoch with early stopping
Mixed Precision: BF16 with gradient checkpointing
Hardware: NVIDIA GH200 120GB
Multi-GPU: Accelerate framework for efficient training

Usage

from transformers import pipeline

# Load model and create generation pipeline
generator = pipeline(
    "text-generation",
    model="luel/gemma-3-4b-tigrinya",
    device_map="auto",               # if on GPU
)

prompt = "ጥንታዊ ስልጣነ"

# Generate text
# you can change the max_length, temperature, repetition_penalty, top_p, etc.
outputs = generator(
    prompt,
    max_length=1024,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.2
)
print(outputs[0]["generated_text"])

Examples

The following examples demonstrate the model's capabilities across different contexts. All examples were generated with max_length=250, temperature=0.6, top_p=0.9, repetition_penalty=1.2.

Example 1:

Input:

ቅዱስ ያሬድ ኣብ ታሪኽ ኣፍሪቃ

Generated:

ቅዱስ ያሬድ ኣብ ታሪኽ ኣፍሪቃ ዓቢ ተራ ዝተጻወቱ 5 ምሁራት -   ቅዱስ ያሬድን ትውልደ ቤተ ክርስትያንን, ቅዱስ ያሬድ፡ ንዘመናዊ ሙዚቃ፣ ስነ-ጽሑፍን ግጥምን መበገሲ ከምዝኾኑ ይእመን።  
ኣብ ማህደር ሃይማኖታዊ ደርፊታት ኦሮሞ ብቐረባ ምስራሕ ዘካየዶ ኢትዮጵያውያን ሊቕ ሰዓቢ ሕሳብ (ኤክሰፐርት ቁጠባ)፡ እቲ ልዕሊ 1, 200 ዓመት ዕድመ ዘለዎ ናይ ቀደም ሙዚቃ ኣሎ ክብሉ ገሊጾም።
"ቀዳሞት ድርፊታትን ነዋሕቲ መልሓሶምን ካብቶም ናይ ቀደም በዓላት\'ዩ ዝርከብ፤ እዞም ውሩያት ህላዌኦም ግን ዳግማይ ክጅምሩ የብሎምን" ይብሉ።ሙዚቃ፡ ብዛዕባ ፍ

Example 2:

Input:

ፕሪሜርሊግ እዚ ዓመት

Generated:

ፕሪሜርሊግ እዚ ዓመት ብዓወት ማንቸስተር ሲቲ ተዛዚሙ -   ፕሪሜር ሊግ፡ ዝደመቝን ዘደንቕን ኣፍሪቃውያን ሸቶታት  ቀጥታዊ ሸፈነ  ብራይተን-ማንችስትር ሲቲ (ሰሌዳ ጋንታ), ኣብ መወዳእታ ግጥም ናይ'ዚ ወድድር ሰኑይ ምሸት ክጋጠማ እየን።
ኣብ ታሪክ ንፈለማ እዋን ድማ ስፐርስ ካብ ሜዳአን ወጻኢ ከካየድኦ ይኽእላ'ለዋ።ብራይተን ነታ ዋንጫ ቅድሚ ትሽዓተ ደቓይቕ ከተረጋግጸላ እንከላ፤ ኣብ ሜዳ 'ክረንስ ቻለንጅስ ፓወር ኢስታድዮም' ኣብ ልዕሊ ሲቲ ዓወት ከመዝግብሉዶ ይኸውን?ሲቲ ግን 
ሕጂ'ውን ቀልጢፋ ማዕረ ትገብር'ያ ዘላ።መን ኮን ይፈልጡ...?

Example 3:

Input:

ጥንታዊ ስልጣነ ኣኽሱም

Generated:

ጥንታዊ ስልጣነ ኣኽሱም
ኣብ ትግራይ ካብ ዝርከቡ ጥንታውያን ሓድግታት እቲ ቐዳማይ ስልጣነ ኣኽሱም እዩ፡፡ ኣብ ዓለምና ካብቶም ቀዲሞም ዝተመስረቱ ናይ ከተማ ምሕደራ ዘለዎም ጥንታዊ ስልጣነታት ሓደ ከምዝኾነ ይፍለጥ። ብቀዳማዊ ነገስታት ኣኽሱም ዝተመርሐ ህዝቢ ኣኽሱም 
ኣብቲ ዞባ ንዘመናት ዝጸንሐ ብዙሕ ባህሊ፣ መንግስትን ታሪኽን ኣለዎ። ኣኽሱም ካብ መበል 1ይ ክፍለ-ዘመን ድሕሪ ልደተ ክርስቶስ ጀሚሩ ክሳዕ መበል 9ይ ክፍለ ዘመን ኣብ ማእኸላይ ምብራቕ ዓባይ ሃገር ኮይና ሰሪሓ

Evaluation

Metric	Split	Value
Perplexity	validation	2.50
Training loss	train	0.48
Validation loss	validation	0.91

Validation corpus – held-out 3 % of a 59 M-tokens.

Limitations

The model inherits limitations from the base Gemma-3 4B architecture.
Limited exposure to specialized technical or scientific Tigrinya vocabulary.
May not handle complex multilingual tasks optimally.
Generated text should be reviewed for accuracy, especially for factual content.

Citation

@misc{gemma-3-4b-tigrinya,
  author = {luel},
  title = {Gemma-3-4b-Tigrinya: A Fine-tuned Language Model for Tigrinya},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\\url{https://huggingface.co/luel/gemma-3-4b-tigrinya}}
}

Acknowledgements

This model builds upon Google's Gemma 3 4B model. We acknowledge Google for making their foundation models available to the community, enabling the development of language-specific adaptations like this one.

luel
/

gemma-3-4b-tigrinya

Gemma-3-4B-Tigrinya

Model Details

Dataset

Training Details

Usage

Examples

Example 1:

Example 2:

Example 3:

Evaluation

Limitations

Citation

Acknowledgements

Model tree for luel/gemma-3-4b-tigrinya

Collection including luel/gemma-3-4b-tigrinya

gemma-Tigrinya

Evaluation results