Gemma-3-4B-Tigrinya-QA

Gemma-3-4B-Tigrinya-QA is a two-stage fine-tuned adaptation of Google's Gemma-3-4B specifically optimized for question-answering in Tigrinya (ትግርኛ).

This model demonstrates good capabilities in answering questions across various domains, including history, culture, and general knowledge, in Tigrinya.

Purpose: Tigrinya is a low-resource language with limited high-performance open models available. This release aims to reduce barriers to entry for research and application development in the Tigrinya language space.

Model Details

Model Type: Instruction-tuned Causal Language Model
Base Model: luel/gemma-3-4b-tigrinya (stage 1: 60M tokens)
Parameters: 4 billion
Architecture: Gemma 3 with Gemma3ForCausalLM
Training Precision: BF16 with TF32 acceleration
Max Sequence Length: 1024 tokens

Training Process

Stage 1: General Text Generation

Base: Gemma-3-4B -> luel/gemma-3-4b-tigrinya
Data: 60M tokens of mixed-domain Tigrinya (news, web, literature)
Purpose: Language adaptation and vocabulary expansion

Stage 2: Instruction Fine-tuning (This Model)

Base: luel/gemma-3-4b-tigrinya -> luel/gemma-3-4b-tigrinya-qa
Data: 67.5k curated Q&A pairs across governance, society, politics, culture, history, proverbs, etc.
Format: Gemma chat template with user/assistant turns

Dataset (Stage 2)

Size: 67.5k question-answer pairs
Language: Tigrinya (ትግርኛ)
Domains: Geography, culture, history, politics, general knowledge
Format: Chat template with <start_of_turn>user and <start_of_turn>model markers
Split: 95% training / 5% validation

Training Details (Stage 2)

Training Framework: HuggingFace Transformers with SFTTrainer
Optimizer: AdamW with cosine learning rate schedule
Learning Rate: 2e-5 with 3% warmup
Weight Decay: 0.01
Batch Size: 6 per device, 2 gradient accumulation steps (effective batch size: 12)
Epochs: 3
Evaluation: Every 500 steps
Mixed Precision: BF16 with gradient checkpointing
Hardware: NVIDIA GH200 120GB

Usage

First, install the Transformers library (version 4.50 or higher):

pip install -U transformers

Then, you can use it for inference as follows:

from transformers import Gemma3ForCausalLM, AutoTokenizer
import torch

model_id = "luel/gemma-3-4b-tigrinya-qa"

tok = AutoTokenizer.from_pretrained(model_id)
model = Gemma3ForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16, 
    device_map="cuda"
)
question = "ትግራይ ኣበይ ትርከብ?"
messages = [{"role": "user", "content": f"{question}"}]
prompt = tok.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)

inputs = tok(prompt, return_tensors="pt").to(model.device)

# you can add and adjust max_length, temperature, repetition_penalty, top_p...
outputs = model.generate(**inputs, max_new_tokens=1024)

full_response = tok.decode(outputs[0], skip_special_tokens=False)
model_response = full_response.split("<start_of_turn>model\n")[-1].split("<end_of_turn>")[0].strip()

print(model_response)

Chat Template

<start_of_turn>user
{user_message}<end_of_turn>
<start_of_turn>model
{model_response}<end_of_turn>

Examples

Below is a short sample showcasing the model's response quality (generated with max_new_tokens=512, temperature=0.6, top_p=0.9).

Q (user): ሃገራት ቀርኒ ኣፍሪካ ዝኾኑ ኣየኖት እዮም?
A (model): ኢትዮጵያ፣ ጅቡቲ፣ ኤርትራን ሶማልያን።

Q (user): ዓወት ኲናት ዓድዋ ንሃገራት ኣፍሪካ እንታይ መልእኽቲ እዩ ዘመሓላልፍ?
A (model): ዓወት ኲናት ዓድዋ ኣብ ልዕሊ ባዕዳዊ መግዛእቲ ዝተፈፀመን ሃገራት ኣፍሪካ ነፃነተን ክቃለሳ ከምዝኽእላ ዘርኣየሉን ታሪክ ፍፃመ እዩ።

Q (user): ነዚ ምስላ እዚ ምልኣዮ። ምስ ቆልዓ ኣይትምከር
A (model): ምስ ከልቢ ኣይትታሓባእ።

Q (user): ብዛዕባ ትግራይ ብሓፂሩ ንገረኒ
A (model): ትግራይ ኣብ ሰሜን ኢትዮጵያ እትርከብ ክልል እያ። ርእሰ ከተማኣ መቐለ ኮይና ካልኦት ዓበይቲ ከተማታት ድማ ኣክሱም፡ ሽረ፡ ማይጨው እየን። ትግራይ ዝተፈላለዩ ህዝቢ፣ ታሪኽ ዘለዎምን ሃፍታም ባህሊ ዘለዎን ብዙሕነት ዝነብረላ እያ። ብዘደንቕ ተፈጥሮን ስነ-ጥበባዊ ምልክታትን ዝመልአት ቦታ እያ።

Q (user): ሰናይ ምምሕዳር እንታይ ማለት እዩ?
A (model): ሰናይ ምምሕዳር ንዜጋታት ብዘይምጭፍልቕ፣ ብዘይኣድልዎን ብዘይ ኣፈላላይን ብማዕረ ግልጋሎት ዝህብ መንግስታዊ ኣሰራርሓ የመላኽት። ሰናይ ምምሕዳር ኩሉ ዜጋ ማዕረ መሰላት ከምዝወሃቦ ይገብር።

Evaluation

Metric	Split	Value
Evaluation Loss	validation	1.025
Perplexity	validation	2.79
Token Accuracy	validation	75%
Training Loss	final	0.963

Validation corpus: 5% held-out split from 67.5k Q&A pairs

Limitations

Language Mixing: May occasionally mix (very rare) Amharic or English words in responses
Domain Scope: Optimized for general Q&A; may not handle highly specialized technical queries optimally
Factual Accuracy: Generated answers should be verified for factual correctness
Context Length: Limited to 1024 tokens for both input and output
Base Model Limitations: Inherits limitations from the base Gemma-3-4B architecture
No Multimodal: Text-only model; cannot process images, audio, or other media
Bias: May reflect societal biases present in training data

Citation

@misc{gemma-3-4b-tigrinya-qa,
  author = {Luel},
  title = {Gemma-3-4B-Tigrinya-QA: A Fine-tuned Question-Answering Model for Tigrinya},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/luel/gemma-3-4b-tigrinya-qa}}
}

Acknowledgements

This model builds upon Google's Gemma 3 4B foundation and the Tigrinya language adaptation. We acknowledge Google for making their foundation models available to the community, enabling the development of language-specific instruction-tuned models like this one.

luel
/

gemma-3-4b-tigrinya-qa