Gemma-3-4B-Tigrinya-QA
Gemma-3-4B-Tigrinya-QA is a two-stage fine-tuned adaptation of Google's Gemma-3-4B specifically optimized for question-answering in Tigrinya (แตแแญแ).
This model demonstrates good capabilities in answering questions across various domains, including history, culture, and general knowledge, in Tigrinya.
Purpose: Tigrinya is a low-resource language with limited high-performance open models available. This release aims to reduce barriers to entry for research and application development in the Tigrinya language space.
Model Details
- Model Type: Instruction-tuned Causal Language Model
- Base Model: luel/gemma-3-4b-tigrinya (stage 1: 60M tokens)
- Parameters: 4 billion
- Architecture: Gemma 3 with
Gemma3ForCausalLM
- Training Precision: BF16 with TF32 acceleration
- Max Sequence Length: 1024 tokens
Training Process
Stage 1: General Text Generation
- Base: Gemma-3-4B -> luel/gemma-3-4b-tigrinya
- Data: 60M tokens of mixed-domain Tigrinya (news, web, literature)
- Purpose: Language adaptation and vocabulary expansion
Stage 2: Instruction Fine-tuning (This Model)
- Base: luel/gemma-3-4b-tigrinya -> luel/gemma-3-4b-tigrinya-qa
- Data: 67.5k curated Q&A pairs across governance, society, politics, culture, history, proverbs, etc.
- Format: Gemma chat template with user/assistant turns
Dataset (Stage 2)
- Size: 67.5k question-answer pairs
- Language: Tigrinya (แตแแญแ)
- Domains: Geography, culture, history, politics, general knowledge
- Format: Chat template with
<start_of_turn>user
and<start_of_turn>model
markers - Split: 95% training / 5% validation
Training Details (Stage 2)
- Training Framework: HuggingFace Transformers with
SFTTrainer
- Optimizer: AdamW with cosine learning rate schedule
- Learning Rate: 2e-5 with 3% warmup
- Weight Decay: 0.01
- Batch Size: 6 per device, 2 gradient accumulation steps (effective batch size: 12)
- Epochs: 3
- Evaluation: Every 500 steps
- Mixed Precision: BF16 with gradient checkpointing
- Hardware: NVIDIA GH200 120GB
Usage
First, install the Transformers library (version 4.50 or higher):
pip install -U transformers
Then, you can use it for inference as follows:
from transformers import Gemma3ForCausalLM, AutoTokenizer
import torch
model_id = "luel/gemma-3-4b-tigrinya-qa"
tok = AutoTokenizer.from_pretrained(model_id)
model = Gemma3ForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="cuda"
)
question = "แตแแซแญ แฃแ แญ แตแญแจแฅ?"
messages = [{"role": "user", "content": f"{question}"}]
prompt = tok.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = tok(prompt, return_tensors="pt").to(model.device)
# you can add and adjust max_length, temperature, repetition_penalty, top_p...
outputs = model.generate(**inputs, max_new_tokens=1024)
full_response = tok.decode(outputs[0], skip_special_tokens=False)
model_response = full_response.split("<start_of_turn>model\n")[-1].split("<end_of_turn>")[0].strip()
print(model_response)
Chat Template
<start_of_turn>user
{user_message}<end_of_turn>
<start_of_turn>model
{model_response}<end_of_turn>
Examples
Below is a short sample showcasing the model's response quality (generated with max_new_tokens=512
, temperature=0.6
, top_p=0.9
).
Q (user): แแแซแต แแญแ แฃแแชแซ แแพแ แฃแจแแต แฅแฎแ?
A (model): แขแตแฎแตแซแฃ แ
แกแฒแฃ แคแญแตแซแ แถแแแซแแข
Q (user): แแแต แฒแแต แแตแ แแแแซแต แฃแแชแซ แฅแแณแญ แแแฅแฝแฒ แฅแฉ แแแแแแ?
A (model): แแแต แฒแแต แแตแ แฃแฅ แแแ แฃแแณแ แแแแฅแฒ แแฐแแแแ แแแซแต แฃแแชแซ แแแแฐแ แญแแแณ แจแแแฝแฅแ แแญแฃแจแแ แณแชแญ แแแ แฅแฉแข
Q (user): แแ แแตแ แฅแ แแแฃแฎแข แแต แแแ แฃแญแตแแจแญ
A (model): แแต แจแแข แฃแญแตแณแแฃแฅแข
Q (user): แฅแแแฃ แตแแซแญ แฅแแแฉ แแแจแ
A (model): แตแแซแญ แฃแฅ แฐแแ แขแตแฎแตแซ แฅแตแญแจแฅ แญแแ แฅแซแข แญแฅแฐ แจแฐแแฃ แแแ แฎแญแ แซแแฆแต แแ แญแฒ แจแฐแแณแต แตแ แฃแญแฑแแก แฝแจแก แแญแจแ แฅแจแแข แตแแซแญ แแฐแแแแฉ แ
แแขแฃ แณแชแฝ แแแแแ แแแณแ แฃแ
แ แแแแ แฅแแแแต แแแฅแจแ แฅแซแข แฅแแฐแแ แฐแแฅแฎแ แตแ-แฅแ แฃแ แแแญแณแตแ แแแแ แต แฆแณ แฅแซแข
Q (user): แฐแแญ แแแแณแญ แฅแแณแญ แแแต แฅแฉ?
A (model): แฐแแญ แแแแณแญ แแแแณแต แฅแแญแแญแแแแฃ แฅแแญแฃแตแแแ แฅแแญ แฃแแแแญแ แฅแแแจ แแแแแต แแ
แฅ แแแแตแณแ แฃแฐแซแญแ แจแแแฝแตแข แฐแแญ แแแแณแญ แฉแ แแ แแแจ แแฐแแต แจแแแแแฆ แญแแฅแญแข
Evaluation
Metric | Split | Value |
---|---|---|
Evaluation Loss | validation | 1.025 |
Perplexity | validation | 2.79 |
Token Accuracy | validation | 75% |
Training Loss | final | 0.963 |
Validation corpus: 5% held-out split from 67.5k Q&A pairs
Limitations
- Language Mixing: May occasionally mix (very rare) Amharic or English words in responses
- Domain Scope: Optimized for general Q&A; may not handle highly specialized technical queries optimally
- Factual Accuracy: Generated answers should be verified for factual correctness
- Context Length: Limited to 1024 tokens for both input and output
- Base Model Limitations: Inherits limitations from the base Gemma-3-4B architecture
- No Multimodal: Text-only model; cannot process images, audio, or other media
- Bias: May reflect societal biases present in training data
Citation
@misc{gemma-3-4b-tigrinya-qa,
author = {Luel},
title = {Gemma-3-4B-Tigrinya-QA: A Fine-tuned Question-Answering Model for Tigrinya},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/luel/gemma-3-4b-tigrinya-qa}}
}
Acknowledgements
This model builds upon Google's Gemma 3 4B foundation and the Tigrinya language adaptation. We acknowledge Google for making their foundation models available to the community, enabling the development of language-specific instruction-tuned models like this one.
- Downloads last month
- 3
Model tree for luel/gemma-3-4b-tigrinya-qa
Collection including luel/gemma-3-4b-tigrinya-qa
Evaluation results
- Perplexity on Tigrinya Q&Aself-reported2.790
- Eval Loss on Tigrinya Q&Aself-reported1.025