Ganda Gemma 1B
A fine-tuned Gemma 3 1B instruction model specialized for English-to-Luganda translation and Luganda conversational AI. The model accepts input in both English and Luganda but outputs responses exclusively in Luganda.
π Translation Performance
Model Comparison
Model | Parameters | BLEU | chrF++ | Efficiency* |
---|---|---|---|---|
Gemma 3 4B | 4B | 1.1 | 20.05 | 0.28 |
Gemma 3 27B | 27B | 3.65 | 31.37 | 0.14 |
GPT-5 Mini | N/A | 5.14 | 36.55 | N/A |
Ganda Gemma 1B | 1B | 6.99 | 40.32 | 6.99 |
Gemini 2.0 Flash | Large | 7.94 | 43.38 | N/A |
*Efficiency = BLEU Score Γ· Parameters (in billions)
Key Performance Insights
π― Efficiency Leader: Achieves 6.99 BLEU per billion parameters (highest efficiency ratio)
π Size Advantage: Outperforms Gemma 3 4B (4x larger) by 535% on BLEU score
π Competitive Quality: Achieves similar performance to GPT-5 Mini with known 1B parameter count
β‘ Practical Deployment: Runs efficiently on consumer hardware while maintaining quality
Evaluation Details
- Dataset: FLORES-200 EnglishβLuganda (1,012 translation pairs)
- Metrics: BLEU (bilingual evaluation understudy) and chrF++ (character F-score)
- Evaluation: Zero-shot translation performance
π Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("CraneAILabs/ganda-gemma-1b")
tokenizer = AutoTokenizer.from_pretrained("CraneAILabs/ganda-gemma-1b")
# Translate to Luganda
prompt = "Translate to Luganda: Hello, how are you today?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, temperature=0.3)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
π Language Capabilities
- Input Languages: English + Luganda
- Output Language: Luganda only
- Primary Focus: English-to-Luganda translation and Luganda conversation
π― Capabilities
- Translation: English-to-Luganda translation
- Conversational AI: Natural dialogue in Luganda
- Summarization: Text summarization in Luganda
- Writing: Creative and informational writing in Luganda
- Question Answering: General knowledge responses in Luganda
π» Usage Examples
Basic Translation
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("CraneAILabs/ganda-gemma-1b")
tokenizer = AutoTokenizer.from_pretrained("CraneAILabs/ganda-gemma-1b")
# English to Luganda translation
prompt = "Translate to Luganda: Welcome to our school"
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_length=100,
temperature=0.3,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Luganda Conversation
# Direct Luganda conversation
prompt = "Oli otya! Osobola okuntuyamba leero?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, temperature=0.3)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Using the Pipeline
from transformers import pipeline
# Create a text generation pipeline
generator = pipeline(
"text-generation",
model="CraneAILabs/ganda-gemma-1b",
tokenizer="CraneAILabs/ganda-gemma-1b",
device=0 if torch.cuda.is_available() else -1
)
# Generate Luganda text
result = generator(
"Translate to Luganda: Welcome to our school",
max_length=100,
temperature=0.3,
do_sample=True
)
print(result[0]['generated_text'])
Ollama Usage
# Run with different quantizations
ollama run crane-ai-labs/ganda-gemma-1b:q4-k-m # Recommended balance
ollama run crane-ai-labs/ganda-gemma-1b:q8-0 # Higher quality
ollama run crane-ai-labs/ganda-gemma-1b:q4-k-s # Smaller size
ollama run crane-ai-labs/ganda-gemma-1b:f16 # Original quality
π GGUF Advantages
- Fast CPU Inference: Optimized for CPU-only environments
- Memory Efficient: Multiple quantization levels available
- Easy Integration: Works with llama.cpp, Ollama, and other GGUF-compatible tools
- Cross-Platform: Supports Windows, macOS, and Linux
π¦ Available Quantizations
Quantization | File Size | Memory Usage | Quality | Use Case |
---|---|---|---|---|
F16 | ~2.0GB | ~3.5GB | Best | GPU inference |
Q8_0 | ~1.1GB | ~2.0GB | High | High-end CPU |
Q4_K_M | ~600MB | ~1.2GB | Good | Recommended |
Q4_K_S | ~550MB | ~1.1GB | Decent | Low memory |
Q4_0 | ~550MB | ~1.0GB | Basic | Very low memory |
π Related Models
- Original Model: CraneAILabs/ganda-gemma-1b - Full precision HuggingFace model
- Mobile (LiteRT): CraneAILabs/ganda-gemma-1b-litert - Optimized for Android/iOS
π License
This model is released under the Gemma Terms of Use. Please review the terms before use.
Built with β€οΈ by Crane AI Labs
Ganda Gemma - Your helpful Luganda AI companion, optimized for CPU!
- Downloads last month
- 1,586
2-bit
3-bit
4-bit
5-bit
8-bit
16-bit
32-bit