Ganda Gemma 1B

A fine-tuned Gemma 3 1B instruction model specialized for English-to-Luganda translation and Luganda conversational AI. The model accepts input in both English and Luganda but outputs responses exclusively in Luganda.

πŸ“Š Translation Performance

Translation Performance Comparison

Model Comparison

Model Parameters BLEU chrF++ Efficiency*
Gemma 3 4B 4B 1.1 20.05 0.28
Gemma 3 27B 27B 3.65 31.37 0.14
GPT-5 Mini N/A 5.14 36.55 N/A
Ganda Gemma 1B 1B 6.99 40.32 6.99
Gemini 2.0 Flash Large 7.94 43.38 N/A

*Efficiency = BLEU Score Γ· Parameters (in billions)

Key Performance Insights

🎯 Efficiency Leader: Achieves 6.99 BLEU per billion parameters (highest efficiency ratio)
πŸš€ Size Advantage: Outperforms Gemma 3 4B (4x larger) by 535% on BLEU score
πŸ’Ž Competitive Quality: Achieves similar performance to GPT-5 Mini with known 1B parameter count
⚑ Practical Deployment: Runs efficiently on consumer hardware while maintaining quality

Evaluation Details

  • Dataset: FLORES-200 Englishβ†’Luganda (1,012 translation pairs)
  • Metrics: BLEU (bilingual evaluation understudy) and chrF++ (character F-score)
  • Evaluation: Zero-shot translation performance

πŸš€ Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("CraneAILabs/ganda-gemma-1b")
tokenizer = AutoTokenizer.from_pretrained("CraneAILabs/ganda-gemma-1b")

# Translate to Luganda
prompt = "Translate to Luganda: Hello, how are you today?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, temperature=0.3)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

🌍 Language Capabilities

  • Input Languages: English + Luganda
  • Output Language: Luganda only
  • Primary Focus: English-to-Luganda translation and Luganda conversation

🎯 Capabilities

  • Translation: English-to-Luganda translation
  • Conversational AI: Natural dialogue in Luganda
  • Summarization: Text summarization in Luganda
  • Writing: Creative and informational writing in Luganda
  • Question Answering: General knowledge responses in Luganda

πŸ’» Usage Examples

Basic Translation

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("CraneAILabs/ganda-gemma-1b")
tokenizer = AutoTokenizer.from_pretrained("CraneAILabs/ganda-gemma-1b")

# English to Luganda translation
prompt = "Translate to Luganda: Welcome to our school"
inputs = tokenizer(prompt, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_length=100,
        temperature=0.3,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Luganda Conversation

# Direct Luganda conversation
prompt = "Oli otya! Osobola okuntuyamba leero?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, temperature=0.3)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Using the Pipeline

from transformers import pipeline

# Create a text generation pipeline
generator = pipeline(
    "text-generation",
    model="CraneAILabs/ganda-gemma-1b",
    tokenizer="CraneAILabs/ganda-gemma-1b",
    device=0 if torch.cuda.is_available() else -1
)

# Generate Luganda text
result = generator(
    "Translate to Luganda: Welcome to our school",
    max_length=100,
    temperature=0.3,
    do_sample=True
)
print(result[0]['generated_text'])

Ollama Usage

# Run with different quantizations
ollama run crane-ai-labs/ganda-gemma-1b:q4-k-m    # Recommended balance
ollama run crane-ai-labs/ganda-gemma-1b:q8-0      # Higher quality
ollama run crane-ai-labs/ganda-gemma-1b:q4-k-s    # Smaller size
ollama run crane-ai-labs/ganda-gemma-1b:f16       # Original quality

πŸš€ GGUF Advantages

  • Fast CPU Inference: Optimized for CPU-only environments
  • Memory Efficient: Multiple quantization levels available
  • Easy Integration: Works with llama.cpp, Ollama, and other GGUF-compatible tools
  • Cross-Platform: Supports Windows, macOS, and Linux

πŸ“¦ Available Quantizations

Quantization File Size Memory Usage Quality Use Case
F16 ~2.0GB ~3.5GB Best GPU inference
Q8_0 ~1.1GB ~2.0GB High High-end CPU
Q4_K_M ~600MB ~1.2GB Good Recommended
Q4_K_S ~550MB ~1.1GB Decent Low memory
Q4_0 ~550MB ~1.0GB Basic Very low memory

πŸ”— Related Models

πŸ“„ License

This model is released under the Gemma Terms of Use. Please review the terms before use.


Built with ❀️ by Crane AI Labs

Ganda Gemma - Your helpful Luganda AI companion, optimized for CPU!

Downloads last month
1,586
GGUF
Model size
1,000M params
Architecture
gemma3
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for CraneAILabs/ganda-gemma-1b-GGUF

Quantized
(110)
this model

Collection including CraneAILabs/ganda-gemma-1b-GGUF