Gemma-3-12B Telugu Fine-tuned Model
This is a fine-tuned version of the Gemma-3-12B model for the Telugu language. The model was trained on a custom Telugu question-answering dataset to enhance its capabilities for Telugu language processing.
Model Details
- Base Model: google/gemma-3-12b-pt
- Fine-tuning Method: FULL
- Training Dataset: Custom Telugu QA pairs dataset
- Validation Split: 10.0%
- All available samples were used for training
Training Parameters
- Per Device Batch Size: 2
- Gradient Accumulation Steps: 32
- Learning Rate: 2e-5
- Number of Epochs: 3.0
- Learning Rate Scheduler: cosine
- Warmup Ratio: 0.03
- Max Sequence Length: 4096
Hardware Configuration
- Precision: BF16
- DeepSpeed: Enabled (Stage 2)
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model_name = "bharathkumar1922001/gemma-3-12b-pt-telugu-sft"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Example prompt (in Telugu)
prompt = "నమస్కారం, మీరు ఎలా ఉన్నారు?"
# Generate response
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Intended Use
This model is designed for Telugu language tasks including:
- Question answering
- Conversation
- Text generation
- General Telugu language understanding and generation
Telugu Language Support
This model has been specifically fine-tuned to enhance its capabilities in Telugu, one of the Dravidian languages spoken primarily in the Indian states of Andhra Pradesh and Telangana.
- Downloads last month
- 129
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for bharathkumar1922001/Gemma3-12b-Indic
Base model
google/gemma-3-12b-pt