You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Gemma-3-12B Telugu Fine-tuned Model

This is a fine-tuned version of the Gemma-3-12B model for the Telugu language. The model was trained on a custom Telugu question-answering dataset to enhance its capabilities for Telugu language processing.

Model Details

  • Base Model: google/gemma-3-12b-pt
  • Fine-tuning Method: FULL
  • Training Dataset: Custom Telugu QA pairs dataset
  • Validation Split: 10.0%
  • All available samples were used for training

Training Parameters

  • Per Device Batch Size: 2
  • Gradient Accumulation Steps: 32
  • Learning Rate: 2e-5
  • Number of Epochs: 3.0
  • Learning Rate Scheduler: cosine
  • Warmup Ratio: 0.03
  • Max Sequence Length: 4096

Hardware Configuration

  • Precision: BF16
  • DeepSpeed: Enabled (Stage 2)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_name = "bharathkumar1922001/gemma-3-12b-pt-telugu-sft"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example prompt (in Telugu)
prompt = "నమస్కారం, మీరు ఎలా ఉన్నారు?"

# Generate response
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Intended Use

This model is designed for Telugu language tasks including:

  • Question answering
  • Conversation
  • Text generation
  • General Telugu language understanding and generation

Telugu Language Support

This model has been specifically fine-tuned to enhance its capabilities in Telugu, one of the Dravidian languages spoken primarily in the Indian states of Andhra Pradesh and Telangana.

Downloads last month
129
Safetensors
Model size
13.2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bharathkumar1922001/Gemma3-12b-Indic

Finetuned
(20)
this model