Model Card for GPT-2 Tigrinya Medium

Model Summary

This is a GPT-2 model trained from scratch on Tigrinya text data. It was trained on 20.6 million tokens, primarily from news sources.

Model Description

  • Model type: GPT-2
  • Language: Tigrinya (α‰΅αŒαˆ­αŠ›)
  • Finetuned from model: Trained from scratch (no pre-training)

Model Architecture

  • Parameters: 51.9M
  • Context Window: 128 tokens
  • Vocabulary Size: 52,000

Training Details

  • Training regime: fp16 mixed precision
  • Number of Epochs: 12
  • Batch Size: 6 (with gradient accumulation steps of 8)
  • Learning Rate: 5e-4

Evaluation

  • Training Perplexity: 28.6
  • Training Loss: 3.12

Usage

from transformers import pipeline
# Load the model
generator = pipeline('text-generation', model='luel/gpt2-tigrinya-medium')

prompt = "ክልል α‰΅αŒαˆ«α‹­"
# Generate text
text = generator(prompt, max_length=100)[0]['generated_text']
print(text)

Limitations

  • Limited context window of 128 tokens.
  • Best suited for medium-length Tigrinya text generation.
  • Outputs should be reviewed for accuracy.
Downloads last month
7
Safetensors
Model size
51.9M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using luel/gpt2-tigrinya-medium 1

Collection including luel/gpt2-tigrinya-medium