IndoBERT Sentiment Classifier for Social Media Posts – Universitas XYZ

This model is a fine-tuned IndoBERT transformer for performing sentiment analysis on Indonesian social media text (Twitter) related to university services. It classifies input text into positive, neutral, or negative sentiment categories.

🧠 Model Description

The model is built upon indobert-base-p2, a BERT-based transformer pre-trained on over 220 million Indonesian words. The fine-tuning process was done on 7500 samples containing balanced sentiment labels related to online academic services.

  • Label classes: Positive, Neutral, Negative
  • Preprocessing: Case folding, punctuation removal, stopword removal, stemming, tokenization (using IndoBERT tokenizer)

βœ… Intended Use

  • Analyzing Indonesian tweets about universities
  • Sentiment-driven dashboards for academic service quality
  • NLP applications in education sector

⚠️ Limitations

  • Domain-specific to university-related sentiment
  • May not generalize well to informal or slang-heavy text
  • Sarcasm or mixed-sentiment detection is not supported
  • Doesn’t handle toxicity or hate speech detection

πŸ“Š Dataset

  • Source: Custom crawled tweets via keywords and hashtags (e.g. #telkomuniversity, Universitas XYZ)
  • Size: 7500 samples
  • Split: 70% train, 10% validation, 20% test
  • Labels: 2500 positive, 2500 neutral, 2500 negative
  • Language: Indonesian

βš™οΈ Training Procedure

Hyperparameters

  • Learning rate: 5e-5
  • Batch size: 8
  • Epochs: 3
  • Optimizer: Adam (Ξ²1=0.9, Ξ²2=0.999, Ξ΅=1e-8)
  • Scheduler: Linear
  • Seed: 42

Framework Versions

  • Transformers: 4.24.0
  • PyTorch: 1.13.0
  • Tokenizers: 0.13.2

πŸ“ˆ Evaluation Metrics

Metric Score
Accuracy 89%
F1 Score 88%
Precision 87%
Recall 89%

image/png

πŸ’» Deployment Context

This model was integrated into a Django-based sentiment dashboard application with:

  • A custom Twitter crawler
  • Real-time sentiment classification
  • Wordclouds and sentiment breakdowns by time period
  • Admin tools for filtering, deleting, and exporting data

πŸ“„ Citation

If you use this model or its components, please cite:

@article{wijaya2023indobert,
  author    = {Kurniadi Ahmad Wijaya and Ade Romadhony and Donni Richasdy},
  title     = {Implementasi Model IndoBERT pada Dashboard Sentimen Media Sosial (Studi Kasus Universitas XYZ)},
  journal   = {eProceedings of Engineering},
  volume    = {10},
  number    = {4},
  year      = {2023},
  month     = {September},
  url       = {https://openlibrary.telkomuniversity.ac.id},
}
Downloads last month
34
Safetensors
Model size
124M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results