tmt3103's picture
Update README.md
d641446 verified
metadata
license: apache-2.0
tags:
  - text-classification
  - topic-analysis
  - vietnamese
  - vsfc
  - phobert
language:
  - vi
datasets:
  - uit-vsfc
model-index:
  - name: VSFC Topic Classifier (PhoBERT)
    results:
      - task:
          type: text-classification
          name: Topic Classification
        dataset:
          name: UIT-VSFC
          type: uit-vsfc
        metrics:
          - type: accuracy
            value: 89.1346
          - type: f1
            value: 89.0436

VSFC TOPIC Classifier using PhoBERT

This model is fine-tuned from vinai/phobert-base on the UIT-VSFC dataset for Vietnamese Students Feedback Corpus topic analysis.

🧠 Model Details

  • Model type: Transformer (BERT-based)
  • Base model: vinai/phobert-base
  • Fine-tuned task: Sentence-level topc classification
  • Target labels: Lecturer, Training program, Facility, Others
  • Tokenizer: SentencePiece BPE

📚 Training Data

  • Dataset: UIT-VSFC
  • Language: Vietnamese
  • License: Academic use
  • Students’ feedback is a vital resource for the interdisciplinary research involving the combining of two different research fields between sentiment analysis and education.

🚀 How to Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("tmt3103/VSFC-topic-classify-phoBERT")
model = AutoModelForSequenceClassification.from_pretrained("tmt3103/VSFC-topic-classify-phoBERT")

inputs = tokenizer("Giảng viên thân thiện dễ thương", return_tensors="pt")
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=-1).item()