---
license: apache-2.0
tags:
  - text-classification
  - topic-analysis
  - vietnamese
  - vsfc
  - phobert
language:
  - vi
datasets:
  - uit-vsfc
model-index:
  - name: VSFC Topic Classifier (PhoBERT)
    results:
      - task:
          type: text-classification
          name: Topic Classification
        dataset:
          name: UIT-VSFC
          type: uit-vsfc
        metrics:
          - type: accuracy
            value: 89.1346
          - type: f1
            value: 89.0436
---

# VSFC TOPIC Classifier using PhoBERT

This model is fine-tuned from [`vinai/phobert-base`](https://huggingface.co/vinai/phobert-base) on the UIT-VSFC dataset for Vietnamese Students Feedback Corpus topic analysis.

## 🧠 Model Details

- **Model type**: Transformer (BERT-based)
- **Base model**: [`vinai/phobert-base`](https://huggingface.co/vinai/phobert-base)
- **Fine-tuned task**: Sentence-level topc classification
- **Target labels**: Lecturer, Training program, Facility, Others
- **Tokenizer**: SentencePiece BPE

## 📚 Training Data

- **Dataset**: [UIT-VSFC](https://drive.google.com/drive/folders/1xclbjHHK58zk2X6iqbvMPS2rcy9y9E0X)
- **Language**: Vietnamese
- **License**: Academic use
- Students’ feedback is a vital resource for the interdisciplinary research involving the combining of two different research fields between sentiment analysis and education.

## 🚀 How to Use

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("tmt3103/VSFC-topic-classify-phoBERT")
model = AutoModelForSequenceClassification.from_pretrained("tmt3103/VSFC-topic-classify-phoBERT")

inputs = tokenizer("Giảng viên thân thiện dễ thương", return_tensors="pt")
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=-1).item()