--- license: apache-2.0 tags: - text-classification - topic-analysis - vietnamese - vsfc - phobert language: - vi datasets: - uit-vsfc model-index: - name: VSFC Topic Classifier (PhoBERT) results: - task: type: text-classification name: Topic Classification dataset: name: UIT-VSFC type: uit-vsfc metrics: - type: accuracy value: 89.1346 - type: f1 value: 89.0436 --- # VSFC TOPIC Classifier using PhoBERT This model is fine-tuned from [`vinai/phobert-base`](https://huggingface.co/vinai/phobert-base) on the UIT-VSFC dataset for Vietnamese Students Feedback Corpus topic analysis. ## 🧠 Model Details - **Model type**: Transformer (BERT-based) - **Base model**: [`vinai/phobert-base`](https://huggingface.co/vinai/phobert-base) - **Fine-tuned task**: Sentence-level topc classification - **Target labels**: Lecturer, Training program, Facility, Others - **Tokenizer**: SentencePiece BPE ## 📚 Training Data - **Dataset**: [UIT-VSFC](https://drive.google.com/drive/folders/1xclbjHHK58zk2X6iqbvMPS2rcy9y9E0X) - **Language**: Vietnamese - **License**: Academic use - Students’ feedback is a vital resource for the interdisciplinary research involving the combining of two different research fields between sentiment analysis and education. ## 🚀 How to Use ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("tmt3103/VSFC-topic-classify-phoBERT") model = AutoModelForSequenceClassification.from_pretrained("tmt3103/VSFC-topic-classify-phoBERT") inputs = tokenizer("Giảng viên thân thiện dễ thương", return_tensors="pt") outputs = model(**inputs) predicted_class = outputs.logits.argmax(dim=-1).item()