DistilBERT Sentiment Analysis Model

Overview

This repository contains a fine-tuned DistilBERT model trained for sentiment analysis on TripAdvisor reviews. The model predicts sentiment scores on a scale of 1 to 5 based on review text.

  • Base Model: distilbert-base-uncased
  • Trained Dataset: nhull/tripadvisor-split-dataset-v2
  • Use Case: Sentiment classification for customer reviews to derive insights into customer satisfaction.
  • Output: Sentiment labels (1-5).

Model Details

  • Learning Rate: 3e-05
  • Batch Size: 64
  • Epochs: 10 (with early stopping)
  • Patience: 5 (epochs without improvement)
  • Tokenizer: distilbert-base-uncased
  • Framework: PyTorch + Hugging Face Transformers

Intended Use

This model is designed to classify hotel reviews based on their sentiment. It assigns a star rating between 1 and 5 to a review, indicating the sentiment expressed in the review.


Dataset

The dataset used for training, validation, and testing is nhull/tripadvisor-split-dataset-v2. It consists of:

  • Training Set: 30,400 reviews
  • Validation Set: 1,600 reviews
  • Test Set: 8,000 reviews

All splits are balanced across five sentiment labels.


Test Performance

Model predicts too high on average by 0.3934.

Metric Value
Accuracy 0.6391
Precision 0.6416
Recall 0.6391
F1-Score 0.6400

Classification Report (Test Set)

Label Precision Recall F1-Score Support
1 0.7483 0.6856 0.7156 1600
2 0.5445 0.5544 0.5494 1600
3 0.6000 0.6281 0.6137 1600
4 0.5828 0.5894 0.5861 1600
5 0.7326 0.7381 0.7354 1600

Confusion Matrix (Test Set)

True \ Predicted 1 2 3 4 5
1 1097 437 60 3 3
2 327 887 344 34 8
3 37 278 1005 254 26
4 3 21 239 943 394
5 2 6 27 384 1181

Files Included

  • validation_results_distilbert.csv: Contains correctly classified reviews with their real and predicted labels.

Limitations

  1. Domain-Specific: The model was trained on TripAdvisor reviews, so it may not generalize to other types of reviews or domains without further fine-tuning.
  2. Subjectivity: Sentiment annotations are subjective and may not fully represent every user's perception.
  3. Performance: Mid-range sentiment labels (2 and 3) have lower precision and recall compared to extreme sentiment labels (1 and 5).
Downloads last month
366
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for nhull/distilbert-sentiment-model

Finetuned
(7321)
this model

Dataset used to train nhull/distilbert-sentiment-model

Space using nhull/distilbert-sentiment-model 1

Collection including nhull/distilbert-sentiment-model