Sentiment Analysis Model (SAM)

A sentiment analysis model built using the Burn deep learning framework in Rust, fine-tuned on the MTEB Tweet Sentiment Extraction dataset and exposed via a Rocket API.

🧠 Model Details

  • Architecture: Transformer Encoder with 6 layers, 4 attention heads, d_model=256, and d_ff=1024.
  • Embeddings: Token and positional embeddings with a maximum sequence length of 256.
  • Output Layer: Linear layer mapping to 3 sentiment classes: Negative, Neutral, Positive.
  • Activation Function: Softmax for multi-class classification.
  • Dropout: Applied with a rate of 0.1 to prevent overfitting (one for embeddings and one for the output layer).
  • Training Framework: Burn in Rust.

πŸ“š Training Data

  • Dataset: MTEB Tweet Sentiment Extraction
  • Size: 100,000 training samples.
  • Preprocessing: Utilized the BertCasedTokenizer for tokenization.
  • Batching: Mini-batch gradient descent with a batch size of 32.

βš™οΈ Training Configuration

  • Optimizer: AdamW with weight decay (0.01) and learning rate (1e-4) - especially good for training large models.
  • Learning Rate Scheduler: Noam scheduler with 5,000 warm-up steps - especially useful for transformer models.
  • Loss Function: CrossEntropyLoss with label smoothing (0.1) and class balancing.
  • Gradient Clipping: Applied with a maximum norm of 1.0.
  • Early Stopping: Implemented with a patience of 2 epochs.
  • Epochs: Trained for up to 5 epochs with early stopping based on validation loss.

πŸ“ˆ Evaluation Metrics

  • Learner Summary:
TextClassificationModel {
  transformer: TransformerEncoder {d_model: 256, d_ff: 1024, n_heads: 8, n_layers: 4, dropout: 0.1, norm_first: true, quiet_softmax: true, params: 3159040}
  embedding_token: Embedding {n_embedding: 28996, d_model: 256, params: 7422976}
  embedding_pos: Embedding {n_embedding: 256, d_model: 256, params: 65536}
  embed_dropout: Dropout {prob: 0.1}
  output_dropout: Dropout {prob: 0.1}
  output: Linear {d_input: 256, d_output: 3, bias: true, params: 771}
  n_classes: 3
  max_seq_length: 256
  params: 10648323
}
Split Metric Min. Epoch Max. Epoch
Train Loss 1.120 5 1.171 1
Train Accuracy 33.743 2 37.814 1
Train Learning Rate 2.763e-8 1 7.648e-8 2
Valid Loss 1.102 4 1.110 1
Valid Accuracy 32.760 2 36.900 5
  • TODO:
    • Tweak hyperparameters to alleviate underfitting.
    • Enhance logging and monitoring.

πŸš€ Usage

  • API Endpoint: /predict
  • Example Request:
{
  "text": "I love the new features in this app!"
}
  • Example Response:
{
  "sentiment": "Positive"
}
  • Steps to Run: TODO after dockerizing and deploying to Hugging Face Spaces.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for WarriorsSami/sentiment-analysis-model

Finetuned
(5728)
this model

Dataset used to train WarriorsSami/sentiment-analysis-model