Sentiment Analysis Model (SAM)

A sentiment analysis model built using the Burn deep learning framework in Rust, fine-tuned on the MTEB Tweet Sentiment Extraction dataset and exposed via a Rocket API.

🧠 Model Details

Architecture: Transformer Encoder with 6 layers, 4 attention heads, d_model=256, and d_ff=1024.
Embeddings: Token and positional embeddings with a maximum sequence length of 256.
Output Layer: Linear layer mapping to 3 sentiment classes: Negative, Neutral, Positive.
Activation Function: Softmax for multi-class classification.
Dropout: Applied with a rate of 0.1 to prevent overfitting (one for embeddings and one for the output layer).
Training Framework: Burn in Rust.

📚 Training Data

Dataset: MTEB Tweet Sentiment Extraction
Size: 100,000 training samples.
Preprocessing: Utilized the BertCasedTokenizer for tokenization.
Batching: Mini-batch gradient descent with a batch size of 32.

⚙️ Training Configuration

Optimizer: AdamW with weight decay (0.01) and learning rate (1e-4) - especially good for training large models.
Learning Rate Scheduler: Noam scheduler with 5,000 warm-up steps - especially useful for transformer models.
Loss Function: CrossEntropyLoss with label smoothing (0.1) and class balancing.
Gradient Clipping: Applied with a maximum norm of 1.0.
Early Stopping: Implemented with a patience of 2 epochs.
Epochs: Trained for up to 5 epochs with early stopping based on validation loss.

📈 Evaluation Metrics

Learner Summary:

TextClassificationModel {
  transformer: TransformerEncoder {d_model: 256, d_ff: 1024, n_heads: 8, n_layers: 4, dropout: 0.1, norm_first: true, quiet_softmax: true, params: 3159040}
  embedding_token: Embedding {n_embedding: 28996, d_model: 256, params: 7422976}
  embedding_pos: Embedding {n_embedding: 256, d_model: 256, params: 65536}
  embed_dropout: Dropout {prob: 0.1}
  output_dropout: Dropout {prob: 0.1}
  output: Linear {d_input: 256, d_output: 3, bias: true, params: 771}
  n_classes: 3
  max_seq_length: 256
  params: 10648323
}

Split	Metric	Min.	Epoch	Max.	Epoch
Train	Loss	1.120	5	1.171	1
Train	Accuracy	33.743	2	37.814	1
Train	Learning Rate	2.763e-8	1	7.648e-8	2
Valid	Loss	1.102	4	1.110	1
Valid	Accuracy	32.760	2	36.900	5

TODO:
- Tweak hyperparameters to alleviate underfitting.
- Enhance logging and monitoring.

🚀 Usage

API Endpoint: /predict
Example Request:

{
  "text": "I love the new features in this app!"
}

Example Response:

{
  "sentiment": "Positive"
}

Steps to Run: TODO after dockerizing and deploying to Hugging Face Spaces.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for WarriorsSami/sentiment-analysis-model

Base model

google-bert/bert-base-uncased

Finetuned

(5848)

this model

WarriorsSami
/

sentiment-analysis-model