BERT IMDb Ensemble for Sentiment Analysis π¬π
Model description
This is an ensemble of 3 BERT-base-uncased models fine-tuned on the IMDb dataset for binary sentiment classification (positive vs. negative reviews).
Each model was trained with a different random seed, and predictions are combined using weighted or unweighted averaging for more robust performance.
- Base model:
bert-base-uncased
- Task: Sentiment classification (binary: 0 = negative, 1 = positive)
- Ensembling strategy: Weighted logits averaging
Training procedure
Dataset: IMDb (train/test split from Hugging Face
datasets
)Preprocessing:
- Tokenization with
bert-base-uncased
- Truncation at 512 tokens
- Tokenization with
Hyperparameters:
- Epochs: 2
- Batch size: 8
- Optimizer: AdamW (default in
Trainer
) - FP16: Enabled
- Seeds:
[42, 123, 999]
Evaluation results
Across the three models, results are very consistent:
Model (Seed) | Epochs | Val. Accuracy | Val. Macro F1 |
---|---|---|---|
42 | 2 | 93.74% | 0.9374 |
123 | 2 | 93.84% | 0.9383 |
999 | 2 | 93.98% | 0.9398 |
Ensemble performance (weighted example [0.2, 0.2, 0.6]
) improves stability and helps reduce variance across seeds.
How to use
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("ByteMeHarder-404/bert-imdb-ensemble")
model = AutoModelForSequenceClassification.from_pretrained("ByteMeHarder-404/bert-imdb-ensemble")
inputs = tokenizer("This movie was an absolute masterpiece!", return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
print(probs) # tensor([[0.01, 0.99]]) -> positive sentiment
Model tree for ByteMeHarder-404/bert-imdb-ensemble
Base model
google-bert/bert-base-uncasedDataset used to train ByteMeHarder-404/bert-imdb-ensemble
Evaluation results
- Accuracy on IMDbtest set self-reported0.939
- F1 on IMDbtest set self-reported0.939