BERT IMDb Ensemble for Sentiment Analysis 🎬🎭

Model description

This is an ensemble of 3 BERT-base-uncased models fine-tuned on the IMDb dataset for binary sentiment classification (positive vs. negative reviews).
Each model was trained with a different random seed, and predictions are combined using weighted or unweighted averaging for more robust performance.

Base model: bert-base-uncased
Task: Sentiment classification (binary: 0 = negative, 1 = positive)
Ensembling strategy: Weighted logits averaging

Training procedure

Dataset: IMDb (train/test split from Hugging Face datasets)
Preprocessing:
- Tokenization with bert-base-uncased
- Truncation at 512 tokens
Hyperparameters:
- Epochs: 2
- Batch size: 8
- Optimizer: AdamW (default in Trainer)
- FP16: Enabled
- Seeds: [42, 123, 999]

Evaluation results

Across the three models, results are very consistent:

Model (Seed)	Epochs	Val. Accuracy	Val. Macro F1
42	2	93.74%	0.9374
123	2	93.84%	0.9383
999	2	93.98%	0.9398

Ensemble performance (weighted example [0.2, 0.2, 0.6]) improves stability and helps reduce variance across seeds.

How to use

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("ByteMeHarder-404/bert-imdb-ensemble")
model = AutoModelForSequenceClassification.from_pretrained("ByteMeHarder-404/bert-imdb-ensemble")

inputs = tokenizer("This movie was an absolute masterpiece!", return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.nn.functional.softmax(outputs.logits, dim=-1)

print(probs)  # tensor([[0.01, 0.99]]) -> positive sentiment

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for ByteMeHarder-404/bert-imdb-ensemble

Base model

google-bert/bert-base-uncased

Finetuned

(5767)

this model

Dataset used to train ByteMeHarder-404/bert-imdb-ensemble

Evaluation results

Accuracy on IMDb
test set self-reported

0.939
F1 on IMDb
test set self-reported

0.939

View on Papers With Code