bert-finetuned-sst2

This model is a fine-tuned version of bert-base-uncased on SST-2 dataset. It achieves the following results on the evaluation set:

Loss: 0.6081
Accuracy: 0.64

Model description

bert-finetuned-sst2 is based on the BERT base architecture, which includes 12 transformer layers, with an uncased vocabulary. This means the model does not differentiate between uppercase and lowercase letters, making it more versatile for text processing tasks. BERT has been pivotal in NLP for its deep understanding of language context and nuance, and this fine-tuned version carries those benefits into sentiment analysis. It was introduced by researchers at Google AI Language in a 2018 paper and has since become a staple for NLP tasks. This model is fine-tuned to classify sentences into positive or negative sentiments, making it ideal for analyzing customer feedback, social media sentiment, and other text where understanding sentiment is valuable.

Intended uses & limitations

bert-finetuned-sst2 is intended for use in sentiment analysis applications across various domains such as social media monitoring, customer feedback analysis, and market research. It is optimized for English language text. While BERT's deep contextual understanding enables accurate sentiment classification, users should be aware of potential biases in the training data which could influence the model's outputs. This model may not perform as well on text from domains significantly different from the training data, such as highly technical documents or languages other than English.

Training and evaluation data

SST-2 dataset We randomly select 100 training data and 100 evaluation data.

How to use

from datasets import load_dataset
from transformers import AutoTokenizer, DataCollatorWithPadding

raw_datasets = load_dataset("glue", "sst2")
checkpoint = "zhuchi76/bert-finetuned-sst2"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)


def tokenize_function(example):
    return tokenizer(example["sentence"], truncation=True)


tokenized_datasets = raw_datasets.map(tokenize_function, batched=True)

small_train_dataset = tokenized_datasets["train"].shuffle(seed=42).select(range(100))
small_eval_dataset = tokenized_datasets["validation"].shuffle(seed=42).select(range(100))

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

from transformers import TrainingArguments

training_args = TrainingArguments(output_dir="bert-finetuned-sst2",
                                  evaluation_strategy="epoch",
                                  hub_model_id="zhuchi76/bert-finetuned-sst2")

from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)

from transformers import Trainer
trainer = Trainer(
    model,
    training_args,
    train_dataset=small_train_dataset, # if using cpu
    eval_dataset=small_eval_dataset, # if using cpu
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
)

# Evaluation
predictions = trainer.predict(small_eval_dataset)
print(predictions.predictions.shape, predictions.label_ids.shape)
preds = np.argmax(predictions.predictions, axis=-1)

import evaluate
metric = evaluate.load("glue", "sst2")
metric.compute(predictions=preds, references=predictions.label_ids)

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	13	0.6714	0.57
No log	2.0	26	0.6477	0.65
No log	3.0	39	0.6081	0.64

Framework versions

Transformers 4.38.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

zhuchi76
/

bert-finetuned-sst2