metadata

license: mit
datasets:
  - sepidmnorozy/Indonesian_sentiment
language:
  - id
base_model:
  - google-bert/bert-base-uncased
pipeline_tag: text-classification

This repository contains a fine-tuned BERT model for sentiment analysis. The model has been trained to classify text into two sentiment categories: 0 (negative) and 1 (positive). Below is a summary of the model's performance and training details.

Model Performance Summary The model achieved the following performance metrics on the evaluation dataset:

Class	Precision	Recall	F1-Score	Support
0	0.88	0.84	0.86	799
1	0.92	0.94	0.93	1467

Accuracy: 0.91

Macro Avg	Weighted Avg
Precision = 0.90 Recall = 0.89 F1-Score = 0.90	Precision = 0.90 Recall = 0.91 F1-Score = 0.90

Training Details

Model Architecture	BERT (Bidirectional Encoder Representations from Transformers)
Task	Sentiment Analysis (Binary Classification)
Epochs	5
Hardware	NVIDIA A100 GPU

How to Use

1. Install Dependencies

Ensure you have the necessary libraries installed:

pip install transformers torch

2. Load the Model

You can load the fine-tuned BERT model using the transformers library:

from transformers import BertForSequenceClassification, BertTokenizer
##Load the fine-tuned model and tokenizer
model = BertForSequenceClassification.from_pretrained("path_to_model")
tokenizer = BertTokenizer.from_pretrained("path_to_tokenizer")

3. Preprocess and Predict

Preprocess your input text and make predictions:

the Code

# prompt: use this model to predict a sentence with output sentiment negatif or positif

from transformers import BertTokenizer, BertForSequenceClassification
import torch

# Load the saved model and tokenizer
model_path = 'bibrani/bert-sentiment-analisis-indo'
tokenizer = BertTokenizer.from_pretrained(model_path)
model = BertForSequenceClassification.from_pretrained(model_path)

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
print(device)

def predict_sentiment(text):
    """Predicts the sentiment of a given text.

    Args:
        text (str): The input text.

    Returns:
        str: "Negative sentiment" or "Positive sentiment".
    """
    # Tokenize the input text
    inputs = tokenizer(text, padding="max_length", truncation=True, max_length=512, return_tensors="pt")

    # Move inputs to the device
    input_ids = inputs.input_ids.to(device)
    attention_mask = inputs.attention_mask.to(device)

    # Perform inference
    with torch.no_grad():
        outputs = model(input_ids, attention_mask=attention_mask)
        logits = outputs.logits

    # Get the predicted class
    predicted_class = torch.argmax(logits, dim=1).item()

    if predicted_class == 0:
        return "Negative sentiment", inputs
    else:
        return "Positive sentiment", inputs

# Example usage
text_to_predict = "jadi cerita nya saya sedang ingin makan spaghetti dengan meatball yang kalau menurut ekspektasi saya adalah bakso yang terbuat dari cingcang yang biasa digunakan di menu pasta , setelah sampai , ternyata bakso yang digunakan adalah bakso olahan yang biasa dipakai di tukang bakso , bahkan bentuk nya tidak bulat"
sentiment = predict_sentiment(text_to_predict)
print(f"Text: {text_to_predict}")
print(f"Sentiment: {sentiment}")

Results Interpretation

Class 0: Represents negative sentiment. Class 1: Represents positive sentiment.

The model shows strong performance across both classes, with slightly higher precision and recall for positive sentiment (class 1).