--- license: mit datasets: - sepidmnorozy/Indonesian_sentiment language: - id base_model: - google-bert/bert-base-uncased pipeline_tag: text-classification --- This repository contains a fine-tuned BERT model for sentiment analysis. The model has been trained to classify text into two sentiment categories: 0 (negative) and 1 (positive). Below is a summary of the model's performance and training details. Model Performance Summary The model achieved the following performance metrics on the evaluation dataset: | Class | Precision | Recall | F1-Score | Support | | ----- | --------- | ------ | -------- | ------- | | 0 | 0.88 | 0.84 | 0.86 | 799 | | 1 | 0.92 | 0.94 | 0.93 | 1467 | Accuracy: 0.91 | Macro Avg | Weighted Avg | | --------- | ------------ | | Precision = 0.90 Recall = 0.89 F1-Score = 0.90 | Precision = 0.90 Recall = 0.91 F1-Score = 0.90 | # Training Details | Model Architecture | BERT (Bidirectional Encoder Representations from Transformers) | | ------------------ | -------------------------------------------------------------- | | Task | Sentiment Analysis (Binary Classification) | | Epochs | 5 | | Hardware | NVIDIA A100 GPU | # How to Use # 1. Install Dependencies Ensure you have the necessary libraries installed: ```python pip install transformers torch ``` # 2. Load the Model You can load the fine-tuned BERT model using the transformers library: ```python from transformers import BertForSequenceClassification, BertTokenizer ##Load the fine-tuned model and tokenizer model = BertForSequenceClassification.from_pretrained("path_to_model") tokenizer = BertTokenizer.from_pretrained("path_to_tokenizer") ``` # 3. Preprocess and Predict Preprocess your input text and make predictions: # the Code ```python # prompt: use this model to predict a sentence with output sentiment negatif or positif from transformers import BertTokenizer, BertForSequenceClassification import torch # Load the saved model and tokenizer model_path = 'bibrani/bert-sentiment-analisis-indo' tokenizer = BertTokenizer.from_pretrained(model_path) model = BertForSequenceClassification.from_pretrained(model_path) # Set device device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) print(device) def predict_sentiment(text): """Predicts the sentiment of a given text. Args: text (str): The input text. Returns: str: "Negative sentiment" or "Positive sentiment". """ # Tokenize the input text inputs = tokenizer(text, padding="max_length", truncation=True, max_length=512, return_tensors="pt") # Move inputs to the device input_ids = inputs.input_ids.to(device) attention_mask = inputs.attention_mask.to(device) # Perform inference with torch.no_grad(): outputs = model(input_ids, attention_mask=attention_mask) logits = outputs.logits # Get the predicted class predicted_class = torch.argmax(logits, dim=1).item() if predicted_class == 0: return "Negative sentiment", inputs else: return "Positive sentiment", inputs # Example usage text_to_predict = "jadi cerita nya saya sedang ingin makan spaghetti dengan meatball yang kalau menurut ekspektasi saya adalah bakso yang terbuat dari cingcang yang biasa digunakan di menu pasta , setelah sampai , ternyata bakso yang digunakan adalah bakso olahan yang biasa dipakai di tukang bakso , bahkan bentuk nya tidak bulat" sentiment = predict_sentiment(text_to_predict) print(f"Text: {text_to_predict}") print(f"Sentiment: {sentiment}") ``` # Results Interpretation Class 0: Represents negative sentiment. Class 1: Represents positive sentiment. The model shows strong performance across both classes, with slightly higher precision and recall for positive sentiment (class 1).