---
language: en
license: mit
library_name: transformers
tags:
- sentiment-analysis
- text-classification
- pytorch
- distilbert
- imdb
datasets:
- imdb
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: imdb-sentiment-analysis-v2
  results:
  - task:
      type: text-classification
      name: Text Classification
    dataset:
      name: IMDB
      type: imdb
      split: test
    metrics:
    - type: accuracy
      value: 86.5
      name: Accuracy
    - type: f1
      value: 0.8672
      name: F1 Score
---

# Sentiment Analysis Model v2.0

This is an improved version of the sentiment analysis model, fine-tuned with additional challenging examples to handle difficult cases like negation, sarcasm, and subtle expressions.

## Model Details

- **Model Type:** DistilBERT (fine-tuned)
- **Task:** Binary Sentiment Classification (Positive/Negative)
- **Training Data:** IMDB Movie Reviews Dataset
- **Language:** English
- **License:** MIT
- **Version:** 2.0

## Performance

| Metric | Value |
|--------|-------|
| Accuracy | 86.50% |
| F1 Score | 0.8672 |
| Precision | 84.21% |
| Recall | 89.47% |

## Training Details

The model was trained on the IMDB dataset augmented with challenging examples specifically designed to improve performance on difficult sentiment analysis cases.

### Training Hyperparameters

- Learning Rate: 2e-5
- Batch Size: 16 (effective batch size: 32 with gradient accumulation)
- Epochs: 3
- Optimizer: AdamW with weight decay
- Mixed Precision: FP16

## Usage

### Direct Use with Pipeline

```python
from transformers import pipeline

# Load the model
sentiment = pipeline("sentiment-analysis", model="shane-reaume/imdb-sentiment-analysis-v2")

# Analyze text
result = sentiment("I really enjoyed this movie!")
print(result)  # [{'label': 'POSITIVE', 'score': 0.9998}]

# Batch processing
texts = [
    "This movie was absolutely amazing, I loved every minute of it!",
    "The acting was terrible and the plot made no sense at all."
]
results = sentiment(texts)
for i, (text, result) in enumerate(zip(texts, results)):
    print(f"Text: {{text}}")
    print(f"Sentiment: {{result['label']}}, Score: {{result['score']:.4f}}")
```

### Loading Model Directly

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load model and tokenizer
model_name = "shane-reaume/imdb-sentiment-analysis-v2"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare text
text = "I really enjoyed this movie!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)

# Get prediction
with torch.no_grad():
    outputs = model(**inputs)
    
# Process outputs
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
prediction = torch.argmax(probabilities, dim=-1).item()
confidence = probabilities[0][prediction].item()

# Map prediction to label (0: negative, 1: positive)
sentiment_label = "POSITIVE" if prediction == 1 else "NEGATIVE"
print(f"Sentiment: {{sentiment_label}}, Confidence: {{confidence:.4f}}")
```

## Limitations

- The model is trained primarily on movie reviews and may not perform as well on other domains.
- The model may struggle with certain types of text:
  - Sarcasm and irony
  - Mixed sentiment expressions
  - Subtle negative expressions
  - Complex negations

## Citation

If you use this model in your research, please cite:

```
@misc{sentiment-analysis-model,
  author = {Your Name},
  title = {Sentiment Analysis Model based on DistilBERT},
  year = {2023},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/shane-reaume/imdb-sentiment-analysis-v2}}
}
```