Upload model v2.0

0209c95 verified 6 months ago

3.71 kB

	---
	language: en
	license: mit
	library_name: transformers
	tags:
	- sentiment-analysis
	- text-classification
	- pytorch
	- distilbert
	- imdb
	datasets:
	- imdb
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	model-index:
	- name: imdb-sentiment-analysis-v2
	results:
	- task:
	type: text-classification
	name: Text Classification
	dataset:
	name: IMDB
	type: imdb
	split: test
	metrics:
	- type: accuracy
	value: 86.5
	name: Accuracy
	- type: f1
	value: 0.8672
	name: F1 Score
	---

	# Sentiment Analysis Model v2.0

	This is an improved version of the sentiment analysis model, fine-tuned with additional challenging examples to handle difficult cases like negation, sarcasm, and subtle expressions.

	## Model Details

	- Model Type: DistilBERT (fine-tuned)
	- Task: Binary Sentiment Classification (Positive/Negative)
	- Training Data: IMDB Movie Reviews Dataset
	- Language: English
	- License: MIT
	- Version: 2.0

	## Performance

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Accuracy \| 86.50% \|
	\| F1 Score \| 0.8672 \|
	\| Precision \| 84.21% \|
	\| Recall \| 89.47% \|

	## Training Details

	The model was trained on the IMDB dataset augmented with challenging examples specifically designed to improve performance on difficult sentiment analysis cases.

	### Training Hyperparameters

	- Learning Rate: 2e-5
	- Batch Size: 16 (effective batch size: 32 with gradient accumulation)
	- Epochs: 3
	- Optimizer: AdamW with weight decay
	- Mixed Precision: FP16

	## Usage

	### Direct Use with Pipeline

	```python
	from transformers import pipeline

	# Load the model
	sentiment = pipeline("sentiment-analysis", model="shane-reaume/imdb-sentiment-analysis-v2")

	# Analyze text
	result = sentiment("I really enjoyed this movie!")
	print(result) # [{'label': 'POSITIVE', 'score': 0.9998}]

	# Batch processing
	texts = [
	"This movie was absolutely amazing, I loved every minute of it!",
	"The acting was terrible and the plot made no sense at all."
	]
	results = sentiment(texts)
	for i, (text, result) in enumerate(zip(texts, results)):
	print(f"Text: {{text}}")
	print(f"Sentiment: {{result['label']}}, Score: {{result['score']:.4f}}")
	```

	### Loading Model Directly

	```python
	from transformers import AutoModelForSequenceClassification, AutoTokenizer
	import torch

	# Load model and tokenizer
	model_name = "shane-reaume/imdb-sentiment-analysis-v2"
	model = AutoModelForSequenceClassification.from_pretrained(model_name)
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	# Prepare text
	text = "I really enjoyed this movie!"
	inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)

	# Get prediction
	with torch.no_grad():
	outputs = model(**inputs)

	# Process outputs
	probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
	prediction = torch.argmax(probabilities, dim=-1).item()
	confidence = probabilities[0][prediction].item()

	# Map prediction to label (0: negative, 1: positive)
	sentiment_label = "POSITIVE" if prediction == 1 else "NEGATIVE"
	print(f"Sentiment: {{sentiment_label}}, Confidence: {{confidence:.4f}}")
	```

	## Limitations

	- The model is trained primarily on movie reviews and may not perform as well on other domains.
	- The model may struggle with certain types of text:
	- Sarcasm and irony
	- Mixed sentiment expressions
	- Subtle negative expressions
	- Complex negations

	## Citation

	If you use this model in your research, please cite:

	```
	@misc{sentiment-analysis-model,
	author = {Your Name},
	title = {Sentiment Analysis Model based on DistilBERT},
	year = {2023},
	publisher = {Hugging Face},
	howpublished = {\url{https://huggingface.co/shane-reaume/imdb-sentiment-analysis-v2}}
	}
	```