ayushsinha commited on
Commit
ef4d654
Β·
verified Β·
1 Parent(s): 88a17cc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +85 -0
README.md ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DistilBERT Base Uncased Quantized Model for Sentiment Analysis
2
+
3
+ This repository hosts a quantized version of the DistilBERT model, fine-tuned for sentiment analysis tasks. The model has been optimized for efficient deployment while maintaining high accuracy, making it suitable for resource-constrained environments.
4
+
5
+ ## Model Details
6
+
7
+ - **Model Architecture:** DistilBERT Base Uncased
8
+ - **Task:** Sentiment Analysis
9
+ - **Dataset:** IMDB Reviews
10
+ - **Quantization:** Float16
11
+ - **Fine-tuning Framework:** Hugging Face Transformers
12
+
13
+ ## Usage
14
+
15
+ ### Installation
16
+
17
+ ```sh
18
+ pip install transformers torch
19
+ ```
20
+
21
+ ### Loading the Model
22
+
23
+ ```python
24
+ from transformers import DistilBertTokenizer, DistilBertForSequenceClassification, Trainer, TrainingArguments
25
+ import torch
26
+
27
+ model_name = "AventIQ-AI/distilbert-base-uncased-sentiment-analysis"
28
+ tokenizer = DistilBertTokenizer.from_pretrained(model_name)
29
+ model = DistilBertForSequenceClassification.from_pretrained(model_name)
30
+
31
+ def predict_sentiment(text):
32
+ inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)
33
+ with torch.no_grad():
34
+ logits = model(**inputs).logits
35
+ predicted_class_id = torch.argmax(logits, dim=-1).item()
36
+ return "Positive" if predicted_class_id == 1 else "Negative"
37
+
38
+ # Test the model with a sample sentence
39
+ test_text = "I absolutely loved the movie! It was fantastic."
40
+ print(f"Sentiment: {predict_sentiment(test_text)}")
41
+ ```
42
+
43
+ ## Performance Metrics
44
+
45
+ - **Accuracy:** 0.56
46
+ - **F1 Score:** 0.56
47
+ - **Precision:** 0.68
48
+ - **Recall:** 0.56
49
+
50
+ ## Fine-Tuning Details
51
+
52
+ ### Dataset
53
+
54
+ The IMDb Reviews dataset was used, containing both positive and negative sentiment examples.
55
+
56
+ ### Training
57
+
58
+ - Number of epochs: 3
59
+ - Batch size: 16
60
+ - Evaluation strategy: epoch
61
+ - Learning rate: 2e-5
62
+
63
+ ### Quantization
64
+
65
+ Post-training quantization was applied using PyTorch's built-in quantization framework to reduce the model size and improve inference efficiency.
66
+
67
+ ## Repository Structure
68
+
69
+ ```
70
+ .
71
+ β”œβ”€β”€ model/ # Contains the quantized model files
72
+ β”œβ”€β”€ tokenizer_config/ # Tokenizer configuration and vocabulary files
73
+ β”œβ”€β”€ model.safensors/ # Fine Tuned Model
74
+ β”œβ”€β”€ README.md # Model documentation
75
+ ```
76
+
77
+ ## Limitations
78
+
79
+ - The model may not generalize well to domains outside the fine-tuning dataset.
80
+ - Quantization may result in minor accuracy degradation compared to full-precision models.
81
+
82
+ ## Contributing
83
+
84
+ Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.
85
+