Sengil commited on
Commit
c43097f
ยท
verified ยท
1 Parent(s): 3bf4fed

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +129 -0
README.md ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: tensorflow
3
+ tags:
4
+ - sentiment-analysis
5
+ - aspect-based-sentiment-analysis
6
+ - tensorflow
7
+ - keras
8
+ language:
9
+ - tr
10
+ metrics:
11
+ - accuracy
12
+ pipeline_tag: text-classification
13
+ datasets:
14
+ - Sengil/Turkish-ABSA-Wsynthetic
15
+ ---
16
+
17
+
18
+ # ๐Ÿ‡น๐Ÿ‡ท Turkish Aspect-Based Sentiment Analysis (ABSA) โ€“ BiLSTM + Word2Vec
19
+
20
+ This model performs aspect-based sentiment analysis (ABSA) on Turkish sentences. Given a sentence and a specific aspect, it predicts the sentiment polarity (Negative, Neutral, Positive) associated with that aspect.
21
+
22
+ ## ๐Ÿง  Model Details
23
+
24
+ - **Model Type:** BiLSTM (Bidirectional Long Short-Term Memory) + Word2Vec
25
+ - **Developer:** [Sengil](https://huggingface.co/Sengil)
26
+ - **Library:** Keras
27
+ - **Input Format:** `"Sentence [ASP] Aspect"`
28
+ - **Labels:** 0 = Negative, 1 = Neutral, 2 = Positive
29
+ - **Training Dataset:** [Sengil/Turkish-ABSA-Wsynthetic](https://huggingface.co/datasets/Sengil/Turkish-ABSA-Wsynthetic)
30
+
31
+ ## ๐Ÿ“Š Evaluation Results
32
+
33
+ The model achieved the following performance on the test set:
34
+
35
+ | Class | Precision | Recall | F1-Score | Support |
36
+ |----------|-----------|--------|----------|---------|
37
+ | Negative | 0.89 | 0.91 | 0.90 | 896 |
38
+ | Neutral | 0.70 | 0.64 | 0.67 | 140 |
39
+ | Positive | 0.92 | 0.92 | 0.92 | 1178 |
40
+ | **Overall** | | | **0.90** | 2214 |
41
+
42
+ - **Overall Accuracy:** 90%
43
+ - **Macro-Averaged F1-Score:** 83%
44
+ - **Weighted-Averaged F1-Score:** 90%
45
+
46
+ ## ๐Ÿš€ Usage Example
47
+
48
+ ```python
49
+ import pickle
50
+ import numpy as np
51
+ from tensorflow.keras.models import load_model
52
+ from tensorflow.keras.preprocessing.sequence import pad_sequences
53
+
54
+ # Load the model and tokenizer
55
+ model = load_model("absa_bilstm_model.keras")
56
+ with open("tokenizer.pkl", "rb") as f:
57
+ tokenizer = pickle.load(f)
58
+
59
+ # Maximum sentence length used during training
60
+ max_len = 84 # Adjust this value based on your training configuration
61
+
62
+ # Prediction function
63
+ def predict_sentiment(sentence, aspect):
64
+ input_text = f"{sentence} [ASP] {aspect}"
65
+ sequence = tokenizer.texts_to_sequences([input_text])
66
+ padded_sequence = pad_sequences(sequence, maxlen=max_len, padding='post')
67
+ prediction = model.predict(padded_sequence)
68
+ label = np.argmax(prediction, axis=1)[0]
69
+ labels = {0: "Negative", 1: "Neutral", 2: "Positive"}
70
+ return labels[label]
71
+
72
+ # Example usage
73
+ sentence = "Manzara ลŸahane evet ama servis rezalet."
74
+ aspect = "Servis"
75
+ print(f"Sentiment for '{aspect}': {predict_sentiment(sentence, aspect)}")
76
+ ````
77
+
78
+ ## ๐Ÿ‹๏ธโ€โ™€๏ธ Training Details
79
+
80
+ * **Embedding:** Word2Vec (dimension: 100)
81
+ * **Model Architecture:**
82
+
83
+ * Embedding layer (initialized with pre-trained Word2Vec weights)
84
+ * 2 x BiLSTM layers (each with 100 units, dropout: 0.3)
85
+ * Conv1D layer (100 filters, kernel size: 5)
86
+ * Global Max Pooling
87
+ * Dense layer (16 units, ReLU activation)
88
+ * Output layer (3 units, softmax activation)
89
+ * **Training Parameters:**
90
+
91
+ * Loss Function: `sparse_categorical_crossentropy`
92
+ * Optimizer: Adam
93
+ * Epochs: 35 (with early stopping)
94
+ * Batch Size: 128
95
+ * Learning Rate: 1e-3 (adjusted dynamically with ReduceLROnPlateau)
96
+
97
+ ## ๐Ÿ“š Training Data
98
+
99
+ The model was trained on the [Sengil/Turkish-ABSA-Wsynthetic](https://huggingface.co/datasets/Sengil/Turkish-ABSA-Wsynthetic) dataset, which comprises semi-synthetic Turkish sentences annotated for aspect-based sentiment analysis, particularly in the restaurant domain.
100
+
101
+ ## โš ๏ธ Limitations
102
+
103
+ * Performance on the Neutral class is lower compared to other classes, possibly due to class imbalance in the training data.
104
+ * The model may struggle with rare or ambiguous aspects not well represented in the training set.
105
+ * Complex sentence structures or ironic expressions may affect the model's accuracy.
106
+
107
+ ## ๐Ÿ“„ Citation
108
+
109
+ ```
110
+ @misc{turkish_absa_bilstm_word2vec,
111
+ title = {Turkish Aspect-Based Sentiment Analysis using BiLSTM + Word2Vec},
112
+ author = {Sengil},
113
+ year = {2025},
114
+ url = {https://huggingface.co/Sengil/Turkish-ABSA-BiLSTM-Word2Vec}
115
+ }
116
+ ```
117
+
118
+ ## ๐Ÿ“ฌ Contact
119
+
120
+ For questions or feedback, please reach out via [Hugging Face profile](https://huggingface.co/Sengil).
121
+
122
+ ```
123
+
124
+ ---
125
+
126
+ You can save this content as a `README.md` file and include it in your Hugging Face model repository. Ensure that you also upload the `absa_bilstm_model.keras` and `tokenizer.pkl` files to the repository. For guidance on uploading models to Hugging Face, refer to their [Model Sharing Documentation](https://huggingface.co/docs/hub/models-sharing).
127
+ ::contentReference[oaicite:0]{index=0}
128
+
129
+ ```