File size: 4,717 Bytes
c43097f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e1e7ab1
c43097f
e1e7ab1
c43097f
 
 
e1e7ab1
 
 
 
 
 
 
 
c43097f
e1e7ab1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c43097f
e1e7ab1
 
 
 
 
 
 
 
 
 
 
 
 
 
c43097f
e1e7ab1
 
 
 
 
 
c43097f
e1e7ab1
 
c43097f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
---
library_name: tensorflow
tags:
- sentiment-analysis
- aspect-based-sentiment-analysis
- tensorflow
- keras
language:
- tr
metrics:
- accuracy
pipeline_tag: text-classification
datasets:
- Sengil/Turkish-ABSA-Wsynthetic
---


# 🇹🇷 Turkish Aspect-Based Sentiment Analysis (ABSA) – BiLSTM + Word2Vec

This model performs aspect-based sentiment analysis (ABSA) on Turkish sentences. Given a sentence and a specific aspect, it predicts the sentiment polarity (Negative, Neutral, Positive) associated with that aspect.

## 🧠 Model Details

- **Model Type:** BiLSTM (Bidirectional Long Short-Term Memory) + Word2Vec
- **Developer:** [Sengil](https://huggingface.co/Sengil)
- **Library:** Keras
- **Input Format:** `"Sentence [ASP] Aspect"`
- **Labels:** 0 = Negative, 1 = Neutral, 2 = Positive
- **Training Dataset:** [Sengil/Turkish-ABSA-Wsynthetic](https://huggingface.co/datasets/Sengil/Turkish-ABSA-Wsynthetic)

## 📊 Evaluation Results

The model achieved the following performance on the test set:

| Class    | Precision | Recall | F1-Score | Support |
|----------|-----------|--------|----------|---------|
| Negative | 0.89      | 0.91   | 0.90     | 896     |
| Neutral  | 0.70      | 0.64   | 0.67     | 140     |
| Positive | 0.92      | 0.92   | 0.92     | 1178    |
| **Overall** |           |        | **0.90** | 2214    |

- **Overall Accuracy:** 90%
- **Macro-Averaged F1-Score:** 83%
- **Weighted-Averaged F1-Score:** 90%

## 🚀 Usage Example

Download model from HF
```python
from huggingface_hub import hf_hub_download
import pickle
from tensorflow.keras.models import load_model

model_path = hf_hub_download(repo_id="Sengil/Turkish-ABSA-BiLSTM-Word2Vec", filename="absa_bilstm_model.keras")
tokenizer_path = hf_hub_download(repo_id="Sengil/Turkish-ABSA-BiLSTM-Word2Vec", filename="tokenizer.pkl")

# load model
model = load_model(model_path)

# load tokenizer
with open(tokenizer_path, "rb") as f:
    tokenizer = pickle.load(f)
````

Input preprocessing
```python
import re
import nltk
nltk.download('punkt')

def preprocess_turkish(text):
    text = text.lower()
    text = re.sub(r"http\S+|www\S+|https\S+", "<url>", text)
    text = re.sub(r"@\w+", "<user>", text)
    text = re.sub(r"[^a-zA-Z0-9çğıöşüÇĞİÖŞÜ\s]", " ", text)
    text = re.sub(r"(.)\1{2,}", r"\1\1", text)
    text = re.sub(r"\s+", " ", text).strip()
    return text
````

Predict the input
```python
import numpy as np
from tensorflow.keras.preprocessing.sequence import pad_sequences

def predict_sentiment(sentence, aspect, max_len=84):
    input_text = sentence + " [ASP] " + aspect
    cleaned = preprocess_turkish(input_text)
    tokenized = tokenizer.texts_to_sequences([cleaned])
    padded = pad_sequences(tokenized, maxlen=max_len, padding='post')
    
    pred = model.predict(padded)
    label = np.argmax(pred)
    labels = {0: "Negatif", 1: "Nötr", 2: "Pozitif"}
    return labels[label]
````

run
```python
sentence = "Manzara sahane evet ama servis rezalet."
aspect = "manzara"

predict = predict_sentiment(sentence, aspect)
print("predict:", predict)
````

## 🏋️‍♀️ Training Details

* **Embedding:** Word2Vec (dimension: 100)
* **Model Architecture:**

  * Embedding layer (initialized with pre-trained Word2Vec weights)
  * 2 x BiLSTM layers (each with 100 units, dropout: 0.3)
  * Conv1D layer (100 filters, kernel size: 5)
  * Global Max Pooling
  * Dense layer (16 units, ReLU activation)
  * Output layer (3 units, softmax activation)
* **Training Parameters:**

  * Loss Function: `sparse_categorical_crossentropy`
  * Optimizer: Adam
  * Epochs: 35 (with early stopping)
  * Batch Size: 128
  * Learning Rate: 1e-3 (adjusted dynamically with ReduceLROnPlateau)

## 📚 Training Data

The model was trained on the [Sengil/Turkish-ABSA-Wsynthetic](https://huggingface.co/datasets/Sengil/Turkish-ABSA-Wsynthetic) dataset, which comprises semi-synthetic Turkish sentences annotated for aspect-based sentiment analysis, particularly in the restaurant domain.

## ⚠️ Limitations

* Performance on the Neutral class is lower compared to other classes, possibly due to class imbalance in the training data.
* The model may struggle with rare or ambiguous aspects not well represented in the training set.
* Complex sentence structures or ironic expressions may affect the model's accuracy.

## 📄 Citation

```
@misc{turkish_absa_bilstm_word2vec,
  title  = {Turkish Aspect-Based Sentiment Analysis using BiLSTM + Word2Vec},
  author = {Sengil},
  year   = {2025},
  url    = {https://huggingface.co/Sengil/Turkish-ABSA-BiLSTM-Word2Vec}
}
```

## 📬 Contact

For questions or feedback, please reach out via [Hugging Face profile](https://huggingface.co/Sengil).