File size: 4,717 Bytes
c43097f e1e7ab1 c43097f e1e7ab1 c43097f e1e7ab1 c43097f e1e7ab1 c43097f e1e7ab1 c43097f e1e7ab1 c43097f e1e7ab1 c43097f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
---
library_name: tensorflow
tags:
- sentiment-analysis
- aspect-based-sentiment-analysis
- tensorflow
- keras
language:
- tr
metrics:
- accuracy
pipeline_tag: text-classification
datasets:
- Sengil/Turkish-ABSA-Wsynthetic
---
# 🇹🇷 Turkish Aspect-Based Sentiment Analysis (ABSA) – BiLSTM + Word2Vec
This model performs aspect-based sentiment analysis (ABSA) on Turkish sentences. Given a sentence and a specific aspect, it predicts the sentiment polarity (Negative, Neutral, Positive) associated with that aspect.
## 🧠 Model Details
- **Model Type:** BiLSTM (Bidirectional Long Short-Term Memory) + Word2Vec
- **Developer:** [Sengil](https://huggingface.co/Sengil)
- **Library:** Keras
- **Input Format:** `"Sentence [ASP] Aspect"`
- **Labels:** 0 = Negative, 1 = Neutral, 2 = Positive
- **Training Dataset:** [Sengil/Turkish-ABSA-Wsynthetic](https://huggingface.co/datasets/Sengil/Turkish-ABSA-Wsynthetic)
## 📊 Evaluation Results
The model achieved the following performance on the test set:
| Class | Precision | Recall | F1-Score | Support |
|----------|-----------|--------|----------|---------|
| Negative | 0.89 | 0.91 | 0.90 | 896 |
| Neutral | 0.70 | 0.64 | 0.67 | 140 |
| Positive | 0.92 | 0.92 | 0.92 | 1178 |
| **Overall** | | | **0.90** | 2214 |
- **Overall Accuracy:** 90%
- **Macro-Averaged F1-Score:** 83%
- **Weighted-Averaged F1-Score:** 90%
## 🚀 Usage Example
Download model from HF
```python
from huggingface_hub import hf_hub_download
import pickle
from tensorflow.keras.models import load_model
model_path = hf_hub_download(repo_id="Sengil/Turkish-ABSA-BiLSTM-Word2Vec", filename="absa_bilstm_model.keras")
tokenizer_path = hf_hub_download(repo_id="Sengil/Turkish-ABSA-BiLSTM-Word2Vec", filename="tokenizer.pkl")
# load model
model = load_model(model_path)
# load tokenizer
with open(tokenizer_path, "rb") as f:
tokenizer = pickle.load(f)
````
Input preprocessing
```python
import re
import nltk
nltk.download('punkt')
def preprocess_turkish(text):
text = text.lower()
text = re.sub(r"http\S+|www\S+|https\S+", "<url>", text)
text = re.sub(r"@\w+", "<user>", text)
text = re.sub(r"[^a-zA-Z0-9çğıöşüÇĞİÖŞÜ\s]", " ", text)
text = re.sub(r"(.)\1{2,}", r"\1\1", text)
text = re.sub(r"\s+", " ", text).strip()
return text
````
Predict the input
```python
import numpy as np
from tensorflow.keras.preprocessing.sequence import pad_sequences
def predict_sentiment(sentence, aspect, max_len=84):
input_text = sentence + " [ASP] " + aspect
cleaned = preprocess_turkish(input_text)
tokenized = tokenizer.texts_to_sequences([cleaned])
padded = pad_sequences(tokenized, maxlen=max_len, padding='post')
pred = model.predict(padded)
label = np.argmax(pred)
labels = {0: "Negatif", 1: "Nötr", 2: "Pozitif"}
return labels[label]
````
run
```python
sentence = "Manzara sahane evet ama servis rezalet."
aspect = "manzara"
predict = predict_sentiment(sentence, aspect)
print("predict:", predict)
````
## 🏋️♀️ Training Details
* **Embedding:** Word2Vec (dimension: 100)
* **Model Architecture:**
* Embedding layer (initialized with pre-trained Word2Vec weights)
* 2 x BiLSTM layers (each with 100 units, dropout: 0.3)
* Conv1D layer (100 filters, kernel size: 5)
* Global Max Pooling
* Dense layer (16 units, ReLU activation)
* Output layer (3 units, softmax activation)
* **Training Parameters:**
* Loss Function: `sparse_categorical_crossentropy`
* Optimizer: Adam
* Epochs: 35 (with early stopping)
* Batch Size: 128
* Learning Rate: 1e-3 (adjusted dynamically with ReduceLROnPlateau)
## 📚 Training Data
The model was trained on the [Sengil/Turkish-ABSA-Wsynthetic](https://huggingface.co/datasets/Sengil/Turkish-ABSA-Wsynthetic) dataset, which comprises semi-synthetic Turkish sentences annotated for aspect-based sentiment analysis, particularly in the restaurant domain.
## ⚠️ Limitations
* Performance on the Neutral class is lower compared to other classes, possibly due to class imbalance in the training data.
* The model may struggle with rare or ambiguous aspects not well represented in the training set.
* Complex sentence structures or ironic expressions may affect the model's accuracy.
## 📄 Citation
```
@misc{turkish_absa_bilstm_word2vec,
title = {Turkish Aspect-Based Sentiment Analysis using BiLSTM + Word2Vec},
author = {Sengil},
year = {2025},
url = {https://huggingface.co/Sengil/Turkish-ABSA-BiLSTM-Word2Vec}
}
```
## 📬 Contact
For questions or feedback, please reach out via [Hugging Face profile](https://huggingface.co/Sengil).
|