--- library_name: tensorflow tags: - sentiment-analysis - aspect-based-sentiment-analysis - tensorflow - keras language: - tr metrics: - accuracy pipeline_tag: text-classification datasets: - Sengil/Turkish-ABSA-Wsynthetic --- # ๐Ÿ‡น๐Ÿ‡ท Turkish Aspect-Based Sentiment Analysis (ABSA) โ€“ BiLSTM + Word2Vec This model performs aspect-based sentiment analysis (ABSA) on Turkish sentences. Given a sentence and a specific aspect, it predicts the sentiment polarity (Negative, Neutral, Positive) associated with that aspect. ## ๐Ÿง  Model Details - **Model Type:** BiLSTM (Bidirectional Long Short-Term Memory) + Word2Vec - **Developer:** [Sengil](https://huggingface.co/Sengil) - **Library:** Keras - **Input Format:** `"Sentence [ASP] Aspect"` - **Labels:** 0 = Negative, 1 = Neutral, 2 = Positive - **Training Dataset:** [Sengil/Turkish-ABSA-Wsynthetic](https://huggingface.co/datasets/Sengil/Turkish-ABSA-Wsynthetic) ## ๐Ÿ“Š Evaluation Results The model achieved the following performance on the test set: | Class | Precision | Recall | F1-Score | Support | |----------|-----------|--------|----------|---------| | Negative | 0.89 | 0.91 | 0.90 | 896 | | Neutral | 0.70 | 0.64 | 0.67 | 140 | | Positive | 0.92 | 0.92 | 0.92 | 1178 | | **Overall** | | | **0.90** | 2214 | - **Overall Accuracy:** 90% - **Macro-Averaged F1-Score:** 83% - **Weighted-Averaged F1-Score:** 90% ## ๐Ÿš€ Usage Example Download model from HF ```python from huggingface_hub import hf_hub_download import pickle from tensorflow.keras.models import load_model model_path = hf_hub_download(repo_id="Sengil/Turkish-ABSA-BiLSTM-Word2Vec", filename="absa_bilstm_model.keras") tokenizer_path = hf_hub_download(repo_id="Sengil/Turkish-ABSA-BiLSTM-Word2Vec", filename="tokenizer.pkl") # load model model = load_model(model_path) # load tokenizer with open(tokenizer_path, "rb") as f: tokenizer = pickle.load(f) ```` Input preprocessing ```python import re import nltk nltk.download('punkt') def preprocess_turkish(text): text = text.lower() text = re.sub(r"http\S+|www\S+|https\S+", "", text) text = re.sub(r"@\w+", "", text) text = re.sub(r"[^a-zA-Z0-9รงฤŸฤฑรถลŸรผร‡ฤžฤฐร–ลžรœ\s]", " ", text) text = re.sub(r"(.)\1{2,}", r"\1\1", text) text = re.sub(r"\s+", " ", text).strip() return text ```` Predict the input ```python import numpy as np from tensorflow.keras.preprocessing.sequence import pad_sequences def predict_sentiment(sentence, aspect, max_len=84): input_text = sentence + " [ASP] " + aspect cleaned = preprocess_turkish(input_text) tokenized = tokenizer.texts_to_sequences([cleaned]) padded = pad_sequences(tokenized, maxlen=max_len, padding='post') pred = model.predict(padded) label = np.argmax(pred) labels = {0: "Negatif", 1: "Nรถtr", 2: "Pozitif"} return labels[label] ```` run ```python sentence = "Manzara sahane evet ama servis rezalet." aspect = "manzara" predict = predict_sentiment(sentence, aspect) print("predict:", predict) ```` ## ๐Ÿ‹๏ธโ€โ™€๏ธ Training Details * **Embedding:** Word2Vec (dimension: 100) * **Model Architecture:** * Embedding layer (initialized with pre-trained Word2Vec weights) * 2 x BiLSTM layers (each with 100 units, dropout: 0.3) * Conv1D layer (100 filters, kernel size: 5) * Global Max Pooling * Dense layer (16 units, ReLU activation) * Output layer (3 units, softmax activation) * **Training Parameters:** * Loss Function: `sparse_categorical_crossentropy` * Optimizer: Adam * Epochs: 35 (with early stopping) * Batch Size: 128 * Learning Rate: 1e-3 (adjusted dynamically with ReduceLROnPlateau) ## ๐Ÿ“š Training Data The model was trained on the [Sengil/Turkish-ABSA-Wsynthetic](https://huggingface.co/datasets/Sengil/Turkish-ABSA-Wsynthetic) dataset, which comprises semi-synthetic Turkish sentences annotated for aspect-based sentiment analysis, particularly in the restaurant domain. ## โš ๏ธ Limitations * Performance on the Neutral class is lower compared to other classes, possibly due to class imbalance in the training data. * The model may struggle with rare or ambiguous aspects not well represented in the training set. * Complex sentence structures or ironic expressions may affect the model's accuracy. ## ๐Ÿ“„ Citation ``` @misc{turkish_absa_bilstm_word2vec, title = {Turkish Aspect-Based Sentiment Analysis using BiLSTM + Word2Vec}, author = {Sengil}, year = {2025}, url = {https://huggingface.co/Sengil/Turkish-ABSA-BiLSTM-Word2Vec} } ``` ## ๐Ÿ“ฌ Contact For questions or feedback, please reach out via [Hugging Face profile](https://huggingface.co/Sengil).