Unggah-Ungguh-Javanese-LSTM-Classifier
Unggah-Ungguh-Javanese-LSTM-Classifier is a CNN-BiLSTM model for Javanese honorific level classification. This model is part of the Unggah-Ungguh project and serves as a strong non-transformer baseline for the task introduced in the paper "Do Language Models Understand Honorific Systems in Javanese?".
Model description
- Model type: Convolutional + Bidirectional LSTM classifier
- Language: Javanese
- License: CC-BY-NC 4.0
- Framework: Keras (TensorFlow backend)
- Training: Trained on a curated dataset of Javanese sentences annotated with honorific labels
Model Sources
- Project Page: https://javanesehonorifics.github.io/
- Repository: https://github.com/JavaneseHonorifics
- Paper: https://arxiv.org/abs/2502.20864
Using the model
from Baseline_LSTM import Config, build_model
import tensorflow as tf
import json
from tensorflow.keras.preprocessing.sequence import pad_sequences
# Load tokenizer
with open("tokenizer.json", "r") as f:
tokenizer_data = json.load(f)
# Build reverse word-index if needed
word_index = tokenizer_data.get("word_index", tokenizer_data)
config = Config()
# Tokenize example sentence
text = "Mbak Srini mangan pecel ajange pincuk"
tokens = [word_index.get(word, word_index.get("<unk>", 1)) for word in text.split()]
tokens_padded = pad_sequences([tokens], maxlen=config.MAX_LEN, padding='post')
# Load model
model = build_model(config)
model.load_weights("baseline_lstm_model.h5")
# Predict
prediction = model.predict(tokens_padded)
label = prediction.argmax(axis=1)[0]
print("Predicted class:", label)
License and Use
Unggah-Ungguh is licensed under the CC-BY-NC 4.0
Citation
@article{farhansyah2025language,
title={Do Language Models Understand Honorific Systems in Javanese?},
author={Farhansyah, Mohammad Rifqi and Darmawan, Iwan and Kusumawardhana, Adryan and Winata, Genta Indra and Aji, Alham Fikri and Wijaya, Derry Tanti},
journal={arXiv preprint arXiv:2502.20864},
year={2025}
}
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support