JavaneseHonorifics
/

Unggah-Ungguh-Javanese-Distilbert-Classifier

Text Classification

Model card Files Files and versions Community

rifqifarhansyah commited on May 26

Commit

52ec92b

·

verified ·

1 Parent(s): a9e828a

Update README.md

Files changed (1) hide show

README.md +56 -3

README.md CHANGED Viewed

@@ -1,3 +1,56 @@
----
-license: cc-by-nc-4.0
----

+---
+license: cc-by-nc-4.0
+language:
+- jv
+datasets:
+- JavaneseHonorifics/Unggah-Ungguh
+base_model:
+- w11wo/javanese-distilbert-small-imdb
+pipeline_tag: text-classification
+library_name: transformers
+---
+# Unggah-Ungguh-Javanese-Distilbert-Classifier
+Unggah-Ungguh-Javanese-Distilbert-Classifier is part of the Unggah-Ungguh's model family, a classifier model for Javanese Honorific Classification task that was mentioned in "Do Language Models Understand Honorific Systems in Javanese?". Check out [our paper](https://arxiv.org/abs/2502.20864) for more information!
+## Model description
+- **Model type**: A classifier model trained on a highly curated Unggah-Ungguh dataset that represent Javanese Honorific rules and systems.
+- **Language(s) NLP**: Javanese
+- **License:** CC-BY-NC 4.0
+- **Finetuned from model:** w11wo/javanese-distilbert-small-imdb
+## Model Sources
+- **Project Page:** https://javanesehonorifics.github.io/
+- **Repository:** https://github.com/JavaneseHonorifics
+- **Paper:** https://arxiv.org/abs/2502.20864
+## Using the model
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+model_path = "JavaneseHonorifics/Unggah-Ungguh-Javanese-Distilbert-Classifier"
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+model = AutoModelForSequenceClassification.from_pretrained(model_path)
+INPUT_TEXT = "Mbak Srini mangan pecel ajange pincuk"
+tokenized_input = tokenizer([INPUT_TEXT], return_tensors="pt", truncation=True, padding=True)
+with torch.no_grad():
+    outputs = model(**tokenized_input)
+    y_pred = outputs.logits.argmax(-1)
+print("Predicted class:", y_pred.item())
+```
+## License and Use
+Unggah-Ungguh is licensed under the CC-BY-NC 4.0
+## Citation
+```bibtex
+@article{farhansyah2025language,
+  title={Do Language Models Understand Honorific Systems in Javanese?},
+  author={Farhansyah, Mohammad Rifqi and Darmawan, Iwan and Kusumawardhana, Adryan and Winata, Genta Indra and Aji, Alham Fikri and Wijaya, Derry Tanti},
+  journal={arXiv preprint arXiv:2502.20864},
+  year={2025}
+}
+```