Text Classification
Transformers
Safetensors
Javanese
distilbert
rifqifarhansyah commited on
Commit
52ec92b
·
verified ·
1 Parent(s): a9e828a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -3
README.md CHANGED
@@ -1,3 +1,56 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - jv
5
+ datasets:
6
+ - JavaneseHonorifics/Unggah-Ungguh
7
+ base_model:
8
+ - w11wo/javanese-distilbert-small-imdb
9
+ pipeline_tag: text-classification
10
+ library_name: transformers
11
+ ---
12
+ # Unggah-Ungguh-Javanese-Distilbert-Classifier
13
+
14
+ Unggah-Ungguh-Javanese-Distilbert-Classifier is part of the Unggah-Ungguh's model family, a classifier model for Javanese Honorific Classification task that was mentioned in "Do Language Models Understand Honorific Systems in Javanese?". Check out [our paper](https://arxiv.org/abs/2502.20864) for more information!
15
+
16
+ ## Model description
17
+ - **Model type**: A classifier model trained on a highly curated Unggah-Ungguh dataset that represent Javanese Honorific rules and systems.
18
+ - **Language(s) NLP**: Javanese
19
+ - **License:** CC-BY-NC 4.0
20
+ - **Finetuned from model:** w11wo/javanese-distilbert-small-imdb
21
+
22
+ ## Model Sources
23
+
24
+ - **Project Page:** https://javanesehonorifics.github.io/
25
+ - **Repository:** https://github.com/JavaneseHonorifics
26
+ - **Paper:** https://arxiv.org/abs/2502.20864
27
+
28
+ ## Using the model
29
+ ```python
30
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
31
+ import torch
32
+ model_path = "JavaneseHonorifics/Unggah-Ungguh-Javanese-Distilbert-Classifier"
33
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
34
+ model = AutoModelForSequenceClassification.from_pretrained(model_path)
35
+ INPUT_TEXT = "Mbak Srini mangan pecel ajange pincuk"
36
+ tokenized_input = tokenizer([INPUT_TEXT], return_tensors="pt", truncation=True, padding=True)
37
+ with torch.no_grad():
38
+ outputs = model(**tokenized_input)
39
+ y_pred = outputs.logits.argmax(-1)
40
+ print("Predicted class:", y_pred.item())
41
+ ```
42
+
43
+ ## License and Use
44
+
45
+ Unggah-Ungguh is licensed under the CC-BY-NC 4.0
46
+
47
+ ## Citation
48
+
49
+ ```bibtex
50
+ @article{farhansyah2025language,
51
+ title={Do Language Models Understand Honorific Systems in Javanese?},
52
+ author={Farhansyah, Mohammad Rifqi and Darmawan, Iwan and Kusumawardhana, Adryan and Winata, Genta Indra and Aji, Alham Fikri and Wijaya, Derry Tanti},
53
+ journal={arXiv preprint arXiv:2502.20864},
54
+ year={2025}
55
+ }
56
+ ```