sbaru commited on
Commit
3b1efa8
ยท
verified ยท
1 Parent(s): 0b4bbab

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -0
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - Junhoee/Jeju-Standard-Translation
5
+ language:
6
+ - ko
7
+ metrics:
8
+ - sacrebleu
9
+ - chrf
10
+ - bertscore
11
+ base_model:
12
+ - gogamza/kobart-base-v2
13
+ tags:
14
+ - nlp
15
+ - translation
16
+ - seq2seq
17
+ - low-resource-language
18
+ - korean-dialect
19
+ - jeju-dialect
20
+ - kobart
21
+ ---
22
+ # ์ œ์ฃผ ์‚ฌํ† ๋ฃจ (Jeju Satoru)
23
+
24
+ ## ํ”„๋กœ์ ํŠธ ๊ฐœ์š”
25
+ '์ œ์ฃผ ์‚ฌํ† ๋ฃจ'๋Š” ์œ ๋„ค์Šค์ฝ”์—์„œ **'์†Œ๋ฉธ ์œ„๊ธฐ ์–ธ์–ด'**๋กœ ์ง€์ •ํ•œ ์ œ์ฃผ์–ด์˜ ๋ณด์กด์„ ๋ชฉํ‘œ๋กœ ๊ฐœ๋ฐœ๋œ **์ œ์ฃผ์–ด-ํ‘œ์ค€์–ด ์–‘๋ฐฉํ–ฅ ๋ฒˆ์—ญ ๋ชจ๋ธ**์ž…๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ์ œ์ฃผ์–ด ํ™”์ž์˜ ๋””์ง€ํ„ธ ์ ‘๊ทผ์„ฑ์„ ๋†’์—ฌ ๋””์ง€ํ„ธ ์†Œ์™ธ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐ ๊ธฐ์—ฌํ•˜๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค.
26
+
27
+ ## ๋ชจ๋ธ ์ •๋ณด
28
+ - **๊ธฐ๋ฐ˜ ๋ชจ๋ธ**: `gogamza/kobart-base-v2`
29
+ - **๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜**: Seq2Seq (์ธ์ฝ”๋”-๋””์ฝ”๋” ๊ตฌ์กฐ)
30
+ - **ํ•™์Šต ๋ฐ์ดํ„ฐ**: ํ—ˆ๊น…ํŽ˜์ด์Šค์— ๊ณต๊ฐœ๋œ [Junhoee/Jeju-Standard-Translation](https://huggingface.co/datasets/Junhoee/Jeju-Standard-Translation) ๋ฐ์ดํ„ฐ์…‹์„ ํ™œ์šฉํ•˜์—ฌ ์•ฝ 93๋งŒ ๊ฐœ์˜ ๋ฌธ์žฅ ์Œ์œผ๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
31
+
32
+ ## ์„ฑ๋Šฅ ํ‰๊ฐ€
33
+ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์€ SacreBLEU, CHRF, BERTScore์™€ ๊ฐ™์€ ์ •๋Ÿ‰์  ์ง€ํ‘œ๋กœ ํ‰๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
34
+
35
+ | ๋ฐฉํ–ฅ | SacreBLEU | CHRF | BERTScore |
36
+ |-------------------|-----------|------|-----------|
37
+ | ์ œ์ฃผ์–ด โ†’ ํ‘œ์ค€์–ด | 77.19 | 83.02| 0.97 |
38
+ | ํ‘œ์ค€์–ด โ†’ ์ œ์ฃผ์–ด | 64.86 | 72.68| 0.94 |
39
+
40
+ ## ์‚ฌ์šฉ ๋ฐฉ๋ฒ•
41
+ `transformers` ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ `pipeline`์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ์‰ฝ๊ฒŒ ๋กœ๋“œํ•˜๊ณ  ์ถ”๋ก ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
42
+
43
+ **1. ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์„ค์น˜**
44
+ ```bash
45
+ pip install transformers torch
46
+
47
+ from transformers import pipeline
48
+
49
+ # ๋ชจ๋ธ ํŒŒ์ดํ”„๋ผ์ธ ๋กœ๋“œ
50
+ translator = pipeline(
51
+ "translation",
52
+ model="sbaru/jeju-satoru"
53
+ )
54
+
55
+ # ์ œ์ฃผ์–ด -> ํ‘œ์ค€์–ด ๋ฒˆ์—ญ ์˜ˆ์‹œ
56
+ jeju_sentence = '[์ œ์ฃผ] ์šฐ๋ฆฌ ์ง‘์ด ํŽœ์•ˆํ—ˆ๋‹ค.'
57
+ result = translator(jeju_sentence, max_length=128)
58
+ print(f"์ž…๋ ฅ: {jeju_sentence}")
59
+ print(f"์ถœ๋ ฅ: {result[0]['translation_text']}")
60
+
61
+ # ํ‘œ์ค€์–ด -> ์ œ์ฃผ์–ด ๋ฒˆ์—ญ ์˜ˆ์‹œ
62
+ standard_sentence = '[ํ‘œ์ค€] ์šฐ๋ฆฌ ์ง‘์€ ํŽธ์•ˆํ•˜๋‹ค.'
63
+ result = translator(standard_sentence, max_length=128)
64
+ print(f"์ž…๋ ฅ: {standard_sentence}")
65
+ print(f"์ถœ๋ ฅ: {result[0]['translation_text']}")