Update README.md
Browse files
README.md
CHANGED
@@ -22,29 +22,29 @@ tags:
|
|
22 |
# Jeju Satoru
|
23 |
|
24 |
## Project Overview
|
25 |
-
|
26 |
|
27 |
## Model Information
|
28 |
* **Base Model**: KoBART (`gogamza/kobart-base-v2`)
|
29 |
* **Model Architecture**: Seq2Seq (Encoder-Decoder structure)
|
30 |
-
*
|
31 |
|
32 |
## Training Strategy and Parameters
|
33 |
-
|
34 |
|
35 |
-
1.
|
36 |
-
2.
|
37 |
|
38 |
The following key hyperparameters and techniques were applied for performance optimization:
|
39 |
-
*
|
40 |
-
*
|
41 |
-
*
|
42 |
-
*
|
43 |
* **Generation Beams**: 5
|
44 |
* **GPU Memory Efficiency**: Mixed-precision training (FP16) was used to reduce training time, along with Gradient Accumulation (Steps: 16).
|
45 |
|
46 |
## Performance Evaluation
|
47 |
-
|
48 |
|
49 |
### Quantitative Evaluation
|
50 |
| Direction | SacreBLEU | CHRF | BERTScore |
|
@@ -53,9 +53,9 @@ The following key hyperparameters and techniques were applied for performance op
|
|
53 |
| Standard → Jeju Dialect | 64.86 | 72.68 | 0.94 |
|
54 |
|
55 |
### Qualitative Evaluation (Summary)
|
56 |
-
*
|
57 |
-
*
|
58 |
-
*
|
59 |
|
60 |
## How to Use
|
61 |
You can easily load and infer with the model using the `transformers` library's `pipeline` function.
|
|
|
22 |
# Jeju Satoru
|
23 |
|
24 |
## Project Overview
|
25 |
+
'Jeju Satoru' is a **bidirectional Jeju-Standard Korean translation model** developed to preserve the Jeju language, which is designated as an **'endangered language'** by UNESCO. The model aims to bridge the digital divide for elderly Jeju dialect speakers by improving their digital accessibility.
|
26 |
|
27 |
## Model Information
|
28 |
* **Base Model**: KoBART (`gogamza/kobart-base-v2`)
|
29 |
* **Model Architecture**: Seq2Seq (Encoder-Decoder structure)
|
30 |
+
* **Training Data**: The model was trained using a large-scale dataset of approximately 930,000 sentence pairs. The dataset was built by leveraging the publicly available [Junhoee/Jeju-Standard-Translation](https://huggingface.co/datasets/Junhoee/Jeju-Standard-Translation) dataset, which is primarily based on text from the KakaoBrain JIT (Jeju-Island-Translation) corpus and transcribed data from the AI Hub Jeju dialect speech dataset.
|
31 |
|
32 |
## Training Strategy and Parameters
|
33 |
+
Our model was trained using a **two-stage domain adaptation method** to handle the complexities of the Jeju dialect.
|
34 |
|
35 |
+
1. **Domain Adaptation**: The model was separately trained on Standard Korean and Jeju dialect sentences to help it deeply understand the grammar and style of each language.
|
36 |
+
2. **Translation Fine-Tuning**: The final stage involved training the model on the bidirectional dataset, with `[제주]` (Jeju) and `[표준]` (Standard) tags added to each sentence to explicitly guide the translation direction.
|
37 |
|
38 |
The following key hyperparameters and techniques were applied for performance optimization:
|
39 |
+
* **Learning Rate**: 2e-5
|
40 |
+
* **Epochs**: 3
|
41 |
+
* **Batch Size**: 128
|
42 |
+
* **Weight Decay**: 0.01
|
43 |
* **Generation Beams**: 5
|
44 |
* **GPU Memory Efficiency**: Mixed-precision training (FP16) was used to reduce training time, along with Gradient Accumulation (Steps: 16).
|
45 |
|
46 |
## Performance Evaluation
|
47 |
+
The model's performance was comprehensively evaluated using both quantitative and qualitative metrics.
|
48 |
|
49 |
### Quantitative Evaluation
|
50 |
| Direction | SacreBLEU | CHRF | BERTScore |
|
|
|
53 |
| Standard → Jeju Dialect | 64.86 | 72.68 | 0.94 |
|
54 |
|
55 |
### Qualitative Evaluation (Summary)
|
56 |
+
* **Adequacy**: The model accurately captures the meaning of most source sentences.
|
57 |
+
* **Fluency**: The translated sentences are grammatically correct and natural-sounding.
|
58 |
+
* **Tone**: While generally good at maintaining the tone, the model has some limitations in perfectly reflecting the nuances and specific colloquial endings of the Jeju dialect.
|
59 |
|
60 |
## How to Use
|
61 |
You can easily load and infer with the model using the `transformers` library's `pipeline` function.
|