sbaru commited on
Commit
82a6f03
·
verified ·
1 Parent(s): f24515e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -13
README.md CHANGED
@@ -22,29 +22,29 @@ tags:
22
  # Jeju Satoru
23
 
24
  ## Project Overview
25
- [cite_start]'Jeju Satoru' is a **bidirectional Jeju-Standard Korean translation model** developed to preserve the Jeju language, which is designated as an **'endangered language'** by UNESCO[cite: 17, 256]. [cite_start]The model aims to bridge the digital divide for elderly Jeju dialect speakers by improving their digital accessibility[cite: 257].
26
 
27
  ## Model Information
28
  * **Base Model**: KoBART (`gogamza/kobart-base-v2`)
29
  * **Model Architecture**: Seq2Seq (Encoder-Decoder structure)
30
- * [cite_start]**Training Data**: The model was trained using a large-scale dataset of approximately 930,000 sentence pairs[cite: 56, 144]. [cite_start]The dataset was built by leveraging the publicly available [Junhoee/Jeju-Standard-Translation](https://huggingface.co/datasets/Junhoee/Jeju-Standard-Translation) dataset, which is primarily based on text from the KakaoBrain JIT (Jeju-Island-Translation) corpus and transcribed data from the AI Hub Jeju dialect speech dataset[cite: 281, 282].
31
 
32
  ## Training Strategy and Parameters
33
- [cite_start]Our model was trained using a **two-stage domain adaptation method** to handle the complexities of the Jeju dialect[cite: 163, 164, 165, 173].
34
 
35
- 1. [cite_start]**Domain Adaptation**: The model was separately trained on Standard Korean and Jeju dialect sentences to help it deeply understand the grammar and style of each language[cite: 166, 167, 168, 170, 171, 172].
36
- 2. [cite_start]**Translation Fine-Tuning**: The final stage involved training the model on the bidirectional dataset, with `[제주]` (Jeju) and `[표준]` (Standard) tags added to each sentence to explicitly guide the translation direction[cite: 71, 72, 73, 175, 176].
37
 
38
  The following key hyperparameters and techniques were applied for performance optimization:
39
- * [cite_start]**Learning Rate**: 2e-5 [cite: 179]
40
- * [cite_start]**Epochs**: 3 [cite: 180]
41
- * [cite_start]**Batch Size**: 128 [cite: 178]
42
- * [cite_start]**Weight Decay**: 0.01 [cite: 179, 180]
43
  * **Generation Beams**: 5
44
  * **GPU Memory Efficiency**: Mixed-precision training (FP16) was used to reduce training time, along with Gradient Accumulation (Steps: 16).
45
 
46
  ## Performance Evaluation
47
- [cite_start]The model's performance was comprehensively evaluated using both quantitative and qualitative metrics[cite: 191, 192, 193].
48
 
49
  ### Quantitative Evaluation
50
  | Direction | SacreBLEU | CHRF | BERTScore |
@@ -53,9 +53,9 @@ The following key hyperparameters and techniques were applied for performance op
53
  | Standard → Jeju Dialect | 64.86 | 72.68 | 0.94 |
54
 
55
  ### Qualitative Evaluation (Summary)
56
- * [cite_start]**Adequacy**: The model accurately captures the meaning of most source sentences[cite: 216, 217].
57
- * [cite_start]**Fluency**: The translated sentences are grammatically correct and natural-sounding[cite: 219, 220, 221, 222].
58
- * [cite_start]**Tone**: While generally good at maintaining the tone, the model has some limitations in perfectly reflecting the nuances and specific colloquial endings of the Jeju dialect[cite: 223, 224, 225].
59
 
60
  ## How to Use
61
  You can easily load and infer with the model using the `transformers` library's `pipeline` function.
 
22
  # Jeju Satoru
23
 
24
  ## Project Overview
25
+ 'Jeju Satoru' is a **bidirectional Jeju-Standard Korean translation model** developed to preserve the Jeju language, which is designated as an **'endangered language'** by UNESCO. The model aims to bridge the digital divide for elderly Jeju dialect speakers by improving their digital accessibility.
26
 
27
  ## Model Information
28
  * **Base Model**: KoBART (`gogamza/kobart-base-v2`)
29
  * **Model Architecture**: Seq2Seq (Encoder-Decoder structure)
30
+ * **Training Data**: The model was trained using a large-scale dataset of approximately 930,000 sentence pairs. The dataset was built by leveraging the publicly available [Junhoee/Jeju-Standard-Translation](https://huggingface.co/datasets/Junhoee/Jeju-Standard-Translation) dataset, which is primarily based on text from the KakaoBrain JIT (Jeju-Island-Translation) corpus and transcribed data from the AI Hub Jeju dialect speech dataset.
31
 
32
  ## Training Strategy and Parameters
33
+ Our model was trained using a **two-stage domain adaptation method** to handle the complexities of the Jeju dialect.
34
 
35
+ 1. **Domain Adaptation**: The model was separately trained on Standard Korean and Jeju dialect sentences to help it deeply understand the grammar and style of each language.
36
+ 2. **Translation Fine-Tuning**: The final stage involved training the model on the bidirectional dataset, with `[제주]` (Jeju) and `[표준]` (Standard) tags added to each sentence to explicitly guide the translation direction.
37
 
38
  The following key hyperparameters and techniques were applied for performance optimization:
39
+ * **Learning Rate**: 2e-5
40
+ * **Epochs**: 3
41
+ * **Batch Size**: 128
42
+ * **Weight Decay**: 0.01
43
  * **Generation Beams**: 5
44
  * **GPU Memory Efficiency**: Mixed-precision training (FP16) was used to reduce training time, along with Gradient Accumulation (Steps: 16).
45
 
46
  ## Performance Evaluation
47
+ The model's performance was comprehensively evaluated using both quantitative and qualitative metrics.
48
 
49
  ### Quantitative Evaluation
50
  | Direction | SacreBLEU | CHRF | BERTScore |
 
53
  | Standard → Jeju Dialect | 64.86 | 72.68 | 0.94 |
54
 
55
  ### Qualitative Evaluation (Summary)
56
+ * **Adequacy**: The model accurately captures the meaning of most source sentences.
57
+ * **Fluency**: The translated sentences are grammatically correct and natural-sounding.
58
+ * **Tone**: While generally good at maintaining the tone, the model has some limitations in perfectly reflecting the nuances and specific colloquial endings of the Jeju dialect.
59
 
60
  ## How to Use
61
  You can easily load and infer with the model using the `transformers` library's `pipeline` function.