File size: 2,096 Bytes
422c025
 
 
 
 
 
 
e5f14ab
 
 
 
 
 
 
 
 
 
 
 
 
 
e1cfb2f
4a6127b
e1cfb2f
 
 
e5f14ab
 
 
 
 
 
 
 
 
e909d54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e5f14ab
 
 
 
 
 
 
4a6127b
b950206
e5f14ab
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
license: mit
language:
- ko
base_model:
- paust/pko-t5-base
pipeline_tag: translation
---
# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->
μ˜μ–΄-ν•œκ΅­μ–΄ λ²ˆμ—­ λͺ¨λΈμž…λ‹ˆλ‹€.

### Model Description

<!-- Provide a longer summary of what this model is. -->
paust/pko-t5-base λͺ¨λΈμ„ 기반으둜 μ˜μ–΄-ν•œκ΅­μ–΄ λ²ˆμ—­μ„ λ―Έμ„Έμ‘°μ •ν•œ λ²ˆμ—­ λͺ¨λΈμž…λ‹ˆλ‹€.
μ˜μ–΄->ν•œκ΅­μ–΄, ν•œκ΅­μ–΄->μ˜μ–΄ μ–‘λ°©ν–₯ λ²ˆμ—­μ„ μ§€μ›ν•˜λ©°, μ˜μ–΄->ν•œκ΅­μ–΄ λ²ˆμ—­ μ‹œ λ†’μž„λ§λ„
μ„€μ •ν•  수 μžˆμŠ΅λ‹ˆλ‹€.


- **Developed by:** [BlueAI]
- **Model type:** [t5.1.1.base]
- **Language(s) (NLP):** [Korean]
- **License:** [MIT]
- **Finetuned from model [optional]:** [paust/pko-t5-base]

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

## How to Get Started with the Model

Use the code below to get started with the model.

```python
from transformers import pipeline, T5TokenizerFast

tokenizer_name = "paust/pko-t5-base"
tokenizer = T5TokenizerFast.from_pretrained(tokenizer_name)
model_path = 'Darong/BlueT'
translator = pipeline("translation", model=model_path, tokenizer=tokenizer, max_length=255)
# μ˜μ–΄ -> ν•œκ΅­μ–΄
prefix = "E2K: "
source = "This model is an English-Korean translation model."
target = translator(prefix + source)
print(target[0]['translation_text'])

# ν•œκ΅­μ–΄->μ˜μ–΄
prefix = "K2E: "
source = "이 λͺ¨λΈμ€ μ˜μ–΄-ν•œκ΅­μ–΄ λ²ˆμ—­ λͺ¨λΈμž…λ‹ˆλ‹€."
target = translator(prefix + source)
print(target[0]['translation_text'])
```

## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

이 λͺ¨λΈμ€ AI Hub 및 자체 κ΅¬μΆ•ν•œ λ°μ΄ν„°λ‘œ ν•™μŠ΅λ˜μ—ˆμŠ΅λ‹ˆλ‹€.
μ˜μ–΄->ν•œκ΅­μ–΄ ν•™μŠ΅ 데이터 μˆ˜λŠ” 1800만 이상, ν•œκ΅­μ–΄->μ˜μ–΄ ν•™μŠ΅ 데이터 μˆ˜λŠ” 1200만 μ΄μƒμ˜ λ¬Έμž₯으둜 κ΅¬μΆ•λ˜μ—ˆμŠ΅λ‹ˆλ‹€.