Update README.md
Browse files
README.md
CHANGED
|
@@ -8,11 +8,11 @@ tags:
|
|
| 8 |
- text2text-generation
|
| 9 |
base_model: google/deplot
|
| 10 |
---
|
| 11 |
-
# **
|
| 12 |
|
| 13 |
-
|
| 14 |
|
| 15 |
-
|
| 16 |
|
| 17 |
- **Developed by:** [NUUA](https://www.nuua.ai/en/)
|
| 18 |
- **Model type:** Visual Question Answering
|
|
@@ -28,8 +28,8 @@ You can run a prediction by querying an input image together with a question as
|
|
| 28 |
from transformers import Pix2StructProcessor, Pix2StructForConditionalGeneration
|
| 29 |
from PIL import Image
|
| 30 |
|
| 31 |
-
processor = Pix2StructProcessor.from_pretrained('nuua/
|
| 32 |
-
model = Pix2StructForConditionalGeneration.from_pretrained('nuua/
|
| 33 |
|
| 34 |
IMAGE_PATH = "LOCAL_PATH_TO_IMAGE"
|
| 35 |
image = Image.open(IMAGE_PATH)
|
|
@@ -39,6 +39,19 @@ predictions = model.generate(**inputs, max_new_tokens=512)
|
|
| 39 |
print(processor.decode(predictions[0], skip_special_tokens=True))
|
| 40 |
```
|
| 41 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 42 |
# **Training Details**
|
| 43 |
|
| 44 |
## Training Data
|
|
@@ -61,7 +74,7 @@ The model was first exposed to a short warmup stage, following its [original pap
|
|
| 61 |
|
| 62 |
## Hardware
|
| 63 |
|
| 64 |
-
|
| 65 |
|
| 66 |
A100 80G GPU๋ฅผ ์ด์ฉํ์ฌ ํ์ตํ์์ต๋๋ค.
|
| 67 |
|
|
|
|
| 8 |
- text2text-generation
|
| 9 |
base_model: google/deplot
|
| 10 |
---
|
| 11 |
+
# **ko-deplot**
|
| 12 |
|
| 13 |
+
ko-deplot is a korean Visual-QA model based on the Google's Pix2Struct architecture. It was fine-tuned from [Deplot](https://huggingface.co/google/deplot), using korean chart image-text pairs.
|
| 14 |
|
| 15 |
+
ko-deplot์ Google์ Pix2Struct ๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ํ ํ๊ตญ์ด Visual-QA ๋ชจ๋ธ์
๋๋ค. [Deplot](https://huggingface.co/google/deplot) ๋ชจ๋ธ์ ํ๊ตญ์ด ์ฐจํธ ์ด๋ฏธ์ง-ํ
์คํธ ์ ๋ฐ์ดํฐ์
์ ์ด์ฉํ์ฌ ํ์ธํ๋ํ์์ต๋๋ค.
|
| 16 |
|
| 17 |
- **Developed by:** [NUUA](https://www.nuua.ai/en/)
|
| 18 |
- **Model type:** Visual Question Answering
|
|
|
|
| 28 |
from transformers import Pix2StructProcessor, Pix2StructForConditionalGeneration
|
| 29 |
from PIL import Image
|
| 30 |
|
| 31 |
+
processor = Pix2StructProcessor.from_pretrained('nuua/ko-deplot')
|
| 32 |
+
model = Pix2StructForConditionalGeneration.from_pretrained('nuua/ko-deplot')
|
| 33 |
|
| 34 |
IMAGE_PATH = "LOCAL_PATH_TO_IMAGE"
|
| 35 |
image = Image.open(IMAGE_PATH)
|
|
|
|
| 39 |
print(processor.decode(predictions[0], skip_special_tokens=True))
|
| 40 |
```
|
| 41 |
|
| 42 |
+
# **Tokenizer Details**
|
| 43 |
+
The model's tokenizer vocab was extended from 50,344 to 65,536 tokens using the following:
|
| 44 |
+
|
| 45 |
+
- Complete Korean Jamo
|
| 46 |
+
- [Additional Korean Jamo](http://koreantypography.org/wp-content/uploads/2016/02/kst_12_7_2_06.pdf)
|
| 47 |
+
- Ko-Electra tokens
|
| 48 |
+
|
| 49 |
+
๋ชจ๋ธ์ tokenizer vocab์ 50344๊ฐ์์ 65536๊ฐ๋ก ์๋๋ฅผ ์ด์ฉํ์ฌ ํ์ฅ์ํจ ํ ํ์ต์ ์งํํ์์ต๋๋ค:
|
| 50 |
+
|
| 51 |
+
- ์์ฑํ ํ๊ธ ์๋ชจ
|
| 52 |
+
- [์ถ๊ฐ ์์ฑํ ํ๊ธ ์๋ชจ](http://koreantypography.org/wp-content/uploads/2016/02/kst_12_7_2_06.pdf)
|
| 53 |
+
- Ko-Electra ํ๊ธ ํ ํฐ
|
| 54 |
+
|
| 55 |
# **Training Details**
|
| 56 |
|
| 57 |
## Training Data
|
|
|
|
| 74 |
|
| 75 |
## Hardware
|
| 76 |
|
| 77 |
+
ko-deplot was trained by using A100 80G.
|
| 78 |
|
| 79 |
A100 80G GPU๋ฅผ ์ด์ฉํ์ฌ ํ์ตํ์์ต๋๋ค.
|
| 80 |
|