|  | --- | 
					
						
						|  | license: apache-2.0 | 
					
						
						|  | language: | 
					
						
						|  | - ko | 
					
						
						|  | - en | 
					
						
						|  | pipeline_tag: visual-question-answering | 
					
						
						|  | tags: | 
					
						
						|  | - text2text-generation | 
					
						
						|  | base_model: google/deplot | 
					
						
						|  | --- | 
					
						
						|  | # **ko-deplot** | 
					
						
						|  |  | 
					
						
						|  | ko-deplot is a korean Visual-QA model based on the Google's Pix2Struct architecture. It was fine-tuned from [Deplot](https://huggingface.co/google/deplot), using korean chart image-text pairs. | 
					
						
						|  |  | 
					
						
						|  | ko-deplotμ Googleμ Pix2Struct ꡬ쑰λ₯Ό κΈ°λ°μΌλ‘ ν νκ΅μ΄ Visual-QA λͺ¨λΈμ
λλ€. [Deplot](https://huggingface.co/google/deplot) λͺ¨λΈμ νκ΅μ΄ μ°¨νΈ μ΄λ―Έμ§-ν
μ€νΈ μ λ°μ΄ν°μ
μ μ΄μ©νμ¬ νμΈνλνμμ΅λλ€. | 
					
						
						|  |  | 
					
						
						|  | - **Developed by:** [NUUA](https://www.nuua.ai/en/) | 
					
						
						|  | - **Model type:** Visual Question Answering | 
					
						
						|  | - **License:** apache-2.0 | 
					
						
						|  | - **Finetuned from model:** [google/deplot](https://huggingface.co/google/deplot) | 
					
						
						|  |  | 
					
						
						|  | # **Model Usage** | 
					
						
						|  | You can run a prediction by querying an input image together with a question as follows: | 
					
						
						|  |  | 
					
						
						|  | μλμ μ½λλ₯Ό μ΄μ©νμ¬ λͺ¨λΈ μΆλ‘ μ ν  μ μμ΅λλ€: | 
					
						
						|  |  | 
					
						
						|  | ```python | 
					
						
						|  | from transformers import Pix2StructProcessor, Pix2StructForConditionalGeneration | 
					
						
						|  | from PIL import Image | 
					
						
						|  |  | 
					
						
						|  | processor = Pix2StructProcessor.from_pretrained('nuua/ko-deplot') | 
					
						
						|  | model = Pix2StructForConditionalGeneration.from_pretrained('nuua/ko-deplot') | 
					
						
						|  |  | 
					
						
						|  | IMAGE_PATH = "LOCAL_PATH_TO_IMAGE" | 
					
						
						|  | image = Image.open(IMAGE_PATH) | 
					
						
						|  |  | 
					
						
						|  | inputs = processor(images=image, text="Generate underlying data table of the figure below:", return_tensors="pt") | 
					
						
						|  | predictions = model.generate(**inputs, max_new_tokens=512) | 
					
						
						|  | print(processor.decode(predictions[0], skip_special_tokens=True)) | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | # **Tokenizer Details** | 
					
						
						|  | The model's tokenizer vocab was extended from 50,344 to 65,536 tokens using the following: | 
					
						
						|  |  | 
					
						
						|  | - Complete Korean Jamo | 
					
						
						|  | - [Additional Korean Jamo](http://koreantypography.org/wp-content/uploads/2016/02/kst_12_7_2_06.pdf) | 
					
						
						|  | - Ko-Electra tokens | 
					
						
						|  |  | 
					
						
						|  | λͺ¨λΈμ tokenizer vocabμ 50344κ°μμ 65536κ°λ‘ μλλ₯Ό μ΄μ©νμ¬ νμ₯μν¨ ν νμ΅μ μ§ννμμ΅λλ€: | 
					
						
						|  |  | 
					
						
						|  | - μμ±ν νκΈ μλͺ¨ | 
					
						
						|  | - [μΆκ° μμ±ν νκΈ μλͺ¨](http://koreantypography.org/wp-content/uploads/2016/02/kst_12_7_2_06.pdf) | 
					
						
						|  | - Ko-Electra νκΈ ν ν° | 
					
						
						|  |  | 
					
						
						|  | # **Training Details** | 
					
						
						|  |  | 
					
						
						|  | ## Training Data | 
					
						
						|  |  | 
					
						
						|  | Synthetic chart data from three libraries were used: | 
					
						
						|  |  | 
					
						
						|  | μΈ κ°μ λΌμ΄λΈλ¬λ¦¬μμ ν©μ± μ°¨νΈ λ°μ΄ν°λ₯Ό μμ±νμ¬ μ¬μ©νμμ΅λλ€: | 
					
						
						|  |  | 
					
						
						|  | - [GenPlot](https://github.com/brendanartley/genplot) | 
					
						
						|  | - [Chart.js](https://github.com/chartjs/Chart.js) | 
					
						
						|  | - [Plotly](https://github.com/plotly/plotly.py) | 
					
						
						|  |  | 
					
						
						|  | ## Training Procedure | 
					
						
						|  |  | 
					
						
						|  | The model was first exposed to a short warmup stage, following its [original paper](https://arxiv.org/pdf/2210.03347.pdf). It was then trained using the chart data for 50,000 steps. | 
					
						
						|  |  | 
					
						
						|  | νμ΅μ μν΄ μ²μ μ§§μ "warmup" λ¨κ³λ₯Ό κ±°μ³ νκΈμ νμ΅μν¨ ν 50,000 μ€ν
 λμ μ°¨νΈ λ°μ΄ν°λ₯Ό νμ΅μμΌ°μ΅λλ€. | 
					
						
						|  |  | 
					
						
						|  | # **Technical Specifications** | 
					
						
						|  |  | 
					
						
						|  | ## Hardware | 
					
						
						|  |  | 
					
						
						|  | ko-deplot was trained by using A100 80G. | 
					
						
						|  |  | 
					
						
						|  | A100 80G GPUλ₯Ό μ΄μ©νμ¬ νμ΅νμμ΅λλ€. | 
					
						
						|  |  | 
					
						
						|  | # **Contact** | 
					
						
						|  |  | 
					
						
						|  | Any questions and suggestions, please use the discussion tab. If you want to contact us directly, email [email protected]. |