korean trocr model

  • trocr λͺ¨λΈμ€ λ””μ½”λ”μ˜ ν† ν¬λ‚˜μ΄μ €μ— μ—†λŠ” κΈ€μžλŠ” ocr ν•˜μ§€ λͺ»ν•˜κΈ° λ•Œλ¬Έμ—, μ΄ˆμ„±μ„ μ‚¬μš©ν•˜λŠ” ν† ν¬λ‚˜μ΄μ €λ₯Ό μ‚¬μš©ν•˜λŠ” 디코더 λͺ¨λΈμ„ μ‚¬μš©ν•˜μ—¬ μ΄ˆμ„±λ„ UNK둜 λ‚˜μ˜€μ§€ μ•Šκ²Œ λ§Œλ“  trocr λͺ¨λΈμž…λ‹ˆλ‹€.
  • 2023 ꡐ원그룹 AI OCR μ±Œλ¦°μ§€ μ—μ„œ μ–»μ—ˆλ˜ λ…Έν•˜μš°λ₯Ό ν™œμš©ν•˜μ—¬ μ œμž‘ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

train datasets

AI Hub

model structure

how to use

from transformers import TrOCRProcessor, VisionEncoderDecoderModel, AutoTokenizer
import requests 
import unicodedata
from io import BytesIO
from PIL import Image

processor = TrOCRProcessor.from_pretrained("ddobokki/ko-trocr") 
model = VisionEncoderDecoderModel.from_pretrained("ddobokki/ko-trocr")
tokenizer = AutoTokenizer.from_pretrained("ddobokki/ko-trocr")

url = "https://raw.githubusercontent.com/ddobokki/ocr_img_example/master/g.jpg"
response = requests.get(url)
img = Image.open(BytesIO(response.content))

pixel_values = processor(img, return_tensors="pt").pixel_values 
generated_ids = model.generate(pixel_values, max_length=64)
generated_text = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
generated_text = unicodedata.normalize("NFC", generated_text)
print(generated_text)
Downloads last month
7,286
Safetensors
Model size
214M params
Tensor type
I64
Β·
FP16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.