hezarai/trocr-base-fa-v1 · How to finetune the model

Nov 28, 2023

Hello thanks a lot for this great library
can you tell me how can i gather a custom dataset and finetune this trOCR model?
what tools do you recommend for labeling and what format should the dataset be

arxyzan

Hezar AI org Nov 28, 2023

Hello @ZiaTohidi , I highly recommend going for less resource-intensive models (non-Transformer based) like CRNN. The TrOCR model needs a lot of samples to reach an acceptable performance.
Also, this model has a successor at hezarai/trocr-base-fa-v2.
Why would you need to fine-tune this model? is it for research purposes or production? if you aim to use an OCR model for production I recommend using hezarai/crnn-fa-printed-96-long. This model is 20X faster and more accurate by magnitudes! It also supports up to 96 characters in an image which is a lot for most usecases even for commercial usages.

ArshaKhaksar

3 days ago

Hello @arxyzan thanks for providing this library
actually i have fine tuned a model with crnn fa printed 96 long but i didnt get a suitable outoup that was for extracting text from an iraninan id card so i was thinking to use the transformer based model for it but now based on what you have said what is your recommendation i dont have any limit for the dataset too because i can generate new fake id card
so what is your suggestion for me use crnn or trocr v2 or should i train manually ? Thanks

arxyzan

Hezar AI org 3 days ago

Hi @ArshaKhaksar , can you please explain why you did not get desired results by finetuning the CRNN model? I'm asking because I did the same thing with CRNN 4 years ago and it worked really well.
So I'd be glad if you can tell me:

How do you detect and extract the texts from the cards? I used CRAFT back then and it was pretty solid
How did you finetune CRNN? what CER/WER results did you get on your custom dataset? Anything higher than 10% error rate is considered not good. Mine was 4% on our base dataset.
Have you tried our vanilla CRNN model without finetuning? How good/bad was the result?