Persian-OCR

Persian-OCR is a deep learning model for Optical Character Recognition (OCR), designed specifically for Persian text.
The model employs a CNN + Transformer architecture trained with CTC loss to extract text from images.

The model was trained on a custom dataset of approximately 600,000 synthetic Persian text images.
These images were generated from Wikipedia text using 49 different Persian fonts, with sequence lengths ranging from 0 to 150 characters.

On this dataset, the model achieves a sequence accuracy of 96%.

The model may benefit from further fine-tuning on real-world data, and contributions or collaborations are warmly welcomed.

🀝 Contributing

Contributions are welcome! If you have a dataset of real-world Persian text or improvements to the model, please open an issue or submit a pull request.

πŸ“¬ Contact

For collaboration or inquiries, please reach out via [email protected]

Files

  • pytorch_model.bin : PyTorch model weights
  • vocab.json : Character vocabulary
  • model.py : Python script defining the CNN + Transformer OCR model
  • utils.py : Utility functions for OCR, including ocr_page and load_vocab
  • config.json : Model configuration

Installation

pip install torch torchvision huggingface_hub

Usage


import torch
import json
import sys
import importlib.util
from huggingface_hub import hf_hub_download

# 1️⃣ Load vocab
vocab_path = hf_hub_download("farbodpya/Persian-OCR", "vocab.json")
with open(vocab_path, "r", encoding="utf-8") as f:
    vocab = json.load(f)
idx_to_char = {int(k): v for k, v in vocab["idx_to_char"].items()}

# 2️⃣ Import model.py 
model_file = hf_hub_download("farbodpya/Persian-OCR", "model.py")
spec_model = importlib.util.spec_from_file_location("model", model_file)
model_module = importlib.util.module_from_spec(spec_model)
sys.modules["model"] = model_module
spec_model.loader.exec_module(model_module)
from model import CNN_Transformer_OCR

# 3️⃣ Import utils.py 
utils_file = hf_hub_download("farbodpya/Persian-OCR", "utils.py")
spec_utils = importlib.util.spec_from_file_location("utils", utils_file)
utils_module = importlib.util.module_from_spec(spec_utils)
sys.modules["utils"] = utils_module
spec_utils.loader.exec_module(utils_module)
from utils import ocr_page

# 4️⃣ Load model weights
weights_path = hf_hub_download("farbodpya/Persian-OCR", "pytorch_model.bin")
model = CNN_Transformer_OCR(num_classes=len(idx_to_char)+1)
model.load_state_dict(torch.load(weights_path, map_location="cpu"))
model.eval()

# 5️⃣ Run OCR on an image
img_path = "sample.png"  # replace with your own image
text = ocr_page(img_path, model, idx_to_char)
print("\n=== Final OCR Page ===\n", text)


Downloads last month
140
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support