Description: This model is a fine-tuned version of MobileNetV2 designed for Optical Character Recognition (OCR) of rare and extended Unicode characters, including phonetic symbols (IPA), Cyrillic extensions, Greek numerals, and archaic Latin letters. It is optimized to detect and classify these characters in visually complex environments featuring blurred backgrounds, varying colors, and rotated glyphs. The model is lightweight and suitable for real-time applications or deployment on edge devices.
Use Cases:
Linguistic text recognition (ancient, phonetic, or symbolic scripts)
Custom captcha
OCR preprocessing in complex visual settings
Fine-tuning Details:
Base model: mobilenet_v2
Dataset: Synthetic dataset with character-level images featuring randomized rotation, color variation, and background blur
Input size: 160X160 RGB
Output: Multi-class classification [13] unique symbols
Example Input/Output: Input: Image of a rotated, colored character on a textured background Output: Unicode label, e.g., 'ʕ' or 'ϸ'
Epoch 5 finished | Avg Loss: 1.7347 | Avg Accuracy: 0.9908 Epoch 5/5 | Validation Loss: 1.7091 | Validation Accuracy: 0.9954
Training Details: The model was fine-tuned on a small custom dataset, synthetically generated to simulate challenging OCR conditions. The validation set represents 1/10th of the total dataset size, ensuring a reasonable generalization check while keeping training focused due to limited data.
Model tree for Blast02/ocr-special-characters-classification-mobilenetv2
Base model
google/mobilenet_v2_1.0_224