# PaddleOCR v4 (PP-OCRv4) ## Model Description **PP-OCRv4** is the fourth-generation end-to-end optical character recognition system from the PaddlePaddle team. It combines a lightweight **text detection → angle classification → text recognition** pipeline with improved training techniques and data augmentation, delivering higher accuracy and robustness while staying efficient for real-time use. PP-OCRv4 supports multilingual OCR (Latin and non-Latin scripts), irregular layouts (rotated/curved text), and challenging inputs such as noisy or low-resolution images often found in mobile and document-scan scenarios. ## Features - **End-to-end OCR**: text detection, optional angle classification, and text recognition in one pipeline. - **Multilingual support**: pretrained models for English, Chinese, and dozens of other languages; easy finetuning for domain text. - **Robust in real-world conditions**: handles rotation, perspective distortion, blur, low light, and complex backgrounds. - **Lightweight & fast**: practical for both mobile apps and large-scale server deployments. - **Flexible I/O**: works with photos, scans, screenshots, receipts, invoices, ID cards, dashboards, and UI text. - **Extensible**: swap components (detector/recognizer), add language packs, or finetune on domain datasets. ## Use Cases - Document digitization (invoices, receipts, forms, contracts) - RPA and back-office automation (screen/OCR flows) - Mobile scanning apps and camera-based translation/read-aloud - Industrial and retail analytics (labels, price tags, shelf tags) - Accessibility (screen-readers and read-aloud applications) ## Inputs and Outputs **Input**: Image (photo, scan, or screenshot). **Output**: A list of detected text regions, each with: - bounding box (rectangular or polygonal) - recognized text string - optional confidence score and orientation --- ## How to use > ⚠️ **Hardware requirement:** the model currently runs **only on Qualcomm NPUs** (e.g., Snapdragon-powered AIPC). > Apple NPU support is planned next. ### 1) Install Nexa-SDK - Download and follow the steps under "Deploy Section" Nexa's model page: [Download Windows arm64 SDK](https://sdk.nexa.ai/model/PaddleOCR%20v4) - (Other platforms coming soon) ### 2) Get an access token Create a token in the Model Hub, then log in: ```bash nexa config set license '' ``` ### 3) Run the model Running: ```bash nexa infer NexaAI/paddleocr-npu ``` --- ## License - Licensed under [Apache-2.0](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/LICENSE) ## References - GitHub repo: [https://github.com/PaddlePaddle/PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) - Model zoo & documentation: [Models list](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_en/models_list_en.md)