prithivMLmods
/

Bpe-vocab-n-OCR

@@ -5,30 +5,27 @@ language:
 - zh
 base_model:
 - prithivMLmods/Qwen2-VL-OCR-2B-Instruct
-pipeline_tag: image-text-to-text
 library_name: transformers
 tags:
 - text-generation-inference
 ---
-![xvzxfv.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/2MNYn7ZsVkqX9lGVkJV47.png)
-# **Tokenized-OCR**
-**Tokenized-OCR** is an advanced OCR-based text extraction tool optimized for generating structured, tokenized outputs. Built upon a powerful vision-language architecture with enhanced OCR and multilingual support, Tokenized-OCR accurately extracts text from images and returns it as a comma-separated sequence.
 #### Key Enhancements:
-* **Advanced OCR Engine**: Fine-tuned on extensive datasets, Tokenized-OCR ensures precise text recognition and tokenization.
 * **Optimized for Tokenized Output**: Produces structured comma-separated text, making it ideal for downstream NLP tasks, automation pipelines, and database integrations.
 * **Enhanced Multilingual OCR**: Supports text extraction in multiple languages, including English, Chinese, Japanese, Korean, Arabic, and more.
 * **Multimodal Processing**: Seamlessly processes both image and text inputs, providing structured tokenized outputs.
 * **Secure and Optimized Model Weights**: Employs safetensors for efficient and secure model loading.
-### Demo Inference
-```python
-Instruction : "Extract and return the tokenized OCR text from the image, ensuring separated by commas."
-```
 ![sdsdfsd.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/XT1Qe2WxVzclETv6Rmgfs.png)
 ### How to Use
@@ -37,7 +34,7 @@ Instruction : "Extract and return the tokenized OCR text from the image, ensurin
 from transformers import Qwen2VLForConditionalGeneration, AutoTokenizer, AutoProcessor
 from qwen_vl_utils import process_vision_info
-# Load the Tokenized-OCR model with optimized parameters
 model = Qwen2VLForConditionalGeneration.from_pretrained(
     "prithivMLmods/Tokenized-OCR", torch_dtype="auto", device_map="auto"
 )
@@ -50,7 +47,7 @@ model = Qwen2VLForConditionalGeneration.from_pretrained(
 #     device_map="auto",
 # )
-# Load the default processor for Tokenized-OCR
 processor = AutoProcessor.from_pretrained("prithivMLmods/Tokenized-OCR")
 # Define the input messages with both an image and a text prompt
@@ -94,19 +91,22 @@ print(output_text)
 ### **Key Features**
-1. **High-Accuracy OCR Processing**
-   - Extracts and tokenizes text from images with exceptional precision.
-2. **Multilingual Text Recognition**
-   - Supports multiple languages, ensuring comprehensive OCR capabilities.
-3. **Comma-Separated Tokenized Output**
-   - Generates structured text for seamless NLP and data processing tasks.
-4. **Efficient Image & Text Processing**
-   - Handles both visual and textual inputs, ensuring accurate OCR-based extraction.
-5. **Optimized for Secure Deployment**
-   - Uses safetensors for enhanced security and model efficiency.
-**Tokenized-OCR** revolutionizes text extraction from images, providing tokenized outputs that are easy to integrate into automated workflows, search engines, and language processing applications.

 - zh
 base_model:
 - prithivMLmods/Qwen2-VL-OCR-2B-Instruct
+pipeline_tag: image-to-text
 library_name: transformers
 tags:
 - text-generation-inference
+- bpe
+- ocr
 ---
+# **Bpe-vocab-n-OCR**
+![bpe.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/f3-ZGWYpFHmRn7L-hSvtt.png)
+**Bpe-vocab-n-OCR** is an advanced OCR-based text extraction tool optimized for generating structured, tokenized outputs. Built upon a powerful vision-language architecture with enhanced OCR and multilingual support, Bpe-vocab-n-OCR accurately extracts text from images and returns it as a comma-separated sequence.
 #### Key Enhancements:
+* **Advanced OCR Engine**: Fine-tuned on extensive datasets, Bpe-vocab-n-OCR ensures precise text recognition and tokenization.
 * **Optimized for Tokenized Output**: Produces structured comma-separated text, making it ideal for downstream NLP tasks, automation pipelines, and database integrations.
 * **Enhanced Multilingual OCR**: Supports text extraction in multiple languages, including English, Chinese, Japanese, Korean, Arabic, and more.
 * **Multimodal Processing**: Seamlessly processes both image and text inputs, providing structured tokenized outputs.
 * **Secure and Optimized Model Weights**: Employs safetensors for efficient and secure model loading.
 ![sdsdfsd.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/XT1Qe2WxVzclETv6Rmgfs.png)
 ### How to Use
 from transformers import Qwen2VLForConditionalGeneration, AutoTokenizer, AutoProcessor
 from qwen_vl_utils import process_vision_info
+# Load the Bpe-vocab-n-OCR model with optimized parameters
 model = Qwen2VLForConditionalGeneration.from_pretrained(
     "prithivMLmods/Tokenized-OCR", torch_dtype="auto", device_map="auto"
 )
 #     device_map="auto",
 # )
+# Load the default processor for Bpe-vocab-n-OCR
 processor = AutoProcessor.from_pretrained("prithivMLmods/Tokenized-OCR")
 # Define the input messages with both an image and a text prompt
 ### **Key Features**
+1. **High-Accuracy OCR Processing**
+   * Extracts and tokenizes text from images with exceptional precision.
+2. **Multilingual Text Recognition**
+   * Supports multiple languages, ensuring comprehensive OCR capabilities.
+3. **Comma-Separated Tokenized Output**
+   * Generates structured text for seamless NLP and data processing tasks.
+4. **Efficient Image & Text Processing**
+   * Handles both visual and textual inputs, ensuring accurate OCR-based extraction.
+5. **Optimized for Secure Deployment**
+   * Uses safetensors for enhanced security and model efficiency.