prithivMLmods commited on
Commit
78389d3
·
verified ·
1 Parent(s): 384fde3

Update README.md (#1)

Browse files

- Update README.md (4def177d643981761b9fa066253c1b2b52bfaf07)

Files changed (1) hide show
  1. README.md +24 -24
README.md CHANGED
@@ -5,30 +5,27 @@ language:
5
  - zh
6
  base_model:
7
  - prithivMLmods/Qwen2-VL-OCR-2B-Instruct
8
- pipeline_tag: image-text-to-text
9
  library_name: transformers
10
  tags:
11
  - text-generation-inference
 
 
12
  ---
13
- ![xvzxfv.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/2MNYn7ZsVkqX9lGVkJV47.png)
14
- # **Tokenized-OCR**
15
 
16
- **Tokenized-OCR** is an advanced OCR-based text extraction tool optimized for generating structured, tokenized outputs. Built upon a powerful vision-language architecture with enhanced OCR and multilingual support, Tokenized-OCR accurately extracts text from images and returns it as a comma-separated sequence.
 
 
17
 
18
  #### Key Enhancements:
19
 
20
- * **Advanced OCR Engine**: Fine-tuned on extensive datasets, Tokenized-OCR ensures precise text recognition and tokenization.
21
  * **Optimized for Tokenized Output**: Produces structured comma-separated text, making it ideal for downstream NLP tasks, automation pipelines, and database integrations.
22
  * **Enhanced Multilingual OCR**: Supports text extraction in multiple languages, including English, Chinese, Japanese, Korean, Arabic, and more.
23
  * **Multimodal Processing**: Seamlessly processes both image and text inputs, providing structured tokenized outputs.
24
  * **Secure and Optimized Model Weights**: Employs safetensors for efficient and secure model loading.
25
 
26
- ### Demo Inference
27
-
28
- ```python
29
- Instruction : "Extract and return the tokenized OCR text from the image, ensuring separated by commas."
30
- ```
31
-
32
  ![sdsdfsd.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/XT1Qe2WxVzclETv6Rmgfs.png)
33
 
34
  ### How to Use
@@ -37,7 +34,7 @@ Instruction : "Extract and return the tokenized OCR text from the image, ensurin
37
  from transformers import Qwen2VLForConditionalGeneration, AutoTokenizer, AutoProcessor
38
  from qwen_vl_utils import process_vision_info
39
 
40
- # Load the Tokenized-OCR model with optimized parameters
41
  model = Qwen2VLForConditionalGeneration.from_pretrained(
42
  "prithivMLmods/Tokenized-OCR", torch_dtype="auto", device_map="auto"
43
  )
@@ -50,7 +47,7 @@ model = Qwen2VLForConditionalGeneration.from_pretrained(
50
  # device_map="auto",
51
  # )
52
 
53
- # Load the default processor for Tokenized-OCR
54
  processor = AutoProcessor.from_pretrained("prithivMLmods/Tokenized-OCR")
55
 
56
  # Define the input messages with both an image and a text prompt
@@ -94,19 +91,22 @@ print(output_text)
94
 
95
  ### **Key Features**
96
 
97
- 1. **High-Accuracy OCR Processing**
98
- - Extracts and tokenizes text from images with exceptional precision.
 
 
 
 
 
 
 
99
 
100
- 2. **Multilingual Text Recognition**
101
- - Supports multiple languages, ensuring comprehensive OCR capabilities.
102
 
103
- 3. **Comma-Separated Tokenized Output**
104
- - Generates structured text for seamless NLP and data processing tasks.
105
 
106
- 4. **Efficient Image & Text Processing**
107
- - Handles both visual and textual inputs, ensuring accurate OCR-based extraction.
108
 
109
- 5. **Optimized for Secure Deployment**
110
- - Uses safetensors for enhanced security and model efficiency.
111
 
112
- **Tokenized-OCR** revolutionizes text extraction from images, providing tokenized outputs that are easy to integrate into automated workflows, search engines, and language processing applications.
 
5
  - zh
6
  base_model:
7
  - prithivMLmods/Qwen2-VL-OCR-2B-Instruct
8
+ pipeline_tag: image-to-text
9
  library_name: transformers
10
  tags:
11
  - text-generation-inference
12
+ - bpe
13
+ - ocr
14
  ---
15
+ # **Bpe-vocab-n-OCR**
 
16
 
17
+ ![bpe.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/f3-ZGWYpFHmRn7L-hSvtt.png)
18
+
19
+ **Bpe-vocab-n-OCR** is an advanced OCR-based text extraction tool optimized for generating structured, tokenized outputs. Built upon a powerful vision-language architecture with enhanced OCR and multilingual support, Bpe-vocab-n-OCR accurately extracts text from images and returns it as a comma-separated sequence.
20
 
21
  #### Key Enhancements:
22
 
23
+ * **Advanced OCR Engine**: Fine-tuned on extensive datasets, Bpe-vocab-n-OCR ensures precise text recognition and tokenization.
24
  * **Optimized for Tokenized Output**: Produces structured comma-separated text, making it ideal for downstream NLP tasks, automation pipelines, and database integrations.
25
  * **Enhanced Multilingual OCR**: Supports text extraction in multiple languages, including English, Chinese, Japanese, Korean, Arabic, and more.
26
  * **Multimodal Processing**: Seamlessly processes both image and text inputs, providing structured tokenized outputs.
27
  * **Secure and Optimized Model Weights**: Employs safetensors for efficient and secure model loading.
28
 
 
 
 
 
 
 
29
  ![sdsdfsd.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/XT1Qe2WxVzclETv6Rmgfs.png)
30
 
31
  ### How to Use
 
34
  from transformers import Qwen2VLForConditionalGeneration, AutoTokenizer, AutoProcessor
35
  from qwen_vl_utils import process_vision_info
36
 
37
+ # Load the Bpe-vocab-n-OCR model with optimized parameters
38
  model = Qwen2VLForConditionalGeneration.from_pretrained(
39
  "prithivMLmods/Tokenized-OCR", torch_dtype="auto", device_map="auto"
40
  )
 
47
  # device_map="auto",
48
  # )
49
 
50
+ # Load the default processor for Bpe-vocab-n-OCR
51
  processor = AutoProcessor.from_pretrained("prithivMLmods/Tokenized-OCR")
52
 
53
  # Define the input messages with both an image and a text prompt
 
91
 
92
  ### **Key Features**
93
 
94
+ 1. **High-Accuracy OCR Processing**
95
+
96
+ * Extracts and tokenizes text from images with exceptional precision.
97
+
98
+ 2. **Multilingual Text Recognition**
99
+
100
+ * Supports multiple languages, ensuring comprehensive OCR capabilities.
101
+
102
+ 3. **Comma-Separated Tokenized Output**
103
 
104
+ * Generates structured text for seamless NLP and data processing tasks.
 
105
 
106
+ 4. **Efficient Image & Text Processing**
 
107
 
108
+ * Handles both visual and textual inputs, ensuring accurate OCR-based extraction.
 
109
 
110
+ 5. **Optimized for Secure Deployment**
 
111
 
112
+ * Uses safetensors for enhanced security and model efficiency.