prithivMLmods commited on
Commit
9738a7f
·
verified ·
1 Parent(s): a54254d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -6
README.md CHANGED
@@ -15,12 +15,17 @@ tags:
15
  - Latex
16
  - VLM
17
  - Plain_Text
 
 
 
18
  ---
19
- # Qwen2-VL-OCR-2B-Instruct [ VL / OCR ]
20
 
21
  ![aaaaaaaaaaa.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/s42kASSQCoJAyYMJkoEuD.png)
22
 
23
- The **Qwen2-VL-OCR-2B-Instruct** model is a fine-tuned version of **Qwen/Qwen2-VL-2B-Instruct**, tailored for tasks that involve **Optical Character Recognition (OCR)**, **image-to-text conversion**, and **math problem solving with LaTeX formatting**. This model integrates a conversational approach with visual and textual understanding to handle multi-modal tasks effectively.
 
 
24
 
25
  #### Key Enhancements:
26
 
@@ -32,6 +37,11 @@ The **Qwen2-VL-OCR-2B-Instruct** model is a fine-tuned version of **Qwen/Qwen2-V
32
 
33
  * **Multilingual Support**: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc.
34
 
 
 
 
 
 
35
  | **File Name** | **Size** | **Description** | **Upload Status** |
36
  |---------------------------|------------|------------------------------------------------|-------------------|
37
  | `.gitattributes` | 1.52 kB | Configures LFS tracking for specific model files. | Initial commit |
@@ -46,11 +56,7 @@ The **Qwen2-VL-OCR-2B-Instruct** model is a fine-tuned version of **Qwen/Qwen2-V
46
  | `vocab.json` | 2.78 MB | Vocabulary file for tokenization. | Uploaded |
47
 
48
  ---
49
- ### Sample Inference with Doc
50
-
51
- ![123.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/TlsmcTqoQMvaBhwo8tGeU.png)
52
 
53
- **📍Demo**: https://huggingface.co/prithivMLmods/Qwen2-VL-OCR-2B-Instruct/blob/main/Demo/ocrtest_qwen.ipynb
54
  ### How to Use
55
 
56
  ```python
 
15
  - Latex
16
  - VLM
17
  - Plain_Text
18
+ - KIE
19
+ - Equations
20
+ - VQA
21
  ---
22
+ # **Qwen2-VL-OCR-2B-Instruct [ VL / OCR ]**
23
 
24
  ![aaaaaaaaaaa.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/s42kASSQCoJAyYMJkoEuD.png)
25
 
26
+ > The **Qwen2-VL-OCR-2B-Instruct** model is a fine-tuned version of **Qwen/Qwen2-VL-2B-Instruct**, tailored for tasks that involve **Optical Character Recognition (OCR)**, **image-to-text conversion**, and **math problem solving with LaTeX formatting**. This model integrates a conversational approach with visual and textual understanding to handle multi-modal tasks effectively.
27
+
28
+ [![Open Demo in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://huggingface.co/prithivMLmods/Qwen2-VL-OCR-2B-Instruct/blob/main/Demo/ocrtest_qwen.ipynb)
29
 
30
  #### Key Enhancements:
31
 
 
37
 
38
  * **Multilingual Support**: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc.
39
 
40
+
41
+ ### Sample Inference
42
+
43
+ ![123.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/TlsmcTqoQMvaBhwo8tGeU.png)
44
+
45
  | **File Name** | **Size** | **Description** | **Upload Status** |
46
  |---------------------------|------------|------------------------------------------------|-------------------|
47
  | `.gitattributes` | 1.52 kB | Configures LFS tracking for specific model files. | Initial commit |
 
56
  | `vocab.json` | 2.78 MB | Vocabulary file for tokenization. | Uploaded |
57
 
58
  ---
 
 
 
59
 
 
60
  ### How to Use
61
 
62
  ```python