Add pipeline tag and paper link to QARI-OCR model card
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
@@ -1,13 +1,13 @@
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
-
tags:
|
|
|
4 |
---
|
5 |
|
6 |
# Model Card for Model ID
|
7 |
|
8 |
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
-
|
10 |
-
|
11 |
|
12 |
## Model Details
|
13 |
|
@@ -17,20 +17,20 @@ tags: []
|
|
17 |
|
18 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
19 |
|
20 |
-
- **Developed by:**
|
21 |
- **Funded by [optional]:** [More Information Needed]
|
22 |
- **Shared by [optional]:** [More Information Needed]
|
23 |
-
- **Model type:**
|
24 |
-
- **Language(s) (NLP):**
|
25 |
- **License:** [More Information Needed]
|
26 |
-
- **Finetuned from model [optional]:**
|
27 |
|
28 |
### Model Sources [optional]
|
29 |
|
30 |
<!-- Provide the basic links for the model. -->
|
31 |
|
32 |
- **Repository:** [More Information Needed]
|
33 |
-
- **Paper
|
34 |
- **Demo [optional]:** [More Information Needed]
|
35 |
|
36 |
## Uses
|
@@ -41,7 +41,7 @@ This is the model card of a 🤗 transformers model that has been pushed on the
|
|
41 |
|
42 |
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
43 |
|
44 |
-
|
45 |
|
46 |
### Downstream Use [optional]
|
47 |
|
@@ -53,7 +53,7 @@ This is the model card of a 🤗 transformers model that has been pushed on the
|
|
53 |
|
54 |
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
|
55 |
|
56 |
-
|
57 |
|
58 |
## Bias, Risks, and Limitations
|
59 |
|
@@ -79,7 +79,7 @@ Use the code below to get started with the model.
|
|
79 |
|
80 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
81 |
|
82 |
-
|
83 |
|
84 |
### Training Procedure
|
85 |
|
@@ -174,7 +174,16 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
|
|
174 |
|
175 |
**BibTeX:**
|
176 |
|
177 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
178 |
|
179 |
**APA:**
|
180 |
|
@@ -196,15 +205,4 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
|
|
196 |
|
197 |
## Model Card Contact
|
198 |
|
199 |
-
[More Information Needed]
|
200 |
-
|
201 |
-
```
|
202 |
-
@misc{QariOCR2025,
|
203 |
-
title={QARI-OCR: High-Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation},
|
204 |
-
author={Ahmed Wasfy, Omer Nacar, Abdelakreem Elkhateb, Mahmoud Reda, Omar Elshehy, Adel Ammar, Wadii Boulila},
|
205 |
-
year={2025},
|
206 |
-
archivePrefix={arXiv},
|
207 |
-
url={https://arxiv.org/abs/2506.02295},
|
208 |
-
note={Accessed: 2025-03-03}
|
209 |
-
}
|
210 |
-
```
|
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
+
tags:
|
4 |
+
- image-to-text
|
5 |
---
|
6 |
|
7 |
# Model Card for Model ID
|
8 |
|
9 |
<!-- Provide a quick summary of what the model is/does. -->
|
10 |
+
This model is designed for Arabic Optical Character Recognition (OCR).
|
|
|
11 |
|
12 |
## Model Details
|
13 |
|
|
|
17 |
|
18 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
19 |
|
20 |
+
- **Developed by:** Ahmed Wasfy, Omer Nacar, Abdelakreem Elkhateb, Mahmoud Reda, Omar Elshehy, Adel Ammar, Wadii Boulila
|
21 |
- **Funded by [optional]:** [More Information Needed]
|
22 |
- **Shared by [optional]:** [More Information Needed]
|
23 |
+
- **Model type:** Vision-Language Model for OCR
|
24 |
+
- **Language(s) (NLP):** Arabic
|
25 |
- **License:** [More Information Needed]
|
26 |
+
- **Finetuned from model [optional]:** Qwen2-VL-2B-Instruct
|
27 |
|
28 |
### Model Sources [optional]
|
29 |
|
30 |
<!-- Provide the basic links for the model. -->
|
31 |
|
32 |
- **Repository:** [More Information Needed]
|
33 |
+
- **Paper:** [QARI-OCR: High-Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation](https://huggingface.co/papers/2506.02295)
|
34 |
- **Demo [optional]:** [More Information Needed]
|
35 |
|
36 |
## Uses
|
|
|
41 |
|
42 |
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
43 |
|
44 |
+
This model can be directly used for recognizing Arabic text in images.
|
45 |
|
46 |
### Downstream Use [optional]
|
47 |
|
|
|
53 |
|
54 |
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
|
55 |
|
56 |
+
This model is specifically designed for Arabic text and might not perform well on other languages.
|
57 |
|
58 |
## Bias, Risks, and Limitations
|
59 |
|
|
|
79 |
|
80 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
81 |
|
82 |
+
Trained on specialized synthetic datasets.
|
83 |
|
84 |
### Training Procedure
|
85 |
|
|
|
174 |
|
175 |
**BibTeX:**
|
176 |
|
177 |
+
```
|
178 |
+
@misc{QariOCR2025,
|
179 |
+
title={QARI-OCR: High-Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation},
|
180 |
+
author={Ahmed Wasfy, Omer Nacar, Abdelakreem Elkhateb, Mahmoud Reda, Omar Elshehy, Adel Ammar, Wadii Boulila},
|
181 |
+
year={2025},
|
182 |
+
archivePrefix={arXiv},
|
183 |
+
url={https://arxiv.org/abs/2506.02295},
|
184 |
+
note={Accessed: 2025-03-03}
|
185 |
+
}
|
186 |
+
```
|
187 |
|
188 |
**APA:**
|
189 |
|
|
|
205 |
|
206 |
## Model Card Contact
|
207 |
|
208 |
+
[More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|