Update README.md
Browse files
README.md
CHANGED
|
@@ -150,16 +150,7 @@ print(output_text)
|
|
| 150 |
## References
|
| 151 |
|
| 152 |
* **YaRN: Efficient Context Window Extension of Large Language Models**
|
| 153 |
-
[https://arxiv.org/pdf/2309.00071](https://arxiv.org/pdf/2309.00071)
|
| 154 |
-
|
| 155 |
* **Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution**
|
| 156 |
-
[https://arxiv.org/pdf/2409.12191](https://arxiv.org/pdf/2409.12191)
|
| 157 |
-
|
| 158 |
* **Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond**
|
| 159 |
-
[https://arxiv.org/pdf/2308.12966](https://arxiv.org/pdf/2308.12966)
|
| 160 |
-
|
| 161 |
* **A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy**
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
* **Ground-R1: Incentivizing Grounded Visual Reasoning via Reinforcement Learning**
|
| 165 |
-
[https://arxiv.org/pdf/2505.20272](https://arxiv.org/pdf/2505.20272)
|
|
|
|
| 150 |
## References
|
| 151 |
|
| 152 |
* **YaRN: Efficient Context Window Extension of Large Language Models**
|
|
|
|
|
|
|
| 153 |
* **Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution**
|
|
|
|
|
|
|
| 154 |
* **Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond**
|
|
|
|
|
|
|
| 155 |
* **A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy**
|
| 156 |
+
* **Ground-R1: Incentivizing Grounded Visual Reasoning via Reinforcement Learning**
|
|
|
|
|
|
|
|
|