visionOCR-3B-061125-GGUF
The visionOCR-3B-061125 model is a fine-tuned version of Qwen/Qwen2.5-VL-3B-Instruct, optimized for Document-Level Optical Character Recognition (OCR), long-context vision-language understanding, and accurate image-to-text conversion with mathematical LaTeX formatting. Built on top of the Qwen2.5-VL architecture, this model significantly improves document comprehension, structured data extraction, and visual reasoning across diverse input formats.
Model Files
File Name | Size | Format | Description |
---|---|---|---|
visionOCR-3B-061125-BF16.gguf | 6.18 GB | BF16 | Brain floating point 16-bit |
visionOCR-3B-061125-Q6_K.gguf | 2.54 GB | Q6_K | 6-bit quantized |
visionOCR-3B-061125-Q5_K_M.gguf | 2.22 GB | Q5_K_M | 5-bit quantized, medium quality |
visionOCR-3B-061125-Q4_K_M.gguf | 1.93 GB | Q4_K_M | 4-bit quantized, medium quality |
visionOCR-3B-061125-Q3_K_M.gguf | 1.59 GB | Q3_K_M | 3-bit quantized, medium quality |
visionOCR-3B-061125-Q3_K_S.gguf | 1.45 GB | Q3_K_S | 3-bit quantized, small quality |
visionOCR-3B-061125-Q2_K.gguf | 1.27 GB | Q2_K | 2-bit quantized |
Quants Usage
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):
- Downloads last month
- 237
Hardware compatibility
Log In
to view the estimation
2-bit
3-bit
4-bit
5-bit
6-bit
16-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for prithivMLmods/visionOCR-3B-061125-GGUF
Base model
Qwen/Qwen2.5-VL-3B-Instruct
Finetuned
prithivMLmods/visionOCR-3B-061125