CoF-SFT-VL Model
This is a supervised fine-tuned (SFT) vision-language model based on Qwen/Qwen2.5-VL-7B-Instruct
. It is trained on the CoF-SFT-Data-5.4k dataset, which contains 5.4k image-text reasoning examples.
Model Details
- Base model: Qwen/Qwen2.5-VL-7B-Instruct
- Training data: 5.4k curated reasoning samples from xintongzhang/CoF-SFT-Data-5.4k
- Framework: Transformers
Resources
- Project page: https://cof-reasoning.github.io/
- Paper: https://arxiv.org/abs/2505.15436
- Downloads last month
- 14
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support