Mol-VL is a Vision-Language Model for Optical Chemical Structure Understanding (OCSU).
To take advantage of existing pretrained VLMs, we adopt the weights from Qwen2-VL. Mol-VL-7B is further finetuned on Vis-CheBI20 training set.
For technical details, please refer to OCSU. Training and evaluation scripts will be available recently, stay tuned!
If you find our work useful in your research, please consider citing:
@article{fan2025ocsu,
title={OCSU: Optical Chemical Structure Understanding for Molecule-centric Scientific Discovery},
author={Fan, Siqi and Xie, Yuguang and Cai, Bowen and Xie, Ailin and Liu, Gaochao and Qiao, Mu and Xing, Jie and Nie, Zaiqing},
journal={arXiv preprint arXiv:2501.15415},
year={2025}
}
- Downloads last month
- 0
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.