DeepPerception / README.md
MaxyLee's picture
Update README.md
150ffbf verified
|
raw
history blame
1.07 kB
metadata
license: apache-2.0
language:
  - en
metrics:
  - accuracy
base_model:
  - Qwen/Qwen2-VL-7B-Instruct
pipeline_tag: image-text-to-text

DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding

Xinyu Ma, Ziyang Ding, Zhicong Luo, Chi Chen, Zonghao Guo, Derek F. Wong, Xiaoyi Feng, Maosong Sun


This is the official repository of DeepPerception, an MLLM enhanced with cognitive visual perception capabilities.