deepseek-ai/deepseek-vl2 · I can't get the model to run.

Jan 27

Hi, I'm new in this world and I'm trying to run the model but I'm failing.

My code looks like:

import torch
from transformers import AutoModelForCausalLM

from deepseek_vl2.models import DeepseekVLV2Processor, DeepseekVLV2ForCausalLM
from deepseek_vl2.utils.io import load_pil_images


# specify the path to the model
model_path = "deepseek-ai/deepseek-vl2-tiny"

When I run it I get an error:

> python my_extraction.py

Traceback (most recent call last):
  File "/Users/andresinaka/Desktop/LLMs/my_extraction.py", line 4, in <module>
    from deepseek_vl2.models import DeepseekVLV2Processor, DeepseekVLV2ForCausalLM
ModuleNotFoundError: No module named 'deepseek_vl2'

Not sure what's going on, I'm following the readme...

I have torch installed:

pip show torch
Name: torch
Version: 2.5.1
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3-Clause
Location: /Users/andres.canal/Desktop/LLMs/myenv/lib/python3.12/site-packages
Requires: filelock, fsspec, jinja2, networkx, setuptools, sympy, typing-extensions
Required-by: accelerate, easyocr, llms, torchvision

I have transformers installed:

pip show torch
Name: torch
Version: 2.5.1
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3-Clause
Location: /Users/andres.canal/Desktop/LLMs/myenv/lib/python3.12/site-packages
Requires: filelock, fsspec, jinja2, networkx, setuptools, sympy, typing-extensions
Required-by: accelerate, easyocr, llms, torchvision
(myenv) andres.canal:LLMs[main] $ pip show transformers
Name: transformers
Version: 4.49.0.dev0
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: [email protected]
License: Apache 2.0 License
Location: /Users/andres.canal/Desktop/LLMs/myenv/lib/python3.12/site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: llms

beednarz-p100

Jan 28

I also didn't manage to get it working, but seems I'm few seteps ahead of you. First of all, you should
git clone https://github.com/deepseek-ai/DeepSeek-VL2.git
then install module with following command according to ReadMe:
pip install -e .

After that you will notice it requires torch==2.0.1... and this torch version is available for Python 3.10 (not higher...) :)

Now I'm stuck at xformers module missing error:

File ~\python_virtual_envs\torch_test\lib\site-packages\deepseek_vl2\models\siglip_vit.py:16
14 from timm.models._manipulate import named_apply, checkpoint_seq, adapt_input_conv
15 from transformers.modeling_utils import is_flash_attn_2_available
---> 16 from xformers.ops import memory_efficient_attention
17 from functools import partial
20 if is_flash_attn_2_available():

ModuleNotFoundError: No module named 'xformers'

andresinaka

Jan 29

Thanks @beednarz-p100 !

I really can't understand why they don't make this more friendly to use with better documentation. I'm so used to Ollama that this feels like a nightmare hahahah...

miral-songhela

Jan 29

I also didn't manage to get it working, but seems I'm few seteps ahead of you. First of all, you should
git clone https://github.com/deepseek-ai/DeepSeek-VL2.git
then install module with following command according to ReadMe:
pip install -e .

After that you will notice it requires torch==2.0.1... and this torch version is available for Python 3.10 (not higher...) :)

Now I'm stuck at xformers module missing error:

File ~\python_virtual_envs\torch_test\lib\site-packages\deepseek_vl2\models\siglip_vit.py:16
14 from timm.models._manipulate import named_apply, checkpoint_seq, adapt_input_conv
15 from transformers.modeling_utils import is_flash_attn_2_available
---> 16 from xformers.ops import memory_efficient_attention
17 from functools import partial
20 if is_flash_attn_2_available():

ModuleNotFoundError: No module named 'xformers'

Hi @beednarz-p100 , I was also stuck at xformers error. I tried a few solutions such as installing different xformers version based on your cuda version. You can get it from here https://github.com/facebookresearch/xformers#installing-xformers

My cuda version is 12.2 so tried to download xformers 12.4 but couldn't get to work.

Please let me know if you were able to get it to work

y0un0

Jan 29

Hello @miral-songhela ,
Deepseek-VL2 uses torch==2.0.1 so you should be able to make it work by downloading xformers==0.0.20 --> https://github.com/facebookresearch/xformers/issues/752#issuecomment-1555756372

genesis1SubHub

Jan 30

Figured out how to run deepseek-vl2-small, has someone figured out how to run deepseek-vl2, I know it's going to take multiple gpus, but having issues embedding inputs?

chorii

Jan 31

I did many trials and errors. Finally, I succeeded when I installed the following versions of the following packages:

Name: torch, Version: 2.0.1
Name: xformers, Version: 0.0.20
Name: flash-attn, Version: 2.5.8

beednarz-p100

Feb 1

Thanks for hint about xformers version. Finally it works for me with following setup:
xformers==0.0.20
torch==2.0.1+cu118
torchaudio==2.0.2+cu118
torchvision==0.15.2+cu118
numpy==1.26.4

CUDA Toolkit 11.8, because according to torch v2.0.1 page it is provided only CUDA v11.7 and 11.8. I had to downgrade numpy from v2 to latest v1. I do not have flash-attn at all. I'm getting error about missing Triton, but any case it works :)

"A matching Triton is not available, some optimizations will not be enabled."

MikAhe

Feb 20

I've cloned github repo and executed pip install -e, but now I'm suffering this error:

RuntimeError: Failed to import transformers.models.cohere.configuration_cohere because of the following error (look up to see its traceback):
No module named 'transformers.models.cohere.configuration_cohere'

Does anyone know how to solve it? I've tried with other versions of transformers library, but I guess deepseek_vl2 requires transformers 4.38.2