mlx-lm not ready

#1
by Weiguo - opened
model, tokenizer = load("mlx-community/Qwen3-Next-80B-A3B-Instruct-6bit")
                   ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/Users/xxxx/miniconda3/lib/python3.13/site-packages/mlx_lm/utils.py", line 266, in load
model, config = load_model(model_path, lazy)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^
File "/Users/xxxx/miniconda3/lib/python3.13/site-packages/mlx_lm/utils.py", line 184, in load_model
model_class, model_args_class = get_model_classes(config=config)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
File "/Users/xxxx/miniconda3/lib/python3.13/site-packages/mlx_lm/utils.py", line 71, in _get_classes
raise ValueError(msg)
ValueError: Model type qwen3_next not supported.

encountered the same issue.

Use the main branch

same issue here, tried on both LM Studio (Version 0.3.25 (0.3.25)) and Osaurus

update didn't help

No luck for me either - getting "Error in inspecting model architecture 'Qwen3NextForCausalLM'" in

(APIServer pid=85654) ERROR 09-14 20:01:04 [registry.py:449] Error in inspecting model architecture 'Qwen3NextForCausalLM'
(APIServer pid=85654) ERROR 09-14 20:01:04 [registry.py:449] Traceback (most recent call last):
(APIServer pid=85654) ERROR 09-14 20:01:04 [registry.py:449] File "/Users/ljubomir/python3-venv/torch313/lib/python3.13/site-packages/vllm/model_executor/models/registry.py", line 867, in _run_in_subprocess
(APIServer pid=85654) ERROR 09-14 20:01:04 [registry.py:449] returned.check_returncode()
(APIServer pid=85654) ERROR 09-14 20:01:04 [registry.py:449] ~~~~~~~~~~~~~~~~~~~~~~~~~^^

on

(torch313) ljubomir@macbook2(:):~/llama.cpp$ VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 vllm serve models/Qwen3-Next-80B-A3B-Instruct-4bit --port 8000 --tensor-parallel-size 4 --max-model-len 262144

I updated/installed both mlx and transformers from github head, jic so I don't rely on releases

$ uv pip install git+https://github.com/huggingface/transformers.git@main
$ uv pip install git+https://github.com/ml-explore/mlx-lm.git@main

πŸ₯² εŠ θ½½ζ¨‘εž‹ε€±θ΄₯

Failed to load model

Error when loading model: ValueError: Model type qwen3_next not supported.

tried on LM Studio (Version 0.3.25 (Build 2))

MLX Community org

Should be working now, the new version was not released so you'd have to update LM Studio into beta or clone mlx-lm from source.

This is the command that worked for me to upgrade to the source version. If you have an older version of mlx-lm installed, you need to specify these flags. I believe LM Studio's fork of mlx-lm does not have support yet, so I have just been using mlx-lm outside of LM Studio for this model.

pip install --upgrade --force-reinstall git+https://github.com/ml-explore/mlx-lm.git

Thank you. Indeed LMStudio is running great now! Getting ~50 tps on an old M2 mbp. You guys making all this happen are the best! 😊

Sign up or log in to comment