Latest SOTA models supported on Qualcomm NPU.
AI & ML interests
On Device AI Deployment and Research
Recent Activity
View all activity
Latest SOTA models supported on Intel NPU
Language Models that takes vision input and/or audio input, hand picked by Nexa Team.
Language Models that takes vision input and/or audio input, hand picked by Nexa Team.
NexaQuant compresses models with 100% accuracy recovery.
Nexa AI infra to support Qwen3VL running on GPU/NPU/CPU
-
NexaAI/Qwen3-VL-4B-Instruct-GGUF
Image-Text-to-Text • 4B • Updated • 23.2k • 24 -
NexaAI/Qwen3-VL-4B-Thinking-GGUF
Image-Text-to-Text • 4B • Updated • 6.89k • 6 -
NexaAI/Qwen3-VL-8B-Instruct-GGUF
Image-Text-to-Text • 8B • Updated • 24.8k • 18 -
NexaAI/Qwen3-VL-8B-Thinking-GGUF
Image-Text-to-Text • 8B • Updated • 13.4k • 12
Latest SOTA models supported on Apple Neural Engine
Text Generations Models in MLX format, hand picked by Nexa Team.
Text Generations Models in GGUF format, hand picked by Nexa Team.
Tiny, multimodal on-device models developed by Nexa AI.
Latest SOTA models supported on Qualcomm NPU.
Nexa AI infra to support Qwen3VL running on GPU/NPU/CPU
-
NexaAI/Qwen3-VL-4B-Instruct-GGUF
Image-Text-to-Text • 4B • Updated • 23.2k • 24 -
NexaAI/Qwen3-VL-4B-Thinking-GGUF
Image-Text-to-Text • 4B • Updated • 6.89k • 6 -
NexaAI/Qwen3-VL-8B-Instruct-GGUF
Image-Text-to-Text • 8B • Updated • 24.8k • 18 -
NexaAI/Qwen3-VL-8B-Thinking-GGUF
Image-Text-to-Text • 8B • Updated • 13.4k • 12
Latest SOTA models supported on Intel NPU
Latest SOTA models supported on Apple Neural Engine
Language Models that takes vision input and/or audio input, hand picked by Nexa Team.
Text Generations Models in MLX format, hand picked by Nexa Team.
Language Models that takes vision input and/or audio input, hand picked by Nexa Team.
Text Generations Models in GGUF format, hand picked by Nexa Team.
NexaQuant compresses models with 100% accuracy recovery.
Tiny, multimodal on-device models developed by Nexa AI.