Text Generation
MLX
Safetensors
mixtral
all use cases
creative
creative writing
all genres
tool calls
tool use
llama 3.1
llama-3
llama3
llama-3.1
problem solving
deep thinking
reasoning
deep reasoning
story
writing
fiction
roleplaying
bfloat16
role play
sillytavern
backyard
context 128k
mergekit
Merge
Mixture of Experts
mixture of experts
conversational
6-bit
Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-q6-mlx
This model Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-q6-mlx was converted to MLX format from DavidAU/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B using mlx-lm version 0.26.0.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-q6-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 36
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support