luigi86/Q3-30b-A3b-Pentiment_mlx-8bpw

This model luigi86/Q3-30b-A3b-Pentiment_mlx-8bpw was converted to MLX format from allura-org/Q3-30b-A3b-Pentiment using mlx-lm version 0.24.1.

Use with mlx

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("luigi86/Q3-30b-A3b-Pentiment_mlx-8bpw")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Original model card

Pentiment

image/png Submit to silence.

not too sure how I feel about this one, but yolo! :3
Triple stage RP/general tune of Qwen3-30B-A3b Base (finetune, merged for stablization, aligned)

Format

use chatml. thinking may or may not work, ymmv!.

Quants

GGUF:

  • todo! :3

EXL2:

EXL3:

GPTQ (8bit and 4bit):

Thankses

special thanks to alibaba for training the base model and regular instruct model, as well as Gryphe for training the pantheon model also used in the merging step. special thanks to artus for making the exllama quants. special thanks to allura for being cute :3

Postmortem

never merge with qwen 3 instruct. it's not worth it. it will destroy your model and make it just qwen 3 instruct again with all its issues.

Downloads last month
37
Safetensors
Model size
8.59B params
Tensor type
BF16
·
U32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for luigi86/Q3-30b-A3b-Pentiment_mlx-8bpw

Quantized
(23)
this model