luigi86/Q3-30b-A3b-Pentiment_mlx-8bpw

This model luigi86/Q3-30b-A3b-Pentiment_mlx-8bpw was converted to MLX format from allura-org/Q3-30b-A3b-Pentiment using mlx-lm version 0.24.1.

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("luigi86/Q3-30b-A3b-Pentiment_mlx-8bpw")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Original model card

Pentiment

Submit to silence.

not too sure how I feel about this one, but yolo! :3
Triple stage RP/general tune of Qwen3-30B-A3b Base (finetune, merged for stablization, aligned)

Format

use chatml. thinking may or may not work, ymmv!.

Quants

GGUF:

todo! :3

EXL2:

official quants

EXL3:

official quants

GPTQ (8bit and 4bit):

official quants

Thankses

special thanks to alibaba for training the base model and regular instruct model, as well as Gryphe for training the pantheon model also used in the merging step. special thanks to artus for making the exllama quants. special thanks to allura for being cute :3

Postmortem

never merge with qwen 3 instruct. it's not worth it. it will destroy your model and make it just qwen 3 instruct again with all its issues.

heni86
/

Q3-30b-A3b-Pentiment_mlx-8bpw