Kooten
/

FlatOrcamaid-13b-v0.2-4bit-mlx

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Edit model card

FlatOrcamaid-13b-v0.2 4bit MLX

MLX quants of NeverSleep/FlatOrcamaid-13b-v0.2

This is an MLX quant of of FlatOrcamaid, MLX is for use with Apple silicon. The 4bpw fits seems to work well on my 16gig M1 MBP, 8bpw needs more ram.

Documentation on MLX

Other Quants:

-MLX: 8bit, 4bit

-Exllama: 8bpw, 6bpw, 5bpw, 4bpw

Prompt template: Custom format, or Alpaca

Custom format:

SillyTavern config files: Context, Instruct.

Alpaca:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:

Contact

Kooten on discord.

Downloads last month: 11

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.