R136a1
/

InfinityKuno-2x7B-GGUF

nsfw

Not-For-All-Audiences

Model card Files Files and versions Community

InfinityKuno-2x7B

GGUF-Imatrix quantizations of InfinityKuno-2x7B

Experimental model from Endevor/InfinityRP-v1-7B and SanjiWatsuki/Kunoichi-DPO-v2-7B models. Merged to MoE model with 2x7B parameters.

Perplexity

Using llama.cpp/perplexity with private roleplay dataset.

Format	PPL
FP16	3.2686 +/- 0.12496
Q8_0	3.2738 +/- 0.12570
Q5_K_M	3.2589 +/- 0.12430
IQ4_NL	3.2689 +/- 0.12487
IQ3_M	3.3097 +/- 0.12233
IQ2_M	3.4658 +/- 0.13077

Prompt format:

Alpaca, Extended Alpaca, Roleplay-Alpaca. (Use any Alpaca based prompt formatting and you should be fine.)

Switch: FP16 - GGUF

Downloads last month: 10

GGUF

Model size

12.9B params

Architecture

llama

Hardware compatibility

Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support