Discussion: An inquiry, perhaps an idea.

#15
by Diavator - opened

https://huggingface.co/Undi95/MLewd-ReMM-L2-Chat-20B
I love this model and in general the entire 20b series from Undi95, but the problem is that they are designed for the 4K context and I really miss it. It is difficult for me to explain some of the teaching features of this model, since I evaluate them as a simple user, but I will try to clarify what I like so much about it.

  1. The model perfectly preserves the character of the hero without improving it over time.
  2. A very high percentage of understanding of hero cards in ST. According to my observations, it is definitely higher than 85%.
  3. The general understanding of the context and situation in the RP is very good.
  4. The general writing style and quality of the text are also at a very high level.
  5. This model completely lacks a moral and ethical block; LLM itself easily initiates ERP without additional “persuasion”, as happens in other models. And the description of the action during the ERP itself does not come down to 2-3 dry, boring phrases, it is really interesting and interesting to read.
  6. The model does not focus on any one feature of the user’s or character’s appearance, but juggles them, due to this there is no feeling of repetition of the text.
    I would really like a similar model for 32k, even with less knowledge.

p.s. I really like how this model https://huggingface.co/Undi95/Nethena-MLewd-Xwin-23B conveys the emotions and reactions of the characters, sometimes you really believe them, very low repeatability.

I imagine you're looking for advice on models? - Since these are out-of-scope due to their sizes.

So, there's been a general increase in quality for the 7B parameter size over time where they are able to seemingly compete with the older 13B and even 20Bs. That has made them popular considering most people can run them fast locally, and they are also cheap to finetune and improve upon...

From IkariDev who worked with Undi on Noromaid:

image.png

So with that said...

You can try this one:

https://huggingface.co/Lewdiculous/Kunocchini-7b-128k-test-GGUF-Imatrix

I didn't use Toppy-M-7B much but people also say it can handle so... Another one to try.

And tell us how it does for 32K context (that maybe be pushing it hard so play around, hopefully it's stable at 16 at least, I only tested 12K mostly...)

Lewdiculous changed discussion title from An inquiry, perhaps an idea. to Discussion: An inquiry, perhaps an idea.

Reopen if necessary.

Lewdiculous changed discussion status to closed

I usually don't put more than 8-12k of context, that's enough volume for my RP needs. But 8k models are rarer than 32k, mostly everyone makes new models on Mistral.

I totally agree that the best 7b models outperform or at least don't perform much worse than most 13b or 20b models. And the moes I tested where sometimes better at simple questions but they are really bad at multiple turn convos the first 2-3 answers are really good but then they repeat all the time and tweaking inference parameters or changing the prompts hasn't helped.

@Diavator

I usually don't put more than 8-12k of context, that's enough volume for my RP needs. But 8k models are rarer than 32k, mostly everyone makes new models on Mistral.

You don't have to use the full 32K, haha, they will work fine with 8K, 10K, 12K... :3

The one I linked is the one I tested the most.

Sign up or log in to comment