Text Generation
Transformers
Safetensors
mistral
chat
conversational
text-generation-inference
Inference Endpoints

Context length?

#8
by AIGUYCONTENT - opened

I downloaded this quant last night: https://huggingface.co/BigHuggyD/anthracite-org_magnum-v2-123b_exl2_8.0bpw_h8

I would like to know what is the suggested context length? I currently have it set to 55,000 (a random number).

And this model does not work with cfg-cache and guidance_scale turned on in Oobaboga. According to Mr. Oobabooga himself, he refers to a paper that claims that turning cfg-cache can make the model smarter: https://www.reddit.com/r/Oobabooga/comments/1cf9bso/what_does_guidance_scale_parameter_do/

Considering how quants essentially perform a lobotomy on models....I am hoping to get cfg-cache working with this model.

Anthracite org

we train on 8192 ctx, but you can try more and see if it becomes incoherent; varies by samplers and use-case.

"I am hoping to get cfg-cache working with this model."
hope you get it working! report back if it works.

Sign up or log in to comment