[Suggestion] Proper quantization labelling, attribution

by finis-est - opened Mar 10

Mar 10

•

Hey, thanks for coming up with EXL2 quants for Snowdrop!

Would be cool to see this listed under Snowdrop's quantizations properly, similar to how these ones do it, will be useful for people looking at options from the Snowdrop model page too:

Just a suggestion, here's how the other quanters do it in the model card, I think changing the base model to Snowdrop might be enough?
https://huggingface.co/mradermacher/QwQ-Snowdrop-GGUF/edit/main/README.md
https://huggingface.co/mradermacher/QwQ-Snowdrop-i1-GGUF/edit/main/README.md
https://huggingface.co/janboe91/QwQ-32B-Snowdrop-v0-4bit/edit/main/README.md
https://huggingface.co/DevQuasar/trashpanda-org.QwQ-32B-Snowdrop-v0-GGUF/edit/main/README.md

Again, thanks for the quant!

FrenzyBiscuit

Ready.Art org Mar 10

I added snowdrop to https://huggingface.co/ReadyArt/QwQ-32B-Snowdrop-v0_EXL2_4.0bpw_H8 under the base_model for 4.0, which is what I think you wanted?

Can you confirm that's what you're asking for? If so I'll go ahead and apply the changes and slowly (probably over the next week or two) do the same for our other 300+ models.

It's not my intention to take credit for the models, I just don't have an automated process of changing the model cards and 300+ models are a lot of models to keep track of. Usually, I just copy the repository, quant them and upload the quants.

I'll look into a way of changing the way I do things, maybe implementing my own small model card instead of the original.

finis-est

Mar 10

Hmm, checked Snowdrop and the quant was still getting listed as a merge, wonder what's up with that...

No worries, figured it was something automated and it's definitely a hassle to do everything manually. Thanks for looking into it, and for your efforts in quanting, appreciate it.

FrenzyBiscuit

Ready.Art org Mar 10

Hmm, checked Snowdrop and the quant was still getting listed as a merge, wonder what's up with that...

No worries, figured it was something automated and it's definitely a hassle to do everything manually. Thanks for looking into it, and for your efforts in quanting, appreciate it.

I think quanters like Bartowski automate the entire thing. I'm not that fancy, but I'll definitely look into changing how I do things and updating the existing models once I do so. It's probably a good idea anyway, as the existing model cards have media (i.e. images) from their creators. Thank you for bringing this up.

finis-est

Mar 10

Cool, no worries, if anything I'm grateful you heard us out. Thanks and looking forward to it, no rush though!

finis-est changed discussion status to closed Mar 10

Anthonyg5005

Mar 12

I think to get it working properly you only need to have base_model: trashpanda-org/QwQ-32B-Snowdrop-v0 for the models
however, since huggingface doesn't directly support exl2 configs you need to also add base_model_relation: quantized to force the repo to be for a quantized model

FrenzyBiscuit

Ready.Art org Mar 13

I think to get it working properly you only need to have base_model: trashpanda-org/QwQ-32B-Snowdrop-v0 for the models
however, since huggingface doesn't directly support exl2 configs you need to also add base_model_relation: quantized to force the repo to be for a quantized model

This does appear to be the case. I'll figure out what I'm going to do with the 300+ models we already have... going to be fun.

FrenzyBiscuit

Ready.Art org Mar 13

Can you confirm that's what you're looking for? I edited the model card. I think it's correct.

I'm going to flesh out a custom model card for my quants over the weekend and then start (slowly) replacing all 300+ of them.. going to be a blast.

FrenzyBiscuit changed discussion status to open Mar 13

Anthonyg5005

Mar 14

would have to wait until @finis-est sees this, I'm not part of their org, I just wanted to help from the exllama community.
also, not sure what your workflow is but for future reference I'd also recommend adding the measurement.txt file to the files if you can, it lets anyone be able to skip the measure pass by reusing your pass. for example, if I wanted to make my own 3.5bpw quant I could use that and skip the whole first half of the quanting process. it's not necessary but it would be helpful for a couple people.
thanks for your quants though, they've been helpful for trying out new finetunes quickly

finis-est

Mar 14

@FrenzyBiscuit Looking good now, can see it getting listed as a quant correctly. Thanks again!

FrenzyBiscuit

Ready.Art org Mar 14

@FrenzyBiscuit Looking good now, can see it getting listed as a quant correctly. Thanks again!

No, thank you for calling me out on it. I was being lazy.

Also, had no idea about the quant thing under the model card, so again thank you :)

FrenzyBiscuit changed discussion status to closed Mar 14

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment