Timon

KeyboardMasher
Β·

AI & ML interests

None yet

Recent Activity

Organizations

None yet

KeyboardMasher's activity

reacted to bartowski's post with πŸ‘ 6 days ago
view post
Post
8712
Access requests enabled for latest GLM models

While a fix is being implemented (https://github.com/ggml-org/llama.cpp/pull/12957) I want to leave the models up for visibility and continued discussion, but want to prevent accidental downloads of known broken models (even though there are settings that could fix it at runtime for now)

With this goal, I've enabled access requests. I don't really want your data, so I'm sorry that I don't think there's a way around that? But that's what I'm gonna do for now, and I'll remove the gate when a fix is up and verified and I have a chance to re-convert and quantize!

Hope you don't mind in the mean time :D
New activity in bartowski/QVQ-72B-Preview-GGUF 3 months ago

llama.cpp inference too slow?

3
#6 opened 4 months ago by
ygsun
reacted to fdaudens's post with 😎 3 months ago
view post
Post
9196
Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:

- Original release: 8 models, 540K downloads. Just the beginning...

- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5Mβ€”nearly 5X the originals.

The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.

When you empower builders, innovation explodes. For everyone. πŸš€

The most popular community model? @bartowski 's DeepSeek-R1-Distill-Qwen-32B-GGUF version β€” 1M downloads alone.
Β·
reacted to bartowski's post with πŸ‘ 4 months ago
view post
Post
73063
Switching to author_model-name

I posted a poll on twitter, and others have mentioned the interest in me using the convention of including the author name in the model path when I upload.

It has a couple advantages, first and foremost of course is ensuring clarity of who uploaded the original model (did Qwen upload Qwen2.6? Or did someone fine tune Qwen2.5 and named it 2.6 for fun?)

The second thing is that it avoids collisions, so if multiple people upload the same model and I try to quant them both, I would normally end up colliding and being unable to upload both

I'll be implementing the change next week, there are just two final details I'm unsure about:

First, should the files also inherit the author's name?

Second, what to do in the case that the author name + model name pushes us past the character limit?

Haven't yet decided how to handle either case, so feedback is welcome, but also just providing this as a "heads up"
Β·
New activity in allenai/OLMo-2-1124-7B-GGUF 5 months ago

Instruct version?

3
#1 opened 5 months ago by
KeyboardMasher
New activity in Nexusflow/Athene-70B 5 months ago

we need llama athene 3.1 70b

5
#5 opened 9 months ago by
gopi87
New activity in bartowski/granite-3.0-8b-instruct-GGUF 6 months ago

Continuous output

1
8
#1 opened 6 months ago by
kth8
New activity in pabloce/dolphin-2.8-gemma-7b-GGUF about 1 year ago

Q8_0 file is damaged.

5
#1 opened about 1 year ago by
KeyboardMasher