gg-hf-gm

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

PhilCulliton authored a paper 7 days ago

Predicting Severe Sepsis Using Text from the Electronic Health Record

PhilCulliton authored a paper 7 days ago

Gemma 2: Improving Open Language Models at a Practical Size

Marksherwood updated a model 17 days ago

google/gemma-3n-E2B-it-litert-preview

View all activity

gg-hf-gm's activity

danielhanchen

posted an update 3 days ago

Post

3007

New DeepSeek-R1-0528 1.65-bit Dynamic GGUF!

Run the model locally even easier! Will fit on a 192GB Macbook and run at 7 tokens/s.

DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-GGUF
Qwen3-8B DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF

And read our Guide: https://docs.unsloth.ai/basics/deepseek-r1-0528

PhilCulliton

authored 2 papers 7 days ago

Predicting Severe Sepsis Using Text from the Electronic Health Record

Paper • 1711.11536 • Published Nov 30, 2017

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31, 2024 • 77

Marksherwood

updated 2 models 17 days ago

google/gemma-3n-E2B-it-litert-preview

Image-Text-to-Text • Updated 16 days ago • 315

google/gemma-3n-E4B-it-litert-preview

Image-Text-to-Text • Updated 11 days ago • 926

reach-vb

posted an update 17 days ago

Post

3644

hey hey @mradermacher - VB from Hugging Face here, we'd love to onboard you over to our optimised xet backend! 💥

as you know we're in the process of upgrading our storage backend to xet (which helps us scale and offer blazingly fast upload/ download speeds too): https://huggingface.co/blog/xet-on-the-hub and now that we are certain that the backend can scale with even big models like Llama 4/ Qwen 3 - we;re moving to the next phase of inviting impactful orgs and users on the hub over as you are a big part of the open source ML community - we would love to onboard you next and create some excitement about it in the community too!

in terms of actual steps - it should be as simple as one of the org admins to join hf.co/join/xet - we'll take care of the rest.

p.s. you'd need to have a the latest hf_xet version of huggingface_hub lib but everything else should be the same: https://huggingface.co/docs/hub/storage-backends#using-xet-storage

p.p.s. this is fully backwards compatible so everything will work as it should! 🤗

16 replies

osanseviero

updated 2 models 18 days ago

google/gemma-3n-E4B-it-litert-preview

Image-Text-to-Text • Updated 11 days ago • 926

google/gemma-3n-E2B-it-litert-preview

Image-Text-to-Text • Updated 16 days ago • 315

PhilCulliton

authored a paper 23 days ago

Position: AI Competitions Provide the Gold Standard for Empirical Rigor in GenAI Evaluation

Paper • 2505.00612 • Published May 1 • 9

wanxinxw

authored 3 papers 24 days ago

danielhanchen

posted an update about 1 month ago

Post

1882

💜 Qwen3 128K Context Length: We've released Dynamic 2.0 GGUFs + 4-bit safetensors!
Fixed: Now works on any inference engine and fixed issues with the chat template.
Qwen3 GGUFs:
30B-A3B: unsloth/Qwen3-30B-A3B-GGUF
235-A22B: unsloth/Qwen3-235B-A22B-GGUF
32B: unsloth/Qwen3-32B-GGUF

Read our guide on running Qwen3 here: https://docs.unsloth.ai/basics/qwen3-how-to-run-and-finetune

128K Context Length:
30B-A3B: unsloth/Qwen3-30B-A3B-128K-GGUF
235-A22B: unsloth/Qwen3-235B-A22B-128K-GGUF
32B: unsloth/Qwen3-32B-128K-GGUF

All Qwen3 uploads: unsloth/qwen3-680edabfb790c8c34a242f95

sanmikoyejo

authored a paper about 1 month ago

The Leaderboard Illusion

Paper • 2504.20879 • Published Apr 29 • 70

danielhanchen

posted an update about 1 month ago

Post

5823

🦥 Introducing Unsloth Dynamic v2.0 GGUFs!
Our v2.0 quants set new benchmarks on 5-shot MMLU and KL Divergence, meaning you can now run & fine-tune quantized LLMs while preserving as much accuracy as possible.

Llama 4: unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
DeepSeek-R1: unsloth/DeepSeek-R1-GGUF-UD
Gemma 3: unsloth/gemma-3-27b-it-GGUF

We made selective layer quantization much smarter. Instead of modifying only a subset of layers, we now dynamically quantize all layers so every layer has a different bit. Now, our dynamic method can be applied to all LLM architectures, not just MoE's.

Blog with Details: https://docs.unsloth.ai/basics/dynamic-v2.0

All our future GGUF uploads will leverage Dynamic 2.0 and our hand curated 300K–1.5M token calibration dataset to improve conversational chat performance.

For accurate benchmarking, we built an evaluation framework to match the reported 5-shot MMLU scores of Llama 4 and Gemma 3. This allowed apples-to-apples comparisons between full-precision vs. Dynamic v2.0, QAT and standard iMatrix quants.

Dynamic v2.0 aims to minimize the performance gap between full-precision models and their quantized counterparts.

philschmid

posted an update about 2 months ago

Post

2937

Gemini 2.5 Flash is here! We excited launch our first hybrid reasoning Gemini model. In Flash 2.5 developer can turn thinking off.

**TL;DR:**
- 🧠 Controllable "Thinking" with thinking budget with up to 24k token
- 🌌 1 Million multimodal input context for text, image, video, audio, and pdf
- 🛠️ Function calling, structured output, google search & code execution.
- 🏦 $0.15 1M input tokens; $0.6 or $3.5 (thinking on) per million output tokens (thinking tokens are billed as output tokens)
- 💡 Knowledge cut of January 2025
- 🚀 Rate limits - Free 10 RPM 500 req/day
- 🏅Outperforms 2.0 Flash on every benchmark

Try it ⬇️
https://aistudio.google.com/prompts/new_chat?model=gemini-2.5-flash-preview-04-17

1 reply

pcuenq

authored a paper about 2 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 188

reach-vb

authored a paper about 2 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 188

sanmikoyejo

authored a paper about 2 months ago

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7 • 105

danielhanchen

posted an update about 2 months ago

Post

4946

You can now run Llama 4 on your own local device! 🦙
Run our Dynamic 1.78-bit and 2.71-bit Llama 4 GGUFs:
unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF

You can run them on llama.cpp and other inference engines. See our guide here: https://docs.unsloth.ai/basics/tutorial-how-to-run-and-fine-tune-llama-4

1 reply

AI & ML interests

Recent Activity

Team members 78

gg-hf-gm's activity