39 102 345

Bertrand Chevrier

kramp

AI & ML interests

text 2 speech, ai for music writting

Recent Activity

liked a Space 7 days ago

TIGER-Lab/Pixel-Reasoner

upvoted a changelog 12 days ago

AI-generated Abstract summaries on Hugging Face Papers

upvoted a changelog 13 days ago

Filter by MCP compatibility available in HF Spaces

View all activity

Organizations

kramp's activity

liked a Space 7 days ago

Pixel Reasoner

⚡

The demo for pixel reasoner

upvoted a changelog 12 days ago

Changelog

AI-generated Abstract summaries on Hugging Face Papers

13 days ago

• 65

upvoted a changelog 13 days ago

Changelog

Filter by MCP compatibility available in HF Spaces

13 days ago

• 70

liked a model 14 days ago

mistralai/Devstral-Small-2505

Text2Text Generation • Updated 9 days ago • 173k • 717

upvoted 2 articles 14 days ago

Article

The Transformers Library: standardizing model definitions

and 3 others •

20 days ago

• 109

Article

Microsoft and Hugging Face expand collaboration

and 2 others •

16 days ago

• 20

reacted to AdinaY's post with 🔥 14 days ago

Post

2396

Dolphin 🔥 A multimodal document image parsing model from ByteDance
, built on an analyze-then-parse paradigm.

ByteDance/Dolphin

✨ MIT licensed
✨ Handles text, tables, figures & formulas via:
- Reading-order layout analysis
- Parallel parsing with smart prompts

liked a model 15 days ago

stabilityai/sv4d2.0

Updated Apr 4 • 435 • 40

liked a Space 19 days ago

453

ACE Step

😻

A Step Towards Music Generation Foundation Model

liked a model 19 days ago

ACE-Step/ACE-Step-v1-3.5B

Text-to-Audio • Updated 13 days ago • 481

liked a Space 21 days ago

832

Computer Agent

🖥

Interact with an agent to perform web-based tasks

liked a Space 23 days ago

370

Parakeet-TDT-0.6b-V2

Transcribe audio files to text with timestamps

liked a model 26 days ago

nari-labs/Dia-1.6B

Text-to-Speech • Updated 3 days ago • 165k • • 2.5k

upvoted an article 26 days ago

Article

AI Personas: The Impact of Design Choices

and 1 other •

28 days ago

• 14

reacted to wolfram's post with 👍 26 days ago

Post

7166

Finally finished my extensive **Qwen 3 evaluations** across a range of formats and quantisations, focusing on **MMLU-Pro** (Computer Science).

A few take-aways stood out - especially for those interested in local deployment and performance trade-offs:

1️⃣ **Qwen3-235B-A22B** (via Fireworks API) tops the table at **83.66%** with ~55 tok/s.
2️⃣ But the **30B-A3B Unsloth** quant delivered **82.20%** while running locally at ~45 tok/s and with zero API spend.
3️⃣ The same Unsloth build is ~5x faster than Qwen's **Qwen3-32B**, which scores **82.20%** as well yet crawls at <10 tok/s.
4️⃣ On Apple silicon, the **30B MLX** port hits **79.51%** while sustaining ~64 tok/s - arguably today's best speed/quality trade-off for Mac setups.
5️⃣ The **0.6B** micro-model races above 180 tok/s but tops out at **37.56%** - that's why it's not even on the graph (50 % performance cut-off).

All local runs were done with LM Studio on an M4 MacBook Pro, using Qwen's official recommended settings.

**Conclusion:** Quantised 30B models now get you ~98 % of frontier-class accuracy - at a fraction of the latency, cost, and energy. For most local RAG or agent workloads, they're not just good enough - they're the new default.

Well done, Qwen - you really whipped the llama's ass! And to OpenAI: for your upcoming open model, please make it MoE, with toggleable reasoning, and release it in many sizes. *This* is the future!