11 29 292

Matricardi Fabio

FM-1976

https://medium.com/@fabio.matricardi

AI & ML interests

control system engineering, AI, LLM with python. ThePoorGPUguy on substack

Recent Activity

liked a model 1 day ago

openbmb/MiniCPM-o-2_6-gguf

liked a model 2 days ago

mradermacher/Llama-3.1-Nemotron-Nano-4B-v1.1-GGUF

liked a model 2 days ago

second-state/FLUX.1-Redux-dev-GGUF

View all activity

Organizations

None yet

FM-1976's activity

liked a model 1 day ago

openbmb/MiniCPM-o-2_6-gguf

Any-to-Any • Updated Feb 27 • 5.38k • 108

liked 3 models 2 days ago

reacted to Xenova's post with 🔥 2 days ago

Post

2003

NEW: Real-time conversational AI models can now run 100% locally in your browser! 🤯

🔐 Privacy by design (no data leaves your device)
💰 Completely free... forever
📦 Zero installation required, just visit a website
⚡️ Blazingly-fast WebGPU-accelerated inference

Try it out: webml-community/conversational-webgpu

For those interested, here's how it works:
- Silero VAD for voice activity detection
- Whisper for speech recognition
- SmolLM2-1.7B for text generation
- Kokoro for text to speech

Powered by Transformers.js and ONNX Runtime Web! 🤗 I hope you like it!

2 replies

updated a collection 2 days ago

GRADIO examples

Collection

5 items • Updated 2 days ago

liked a Space 2 days ago

108

Conversational WebGPU

🚀

liked a model 2 days ago

BAAI/OpenSeek-Small-v1

Updated 2 days ago • 4

reacted to yeonseok-zeticai's post with 🔥 3 days ago

Post

1992

🚀 NEW DROP: run your own on-device LLM—in minutes, on any phone
Today we’re open-sourcing everything you need to put Qwen3-0.6B straight into a production-ready mobile app:

🎥 Watch Qwen3-0.6B chat in real time on any smartphones!

📊 TPS benchmarks – slides comparing token-per-second across heterogeneous mobile devices

💻 Plug-and-play source – Just Copy & Run the source to your project for Android (Kotlin & Java) and iOS (Swift).

🤞 Cross-platform, one pipeline – ZETIC.MLange auto-tunes kernels for every different devices, we’ve tested.

👨‍💻 Ready for production – swap in your own model, re-benchmark with one command, publish.

Get started
Just Sign-up and check the playground project, QWEN-0.6B
- https://mlange.zetic.ai/p/zetic-example/Qwen3-0.6B

We built this to show that cloud-free LLMs are ready today. Dive in, fork it, and tag ZETIC.ai when you launch your own on-device assistant, game NPC, or offline content generator—we’ll spotlight the best projects.