NEW: Real-time conversational AI models can now run 100% locally in your browser! π€―
π Privacy by design (no data leaves your device) π° Completely free... forever π¦ Zero installation required, just visit a website β‘οΈ Blazingly-fast WebGPU-accelerated inference
For those interested, here's how it works: - Silero VAD for voice activity detection - Whisper for speech recognition - SmolLM2-1.7B for text generation - Kokoro for text to speech
Powered by Transformers.js and ONNX Runtime Web! π€ I hope you like it!
π NEW DROP: run your own on-device LLMβin minutes, on any phone Today weβre open-sourcing everything you need to put Qwen3-0.6B straight into a production-ready mobile app:
π₯ Watch Qwen3-0.6B chat in real time on any smartphones!
π TPS benchmarks β slides comparing token-per-second across heterogeneous mobile devices
π» Plug-and-play source β Just Copy & Run the source to your project for Android (Kotlin & Java) and iOS (Swift).
π€ Cross-platform, one pipeline β ZETIC.MLange auto-tunes kernels for every different devices, weβve tested.
π¨βπ» Ready for production β swap in your own model, re-benchmark with one command, publish.
We built this to show that cloud-free LLMs are ready today. Dive in, fork it, and tag ZETIC.ai when you launch your own on-device assistant, game NPC, or offline content generatorβweβll spotlight the best projects.