Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Delyan Raychev's picture

Delyan Raychev

delyanr
ยท
https://delyan.org
  • DelyanRaychev
  • draychev

AI & ML interests

None yet

Recent Activity

reacted to mitkox's post with ๐Ÿ”ฅ about 2 months ago
I just threw Qwen3-0.6B in BF16 into an on device AI drag race on AMD Strix Halo with vLLM: 564 tokens/sec on short 100-token sprints 96 tokens/sec on 8K-token marathons TL;DR You don't just run AI on AMD. You negotiate with it. The hardware absolutely delivers. Spoiler alert; there is exactly ONE configuration where vLLM + ROCm + Triton + PyTorch + Drivers + Ubuntu Kernel to work at the same time. Finding it required the patience of a saint Consumer AMD for AI inference is the ultimate "budget warrior" play, insane performance-per-euro, but you need hardcore technical skills that would make a senior sysadmin nod in quiet respect.
View all activity

Organizations

None yet

models 0

None public yet

datasets 0

None public yet
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs