๐ช It's the first time Open-Source coding model of this size class that clearly matches GPT-4o's coding capabilities!
โจ Completes the previous two Qwen 2.5 Coder release with 4 new size: 0.5B, 3B, 14B, 32B ๐ Support long context up to 128K (for the 14B and 32B models) โ Drop-in replacement to GPT-4o as a coding assistant on Cursor or for Artifacts! ๐ค Models available right now on the Hub, under Apache 2.0 license!
๐๏ธ "We need digital sobriety." @sasha challenges Big Tech's race for nuclear energy on BBC AI Decoded. Instead of pursuing more power, shouldn't we first ask if we really need AI everywhere?
Did you guys know that if you try to link a prepaid card to huggingface it won't work, but then if you press the button again it links anyway? Then you can lock the card (deny any charges), and get resources for free? You're welcome :P
4 replies
ยท
reacted to yongchanghao's
post with ๐ฅ4 months ago
We just released a paper (NeuZip) that compresses VRAM in a lossless manner to run larger models. This should be particularly useful when VRAM is insufficient during training/inference. Specifically, we look inside each floating number and find that the exponents are highly compressible (as shown in the figure below).