view article Article Wan2.1 + DFloat11 Enables High-quality Text-to-video With 24GB VRAM By LeanQuant • 17 days ago • 1
DFloat11 | FLUX.1 Collection Losslessly compressed FLUX.1: requires < 20GB VRAM to run. • 5 items • Updated May 8 • 1
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Paper • 2504.11651 • Published Apr 15 • 28
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published Mar 20 • 74
Llama Collection All our SOTA Llama models that crush competition :) • 6 items • Updated Nov 5, 2024 • 1