70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Paper • 2504.11651 • Published 6 days ago • 13
Llama Collection All our SOTA Llama models that crush competition :) • 6 items • Updated Nov 5, 2024 • 1
Llama Collection All our SOTA Llama models that crush competition :) • 6 items • Updated Nov 5, 2024 • 1
xmadai/Mistral-Large-Instruct-2407-xMADai-INT4 Text Generation • Updated Oct 30, 2024 • 121 • 6
xmadai/Llama-3.1-Nemotron-70B-Instruct-xMADai-INT4 Text Generation • Updated Oct 30, 2024 • 1 • 4
Llama Collection All our SOTA Llama models that crush competition :) • 6 items • Updated Nov 5, 2024 • 1