LPD Collection Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation • 6 items • Updated 10 days ago • 1
Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation Paper • 2507.01957 • Published 10 days ago • 18
Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation Paper • 2506.19852 • Published 18 days ago • 38
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation Paper • 2506.19852 • Published 18 days ago • 38
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Paper • 2505.22618 • Published May 28 • 42
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation Paper • 2505.18875 • Published May 24 • 41 • 2
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization Paper • 2403.12422 • Published Mar 19, 2024 • 1
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training Paper • 2410.19313 • Published Oct 25, 2024 • 19
Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity Paper • 2502.01776 • Published Feb 3 • 2
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache Paper • 2502.10424 • Published Feb 5 • 1
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation Paper • 2505.18875 • Published May 24 • 41
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training Paper • 2505.11594 • Published May 16 • 73
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation Paper • 2505.18875 • Published May 24 • 41
Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity Paper • 2502.01776 • Published Feb 3 • 2