view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others • Jun 12 • 125
FEAT: Full-Dimensional Efficient Attention Transformer for Medical Video Generation Paper • 2506.04956 • Published Jun 5 • 3
Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs Paper • 2504.07866 • Published Apr 10 • 12
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 184
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale Paper • 2505.03005 • Published May 5 • 35
Volume estimates for unions of convex sets, and the Kakeya set conjecture in three dimensions Paper • 2502.17655 • Published Feb 24 • 1
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published Apr 8 • 111
People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text Paper • 2501.15654 • Published Jan 26 • 15
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference Paper • 2503.13427 • Published Mar 17 • 3
RWKV-7 "Goose" with Expressive Dynamic State Evolution Paper • 2503.14456 • Published Mar 18 • 153
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Jul 21 • 632
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning Paper • 2502.06060 • Published Feb 9 • 38