Malthe August Bordin Bresler's picture

13

Malthe August Bordin Bresler

maltheaugust

maltheaugust

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

upvoted a paper 10 days ago

Kimi-VL Technical Report

upvoted a paper 10 days ago

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

View all activity

Organizations

None yet

maltheaugust's activity

upvoted a paper 6 days ago

PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

Paper • 2504.08791 • Published 14 days ago • 117

upvoted 2 papers 10 days ago

Kimi-VL Technical Report

Paper • 2504.07491 • Published 11 days ago • 115

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Paper • 2504.06261 • Published 13 days ago • 101

upvoted 2 papers 13 days ago

JudgeLRM: Large Reasoning Models as a Judge

Paper • 2504.00050 • Published 21 days ago • 59

ZClip: Adaptive Spike Mitigation for LLM Pre-Training

Paper • 2504.02507 • Published 18 days ago • 76

upvoted a paper 20 days ago

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

Paper • 2503.19693 • Published 27 days ago • 75

upvoted a paper 24 days ago

LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation

Paper • 2503.19950 • Published 27 days ago • 11

upvoted 6 papers about 1 month ago

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 140

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 119

φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation

Paper • 2503.13288 • Published Mar 17 • 50

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14 • 96

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 157

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 68