Delulu: A Verified Multi-Lingual Benchmark for Code Hallucination Detection in Fill-in-the-Middle Tasks Paper • 2605.07024 • Published 24 days ago • 2
SABER: Uncovering Vulnerabilities in Safety Alignment via Cross-Layer Residual Connection Paper • 2509.16060 • Published Sep 19, 2025 • 1
Refusal in Language Models Is Mediated by a Single Direction Paper • 2406.11717 • Published Jun 17, 2024 • 13
Qwen 3.5 - 0.8, 2, 4, 9, 27, 35B - regular / uncensored Collection Min 256k context + images : Reg, Heretic, Heretic fine tunes of Qwen 3.5 in all sizes. • 43 items • Updated 1 day ago • 45
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 ggerganov, ngxson, allozaur, lysandre, victor, julien-c • Feb 20 • 507
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 itazap, ariG23498, ArthurZ, sergiopaniego, merve, pcuenq • Dec 18, 2025 • 124
view article Article Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models nvidia • Dec 15, 2025 • 111
The Bestiary Collection Decensored language models made using Heretic (https://github.com/p-e-w/heretic) • 5 items • Updated 10 days ago • 114
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 30 items • Updated Feb 25 • 141
Granite Quantized Models Collection Quantized versions of IBM Granite models. • 47 items • Updated 4 days ago • 36
On Path to Multimodal Generalist: General-Level and General-Bench Paper • 2505.04620 • Published May 7, 2025 • 83