-
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
Paper • 2508.01059 • Published • 33 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 225 -
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification
Paper • 2508.05629 • Published • 158 -
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 159
Collections
Discover the best community collections!
Collections including paper arxiv:2508.01059
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.7k • 1.15k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 14 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 1.04k • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 61
-
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
Paper • 2405.19504 • Published • 3 -
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
Paper • 2506.20452 • Published • 19 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 69 -
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
Paper • 2507.18553 • Published • 39
-
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report
Paper • 2508.01059 • Published • 33 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 225 -
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification
Paper • 2508.05629 • Published • 158 -
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 159
-
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
Paper • 2405.19504 • Published • 3 -
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
Paper • 2506.20452 • Published • 19 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 69 -
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
Paper • 2507.18553 • Published • 39
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.7k • 1.15k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 14 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 1.04k • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 61