On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting Paper • 2508.11408 • Published 7 days ago • 6 • 5
On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting Paper • 2508.11408 • Published 7 days ago • 6 • 5
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model Paper • 2508.14444 • Published 3 days ago • 24
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model Paper • 2508.14444 • Published 3 days ago • 24 • 3
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens Paper • 2508.01191 • Published 21 days ago • 219 • 11
Cyber-Zero: Training Cybersecurity Agents without Runtime Paper • 2508.00910 • Published 24 days ago • 8
Personalized Safety Alignment for Text-to-Image Diffusion Models Paper • 2508.01151 • Published 21 days ago • 8
Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe Paper • 2508.01691 • Published 19 days ago • 9
A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models Paper • 2508.01548 • Published 20 days ago • 13
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo Paper • 2508.02317 • Published 18 days ago • 16
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report Paper • 2508.01059 • Published 21 days ago • 32
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning Paper • 2507.16784 • Published Jul 22 • 116 • 10
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning Paper • 2507.16784 • Published Jul 22 • 116 • 10
Favicon Trojans: Executable Steganography Via Ico Alpha Channel Exploitation Paper • 2507.09074 • Published Jul 11 • 6 • 5
Favicon Trojans: Executable Steganography Via Ico Alpha Channel Exploitation Paper • 2507.09074 • Published Jul 11 • 6 • 5
Favicon Trojans: Executable Steganography Via Ico Alpha Channel Exploitation Paper • 2507.09074 • Published Jul 11 • 6 • 5
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published May 20 • 23 • 6
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published May 20 • 23 • 6