DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 16 days ago • 302
Offline Reinforcement Learning for LLM Multi-Step Reasoning Paper • 2412.16145 • Published Dec 20, 2024 • 38
view article Article wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR?? By catherinearnett • Sep 27, 2024 • 40
Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean Paper • 2403.10882 • Published Mar 16, 2024 • 5
X-LLaVA: Optimizing Bilingual Large Vision-Language Alignment Paper • 2403.11399 • Published Mar 18, 2024 • 6
BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via Graph Representation Pretraining Paper • 2401.06443 • Published Jan 12, 2024 • 2
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 711
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning Paper • 2402.15506 • Published Feb 23, 2024 • 14
In deep reinforcement learning, a pruned network is a good network Paper • 2402.12479 • Published Feb 19, 2024 • 19
Mixtures of Experts Unlock Parameter Scaling for Deep RL Paper • 2402.08609 • Published Feb 13, 2024 • 35
Large-scale Reinforcement Learning for Diffusion Models Paper • 2401.12244 • Published Jan 20, 2024 • 29