Bigger isn't always better: how to choose the most efficient model for context-specific tasks 🌱🧑🏼💻 By sasha • 5 days ago • 14
Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face By dvgodoy • Feb 11 • 38
OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve By codelion • 13 days ago • 18
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 143
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr • Feb 11 • 41
Building an Open Ecosystem for Time Series Forecasting: Introducing TimesFM in Hugging Face By Nutanix and 1 other • 14 days ago • 13
Mitigating False Negatives in Multiple Negatives Ranking Loss for Retriever Training By dragonkue • 8 days ago • 6
System Prompt Learning: Teaching LLMs to Learn Problem-Solving Strategies from Experience By codelion • about 7 hours ago • 4
Bigger isn't always better: how to choose the most efficient model for context-specific tasks 🌱🧑🏼💻 By sasha • 5 days ago • 14
Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face By dvgodoy • Feb 11 • 38
OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve By codelion • 13 days ago • 18
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 143
Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr • Feb 11 • 41
Building an Open Ecosystem for Time Series Forecasting: Introducing TimesFM in Hugging Face By Nutanix and 1 other • 14 days ago • 13
Mitigating False Negatives in Multiple Negatives Ranking Loss for Retriever Training By dragonkue • 8 days ago • 6
System Prompt Learning: Teaching LLMs to Learn Problem-Solving Strategies from Experience By codelion • about 7 hours ago • 4