LLM Training - a JM-Brun Collection

JM-Brun 's Collections

Tabular

Agents

SLMs

LLM-KG

LLM Architecture

Interpretability XAI

LLM Training

updated 4 days ago

The Differences Between Direct Alignment Algorithms are a Blur

Paper • 2502.01237 • Published Feb 3 • 115
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published 9 days ago • 116