Quentin Gallouédec's picture

Hiring 💼

Quentin Gallouédec PRO

qgallouedec

huggingface

·

AI & ML interests

None yet

Recent Activity

posted an update about 13 hours ago

TRL v1.3 ships day-one training support for Qwen 3.6 🚀 The new Qwen 3.6 family (`Qwen/Qwen3.6-27B`, `Qwen/Qwen3.6-35B-A3B`) reuses the Qwen3.5-MoE architecture but ships a slightly different chat template, so we updated the stack end-to-end: new training template with `{% generation %}` markers, tool-call response schema routing, tiny test models for the VLM matrix. SFT with assistant-only loss works out of the box: ```python from trl import SFTConfig, SFTTrainer trainer = SFTTrainer( model="Qwen/Qwen3.6-27B", args=SFTConfig(assistant_only_loss=True), train_dataset=dataset, ) trainer.train() ``` So does GRPO tool-calling — just hand `tools=[...]` to `GRPOTrainer`. v1.3 also brings a new experimental TPO trainer (Triple Preference Optimization), speculative decoding in `trl vllm-serve` (Qwen3 MTP / Eagle3 drafts), 12 more KTO ↔ DPO alignment PRs (KTO promotion to stable is now in reach), three more `{% generation %}` chat templates (Gemma/Gemma 2, Phi-3, GLM-4-MoE), and a chunky SFT entropy bug fix. Full release notes: https://github.com/huggingface/trl/releases/tag/v1.3.0

updated a dataset 1 day ago

hf-doc-build/doc-build

updated a bucket 1 day ago

hf-doc-build/doc

View all activity

Organizations

published an article 27 days ago

Article

TRL v1.0: Post-Training Library Built to Move with the Field

+2

27 days ago

•

49

published an article about 2 months ago

Article

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

+7

Mar 10

•

133

published an article 5 months ago

Article

Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand

Dec 4, 2025

•

68

published an article 5 months ago

Article

20x Faster TRL Fine-tuning with RapidFire AI

+1

Nov 21, 2025

•

27

published an article 9 months ago

Article

Vision Language Model Alignment in TRL ⚡️

+3

Aug 7, 2025

•

110

published an article 9 months ago

Article

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

+3

Jul 29, 2025

•

221

published an article 10 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

Jul 8, 2025

•

770

published an article 11 months ago

Article

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

+4

Jun 3, 2025

•

101

published an article about 1 year ago

Article

Gotchas in Tokenizer Behavior Every Developer Should Know

Apr 18, 2025

•

72

published an article about 1 year ago

Article

Open R1: Update #3

Mar 11, 2025

•

297

published an article about 1 year ago

Article

Open-R1: Update #1

Feb 2, 2025

•

305

published an article over 1 year ago

Article

Visualize and understand GPU memory in PyTorch

Dec 24, 2024

•

269

published an article almost 2 years ago

Article

Preference Optimization for Vision Language Models

+2

Jul 10, 2024

•

93

published an article about 2 years ago

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

+2

Apr 22, 2024

•

81

published an article about 2 years ago

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

+2

Apr 22, 2024

•

81