TRL documentation

Speeding Up Training

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.13.0).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Speeding Up Training

Section under construction. Feel free to contribute!

vLLM for fast generation in online methods

Online methods such as Online DPO or Nash-MD require the model to generate completions, which is often a slow process and can significantly impact training time. To speed up generation, you can use vLLM, a library that enables fast generation through PagedAttention. TRL’s online trainers support vLLM, greatly improving training speed.

To use vLLM, first install it using:

pip install vllm
Online DPO

Then, enable it by passing use_vllm=True in the training arguments.

from trl import OnlineDPOConfig

training_args = DPOConfig(..., use_vllm=True)
< > Update on GitHub