English
TensorRT-libs / README.md
Jarvis73's picture
Upload folder using huggingface_hub
76de374 verified
|
raw
history blame
2.93 kB
metadata
license: other
license_name: tencent-hunyuan-community
license_link: https://huggingface.co/Tencent-Hunyuan/HunyuanDiT/blob/main/LICENSE.txt
language:
  - en

HunyuanDiT TensorRT Acceleration

English | 中文

We provide a TensorRT version of HunyuanDiT for inference acceleration (faster than flash attention). One can convert the torch model to TensorRT model using the following steps.

1. Download dependencies from huggingface.

cd HunyuanDiT
# Use the huggingface-cli tool to download the model.
huggingface-cli download Tencent-Hunyuan/TensorRT-libs --local-dir ./ckpts/t2i/model_trt

2. Install the TensorRT dependencies.

sh trt/install.sh

3. Build the TensorRT engine.

Method 1: Use the prebuilt engine

We provide some prebuilt TensorRT engines.

Supported GPU Download Link Remote Path
GeForce RTX 3090 HuggingFace engines/RTX3090/model_onnx.plan
GeForce RTX 4090 HuggingFace engines/RTX4090/model_onnx.plan
A100 HuggingFace engines/A100/model_onnx.plan

Use the following command to download and place the engine in the specified location.

huggingface-cli download Tencent-Hunyuan/TensorRT-engine <Remote Path> --local-dir ./ckpts/t2i/model_trt/engine

Method 2: Build your own engine

If you are using a different GPU, you can build the engine using the following command.

# Set the TensorRT build environment variables first. We provide a script to set up the environment.
source trt/activate.sh

# Method 1: Build the TensorRT engine. By default, it will read the `ckpts` folder in the current directory.
sh trt/build_engine.sh

# Method 2: If your model directory is not `ckpts`, you need to specify the model directory.
sh trt/build_engine.sh </path/to/ckpts>
  1. Run the inference using the TensorRT model.
# Run the inference using the prompt-enhanced model + HunyuanDiT TensorRT model.
python sample_t2i.py --prompt "渔舟唱晚" --infer-mode trt

# Close prompt enhancement. (save GPU memory)
python sample_t2i.py --prompt "渔舟唱晚" --infer-mode trt --no-enhance