--- license: other base_model: black-forest-labs/FLUX.1-schnell tags: - flux - flux1-schnell - text-to-image - diffusers - tensorrt - tensorrt-rtx - nvidia - ampere - bf16 --- # Flux.1-schnell TensorRT-RTX BF16 Ampere TensorRT-RTX optimized engines for Flux.1-schnell on NVIDIA Ampere architecture (RTX 30 series, A100, etc.) with BF16 precision. ## Model Details - **Base Model**: black-forest-labs/FLUX.1-schnell - **Architecture**: AMPERE (Compute Capability 8.6) - **Precision**: BF16 (16-bit brain floating point) - **TensorRT-RTX Version**: 1.0.0.21 - **Image Resolution**: 1024x1024 - **Batch Size**: 1 (static) ## Engine Files This repository contains 4 TensorRT engine files: - `clip.plan` - CLIP text encoder - `t5.plan` - T5 text encoder - `transformer.plan` - Flux transformer model - `vae.plan` - VAE decoder **Total Size**: 16.7GB ## Hardware Requirements - NVIDIA RTX 30 series (RTX 3080, 3090) or A100 - Compute Capability 8.6 - Minimum 24GB VRAM recommended - TensorRT-RTX 1.0.0.21 runtime ## Usage ```python # Example usage with TensorRT-RTX backend from nvidia_demos.TensorRT_RTX.demo.flux1_dev.pipelines.flux_pipeline import FluxPipeline pipeline = FluxPipeline( cache_dir="./cache", hf_token="your_hf_token" ) # Load pre-built engines pipeline.load_engines( transformer_precision="bf16", opt_batch_size=1, opt_height=1024, opt_width=1024 ) # Generate image image = pipeline.infer( prompt="A beautiful landscape with mountains", height=1024, width=1024 ) ``` ## Performance - **Inference Speed**: ~8-12 seconds per image (RTX 3090) - **Memory Usage**: ~18-20GB VRAM - **Optimizations**: Static shapes, BF16 precision, Ampere-specific kernels ## License This model follows the Flux.1-schnell license terms. Please refer to the original model repository for licensing details. ## Built With - [TensorRT-RTX 1.0.0.21](https://developer.nvidia.com/tensorrt) - [NVIDIA Flux Demo](https://github.com/NVIDIA/TensorRT-RTX/) - Built on NVIDIA GeForce RTX 3090 (Ampere 8.6)