Transformers documentation
Efficient Inference on a Single GPU
You are viewing v4.21.3 version.
			
				A newer version
					v4.57.1 is available.
Efficient Inference on a Single GPU
This document will be completed soon with information on how to infer on a single GPU. In the meantime you can check out the guide for training on a single GPU and the guide for inference on CPUs.