Transformers documentation

Optimum

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Optimum

Optimum is an optimization library that supports quantization for Intel, Furiousa, ONNX Runtime, GPTQ, and lower-level PyTorch quantization functions. It is designed to enhance performance for specific hardware - Intel CPUs/HPUs, AMD GPUs, Furiousa NPUs, etc. - and model accelerators like ONNX Runtime.

< > Update on GitHub