Transformers documentation
Optimum
You are viewing main version, which requires installation from source. If you'd like
regular pip install, checkout the latest stable version (v4.49.0).
Optimum
Optimum is an optimization library that supports quantization for Intel, Furiousa, ONNX Runtime, GPTQ, and lower-level PyTorch quantization functions. It is designed to enhance performance for specific hardware - Intel CPUs/HPUs, AMD GPUs, Furiousa NPUs, etc. - and model accelerators like ONNX Runtime.
< > Update on GitHub