TinyLLama TensorRT LLM Edition.

This repo contains the TensorRT LLM version of TinyLlama Model. The conversion is done to support Float16 precision on Nvidia TensorRT.

Downloads last month
44
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support