license: mit language: - en tags: - text-generation-inference - text
This repo contains the TensorRT LLM version of TinyLlama Model. The conversion is done to support Float16 precision on Nvidia TensorRT.