Owner

Included link to the QAT - Quantization Aware Training. And the quote relating to bfloat16 is now included.

I just finished a shortened bfloat16 definition for CUDA in another project. Down to 5 lines of code from over 5000 lines (/usr/local/cuda/include/cuda_bf16.h), provided by Nvidia. And it works!

Owner

Looks good!

kreier changed pull request status to merged

Sign up or log in to comment