license: llama2 | |
Trained using TRL, it didn't fit properly on my 3090 without significantly dropping batch size and applying 4-bit quantization. | |
It didn't exactly converge. | |
 |