New Language
Hi Team,
is it possibile to use the pre-trained version to 'teach' the model a new language ? if yes how many hours of audio would you recommend to use ?
Cheers
I would like to also ask about the GPU VRAM requirement for training/finetuning of the smallest model. I ask about 1 GPU not cluster.
GPU vram is dependent on sequence length - all finetuning for us was done with short speech sequences and can be done quite easily on 80gb. You can also do PEFT/LORA etc to reduce vram requirements. Check the Github for a link to an Unsloth peft implementation.
Re multlingual - you can have decent results with short finetunes on a few thousand samples - but for results to the level of English requires more data - we will offer better pretrained models for multilingual finetunes very soon.