Fine-tuning on a new language
Would love to give this a try and fine-tune on a new language. But it is not very clear as to how to extend the vocab and tokenizer for the new language. Also, when trying to do basic eval we keep getting this error:
bad operand type for unary -: 'NoneType'
Found this issue: https://huggingface.co/microsoft/Phi-4-multimodal-instruct/discussions/46 which perhaps is the same issue?
It would be great if you can provide some guidance on:
- Extending vocab
- Customizing tokenizer
- And the
NoneType
error
Looked at the Korean fine-tuning example and also saw Notebooks of folks attempting to fine-tune for languages like Turkish. But there seems to be no common way to easily fine-tune on downstream task(s) / new language(s).
@amaraa
If you have inference errors, please double check the environ and dependencies, this is what we suggest
https://huggingface.co/microsoft/Phi-4-multimodal-instruct#requirements
Alternatively you can look at this dockerfile
https://github.com/anastasiosyal/phi4-multimodal-instruct-server/blob/main/dockerfile
This is the writeup for Korean finetuning contributed by the community.
https://huggingface.co/microsoft/Phi-4-multimodal-instruct#appendix-b-fine-tuning-korean-speech