Fine-tuning on a new language

#47
by amaraa - opened

Would love to give this a try and fine-tune on a new language. But it is not very clear as to how to extend the vocab and tokenizer for the new language. Also, when trying to do basic eval we keep getting this error:

bad operand type for unary -: 'NoneType'

Found this issue: https://huggingface.co/microsoft/Phi-4-multimodal-instruct/discussions/46 which perhaps is the same issue?

It would be great if you can provide some guidance on:

  • Extending vocab
  • Customizing tokenizer
  • And the NoneType error

Looked at the Korean fine-tuning example and also saw Notebooks of folks attempting to fine-tune for languages like Turkish. But there seems to be no common way to easily fine-tune on downstream task(s) / new language(s).

Microsoft org

@amaraa
If you have inference errors, please double check the environ and dependencies, this is what we suggest
https://huggingface.co/microsoft/Phi-4-multimodal-instruct#requirements

Alternatively you can look at this dockerfile
https://github.com/anastasiosyal/phi4-multimodal-instruct-server/blob/main/dockerfile

This is the writeup for Korean finetuning contributed by the community.
https://huggingface.co/microsoft/Phi-4-multimodal-instruct#appendix-b-fine-tuning-korean-speech

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment