Can I fine-tune this model if I'm GPU poor?
#3
by
CCRss
- opened
If I only have let's say 8H100 or 64H100 GPUs will it be enough to finetune such model only fine-tuining mlp and vision part. And also maybe will it be beneficial to fine-tune using lora?
Really great multilingual model, tested on kk and ru. Works well in OCR and overall in writing. I tested online demo https://www.hailuo.ai/ not model, maybe they use different ones.
And is there any cases how to fine-tune this model?
- MiniMax-VL-01 model training involves updating all parameters. Therefore, while applying LoRA fine-tuning to the vision part and MLP might offer specific advantages in certain scenarios, it could potentially degrade the overall performance of the model.
- Currently, the open-sourced model we have is the actual model deployed on https://www.hailuo.ai. The inconsistency in experience might be due to factors such as the system prompt, as well as the logic involved in switching between MiniMax-Text-01 and MiniMax-VL-01.
- We do not currently provide direct fine-tuning support. Technically, using 64 H100 GPUs to fine-tune the MLP and vision parts should be feasible. By not creating optimizer states for the LLM part, we can save a significant amount of HBM memory. For setups with 8 H100 GPUs, HBM might be a bit tight, and some offloading strategies could be helpful. It might be worthwhile to look into related work from DeepSpeed or other open-source frameworks for reference.
Thank you for your interest in our project!
CCRss
changed discussion status to
closed