Will the fine-tuning code be provided?

#6
by AXCXEPT - opened

I truly believe that Bitnet is an outstanding achievement that opens up new possibilities for the future.
Thank you very much for releasing such an impressive model.
While the paper is available, I was wondering if you plan to release the code for properly fine-tuning the Bitnet model as well?

Sincerely,
Axcxept Inc.

@shumingma can you provide any infos + code to fine tune this model?

Thank you for your reply.
After reading the paper, I was able to resolve the issue by using SFT and DPO, setting the learning rate to 2e-7, and testing with a longer context length.
I truly appreciate your outstanding work.
If there’s anything else we should be mindful of during training, please let me know.
From our experiments, we concluded that no special training code is required.

@AXCXEPT could you please post your fine tuning code? could you also train a binary text classifier with fine tuning?

@robustdev

We were actually able to perform training with SFT yesterday. However, upon reflection, this shouldn’t have been possible due to the specifications of TRL.

Now, our training code not work.

Therefore, we are now investigating the truth in the following thread.

https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/discussions/12

Can anyone answer that Q?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment