Will the fine-tuning code be provided?

by AXCXEPT - opened Apr 17

Apr 17

I truly believe that Bitnet is an outstanding achievement that opens up new possibilities for the future.
Thank you very much for releasing such an impressive model.
While the paper is available, I was wondering if you plan to release the code for properly fine-tuning the Bitnet model as well?

Sincerely,
Axcxept Inc.

Gerald001

Apr 17

•

edited Apr 17

@shumingma can you provide any infos + code to fine tune this model?

AXCXEPT

Apr 17

Thank you for your reply.
After reading the paper, I was able to resolve the issue by using SFT and DPO, setting the learning rate to 2e-7, and testing with a longer context length.
I truly appreciate your outstanding work.
If there’s anything else we should be mindful of during training, please let me know.
From our experiments, we concluded that no special training code is required.

robustdev

Apr 18

•

edited Apr 18

@AXCXEPT could you please post your fine tuning code? could you also train a binary text classifier with fine tuning?

AXCXEPT

Apr 18

@robustdev

We were actually able to perform training with SFT yesterday. However, upon reflection, this shouldn’t have been possible due to the specifications of TRL.

Now, our training code not work.

Therefore, we are now investigating the truth in the following thread.

https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/discussions/12

Can anyone answer that Q?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment