Great model, Seeking Advice on Fine-Tuning for Domain Reasoning Tasks

#4
by aaditya - opened

Thanks for open-sourcing this, it's one of the best reasoning models I've tried in this category so far. I'm currently working on adapting it for domain-specific reasoning. Could you advise on what the dataset should look like for this purpose?

Is it sufficient to use an Alpaca-format dataset
({"instruction: "", "input" : "", "output" : ""}) or would I need reasoning traces for effective fine-tuning? Also, would you recommend QLoRA or SFT for this task?

Any tips or best practices would be greatly appreciated!

Thinking can be "on" or "off"

thinking = "on"

print(pipeline([{"role": "system", "content": f"detailed thinking {thinking}"}, {"role": "user", "content": "Solve x*(sin(x)+2)=0"}]))

is good model,but i need training data format,pls

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment