Great model, Seeking Advice on Fine-Tuning for Domain Reasoning Tasks
#4
by
aaditya
- opened
Thanks for open-sourcing this, it's one of the best reasoning models I've tried in this category so far. I'm currently working on adapting it for domain-specific reasoning. Could you advise on what the dataset should look like for this purpose?
Is it sufficient to use an Alpaca-format dataset({"instruction: "", "input" : "", "output" : ""})
or would I need reasoning traces for effective fine-tuning? Also, would you recommend QLoRA or SFT for this task?
Any tips or best practices would be greatly appreciated!
Thinking can be "on" or "off"
thinking = "on"
print(pipeline([{"role": "system", "content": f"detailed thinking {thinking}"}, {"role": "user", "content": "Solve x*(sin(x)+2)=0"}]))
is good model,but i need training data format,pls