Mar 22

Thanks for open-sourcing this, it's one of the best reasoning models I've tried in this category so far. I'm currently working on adapting it for domain-specific reasoning. Could you advise on what the dataset should look like for this purpose?

Is it sufficient to use an Alpaca-format dataset
({"instruction: "", "input" : "", "output" : ""}) or would I need reasoning traces for effective fine-tuning? Also, would you recommend QLoRA or SFT for this task?

Any tips or best practices would be greatly appreciated!

aifeifei798

Mar 23

Thinking can be "on" or "off"

thinking = "on"

print(pipeline([{"role": "system", "content": f"detailed thinking {thinking}"}, {"role": "user", "content": "Solve x*(sin(x)+2)=0"}]))

is good model,but i need training data format,pls

aifeifei798

Mar 23

https://huggingface.co/datasets/nvidia/Llama-Nemotron-Post-Training-Dataset-v1

nvidia
/

Llama-3.1-Nemotron-Nano-8B-v1

Great model, Seeking Advice on Fine-Tuning for Domain Reasoning Tasks

Thinking can be "on" or "off"