Stops working after first two messages
Had the same issue with QwQ 32b to be honest but this seems to work better for the first two messages it thinks and outputs a response but when the third message is there it either doesn't think anymore or it thinks but never outputs a final answer.
I'm using the Q5 variant in LM studio with temp 0.8 and rep 1.1 32k context , tried with top k on and off, top p on and off, min sampling on and off. Same result.
I'm using the system prompt :
You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside tags, and then provide your solution or response to the problem.
The prompt template though I don't know what's its supposed to be but its the automatically loaded one from LM studio:
{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>
What am I doing wrong or is it like Qwq 32b where there is some kind of error?
I'm seeing the same issue, I can only get it to use reasoning for 1 or 2 responses at most.
Just make sure it's the last thing in the prompt rather than the first, that helped a lot for me.