garbage output?

#1
by jacek2024 - opened

can't get any useful output from Q8, does it work for you?
(maybe this model requires special handling?)

DevQuasar org

It's working for me.
Maybe update your inference backend.

Proof:
Screenshot 2025-05-31 at 2.59.23 PM.png

Just realized this problem is not limited to this model, but also to the original AM-thinking, however, qwen-2.5 32B works correctly
I recompiled llama.cpp from git, no change, I wonder what I am doing incorrectly (I run many other models without issues)

am_problem_1.png

am_problem_2.png

OK, solved it, system prompt can't be empty
I compared llama-cli to llama-server, in llama-cli reply was ok because it used some system prompt automatically, in llama-server I had it empty or edited

jacek2024 changed discussion status to closed

Sign up or log in to comment