dont work... repeat all user messages
dont work... repeat all user messages
tried in LM Studio and chatgpt4all
also with some other templates
Hello,
Our model is tested using Transformers and VLLM. You can find all the evaluations on the model card. By the kind of behavior you mentioned it seems to be a problem with the integration with LM Studio.
In the meantime, in order to reproduce the error you had, we will need more details about the prompt you used and how did you apply the chatML template. Can you send us some example prompts?
We will check the model and report back to you as soon as possible.
Best regards,
Carlos
thx for checking...
type question: "what are your main features?", usless answer.
attach an english dokument (3 pages, raw text file)
ask for a specific content! answer is very general.
all with standard template , with or with out system prompt.
system: win11, lm studio v 0.3.15
unstablest model iv ever tryed so fare and i had 100ts
all in comparision to granite 2b or 8b
ok.. one differ i dont load your q4 GGUF version...
i used huggingface repo and create an q8 version
https://huggingface.co/spaces/ggml-org/gguf-my-repo
but should be work, right?
Hello,
Thank you for your reply.
As we state in the model card this model is a machine translation only version of salamandra 7B:
DISCLAIMER: This version of Salamandra is tailored exclusively for translation tasks. It lacks chat capabilities and has not been trained with any chat instructions.
For chat prompts or non-translation related tasks, the model won't generate a correct answer. For those tasks we recommend you to use the general LLM version of Salamandra:
https://huggingface.co/BSC-LT/salamandra-7b-instruct
About the example you show on the screenshot. LM Studio assumes that the LLM is a conversation model, which SalamandraTA-7B-instruct is not. In order to obtain the best results you should consider the following parameters:
- Deactivate the context from the previous conversation turns or test it on a new conversation.
- Unlike most LLMs that use sampling as decoding method, this model is designed to use temperature = 0. This ensures consistency when the model is translating similar inputs and prevents hallucinations.
- Setting the EOS token to <|im_end|> can also help with consistency.
Here you can see a generation example and the settings I used on LM studio:
Hope this helps and you are able to run the model without issues!
Best regards,
Carlos
i see...
ok get it...
but also instruct both i dont get stable translation ... like granite there it allways work your is to speacial or in very early ALPHA state ^^