Differences in output from the original model
Hi, thank you for providing the GGUF version. I am the author of this original model.
In this GGUF model, it seems that newline code are displayed as "<0x0A>" or the stop token is lost. My friend pointed out this problem to me, but that did not occur with the original model.
I have no idea what is causing this problem, is there anything I can do to help you?
Thank you for the report.
I have now checked and uploaded corrections for both 7b and 13b.
During the merging of the Swallow tokenizer from the llama extension and the llama tokenizer from tulu-2,
the original spm tokenizer (tokenizer.model) was converted to huggingface's FastTokenizer (tokenizer.json), which caused the issue,
as there are differences in behavior between the spm tokenizer and the one in llama.cpp.
Since the spm extended tokenizer used in Swallow includes the range of the tulu-2's llama tokenizer,
I brought over and converted the tokenizer.model from Swallow.
It's quite a hassle to convert the tokenizer.json, which is transformed from spm, to gguf without any issues,
so it would be helpful if you could include Swallow's tokenizer.model in the model,
as it would make it easier for others to perform gguf conversions in the future.
(Since Swallow's tokenizer.model includes the range of tulu-2's vocab, it should be fine to use as is.)
報告してくれてありがとうございます。
現在確認しまして、7b、13b共に修正してアップロード中です。
llama拡張のSwallowトークナイザーとtulu-2のllamaトークナイザーをマージする際に、
本来のspmトークナイザー(tokenizer.model)がhuggingfaceのFastTokenizer(tokenizer.json)に変換されており、
llama.cppでspmトークナイザーとの挙動の違いがあることが原因でした。
このモデルはSwallowで使用しているspm拡張トークナイザーがtulu-2のllamaトークナイザーの範囲を含んでいるので、
swallowからtokenizer.modelを持ってきて変換しました。
spmから変換されたtokenizer.jsonを問題ないようにgguf変換するのは結構手間なので、
できればSwallowのtokenizer.modelをモデルに含めていただけたら、今後他の方がgguf変換する際に楽かと思います。
(Swallowのtokenizer.modelはtulu-2のvocabの範囲を含んでいるのでそのまま置いて問題ないはずです)
https://huggingface.co/tokyotech-llm/Swallow-7b-instruct-hf/tree/main
よろしくお願いします
I understand the details of this problem. Thank you so much!