help with quant

#1
by WasamiKirua - opened

I have just clone the https://github.com/fbuciuni90/llama.cpp and tried quantize the Velvet 2B but i am still getting:

WARNING:hf-to-gguf:

WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:**          There are 2 possible reasons for this:
WARNING:hf-to-gguf:**          - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:**          - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:**          Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref:     https://github.com/ggml-org/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh:  a3df2b8943e01cfd7d68c9f8446b294f3d8706d1d6853df65df7fda5d4fcb19f
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:

Traceback (most recent call last):
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 1572, in set_vocab
    self._set_vocab_sentencepiece()
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 795, in _set_vocab_sentencepiece
    tokens, scores, toktypes = self._create_vocab_sentencepiece()
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 812, in _create_vocab_sentencepiece
    raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: neuro-sama-velvet-2b-ita/orig/tokenizer.model

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 1575, in set_vocab
    self._set_vocab_llama_hf()
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 887, in _set_vocab_llama_hf
    vocab = gguf.LlamaHfVocab(self.dir_model)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Volumes/Data/llama.cpp/gguf-py/gguf/vocab.py", line 384, in __init__
    raise TypeError('Llama 3 must be converted with BpeVocab')
TypeError: Llama 3 must be converted with BpeVocab

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 5117, in <module>
    main()
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 5111, in main
    model_instance.write()
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 440, in write
    self.prepare_metadata(vocab_only=False)
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 433, in prepare_metadata
    self.set_vocab()
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 1578, in set_vocab
    self._set_vocab_gpt2()
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 731, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
                               ^^^^^^^^^^^^^^^^^^^^^
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 526, in get_vocab_base
    tokpre = self.get_vocab_base_pre(tokenizer)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Volumes/Data/llama.cpp/convert_hf_to_gguf.py", line 719, in get_vocab_base_pre
    raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()

I double checked and all the changes made in #11716 are already included. should't work with the 2B even if in the fork is used the 14B ?

Cheers,
Wasa

Hi @WasamiKirua , thanks for reaching out.
I just tested the procedure again with the latest commits, and it's working fine. Are you sure you are using the fbuciuni90/llama.cpp fork?
Please check the convert_hf_to_gguf.py script, specifically around line 706, to verify that Velvet support is present, like this:

        if chkhsh == "a3df2b8943e01cfd7d68c9f8446b294f3d8706d1d6853df65df7fda5d4fcb19f":
            # ref: https://huggingface.co/Almawave/Velvet-14B
            res = "velvet"

After that, I suggest checking your launching commands again to ensure you are running the script from the correct folder, especially if you have multiple llama.cpp folders or Python virtual environments (venvs).

Good luck :D

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment