runtime error
Exit code: 1. Reason: | 0.00/1.84M [00:00<?, ?B/s][A tokenizer.json: 100%|โโโโโโโโโโ| 1.84M/1.84M [00:00<00:00, 55.5MB/s] special_tokens_map.json: 0%| | 0.00/411 [00:00<?, ?B/s][A special_tokens_map.json: 100%|โโโโโโโโโโ| 411/411 [00:00<00:00, 2.87MB/s] You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message. config.json: 0%| | 0.00/821 [00:00<?, ?B/s][A config.json: 100%|โโโโโโโโโโ| 821/821 [00:00<00:00, 5.51MB/s] Traceback (most recent call last): File "/home/user/app/app.py", line 9, in <module> model = AutoModelForCausalLM.from_pretrained(model_name) File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 317, in _wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4884, in from_pretrained hf_quantizer = AutoHfQuantizer.from_config( File "/usr/local/lib/python3.10/site-packages/transformers/quantizers/auto.py", line 185, in from_config return target_cls(quantization_config, **kwargs) File "/usr/local/lib/python3.10/site-packages/transformers/quantizers/quantizer_gptq.py", line 49, in __init__ raise ImportError("Loading a GPTQ quantized model requires optimum (`pip install optimum`)") ImportError: Loading a GPTQ quantized model requires optimum (`pip install optimum`)
Container logs:
Fetching error logs...