runtime error
Exit code: 1. Reason: /usr/local/lib/python3.10/site-packages/torch/cuda/__init__.py:716: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. config.json: 0%| | 0.00/608 [00:00<?, ?B/s][A config.json: 100%|██████████| 608/608 [00:00<00:00, 2.71MB/s] model.safetensors: 0%| | 0.00/2.20G [00:00<?, ?B/s][A model.safetensors: 7%|▋ | 157M/2.20G [00:01<00:13, 154MB/s][A model.safetensors: 54%|█████▍ | 1.18G/2.20G [00:02<00:01, 662MB/s][A model.safetensors: 100%|█████████▉| 2.20G/2.20G [00:02<00:00, 813MB/s] Traceback (most recent call last): File "/home/user/app/app.py", line 13, in <module> model = LlamaForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0", torch_dtype=torch.bfloat16, device_map="auto") File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4090, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/usr/local/lib/python3.10/site-packages/lxt/models/llama.py", line 906, in __init__ self.model = LlamaModel(config) File "/usr/local/lib/python3.10/site-packages/lxt/models/llama.py", line 640, in __init__ [LlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] File "/usr/local/lib/python3.10/site-packages/lxt/models/llama.py", line 640, in <listcomp> [LlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] File "/usr/local/lib/python3.10/site-packages/lxt/models/llama.py", line 424, in __init__ self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx) KeyError: 'sdpa'
Container logs:
Fetching error logs...