runtime error
Exit code: 1. Reason: ��█▏ | 2.04G/4.92G [00:02<00:02, 1.11GB/s][A model-00003-of-00004.safetensors: 72%|███████▏ | 3.55G/4.92G [00:03<00:01, 1.28GB/s][A model-00003-of-00004.safetensors: 100%|█████████▉| 4.92G/4.92G [00:03<00:00, 1.28GB/s] Downloading shards: 75%|███████▌ | 3/4 [00:12<00:04, 4.27s/it][A model-00004-of-00004.safetensors: 0%| | 0.00/1.17G [00:00<?, ?B/s][A model-00004-of-00004.safetensors: 51%|█████ | 591M/1.17G [00:01<00:00, 589MB/s][A model-00004-of-00004.safetensors: 100%|█████████▉| 1.17G/1.17G [00:01<00:00, 862MB/s] Downloading shards: 100%|██████████| 4/4 [00:14<00:00, 3.19s/it][A Downloading shards: 100%|██████████| 4/4 [00:14<00:00, 3.61s/it] Traceback (most recent call last): File "/home/user/app/app.py", line 29, in <module> model = LlamaForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="cuda", use_safetensors=True) File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 262, in _wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4185, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/usr/local/lib/python3.10/site-packages/lxt/models/llama.py", line 906, in __init__ self.model = LlamaModel(config) File "/usr/local/lib/python3.10/site-packages/lxt/models/llama.py", line 640, in __init__ [LlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] File "/usr/local/lib/python3.10/site-packages/lxt/models/llama.py", line 640, in <listcomp> [LlamaDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] File "/usr/local/lib/python3.10/site-packages/lxt/models/llama.py", line 424, in __init__ self.self_attn = LLAMA_ATTENTION_CLASSES[config._attn_implementation](config=config, layer_idx=layer_idx) KeyError: 'sdpa'
Container logs:
Fetching error logs...