Support via the `transformers` library
#3
by
pevogam
- opened
Do you think it is likely to get any support for loading the model in the transformers library soon? Is there an alternative to load it via any additional python tooling? I mean something like the original sample provided by Meta for the unquantized versions:
from transformers import AutoProcessor, Llama4ForConditionalGeneration
import torch
model_id = "meta-llama/Llama-4-Scout-17B-16E-Instruct"
processor = AutoProcessor.from_pretrained(model_id)
model = Llama4ForConditionalGeneration.from_pretrained(
model_id,
attn_implementation="flex_attention",
device_map="auto",
torch_dtype=torch.bfloat16,
)
which for unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
will result in
OSError: unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF/tree/main'for available files.