Support via the `transformers` library

#3
by pevogam - opened

Do you think it is likely to get any support for loading the model in the transformers library soon? Is there an alternative to load it via any additional python tooling? I mean something like the original sample provided by Meta for the unquantized versions:

from transformers import AutoProcessor, Llama4ForConditionalGeneration
import torch

model_id = "meta-llama/Llama-4-Scout-17B-16E-Instruct"

processor = AutoProcessor.from_pretrained(model_id)
model = Llama4ForConditionalGeneration.from_pretrained(
    model_id,
    attn_implementation="flex_attention",
    device_map="auto",
    torch_dtype=torch.bfloat16,
)

which for unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF will result in

OSError: unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF/tree/main'for available files.

Sign up or log in to comment