Image input

#4
by Ju-Seung - opened

Thanks for sharing all the models. By the way, are these models available for both image and text input? If so, could you share a snippet of code for it?

yeah you can use it using llama.cpp cli command like this:

./build/bin/llama-gemma3-cli -m google_gemma-3-12b-it-Q8_0.gguf --mmproj mmproj-google_gemma-3-12b-it-f16.gguf

Thank you for the quick response. Yes I was able to run it like
./build/bin/llama-gemma3-cli -m google_gemma-3-12b-it-Q8_0.gguf --mmproj mmproj-google_gemma-3-12b-it-f16.gguf --image path \ -p "Describe this image in detail."

I wonder it's also available for a python code like below. Could you let me know whether it is possible or not supported yet?

model = Llama(
    model_path=model_path,
    n_gpu_layers=0, 
    mmproj=mmproj_path  
)

# Format for  the Gemma 3 tutorial
messages = [
    {
        "role": "system",
        "content": [{"type": "text", "text": "You are a helpful assistant."}]
    },
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image_url},
            {"type": "text", "text": "Describe this image in detail."}
        ]
    }
]

# Generate a response
response = model.create_chat_completion(
    messages=messages,
)

Seems like Llava 1.5 supports this through the chat handler, but I haven't found it for Gemma 3 or other ways of doing it..
(https://github.com/abetlen/llama-cpp-python/blob/main/llama_cpp/llama_chat_format.py)

from llama_cpp import Llama
from llama_cpp.llama_chat_format import Llava15ChatHandler
chat_handler = Llava15ChatHandler(clip_model_path="path/to/llava/mmproj.bin")
llm = Llama(
  model_path="./path/to/llava/llama-model.gguf",
  chat_handler=chat_handler,
  n_ctx=2048, # n_ctx should be increased to accommodate the image embedding
)
llm.create_chat_completion(
    messages = [
        {"role": "system", "content": "You are an assistant who perfectly describes images."},
        {
            "role": "user",
            "content": [
                {"type" : "text", "text": "What's in this image?"},
                {"type": "image_url", "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" } }
            ]
        }
    ]
)

Sign up or log in to comment