unsloth/Qwen2.5-VL-3B-Instruct-bnb-4bit · OOM error on Goggle Colab

I am trying to use the model in Colab with this code after installing bitsandbytes.
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="unsloth/Qwen2.5-VL-3B-Instruct-bnb-4bit")
messages = [
{
"role": "user",
"content": [
{"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
{"type": "text", "text": "What animal is on the candy?"}
]
},
]
pipe(text=messages)

But I am getting OOM error
OutOfMemoryError: CUDA out of memory. Tried to allocate 7.21 GiB. GPU 0 has a total capacity of 14.74 GiB of which 6.68 GiB is free. Process 3862 has 8.06 GiB memory in use. Of the allocated memory 7.20 GiB is allocated by PyTorch, and 750.05 MiB is reserved by PyTorch but unallocated

I am trying to use the model in Colab with this code after installing bitsandbytes.
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="unsloth/Qwen2.5-VL-3B-Instruct-bnb-4bit")
messages = [
{
"role": "user",
"content": [
{"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
{"type": "text", "text": "What animal is on the candy?"}
]
},
]
pipe(text=messages)

But I am getting OOM error
OutOfMemoryError: CUDA out of memory. Tried to allocate 7.21 GiB. GPU 0 has a total capacity of 14.74 GiB of which 6.68 GiB is free. Process 3862 has 8.06 GiB memory in use. Of the allocated memory 7.20 GiB is allocated by PyTorch, and 750.05 MiB is reserved by PyTorch but unallocated

did you use our qwen2.5 vl notebook?