Edit model card


SOLO Model Card

Model details

Model type: SOLO is a 7B large vision-language model with a single Transformer architecture for unified vision-language modeling. SOLO accepts both raw image patches (in pixels) and texts as inputs, without using a separate pre-trained vision encoder.

Model date: SOLO-7B was trained in June 2024.

Paper or resources for more information: Paper & Github

Where to send questions or comments about the model: https://github.com/Yangyi-Chen/SOLO/issues

Inference with Huggingface Please check this scripts for an example of performing inference on the model.

Downloads last month
179
Safetensors
Model size
7.26B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.