--- library_name: keras-hub --- ### Model Overview SigLIP model pre-trained on WebLi at resolution 224x224. It was introduced in the paper [Sigmoid Loss for Language Image Pre-Training](https://arxiv.org/abs/2303.15343) by Zhai et al. and first released in this [repository](https://github.com/google-research/big_vision). SigLIP is [CLIP](https://huggingface.co/docs/transformers/model_doc/clip), a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a global view of the pairwise similarities for normalization. This allows further scaling up the batch size, while also performing better at smaller batch sizes. A TLDR of SigLIP by one of the authors can be found [here](https://twitter.com/giffmana/status/1692641733459267713). Weights are released under the [Apache 2 License](https://github.com/keras-team/keras-hub/blob/master/LICENSE) . Keras model code is released under the [Apache 2 License](https://github.com/keras-team/keras-hub/blob/master/LICENSE). ## Links * [SigLIP Quickstart Notebook](https://www.kaggle.com/code/laxmareddypatlolla/siglip-quickstart-notebook-with-hub) * [SigLIP API Documentation](coming soon) * [SigLIP Model Card](https://arxiv.org/abs/2303.15343) * [KerasHub Beginner Guide](https://keras.io/guides/keras_hub/getting_started/) * [KerasHub Model Publishing Guide](https://keras.io/guides/keras_hub/upload/) ## Installation Keras and KerasHub can be installed with: ``` pip install -U -q keras-hub pip install -U -q keras ``` Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the [Keras Getting Started](https://keras.io/getting_started/) page. ## Presets The following model checkpoints are provided by the Keras team. Full code examples for each are available below. | Preset name | Parameters | Description | |---------------------------------------|------------|--------------------------------------------------------------------------------------------------------------| | | | | ## Example Usage ```Python import keras import numpy as np import matplotlib.pyplot as plt from keras_hub.models import SigLIPBackbone, SigLIPTokenizer from keras_hub.layers import SigLIPImageConverter # instantiate the model and preprocessing tools siglip = SigLIPBackbone.from_preset("siglip_large_patch16_256") tokenizer = SigLIPTokenizer.from_preset("siglip_large_patch16_256", sequence_length=64) image_converter = SigLIPImageConverter.from_preset("siglip_large_patch16_256") # obtain tokens for some input text tokens = tokenizer.tokenize(["mountains", "cat on tortoise", "house"]) # preprocess image and text image = keras.utils.load_img("cat.jpg") image = image_converter(np.array([image]).astype(float)) # query the model for similarities siglip({ "images": image, "token_ids": tokens, }) ``` ## Example Usage with Hugging Face URI ```Python import keras import numpy as np import matplotlib.pyplot as plt from keras_hub.models import SigLIPBackbone, SigLIPTokenizer from keras_hub.layers import SigLIPImageConverter # instantiate the model and preprocessing tools siglip = SigLIPBackbone.from_preset("hf://keras/siglip_large_patch16_256") tokenizer = SigLIPTokenizer.from_preset("hf://keras/siglip_large_patch16_256", sequence_length=64) image_converter = SigLIPImageConverter.from_preset("hf://keras/siglip_large_patch16_256") # obtain tokens for some input text tokens = tokenizer.tokenize(["mountains", "cat on tortoise", "house"]) # preprocess image and text image = keras.utils.load_img("cat.jpg") image = image_converter(np.array([image]).astype(float)) # query the model for similarities siglip({ "images": image, "token_ids": tokens, }) ```