Feature Extraction
Transformers
Safetensors
clip
zero-shot-image-classification
chs20's picture
Update README.md
74dd8cc verified
|
raw
history blame
715 Bytes
metadata
license: mit
datasets:
  - ILSVRC/imagenet-1k
  - mlfoundations/datacomp_small
base_model:
  - laion/CLIP-ViT-bigG-14-laion2B-39B-b160k

[Paper]   [Code]

Model Initialized from laion/CLIP-ViT-bigG-14-laion2B-39B-b160k. The text encoder is finetuned with LEAF at $k=1$ with $\rho=50$ and semantic constraints.

To load this model use:

from transformers import CLIPProcessor, CLIPModel

model_name = "LEAF-CLIP/OpenCLIP-ViT-bigG-rho50-k1-constrained"
processor_name = "laion/CLIP-ViT-bigG-14-laion2B-39B-b160k"

model = CLIPModel.from_pretrained(model_name)
processor = CLIPProcessor.from_pretrained(processor_name)