|
--- |
|
license: mit |
|
datasets: |
|
- ILSVRC/imagenet-1k |
|
- mlfoundations/datacomp_small |
|
base_model: |
|
- laion/CLIP-ViT-g-14-laion2B-s12B-b42K |
|
--- |
|
|
|
Model Initialized from `laion/CLIP-ViT-g-14-laion2B-s12B-b42K`. The image encoder is finetuned with FARE at $\epsilon=2/255$. The text encoder is finetuned with LEAF at $k=1$ with $\rho=50$. |
|
|
|
To load this model use: |
|
|
|
```python |
|
from transformers import CLIPProcessor, CLIPModel |
|
|
|
model_name = "LEAF-CLIP/OpenCLIP-ViT-g-rho50-k1-FARE2" |
|
processor_name = "laion/CLIP-ViT-g-14-laion2B-s12B-b42K" |
|
|
|
model = CLIPModel.from_pretrained(model_name) |
|
processor = CLIPProcessor.from_pretrained(processor_name) |
|
``` |