This is NOT a fine-tune.

This is the original OpenAI CLIP ViT-L/14@336 Text Encoder, converted to HuggingFace 'transformers' format. All credits to the original authors.

Why?

  • It's a normal "CLIP-L" Text Encoder and can be used as such.
  • See below (Flux.1-dev, CLIP only guidance, CFG 3.5, Heun).
  • For my fine-tuned KO-CLIP ViT-L/14@336 -> see here

image/jpeg

image/jpeg

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for zer0int/clip-vit-large-patch14-336-text-encoder

Finetuned
(24)
this model