Ramos-Ramos
/

distillclip

Generated from Trainer

Model card Files Files and versions

patrickramos commited on Jul 5, 2023

Commit

7ca96c3

·

1 Parent(s): 089b117

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -23,7 +23,7 @@ DistillCLIP is a distilled version of CLIP. Specficially, the teacher model was
 The knowledge distillation scheme of CLIP is presented below:
-<img src="https://huggingface.co/Ramos-Ramos/distillclip/resolve/main/distillclip_overview.svg" width="50%" height="50%">
 CLIP is distilled with two losses: $L_{inter}$ and $L_{intra}$. These losses respectively distill the inter-modal (image-text) and intra-modal (image-image, text-text) similarity maps with MSE losses. The final distillation loss is the sum of the two losses, or $L = L_{inter} + L_{intra}$.

 The knowledge distillation scheme of CLIP is presented below:
+<img src="https://huggingface.co/Ramos-Ramos/distillclip/resolve/main/distillclip_overview.svg" width="75%" height="75%">
 CLIP is distilled with two losses: $L_{inter}$ and $L_{intra}$. These losses respectively distill the inter-modal (image-text) and intra-modal (image-image, text-text) similarity maps with MSE losses. The final distillation loss is the sum of the two losses, or $L = L_{inter} + L_{intra}$.