exdysa ai-forever commited on
Commit
f0e1b07
·
verified ·
0 Parent(s):

Duplicate from ai-forever/ruclip-vit-large-patch14-336

Browse files

Co-authored-by: ai-forever <[email protected]>

Files changed (5) hide show
  1. .gitattributes +27 -0
  2. README.md +63 -0
  3. bpe.model +3 -0
  4. config.json +14 -0
  5. pytorch_model.bin +3 -0
.gitattributes ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bin.* filter=lfs diff=lfs merge=lfs -text
5
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.model filter=lfs diff=lfs merge=lfs -text
12
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
13
+ *.onnx filter=lfs diff=lfs merge=lfs -text
14
+ *.ot filter=lfs diff=lfs merge=lfs -text
15
+ *.parquet filter=lfs diff=lfs merge=lfs -text
16
+ *.pb filter=lfs diff=lfs merge=lfs -text
17
+ *.pt filter=lfs diff=lfs merge=lfs -text
18
+ *.pth filter=lfs diff=lfs merge=lfs -text
19
+ *.rar filter=lfs diff=lfs merge=lfs -text
20
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
21
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
22
+ *.tflite filter=lfs diff=lfs merge=lfs -text
23
+ *.tgz filter=lfs diff=lfs merge=lfs -text
24
+ *.xz filter=lfs diff=lfs merge=lfs -text
25
+ *.zip filter=lfs diff=lfs merge=lfs -text
26
+ *.zstandard filter=lfs diff=lfs merge=lfs -text
27
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ruclip-vit-large-patch14-336
2
+
3
+ **RuCLIP** (**Ru**ssian **C**ontrastive **L**anguage–**I**mage **P**retraining) is a multimodal model
4
+ for obtaining images and text similarities and rearranging captions and pictures.
5
+ RuCLIP builds on a large body of work on zero-shot transfer, computer vision, natural language processing and
6
+ multimodal learning.
7
+
8
+ Model was trained by [Sber AI](https://github.com/sberbank-ai) and [SberDevices](https://sberdevices.ru/) teams.
9
+ * Task: `text ranking`; `image ranking`; `zero-shot image classification`;
10
+ * Type: `encoder`
11
+ * Num Parameters: `430M`
12
+ * Training Data Volume: `240 million text-image pairs`
13
+ * Language: `Russian`
14
+ * Context Length: `77`
15
+ * Transformer Layers: `12`
16
+ * Transformer Width: `768`
17
+ * Transformer Heads: `12`
18
+ * Image Size: `336`
19
+ * Vision Layers: `24`
20
+ * Vision Width: `1024`
21
+ * Vision Patch Size: `14`
22
+
23
+ ## Usage [Github](https://github.com/sberbank-ai/ru-clip)
24
+
25
+ ```
26
+ pip install ruclip
27
+ ```
28
+
29
+ ```python
30
+ clip, processor = ruclip.load("ruclip-vit-large-patch14-336", device="cuda")
31
+ ```
32
+
33
+ ## Performance
34
+ We have evaluated the performance on the following datasets:
35
+
36
+ | Dataset | Metric Name | Metric Result |
37
+ |:--------------|:---------------|:--------------------|
38
+ | Food101 | acc | 0.712 |
39
+ | CIFAR10 | acc | 0.906 |
40
+ | CIFAR100 | acc | 0.591 |
41
+ | Birdsnap | acc | 0.213 |
42
+ | SUN397 | acc | 0.523 |
43
+ | Stanford Cars | acc | 0.659 |
44
+ | DTD | acc | 0.408 |
45
+ | MNIST | acc | 0.242 |
46
+ | STL10 | acc | 0.956 |
47
+ | PCam | acc | 0.554 |
48
+ | CLEVR | acc | 0.142 |
49
+ | Rendered SST2 | acc | 0.539 |
50
+ | ImageNet | acc | 0.488 |
51
+ | FGVC Aircraft | mean-per-class | 0.075 |
52
+ | Oxford Pets | mean-per-class | 0.546 |
53
+ | Caltech101 | mean-per-class | 0.835 |
54
+ | Flowers102 | mean-per-class | 0.517 |
55
+ | HatefulMemes | roc-auc | 0.519 |
56
+
57
+
58
+ # Authors
59
+
60
+ + Alex Shonenkov: [Github](https://github.com/shonenkov), [Kaggle GM](https://www.kaggle.com/shonenkov)
61
+ + Daniil Chesakov: [Github](https://github.com/Danyache)
62
+ + Denis Dimitrov: [Github](https://github.com/denndimitrov)
63
+ + Igor Pavlov: [Github](https://github.com/boomb0om)
bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:26db7928d1a022215fc5a1948c46d17c8e39e471b4d0f8b3d1edfd91c7c62571
3
+ size 747907
config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "embed_dim": 768,
3
+ "image_resolution": 336,
4
+ "vision_layers": 24,
5
+ "vision_width": 1024,
6
+ "vision_patch_size": 14,
7
+ "context_length": 77,
8
+ "vocab_size": 49408,
9
+ "transformer_width": 768,
10
+ "transformer_heads": 12,
11
+ "transformer_layers": 12,
12
+ "mean": [0.48145466, 0.4578275, 0.40821073],
13
+ "std": [0.26862954, 0.26130258, 0.27577711]
14
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:39edca30dbb7421989cc78e8787bd3b8ad6829ac6f1279f9c29a1535fe86bb9f
3
+ size 1711937797