Idan0405 commited on
Commit
c3f6a06
·
1 Parent(s): b2f62dc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -0
README.md ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ # For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
3
+ # Doc / guide: https://huggingface.co/docs/hub/model-cards
4
+ {}
5
+ ---
6
+
7
+ # Model Card: ClipMD
8
+
9
+ ## Model Details
10
+ ClipMD is a medical image-text matching model based on OpenAI's CLIP model with a sliding window text encoder.
11
+
12
+ ### Model Description
13
+
14
+ The model uses a ViT-B/32 Transformer architecture as an image encoder and uses a masked sliding window elf-attention Transformer as a text encoder. These encoders are trained to maximize the similarity of (image, text) pairs via a contrastive loss.
15
+
16
+ The model was fine-tuned on the ROCO dataset.
17
+
18
+ ## Use with Transformers
19
+ ```
20
+ from PIL import Image
21
+
22
+ from transformers import AutoProcessor,AutoModel
23
+
24
+ model = AutoModel.from_pretrained("Idan0405/ClipMD")
25
+ processor = AutoProcessor.from_pretrained("Idan0405/ClipMD")
26
+
27
+ image = Image.open("your image path")
28
+
29
+ inputs = processor(text=["chest x-ray", "head MRI"], images=image, return_tensors="pt", padding=True)
30
+
31
+ outputs = model(**inputs)
32
+ logits_per_image = outputs[0] # this is the image-text similarity score
33
+ probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
34
+ ```
35
+
36
+