neginb commited on
Commit
7fe339b
·
verified ·
1 Parent(s): 14c1af8

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - vector-institute/open-pmc
5
+ metrics:
6
+ - accuracy
7
+ - f1
8
+ - recall
9
+ ---
10
+ <div align="center">
11
+ <img src="https://github.com/VectorInstitute/pmc-data-extraction/blob/0a969136344a07267bb558d01f3fe76b36b93e1a/media/open-pmc-pipeline.png?raw=true"
12
+ alt="Open-PMC Pipeline"
13
+ width="1000" />
14
+ </div>
15
+
16
+ <p align="center">
17
+ <strong>Arxiv:</strong> <a href="http://arxiv.org/abs/2503.14377" target="_blank">Arxiv</a>
18
+ &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;
19
+ <strong>Code:</strong> <a href="https://github.com/VectorInstitute/pmc-data-extraction" target="_blank">Open-PMC Github</a>
20
+ &nbsp;&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;&nbsp;
21
+ <strong>Dataset:</strong> <a href="https://huggingface.co/datasets/vector-institute/open-pmc" target="_blank">Hugging Face</a>
22
+ </p>
23
+
24
+
25
+ ## Model Overview
26
+
27
+ This model is a checkpoint trained on the **Open-PMC** dataset. It utilizes a **Vision Transformer (ViT-base16)** as the backbone for visual feature extraction and **PubMedBERT** for processing text data. The model is trained using **Contrastive Learning** with the **vanilla Info-NCE loss** to learn meaningful representations across different modalities.
28
+
29
+ ## Model Architecture
30
+
31
+ - **Vision Backbone**: ViT-B/16 (Pretrained on ImageNet)
32
+ - **Text Backbone**: PubMedBERT (Pretrained on PubMedCentral Abstracts)
33
+ - **Training Objective**: Contrastive Learning with **Info-NCE Loss**
34
+
35
+ ## Training Framework
36
+
37
+ The model was trained using the **mmlearn** framework, which is designed for multimodal learning. You can find more information and access the framework [here](https://github.com/vectorInstitute/mmlearn).
38
+
39
+ ## How to Use
40
+
41
+ Please visit out GitHub for information on how to run benchmarking using this checkpoint