Zeyu077 commited on
Commit
0bacb78
·
verified ·
1 Parent(s): 04706dc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +103 -23
README.md CHANGED
@@ -12,29 +12,109 @@ We invite you to explore its capabilities and welcome inquiries or collaboration
12
  We evaluated our model on [MedEvalKit](https://github.com/alibaba-damo-academy/MedEvalKit), using Qwen2.5-72B as the judge model.
13
  The results are as follows.
14
 
15
- | Model | Size | MMMU-Health&Medicine | VQA-RAD | SLAKE | PathVQA | PMC-VQA | OmniMedVQA | MedXpertQA-MM | Avg. |
16
- |---------------------|------|----------|---------|-------|---------|---------|-------|---------|-------|
17
- | **Proprietary Models** |
18
- | GPT-5 | - | 83.6 | 67.8 | 78.1 | 52.8 | 60.0 | 76.4 | 71.0 | 70.0 |
19
- | GPT-5-mini | - | 80.5 | 66.3 | 76.1 | 52.4 | 57.6 | 70.9 | 60.1 | 66.3 |
20
- | GPT-5-nano | - | 74.1 | 55.4 | 69.3 | 45.4 | 51.3 | 66.5 | 45.1 | 58.2 |
21
- | GPT-4.1 | - | 75.2 | 65.0 | 72.2 | 55.5 | 55.2 | 75.5 | 45.2 | 63.4 |
22
- | Claude Sonnet 4 | - | 74.6 | 67.6 | 70.6 | 54.2 | 54.4 | 65.5 | 43.3 | 61.5 |
23
- | Gemini-2.5-Flash | - | 76.9 | 68.5 | 75.8 | 55.4 | 55.4 | 71.0 | 52.8 | 65.1 |
24
- | **General Open-source Models** |
25
- | Qwen2.5VL-3B | 3B | 51.3 | 56.8 | 63.2 | 37.1 | 50.6 | 64.5 | 20.7 | 49.2 |
26
- | Qwen2.5VL-7B | 7B | 50.6 | 64.5 | 67.2 | 44.1 | 51.9 | 63.6 | 22.3 | 52.0 |
27
- | InternVL2.5-8B | 8B | 53.5 | 59.4 | 69.0 | 42.1 | 51.3 | 81.3 | 21.7 | 54 |
28
- | InternVL3-8B | 8B | 59.2 | 65.4 | 72.8 | 48.6 | 53.8 | 79.1 | 22.4 | 57.3 |
29
- | **Medical Open-source Models** |
30
- | MedGemma-4B-IT | 4B | 43.7 | 49.9 | 76.4 | 48.8 | 49.9 | 69.8 | 22.3 | 51.54 |
31
- | LLaVA-Med-7B | 7B | 29.3 | 53.7 | 48.0 | 38.8 | 30.5 | 44.3 | 20.3 | 37.8 |
32
- | HuatuoGPT-V-7B | 7B | 47.3 | 67.0 | 67.8 | 48.0 | 53.3 | 74.2 | 21.6 | 54.2 |
33
- | Lingshu-7B | 7B | 54.0 | 67.9 | 83.1 | 61.9 | 56.3 | 82.9 | 26.7 | 61.8 |
34
- | BioMediX2-8B | 8B | 39.8 | 49.2 | 57.7 | 37.0 | 43.5 | 63.3 | 21.8 | 44.6 |
35
- |**InfiMed-Series Model**|
36
- | [InfiMed-SFT-3B](https://huggingface.co/InfiX-ai/InfiMed-SFT-3B) | 3B | 54.67 | 58.09 | 82 | 60.59 | 53.22 | 67.01 | 23.55 | 57.02 |
37
- | [InfiMed-RL-3B](https://huggingface.co/InfiX-ai/InfiMed-RL-3B) | 3B | 55.33 | 60.53 | 82.38 | 61.97 | 58.74 | 71.71 | 23.6 | 59.18 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
 
39
  ## Model Download
40
  Download the InfiMed models from the Hugging Face Hub into the `./models` directory.
 
12
  We evaluated our model on [MedEvalKit](https://github.com/alibaba-damo-academy/MedEvalKit), using Qwen2.5-72B as the judge model.
13
  The results are as follows.
14
 
15
+ <!DOCTYPE html>
16
+ <html lang="en">
17
+ <head>
18
+ <meta charset="UTF-8">
19
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
20
+ <title>Model Comparison Table</title>
21
+ <style>
22
+ table {
23
+ width: 100%;
24
+ border-collapse: collapse;
25
+ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Arial, sans-serif;
26
+ font-size: 14px;
27
+ }
28
+ th, td {
29
+ border: 1px solid #e0e0e0;
30
+ padding: 10px;
31
+ text-align: right;
32
+ }
33
+ th {
34
+ background-color: #f5f5f5;
35
+ cursor: pointer;
36
+ font-weight: 600;
37
+ }
38
+ th:first-child, td:first-child {
39
+ text-align: left;
40
+ }
41
+ tr {
42
+ background-color: #fafafa;
43
+ }
44
+ .category-row {
45
+ background-color: #e0e0e0;
46
+ font-weight: bold;
47
+ text-align: left;
48
+ }
49
+ .infimed {
50
+ background-color: #e6f3ff;
51
+ }
52
+ .avg {
53
+ font-weight: bold;
54
+ }
55
+ a {
56
+ color: #0066cc;
57
+ text-decoration: none;
58
+ }
59
+ a:hover {
60
+ text-decoration: underline;
61
+ }
62
+ /* 响应式设计 */
63
+ @media (max-width: 600px) {
64
+ table, th, td {
65
+ font-size: 12px;
66
+ padding: 6px;
67
+ }
68
+ th, td {
69
+ min-width: 60px;
70
+ }
71
+ }
72
+ </style>
73
+ </head>
74
+ <body>
75
+ <table id="modelTable">
76
+ <thead>
77
+ <tr>
78
+ <th>Model</th>
79
+ <th>Size</th>
80
+ <th>MMMU-H&M</th>
81
+ <th>VQA-RAD</th>
82
+ <th>SLAKE</th>
83
+ <th>PathVQA</th>
84
+ <th>PMC-VQA</th>
85
+ <th>OmniMedVQA</th>
86
+ <th>MedXpertQA</th>
87
+ <th>Avg.</th>
88
+ </tr>
89
+ </thead>
90
+ <tbody>
91
+ <tr class="category-row"><td colspan="10">Proprietary Models</td></tr>
92
+ <tr><td>GPT-5</td><td>-</td><td>83.60</td><td>67.80</td><td>78.10</td><td>52.80</td><td>60.00</td><td>76.40</td><td>71.00</td><td class="avg">70.00</td></tr>
93
+ <tr><td>GPT-5-mini</td><td>-</td><td>80.50</td><td>66.30</td><td>76.10</td><td>52.40</td><td>57.60</td><td>70.90</td><td>60.10</td><td class="avg">66.30</td></tr>
94
+ <tr><td>GPT-5-nano</td><td>-</td><td>74.10</td><td>55.40</td><td>69.30</td><td>45.40</td><td>51.30</td><td>66.50</td><td>45.10</td><td class="avg">58.20</td></tr>
95
+ <tr><td>GPT-4.1</td><td>-</td><td>75.20</td><td>65.00</td><td>72.20</td><td>55.50</td><td>55.20</td><td>75.50</td><td>45.20</td><td class="avg">63.40</td></tr>
96
+ <tr><td>Claude Sonnet 4</td><td>-</td><td>74.60</td><td>67.60</td><td>70.60</td><td>54.20</td><td>54.40</td><td>65.50</td><td>43.30</td><td class="avg">61.50</td></tr>
97
+ <tr><td>Gemini-2.5-Flash</td><td>-</td><td>76.90</td><td>68.50</td><td>75.80</td><td>55.40</td><td>55.40</td><td>71.00</td><td>52.80</td><td class="avg">65.10</td></tr>
98
+ <tr class="category-row"><td colspan="10">General Open-source Models</td></tr>
99
+ <tr><td>Qwen2.5VL-3B</td><td>3B</td><td>51.30</td><td>56.80</td><td>63.20</td><td>37.10</td><td>50.60</td><td>64.50</td><td>20.70</td><td class="avg">49.20</td></tr>
100
+ <tr><td>Qwen2.5VL-7B</td><td>7B</td><td>50.60</td><td>64.50</td><td>67.20</td><td>44.10</td><td>51.90</td><td>63.60</td><td>22.30</td><td class="avg">52.00</td></tr>
101
+ <tr><td>InternVL2.5-8B</td><td>8B</td><td>53.50</td><td>59.40</td><td>69.00</td><td>42.10</td><td>51.30</td><td>81.30</td><td>21.70</td><td class="avg">54.00</td></tr>
102
+ <tr><td>InternVL3-8B</td><td>8B</td><td>59.20</td><td>65.40</td><td>72.80</td><td>48.60</td><td>53.80</td><td>79.10</td><td>22.40</td><td class="avg">57.30</td></tr>
103
+ <tr class="category-row"><td colspan="10">Medical Open-source Models</td></tr>
104
+ <tr><td>MedGemma-4B-IT</td><td>4B</td><td>43.70</td><td>49.90</td><td>76.40</td><td>48.80</td><td>49.90</td><td>69.80</td><td>22.30</td><td class="avg">51.54</td></tr>
105
+ <tr><td>LLaVA-Med-7B</td><td>7B</td><td>29.30</td><td>53.70</td><td>48.00</td><td>38.80</td><td>30.50</td><td>44.30</td><td>20.30</td><td class="avg">37.80</td></tr>
106
+ <tr><td>HuatuoGPT-V-7B</td><td>7B</td><td>47.30</td><td>67.00</td><td>67.80</td><td>48.00</td><td>53.30</td><td>74.20</td><td>21.60</td><td class="avg">54.20</td></tr>
107
+ <tr><td>Lingshu-7B</td><td>7B</td><td>54.00</td><td>67.90</td><td>83.10</td><td>61.90</td><td>56.30</td><td>82.90</td><td>26.70</td><td class="avg">61.80</td></tr>
108
+ <tr><td>BioMediX2-8B</td><td>8B</td><td>39.80</td><td>49.20</td><td>57.70</td><td>37.00</td><td>43.50</td><td>63.30</td><td>21.80</td><td class="avg">44.60</td></tr>
109
+ <tr class="category-row"><td colspan="10">InfiMed-Series Model</td></tr>
110
+ <tr class="infimed"><td><a href="https://huggingface.co/InfiX-ai/InfiMed-SFT-3B">InfiMed-SFT-3B</a></td><td>3B</td><td>54.67</td><td>58.09</td><td>82.00</td><td>60.59</td><td>53.22</td><td>67.01</td><td>23.55</td><td class="avg">57.02</td></tr>
111
+ <tr class="infimed"><td><a href="https://huggingface.co/InfiX-ai/InfiMed-RL-3B">InfiMed-RL-3B</a></td><td>3B</td><td>55.33</td><td>60.53</td><td>82.38</td><td>61.97</td><td>58.74</td><td>71.71</td><td>23.60</td><td class="avg">59.18</td></tr>
112
+ </tbody>
113
+ </table>
114
+
115
+
116
+ </body>
117
+ </html>
118
 
119
  ## Model Download
120
  Download the InfiMed models from the Hugging Face Hub into the `./models` directory.