mcintoc commited on
Commit
b5350db
·
verified ·
1 Parent(s): 7f0d8da

add documentation and model weights (#1)

Browse files

- initial updates to repo and licensing (3b89e4303872073829a482c66fb4961c0af4da5c)
- preprocessing code, more in github (0f1419cf71677187f3962000806f6cbcf2c507b5)
- readme with image (1f07f22e4df48da1ef9656f5c3186a915ac3a3c3)
- model weights (6a2b66469aa57e99de4a73f3e752dcb3c4453683)
- error in readme (f66778411dca6f51f409a6038d8887695a526e88)

Files changed (8) hide show
  1. .gitattributes +1 -0
  2. .gitignore +15 -0
  3. LICENSE +402 -0
  4. README.md +52 -1
  5. assets/github_highlevel.png +3 -0
  6. mixinhelpers.py +221 -0
  7. preprocessor.py +69 -0
  8. probmed_weights.pth +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.png filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .log/
2
+ .vscode/
3
+ __pycache__/
4
+ assets_local/
5
+ */__pycache__/
6
+ .venv/
7
+ _t*.py
8
+ _*.yaml
9
+ _develop_*.py
10
+ outputs/
11
+ jupyter/*
12
+ *.csv
13
+ third_party/*
14
+ dev_*.py
15
+ .venv
LICENSE ADDED
@@ -0,0 +1,402 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Attribution-NonCommercial-NoDerivatives 4.0 International
2
+
3
+ =======================================================================
4
+
5
+ Creative Commons Corporation ("Creative Commons") is not a law firm and
6
+ does not provide legal services or legal advice. Distribution of
7
+ Creative Commons public licenses does not create a lawyer-client or
8
+ other relationship. Creative Commons makes its licenses and related
9
+ information available on an "as-is" basis. Creative Commons gives no
10
+ warranties regarding its licenses, any material licensed under their
11
+ terms and conditions, or any related information. Creative Commons
12
+ disclaims all liability for damages resulting from their use to the
13
+ fullest extent possible.
14
+
15
+ Using Creative Commons Public Licenses
16
+
17
+ Creative Commons public licenses provide a standard set of terms and
18
+ conditions that creators and other rights holders may use to share
19
+ original works of authorship and other material subject to copyright
20
+ and certain other rights specified in the public license below. The
21
+ following considerations are for informational purposes only, are not
22
+ exhaustive, and do not form part of our licenses.
23
+
24
+ Considerations for licensors: Our public licenses are
25
+ intended for use by those authorized to give the public
26
+ permission to use material in ways otherwise restricted by
27
+ copyright and certain other rights. Our licenses are
28
+ irrevocable. Licensors should read and understand the terms
29
+ and conditions of the license they choose before applying it.
30
+ Licensors should also secure all rights necessary before
31
+ applying our licenses so that the public can reuse the
32
+ material as expected. Licensors should clearly mark any
33
+ material not subject to the license. This includes other CC-
34
+ licensed material, or material used under an exception or
35
+ limitation to copyright. More considerations for licensors:
36
+ wiki.creativecommons.org/Considerations_for_licensors
37
+
38
+ Considerations for the public: By using one of our public
39
+ licenses, a licensor grants the public permission to use the
40
+ licensed material under specified terms and conditions. If
41
+ the licensor's permission is not necessary for any reason--for
42
+ example, because of any applicable exception or limitation to
43
+ copyright--then that use is not regulated by the license. Our
44
+ licenses grant only permissions under copyright and certain
45
+ other rights that a licensor has authority to grant. Use of
46
+ the licensed material may still be restricted for other
47
+ reasons, including because others have copyright or other
48
+ rights in the material. A licensor may make special requests,
49
+ such as asking that all changes be marked or described.
50
+ Although not required by our licenses, you are encouraged to
51
+ respect those requests where reasonable. More considerations
52
+ for the public:
53
+ wiki.creativecommons.org/Considerations_for_licensees
54
+
55
+ =======================================================================
56
+
57
+ Creative Commons Attribution-NonCommercial-NoDerivatives 4.0
58
+ International Public License
59
+
60
+ By exercising the Licensed Rights (defined below), You accept and agree
61
+ to be bound by the terms and conditions of this Creative Commons
62
+ Attribution-NonCommercial-NoDerivatives 4.0 International Public
63
+ License ("Public License"). To the extent this Public License may be
64
+ interpreted as a contract, You are granted the Licensed Rights in
65
+ consideration of Your acceptance of these terms and conditions, and the
66
+ Licensor grants You such rights in consideration of benefits the
67
+ Licensor receives from making the Licensed Material available under
68
+ these terms and conditions.
69
+
70
+
71
+ Section 1 -- Definitions.
72
+
73
+ a. Adapted Material means material subject to Copyright and Similar
74
+ Rights that is derived from or based upon the Licensed Material
75
+ and in which the Licensed Material is translated, altered,
76
+ arranged, transformed, or otherwise modified in a manner requiring
77
+ permission under the Copyright and Similar Rights held by the
78
+ Licensor. For purposes of this Public License, where the Licensed
79
+ Material is a musical work, performance, or sound recording,
80
+ Adapted Material is always produced where the Licensed Material is
81
+ synched in timed relation with a moving image.
82
+
83
+ b. Copyright and Similar Rights means copyright and/or similar rights
84
+ closely related to copyright including, without limitation,
85
+ performance, broadcast, sound recording, and Sui Generis Database
86
+ Rights, without regard to how the rights are labeled or
87
+ categorized. For purposes of this Public License, the rights
88
+ specified in Section 2(b)(1)-(2) are not Copyright and Similar
89
+ Rights.
90
+
91
+ c. Effective Technological Measures means those measures that, in the
92
+ absence of proper authority, may not be circumvented under laws
93
+ fulfilling obligations under Article 11 of the WIPO Copyright
94
+ Treaty adopted on December 20, 1996, and/or similar international
95
+ agreements.
96
+
97
+ d. Exceptions and Limitations means fair use, fair dealing, and/or
98
+ any other exception or limitation to Copyright and Similar Rights
99
+ that applies to Your use of the Licensed Material.
100
+
101
+ e. Licensed Material means the artistic or literary work, database,
102
+ or other material to which the Licensor applied this Public
103
+ License.
104
+
105
+ f. Licensed Rights means the rights granted to You subject to the
106
+ terms and conditions of this Public License, which are limited to
107
+ all Copyright and Similar Rights that apply to Your use of the
108
+ Licensed Material and that the Licensor has authority to license.
109
+
110
+ g. Licensor means the individual(s) or entity(ies) granting rights
111
+ under this Public License.
112
+
113
+ h. NonCommercial means not primarily intended for or directed towards
114
+ commercial advantage or monetary compensation. For purposes of
115
+ this Public License, the exchange of the Licensed Material for
116
+ other material subject to Copyright and Similar Rights by digital
117
+ file-sharing or similar means is NonCommercial provided there is
118
+ no payment of monetary compensation in connection with the
119
+ exchange.
120
+
121
+ i. Share means to provide material to the public by any means or
122
+ process that requires permission under the Licensed Rights, such
123
+ as reproduction, public display, public performance, distribution,
124
+ dissemination, communication, or importation, and to make material
125
+ available to the public including in ways that members of the
126
+ public may access the material from a place and at a time
127
+ individually chosen by them.
128
+
129
+ j. Sui Generis Database Rights means rights other than copyright
130
+ resulting from Directive 96/9/EC of the European Parliament and of
131
+ the Council of 11 March 1996 on the legal protection of databases,
132
+ as amended and/or succeeded, as well as other essentially
133
+ equivalent rights anywhere in the world.
134
+
135
+ k. You means the individual or entity exercising the Licensed Rights
136
+ under this Public License. Your has a corresponding meaning.
137
+
138
+
139
+ Section 2 -- Scope.
140
+
141
+ a. License grant.
142
+
143
+ 1. Subject to the terms and conditions of this Public License,
144
+ the Licensor hereby grants You a worldwide, royalty-free,
145
+ non-sublicensable, non-exclusive, irrevocable license to
146
+ exercise the Licensed Rights in the Licensed Material to:
147
+
148
+ a. reproduce and Share the Licensed Material, in whole or
149
+ in part, for NonCommercial purposes only; and
150
+
151
+ b. produce and reproduce, but not Share, Adapted Material
152
+ for NonCommercial purposes only.
153
+
154
+ 2. Exceptions and Limitations. For the avoidance of doubt, where
155
+ Exceptions and Limitations apply to Your use, this Public
156
+ License does not apply, and You do not need to comply with
157
+ its terms and conditions.
158
+
159
+ 3. Term. The term of this Public License is specified in Section
160
+ 6(a).
161
+
162
+ 4. Media and formats; technical modifications allowed. The
163
+ Licensor authorizes You to exercise the Licensed Rights in
164
+ all media and formats whether now known or hereafter created,
165
+ and to make technical modifications necessary to do so. The
166
+ Licensor waives and/or agrees not to assert any right or
167
+ authority to forbid You from making technical modifications
168
+ necessary to exercise the Licensed Rights, including
169
+ technical modifications necessary to circumvent Effective
170
+ Technological Measures. For purposes of this Public License,
171
+ simply making modifications authorized by this Section 2(a)
172
+ (4) never produces Adapted Material.
173
+
174
+ 5. Downstream recipients.
175
+
176
+ a. Offer from the Licensor -- Licensed Material. Every
177
+ recipient of the Licensed Material automatically
178
+ receives an offer from the Licensor to exercise the
179
+ Licensed Rights under the terms and conditions of this
180
+ Public License.
181
+
182
+ b. No downstream restrictions. You may not offer or impose
183
+ any additional or different terms or conditions on, or
184
+ apply any Effective Technological Measures to, the
185
+ Licensed Material if doing so restricts exercise of the
186
+ Licensed Rights by any recipient of the Licensed
187
+ Material.
188
+
189
+ 6. No endorsement. Nothing in this Public License constitutes or
190
+ may be construed as permission to assert or imply that You
191
+ are, or that Your use of the Licensed Material is, connected
192
+ with, or sponsored, endorsed, or granted official status by,
193
+ the Licensor or others designated to receive attribution as
194
+ provided in Section 3(a)(1)(A)(i).
195
+
196
+ b. Other rights.
197
+
198
+ 1. Moral rights, such as the right of integrity, are not
199
+ licensed under this Public License, nor are publicity,
200
+ privacy, and/or other similar personality rights; however, to
201
+ the extent possible, the Licensor waives and/or agrees not to
202
+ assert any such rights held by the Licensor to the limited
203
+ extent necessary to allow You to exercise the Licensed
204
+ Rights, but not otherwise.
205
+
206
+ 2. Patent and trademark rights are not licensed under this
207
+ Public License.
208
+
209
+ 3. To the extent possible, the Licensor waives any right to
210
+ collect royalties from You for the exercise of the Licensed
211
+ Rights, whether directly or through a collecting society
212
+ under any voluntary or waivable statutory or compulsory
213
+ licensing scheme. In all other cases the Licensor expressly
214
+ reserves any right to collect such royalties, including when
215
+ the Licensed Material is used other than for NonCommercial
216
+ purposes.
217
+
218
+
219
+ Section 3 -- License Conditions.
220
+
221
+ Your exercise of the Licensed Rights is expressly made subject to the
222
+ following conditions.
223
+
224
+ a. Attribution.
225
+
226
+ 1. If You Share the Licensed Material, You must:
227
+
228
+ a. retain the following if it is supplied by the Licensor
229
+ with the Licensed Material:
230
+
231
+ i. identification of the creator(s) of the Licensed
232
+ Material and any others designated to receive
233
+ attribution, in any reasonable manner requested by
234
+ the Licensor (including by pseudonym if
235
+ designated);
236
+
237
+ ii. a copyright notice;
238
+
239
+ iii. a notice that refers to this Public License;
240
+
241
+ iv. a notice that refers to the disclaimer of
242
+ warranties;
243
+
244
+ v. a URI or hyperlink to the Licensed Material to the
245
+ extent reasonably practicable;
246
+
247
+ b. indicate if You modified the Licensed Material and
248
+ retain an indication of any previous modifications; and
249
+
250
+ c. indicate the Licensed Material is licensed under this
251
+ Public License, and include the text of, or the URI or
252
+ hyperlink to, this Public License.
253
+
254
+ For the avoidance of doubt, You do not have permission under
255
+ this Public License to Share Adapted Material.
256
+
257
+ 2. You may satisfy the conditions in Section 3(a)(1) in any
258
+ reasonable manner based on the medium, means, and context in
259
+ which You Share the Licensed Material. For example, it may be
260
+ reasonable to satisfy the conditions by providing a URI or
261
+ hyperlink to a resource that includes the required
262
+ information.
263
+
264
+ 3. If requested by the Licensor, You must remove any of the
265
+ information required by Section 3(a)(1)(A) to the extent
266
+ reasonably practicable.
267
+
268
+
269
+ Section 4 -- Sui Generis Database Rights.
270
+
271
+ Where the Licensed Rights include Sui Generis Database Rights that
272
+ apply to Your use of the Licensed Material:
273
+
274
+ a. for the avoidance of doubt, Section 2(a)(1) grants You the right
275
+ to extract, reuse, reproduce, and Share all or a substantial
276
+ portion of the contents of the database for NonCommercial purposes
277
+ only and provided You do not Share Adapted Material;
278
+
279
+ b. if You include all or a substantial portion of the database
280
+ contents in a database in which You have Sui Generis Database
281
+ Rights, then the database in which You have Sui Generis Database
282
+ Rights (but not its individual contents) is Adapted Material; and
283
+
284
+ c. You must comply with the conditions in Section 3(a) if You Share
285
+ all or a substantial portion of the contents of the database.
286
+
287
+ For the avoidance of doubt, this Section 4 supplements and does not
288
+ replace Your obligations under this Public License where the Licensed
289
+ Rights include other Copyright and Similar Rights.
290
+
291
+
292
+ Section 5 -- Disclaimer of Warranties and Limitation of Liability.
293
+
294
+ a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE
295
+ EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS
296
+ AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF
297
+ ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS,
298
+ IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION,
299
+ WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR
300
+ PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS,
301
+ ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT
302
+ KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT
303
+ ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU.
304
+
305
+ b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE
306
+ TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION,
307
+ NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT,
308
+ INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES,
309
+ COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR
310
+ USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN
311
+ ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR
312
+ DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR
313
+ IN PART, THIS LIMITATION MAY NOT APPLY TO YOU.
314
+
315
+ c. The disclaimer of warranties and limitation of liability provided
316
+ above shall be interpreted in a manner that, to the extent
317
+ possible, most closely approximates an absolute disclaimer and
318
+ waiver of all liability.
319
+
320
+
321
+ Section 6 -- Term and Termination.
322
+
323
+ a. This Public License applies for the term of the Copyright and
324
+ Similar Rights licensed here. However, if You fail to comply with
325
+ this Public License, then Your rights under this Public License
326
+ terminate automatically.
327
+
328
+ b. Where Your right to use the Licensed Material has terminated under
329
+ Section 6(a), it reinstates:
330
+
331
+ 1. automatically as of the date the violation is cured, provided
332
+ it is cured within 30 days of Your discovery of the
333
+ violation; or
334
+
335
+ 2. upon express reinstatement by the Licensor.
336
+
337
+ For the avoidance of doubt, this Section 6(b) does not affect any
338
+ right the Licensor may have to seek remedies for Your violations
339
+ of this Public License.
340
+
341
+ c. For the avoidance of doubt, the Licensor may also offer the
342
+ Licensed Material under separate terms or conditions or stop
343
+ distributing the Licensed Material at any time; however, doing so
344
+ will not terminate this Public License.
345
+
346
+ d. Sections 1, 5, 6, 7, and 8 survive termination of this Public
347
+ License.
348
+
349
+
350
+ Section 7 -- Other Terms and Conditions.
351
+
352
+ a. The Licensor shall not be bound by any additional or different
353
+ terms or conditions communicated by You unless expressly agreed.
354
+
355
+ b. Any arrangements, understandings, or agreements regarding the
356
+ Licensed Material not stated herein are separate from and
357
+ independent of the terms and conditions of this Public License.
358
+
359
+
360
+ Section 8 -- Interpretation.
361
+
362
+ a. For the avoidance of doubt, this Public License does not, and
363
+ shall not be interpreted to, reduce, limit, restrict, or impose
364
+ conditions on any use of the Licensed Material that could lawfully
365
+ be made without permission under this Public License.
366
+
367
+ b. To the extent possible, if any provision of this Public License is
368
+ deemed unenforceable, it shall be automatically reformed to the
369
+ minimum extent necessary to make it enforceable. If the provision
370
+ cannot be reformed, it shall be severed from this Public License
371
+ without affecting the enforceability of the remaining terms and
372
+ conditions.
373
+
374
+ c. No term or condition of this Public License will be waived and no
375
+ failure to comply consented to unless expressly agreed to by the
376
+ Licensor.
377
+
378
+ d. Nothing in this Public License constitutes or may be interpreted
379
+ as a limitation upon, or waiver of, any privileges and immunities
380
+ that apply to the Licensor or You, including from the legal
381
+ processes of any jurisdiction or authority.
382
+
383
+ =======================================================================
384
+
385
+ Creative Commons is not a party to its public
386
+ licenses. Notwithstanding, Creative Commons may elect to apply one of
387
+ its public licenses to material it publishes and in those instances
388
+ will be considered the “Licensor.” The text of the Creative Commons
389
+ public licenses is dedicated to the public domain under the CC0 Public
390
+ Domain Dedication. Except for the limited purpose of indicating that
391
+ material is shared under a Creative Commons public license or as
392
+ otherwise permitted by the Creative Commons policies published at
393
+ creativecommons.org/policies, Creative Commons does not authorize the
394
+ use of the trademark "Creative Commons" or any other trademark or logo
395
+ of Creative Commons without its prior written consent including,
396
+ without limitation, in connection with any unauthorized modifications
397
+ to any of its public licenses or any other arrangements,
398
+ understandings, or agreements concerning use of licensed material. For
399
+ the avoidance of doubt, this paragraph does not form part of the
400
+ public licenses.
401
+
402
+ Creative Commons may be contacted at creativecommons.org.
README.md CHANGED
@@ -1,3 +1,54 @@
1
  ---
2
- license: cc-by-nc-4.0
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ license: cc-by-nc-nd-4.0
5
+ tags:
6
+ - cxr
7
+ - ecg
8
+ - echocardiogram
9
+ - probabilistic modelling
10
+ - multimodal
11
+ - medical
12
+ pipeline_tag: other
13
  ---
14
+
15
+ # ProbMED: A Probabilistic Framework for Medical Multimodal Binding (ICCV 2025)
16
+ Probabilistic Modality-Enhanced Diagnosis (ProbMED), a multi-modal Med-VLPM that employs probabilistic contrastive learning to model distributions over embeddings rather than fixed-point, deterministic estimates. ProbMED aligns four distinct modalities—chest X-rays, electrocardiograms, echocardiograms, and clinical text—into a unified probabilistic embedding space.
17
+
18
+ <p align="center">
19
+ <img src="assets/github_highlevel.png" width="40%">
20
+ </p>
21
+
22
+
23
+ ## Installation
24
+ Clone the GitHub repository and install dependencies, instructions are found in the repo:
25
+ ```bash
26
+ git clone [email protected]:mcintoshML/probMED.git
27
+ cd probMED
28
+ pip install -r requirements.txt
29
+ ```
30
+
31
+ ## Full Code Release
32
+ The model weights and inference is available with this code base.
33
+
34
+ **We plan to release the full training and evaluation codebase upon the clinical journal submission to facilitate reproducibility, please stay tuned!**
35
+
36
+ ## License
37
+ This work is licensed under the **Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0)**.
38
+
39
+ You may share this work for non-commercial purposes, with proper attribution, but you may not modify it or use it commercially.
40
+
41
+ [![Creative Commons License](https://i.creativecommons.org/l/by-nc-nd/4.0/88x31.png)](https://creativecommons.org/licenses/by-nc-nd/4.0/)
42
+
43
+ [View Full License Details](https://creativecommons.org/licenses/by-nc-nd/4.0/)
44
+
45
+ ## Citation
46
+ If you use ProbMED in your research (ICCV 2025), please cite:
47
+ ```
48
+ @article{gao2025probmed,
49
+ title={ProbMed: A Probabilistic Framework for Medical Multimodal Binding},
50
+ author={Gao, Yuan and Kim, Sangwook and You, Jianzhong and McIntosh, Chris},
51
+ journal={arXiv preprint arXiv:2509.25711},
52
+ year={2025}
53
+ }
54
+ ```
assets/github_highlevel.png ADDED

Git LFS Details

  • SHA256: 3e10e5cc30e35945b2d0f4e0301b90e54ea7fad05a7a15e7865ba7b51e638384
  • Pointer size: 131 Bytes
  • Size of remote file: 193 kB
mixinhelpers.py ADDED
@@ -0,0 +1,221 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # For CXR
2
+ import random
3
+
4
+ import cv2
5
+ import numpy as np
6
+ import torch
7
+ from PIL import Image
8
+ from torchvision import transforms
9
+ from transformers import BatchEncoding, PreTrainedTokenizer
10
+
11
+ """
12
+ Mixin for all modalities, each mixin has:
13
+ - preprocess function that takes in path or data and returns tensor
14
+ - construct_input function that takes in tensor and returns dict with batch
15
+ dimension for model input
16
+ - key string for model input dict
17
+ """
18
+
19
+
20
+ class ECHO_Mixin:
21
+ LOWER_YELLOW: list[int] = [20, 50, 50]
22
+ UPPER_YELLOW: list[int] = [100, 255, 255]
23
+ IMAGE_SIZE: tuple[int, int] = (224, 224)
24
+ NORM_MEAN: tuple[float, float, float] = (0.48145466, 0.4578275, 0.40821073)
25
+ NORM_STD: tuple[float, float, float] = (0.26862954, 0.26130258, 0.27577711)
26
+
27
+ ECHO_TRANSFORMS = transforms.Compose(
28
+ [
29
+ transforms.ToTensor(), # Scaling into [0, 1]
30
+ transforms.Resize(IMAGE_SIZE),
31
+ transforms.Normalize(
32
+ mean=NORM_MEAN,
33
+ std=NORM_STD,
34
+ ),
35
+ ]
36
+ )
37
+ ECHO_KEY: str = "echo"
38
+
39
+ def grabimage(self, split: str, data: dict[str, np.ndarray]) -> np.ndarray:
40
+ """"""
41
+ if split == "train":
42
+ caseofinterest = random.choice(list(data.keys()))
43
+ imageindice = random.choice(list(range(data[caseofinterest].shape[0])))
44
+
45
+ else:
46
+ caseofinterest = random.choice(list(data.keys())) # listofcases[0]
47
+ imageindice = 0
48
+ video = data[caseofinterest]
49
+ return self.extract_echoframe(imageindice, video)
50
+
51
+ def extract_echoframe(self, imageindice: int, video: np.ndarray) -> np.ndarray:
52
+ image = video[imageindice]
53
+ hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
54
+ lower_yellow = np.array(self.LOWER_YELLOW) # Lower bound of yellow hue
55
+ upper_yellow = np.array(self.UPPER_YELLOW) # Upper bound of yellow hue
56
+ mask = cv2.inRange(hsv_image, lower_yellow, upper_yellow)
57
+ image[mask > 0] = [0, 0, 0]
58
+ image = np.array(image, dtype=np.float32)
59
+ image -= image.min()
60
+ image /= image.max()
61
+ image *= 255
62
+
63
+ image = image
64
+ image = image[:, :, :]
65
+ image = image.astype(np.uint8)
66
+ return image
67
+
68
+ def preprocess_echoseries(
69
+ self, video_dict: dict[str, np.ndarray], split: str = "valid"
70
+ ) -> torch.Tensor:
71
+ """assumes inference mode"""
72
+ image = self.grabimage(split, video_dict)
73
+ if not isinstance(image, np.ndarray):
74
+ raise TypeError("Expected image to be a numpy ndarray")
75
+ pil_image = Image.fromarray(image)
76
+ transformed = self.ECHO_TRANSFORMS(pil_image)
77
+ if not isinstance(transformed, torch.Tensor):
78
+ transformed = transforms.ToTensor()(pil_image)
79
+ return transformed
80
+
81
+ def preprocess_single_echo(self, avi_path: str) -> torch.Tensor:
82
+ """assumes inference mode, opens AVI file and processes first frame
83
+ Output: image: torch.Tensor of shape (C, H, W)
84
+ """
85
+ cap = cv2.VideoCapture(avi_path)
86
+ success, frame = cap.read()
87
+ cap.release()
88
+ if not success or frame is None:
89
+ raise ValueError(f"Could not read frame from AVI file: {avi_path}")
90
+ image = self.extract_echoframe(0, np.array([frame])) # process first frame
91
+ image = self.ECHO_TRANSFORMS(Image.fromarray(image))
92
+ if not isinstance(image, torch.Tensor):
93
+ image = torch.from_numpy(image)
94
+ return image
95
+
96
+
97
+ # CXR
98
+ class CXR_Mixin:
99
+ RESIZE: tuple[int, int] = (256, 256)
100
+ IMAGE_SIZE: tuple[int, int] = (224, 224)
101
+ NORM_MEAN: list[float] = [0.5862785803043838]
102
+ NORM_STD: list[float] = [0.27950088968644304]
103
+ VISION_KEY: str = "vision"
104
+ CXR_TRANSFORMS = transforms.Compose(
105
+ [
106
+ transforms.ToTensor(), # Scaling into [0, 1]
107
+ transforms.Resize(RESIZE),
108
+ transforms.CenterCrop(IMAGE_SIZE),
109
+ transforms.Normalize(
110
+ mean=NORM_MEAN,
111
+ std=NORM_STD,
112
+ ),
113
+ ]
114
+ )
115
+
116
+ @staticmethod
117
+ def remove_border(pixel_array: np.ndarray) -> np.ndarray:
118
+ # Find where the image is not just background (0s)
119
+ coords = np.column_stack(np.where(pixel_array > 0))
120
+ x_min, y_min = coords.min(axis=0)
121
+ x_max, y_max = coords.max(axis=0)
122
+ # Crop the image
123
+ cropped_image = pixel_array[x_min:x_max, y_min:y_max]
124
+ return cropped_image
125
+
126
+ def preprocess_loaded_cxr(self, img: np.array) -> torch.Tensor:
127
+ cxr = self.remove_border(img)
128
+ # Convert grayscale image to 3-channel RGB
129
+ cxr = np.repeat(cxr[..., np.newaxis], 3, axis=-1)
130
+
131
+ cxr = Image.fromarray(cxr)
132
+ transformed = self.CXR_TRANSFORMS(cxr)
133
+ if not isinstance(transformed, torch.Tensor):
134
+ transformed = transforms.ToTensor()(cxr)
135
+ return transformed
136
+
137
+ def preprocess_single_cxr(self, image_path: str) -> torch.Tensor:
138
+ """assumes inference mode"""
139
+ with open(image_path, "rb") as fopen:
140
+ image = Image.open(fopen).convert("RGB")
141
+ image = np.array(image)[:, :, 0] # convert to grayscale
142
+
143
+ cxr = self.preprocess_loaded_cxr(image)
144
+ return cxr
145
+
146
+
147
+ class ECG_Mixin:
148
+ LENGTH: int = 1000
149
+ FREQUENCY: int = 100 # we assume 100Hz sampling rate
150
+ CHANNELS: int = 12
151
+ NORM_MEAN: float = 0.02547506
152
+ NORM_SCALE: float = 0.16486814
153
+ NORM_VAR: float = 0.0271815
154
+ ECG_KEY: str = "ecg"
155
+
156
+ def manual_standardize(self, x: np.ndarray) -> torch.Tensor:
157
+ """
158
+ Apply manual standardization to ECG or other data.
159
+ Equivalent to sklearn's StandardScaler with given constants.
160
+
161
+ Args:
162
+ x (np.ndarray): Input array of shape (12, 1000)
163
+ Returns:
164
+ torch.Tensor: Scaled array of the same shape
165
+ """
166
+ return torch.from_numpy((x - self.NORM_MEAN) / self.NORM_SCALE).float()
167
+
168
+ def check_ecg(self, ecg: np.ndarray) -> np.ndarray:
169
+ # Find where the image is not just background (0s)
170
+ if np.isnan(ecg).any() or np.isinf(ecg).any():
171
+ raise ValueError("ECG contains NaN or Inf values")
172
+ return ecg[:, : self.LENGTH] # Truncate to first 1000 length (10 seconds at 100Hz)
173
+
174
+ def preprocess_single_ecg(self, ecg_path: str) -> torch.Tensor:
175
+ """assumes inference mode"""
176
+ # ecg is a np array path, assumes 12 channels
177
+ ecg = np.load(ecg_path)
178
+ if ecg.ndim == 2 and ecg.shape[0] != self.CHANNELS:
179
+ raise ValueError(f"Expected ECG with {self.CHANNELS} channels, got {ecg.shape[0]}")
180
+
181
+ ecg = self.check_ecg(ecg)
182
+ transformed = self.manual_standardize(ecg)
183
+
184
+ return transformed
185
+
186
+
187
+ class Text_Mixin:
188
+ MODALITY_LIST: dict[str, str] = {"echo": "echocardiogram", "ecg": "ecg", "vision": "cxr"}
189
+ MAX_LENGTH: int = 120 # longer length to accomodate longer reports
190
+ TEXT_LENGTH: int = 100 # 100 words
191
+
192
+ def get_first_n_words(self, text: str, n: int = 100) -> str:
193
+ """97.5 percentile of text is less than 35 words"""
194
+ words = text.split() # Split the text into words
195
+ return " ".join(words[:n]) # Join the first n words back into a string
196
+
197
+ def createCaption(self, caption: str, modality: str = "") -> str:
198
+ assert modality in set(self.MODALITY_LIST.keys()) or modality == "", (
199
+ f"modality should be in {self.MODALITY_LIST} or empty"
200
+ )
201
+ return f"text : {caption}, {modality} looks like : "
202
+
203
+ def createTokenizedCaption(self, caption: str, tokenizer: PreTrainedTokenizer) -> BatchEncoding:
204
+ encoding = tokenizer(
205
+ caption,
206
+ padding="max_length",
207
+ truncation=True,
208
+ max_length=self.MAX_LENGTH,
209
+ return_tensors="pt",
210
+ )
211
+ return encoding
212
+
213
+ def construct_caption(
214
+ self, caption: str, tokenizer: PreTrainedTokenizer, modality: str = ""
215
+ ) -> BatchEncoding:
216
+ """given caption string, return tokenized caption dict for model input
217
+ Output: dict with keys 'input_ids' and 'attention_mask', each of shape (1, L)
218
+ """
219
+ caption_str = self.createCaption(caption, modality)
220
+ tokenized = self.createTokenizedCaption(caption_str, tokenizer)
221
+ return tokenized
preprocessor.py ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ from transformers import AutoTokenizer, BatchEncoding
3
+
4
+ from mixinhelpers import CXR_Mixin, ECG_Mixin, ECHO_Mixin, Text_Mixin
5
+
6
+ """
7
+ Preprocessor classes for different modalities and their combinations.
8
+ You can combine different mixins to create preprocessors for multi-modal inputs.
9
+ Examples below are provided for ECHO+Text, ECG+Text, and CXR+Text.
10
+ """
11
+
12
+
13
+ class BasePreprocessor:
14
+ def __init__(self, model_name: str = "dmis-lab/biobert-v1.1") -> None:
15
+ self.tokenizer = AutoTokenizer.from_pretrained(model_name)
16
+
17
+
18
+ # duo modality preprocessors
19
+ class ECHOText_Preprocessor(BasePreprocessor, ECHO_Mixin, Text_Mixin):
20
+ def __init__(self, model_name: str = "dmis-lab/biobert-v1.1") -> None:
21
+ super().__init__(model_name=model_name)
22
+
23
+ def preprocess_echo_text(self, echo_path: str, text: str) -> tuple[torch.Tensor, BatchEncoding]:
24
+ """this can be used in dataloader to correctly collate batches, use the string keys to
25
+ identify the modalities
26
+ echo_path: path to echo npy file
27
+ text: string of text report
28
+ returns: (echo tensor, tokenized text dict)"""
29
+ echo = self.preprocess_single_echo(echo_path) # (C, H, W)
30
+ text_inputs = self.construct_caption(
31
+ caption=text, tokenizer=self.tokenizer, modality=self.ECHO_KEY
32
+ )
33
+ return echo, text_inputs
34
+
35
+
36
+ class ECGText_Preprocessor(BasePreprocessor, ECG_Mixin, Text_Mixin):
37
+ def __init__(self, model_name: str = "dmis-lab/biobert-v1.1") -> None:
38
+ super().__init__(model_name=model_name)
39
+
40
+ def preprocess_ecg_text(self, ecg_path: str, text: str) -> tuple[torch.Tensor, BatchEncoding]:
41
+ """this can be used in dataloader to correctly collate batches, use the string keys
42
+ to identify the modalities
43
+ ecg_path: path to ecg npy file
44
+ text: string of text report
45
+ returns: (ecg tensor, tokenized text dict)"""
46
+ ecg = self.preprocess_single_ecg(ecg_path) # (C, L)
47
+ text_inputs = self.construct_caption(
48
+ caption=text, tokenizer=self.tokenizer, modality=self.ECG_KEY
49
+ )
50
+
51
+ return ecg, text_inputs
52
+
53
+
54
+ class CXRText_Preprocessor(BasePreprocessor, CXR_Mixin, Text_Mixin):
55
+ def __init__(self, model_name: str = "dmis-lab/biobert-v1.1") -> None:
56
+ super().__init__(model_name=model_name)
57
+
58
+ def preprocess_cxr_text(self, cxr_path: str, text: str) -> tuple[torch.Tensor, BatchEncoding]:
59
+ """this can be used in dataloader to correctly collate batches, use the string keys to
60
+ identify the modalities
61
+ cxr_path: path to cxr image file
62
+ text: string of text report
63
+ returns: (cxr tensor, tokenized text dict)"""
64
+ cxr = self.preprocess_single_cxr(cxr_path) # (C, H, W)
65
+ text_inputs = self.construct_caption(
66
+ caption=text, tokenizer=self.tokenizer, modality=self.VISION_KEY
67
+ )
68
+
69
+ return cxr, text_inputs
probmed_weights.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e706413f29b3dd7ba3ef197e1178192f92ffc2b75a0a3bd38eff37ce5f7059b0
3
+ size 1008833750