trinhvg
/

ViDRiP_LLaVA_image

Safetensors

llava_qwen

Model card Files Files and versions Community

trinhvg commited on May 20

Commit

b29a222

verified ·

1 Parent(s): 4436a91

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +38 -3

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
-# 🧬 ViDRiP-LLaVA: Multimodal Diagnostic Reasoning in Pathology
 **ViDRiP-LLaVA** is a vision-language framework designed for instruction-based diagnostic reasoning using both image patches and video clips from pathology slides. It builds on LLaVA and extends it to the medical domain with domain-specific datasets and fine-tuned models.
@@ -19,10 +19,12 @@ Our method leverages chain-of-thought (CoT) prompting to distill the reasoning c
 </p>
-## 📚 Datasets
 ### 🔹 [ViDRiP_Instruct_Train](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train)
-### 🔹 [ViDRiP_Instruct_Train_Video_GoogleDrive](https://drive.google.com/drive/folders/1oxZlaJpE7PGDYt32LeoGgIzwEvWdnupY?usp=sharing)
 ### 🔹 [ViDRiP_Instruct_Train_Video_Hugging Face](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train) (There is 10 zip files)
 - 4,000+ instruction-style samples
@@ -39,6 +41,11 @@ Our method leverages chain-of-thought (CoT) prompting to distill the reasoning c
 - Held-out test set of diagnostic Q&A pairs
 - Used for benchmarking reasoning performance
 ---
@@ -101,3 +108,31 @@ license: cc-by-nc-nd-3.0
 ### Citation:
 Coming soon

+# 🧬 ViDRiP-LLaVA: A Dataset and Benchmark for Diagnostic Reasoning from Pathology Videos
 **ViDRiP-LLaVA** is a vision-language framework designed for instruction-based diagnostic reasoning using both image patches and video clips from pathology slides. It builds on LLaVA and extends it to the medical domain with domain-specific datasets and fine-tuned models.
 </p>
+## 📚 Video Datasets
 ### 🔹 [ViDRiP_Instruct_Train](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train)
+The videos data is ~ 100 GB:
+[//]: # (### 🔹 [ViDRiP_Instruct_Train_Video_GoogleDrive]&#40;https://drive.google.com/drive/folders/1oxZlaJpE7PGDYt32LeoGgIzwEvWdnupY?usp=sharing&#41;)
 ### 🔹 [ViDRiP_Instruct_Train_Video_Hugging Face](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train) (There is 10 zip files)
 - 4,000+ instruction-style samples
 - Held-out test set of diagnostic Q&A pairs
 - Used for benchmarking reasoning performance
+## 📚 Image Datasets
+We use publicly available datasets: Quilt-LLaVA and PathAsst.
+Please refer to their respective repositories for download instructions.
+- [**Quilt-LLaVA**](https://github.com/aldraus/quilt-llava): A vision-language dataset for pathology adapted from LLaVA.
+- [**PathAsst**](https://github.com/superjamessyx/Generative-Foundation-AI-Assistant-for-Pathology): A generative assistant for pathology with curated image-text pairs.
 ---
 ### Citation:
 Coming soon
+## 📄 Usage and License Notices
+**ViDRiP-LLaVA** (Vision-language Diagnostic Reasoning in Pathology), including its dataset, code, and model checkpoints, is released strictly for **non-commercial research purposes only**.
+### 📁 Licenses
+* **Dataset:**
+  Licensed under [**CC BY-NC-ND 3.0**](https://creativecommons.org/licenses/by-nc-nd/3.0/) (Attribution–NonCommercial–NoDerivatives)
+* **Code and pretrained models:**
+  Licensed under [**CC BY-NC 3.0**](https://creativecommons.org/licenses/by-nc/3.0/) (Attribution–NonCommercial)
+### ⚙️ Dependencies and Components
+This project may incorporate or build upon resources such as **LLaVA-Next**, **QUILT-1M**, **LLaMA**, **PathAsst**, and **GPT-4**, each subject to their own licenses and **Terms of Use**.
+### 🎥 Source Acknowledgment
+ViDRiP-LLaVA includes data derived from **public educational pathology videos hosted on YouTube**.
+All content usage complies with [**YouTube’s Terms of Service**](https://www.youtube.com/t/terms), and the **intellectual property rights of the original pathologist creators are fully acknowledged and respected**.
+### 🚫 Restrictions
+* Not for **commercial use**
+* Not to be used in **clinical care** or **medical decision-making**
+* For **academic research, development, and evaluation only**