Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -1,8 +1,3 @@
|
|
1 |
-
---
|
2 |
-
datasets:
|
3 |
-
- trinhvg/ViDRiP_Instruct_Test
|
4 |
-
- trinhvg/ViDRiP_Instruct_Train
|
5 |
-
---
|
6 |
|
7 |
# 🧬 ViDRiP-LLaVA: A Dataset and Benchmark for Diagnostic Reasoning from Pathology Videos
|
8 |
|
@@ -26,11 +21,28 @@ Our method leverages chain-of-thought (CoT) prompting to distill the reasoning c
|
|
26 |
|
27 |
## 📚 Video Datasets
|
28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
### 🔹 [ViDRiP_Instruct_Train](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train)
|
30 |
-
The videos data is ~
|
31 |
|
32 |
[//]: # (### 🔹 [ViDRiP_Instruct_Train_Video_GoogleDrive](https://drive.google.com/drive/folders/1oxZlaJpE7PGDYt32LeoGgIzwEvWdnupY?usp=sharing))
|
33 |
-
### 🔹 [ViDRiP_Instruct_Train_Video_Hugging Face](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train) (There is
|
34 |
|
35 |
- 4,000+ instruction-style samples
|
36 |
- Each sample includes:
|
@@ -46,6 +58,8 @@ The videos data is ~ 100 GB:
|
|
46 |
- Held-out test set of diagnostic Q&A pairs
|
47 |
- Used for benchmarking reasoning performance
|
48 |
|
|
|
|
|
49 |
## 📚 Image Datasets
|
50 |
We use publicly available datasets: Quilt-LLaVA and PathAsst.
|
51 |
Please refer to their respective repositories for download instructions.
|
@@ -139,4 +153,4 @@ All content usage complies with [**YouTube’s Terms of Service**](https://www.y
|
|
139 |
|
140 |
* Not for **commercial use**
|
141 |
* Not to be used in **clinical care** or **medical decision-making**
|
142 |
-
* For **academic research, development, and evaluation only**
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
2 |
# 🧬 ViDRiP-LLaVA: A Dataset and Benchmark for Diagnostic Reasoning from Pathology Videos
|
3 |
|
|
|
21 |
|
22 |
## 📚 Video Datasets
|
23 |
|
24 |
+
### 🎥 Released Video Format
|
25 |
+
|
26 |
+
All clips are:
|
27 |
+
- **Cleaned** using a Visual Data Refinement pipeline (temporal trimming + YoloPath filtering + OCR exclusion + inpainting)
|
28 |
+
- **Downsampled** to **1–5 FPS** to reduce file size and support fair-use compliance
|
29 |
+
- **Muted** to avoid redistribution of original YouTube audio
|
30 |
+
|
31 |
+
These steps preserve diagnostic signal while respecting the rights of YouTube creators and complying with [YouTube’s Terms of Service](https://www.youtube.com/t/terms).
|
32 |
+
|
33 |
+
### 🔍 Training vs. Public Release Notice
|
34 |
+
The ViDRiP-LLaVA models were trained on an internal dataset version that included:
|
35 |
+
- Full-frame-rate video clips
|
36 |
+
- Visual content **prior to OCR filtering**
|
37 |
+
|
38 |
+
All **evaluations** (including those in our benchmark) are conducted using the **publicly released test set**, ensuring full reproducibility.
|
39 |
+
|
40 |
+
|
41 |
### 🔹 [ViDRiP_Instruct_Train](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train)
|
42 |
+
The videos data is ~ 60 GB:
|
43 |
|
44 |
[//]: # (### 🔹 [ViDRiP_Instruct_Train_Video_GoogleDrive](https://drive.google.com/drive/folders/1oxZlaJpE7PGDYt32LeoGgIzwEvWdnupY?usp=sharing))
|
45 |
+
### 🔹 [ViDRiP_Instruct_Train_Video_Hugging Face](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train) (There is 6 zip files)
|
46 |
|
47 |
- 4,000+ instruction-style samples
|
48 |
- Each sample includes:
|
|
|
58 |
- Held-out test set of diagnostic Q&A pairs
|
59 |
- Used for benchmarking reasoning performance
|
60 |
|
61 |
+
|
62 |
+
|
63 |
## 📚 Image Datasets
|
64 |
We use publicly available datasets: Quilt-LLaVA and PathAsst.
|
65 |
Please refer to their respective repositories for download instructions.
|
|
|
153 |
|
154 |
* Not for **commercial use**
|
155 |
* Not to be used in **clinical care** or **medical decision-making**
|
156 |
+
* For **academic research, development, and evaluation only**
|