trinhvg commited on
Commit
6da947e
·
verified ·
1 Parent(s): 43f5306

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +22 -8
README.md CHANGED
@@ -1,8 +1,3 @@
1
- ---
2
- datasets:
3
- - trinhvg/ViDRiP_Instruct_Test
4
- - trinhvg/ViDRiP_Instruct_Train
5
- ---
6
 
7
  # 🧬 ViDRiP-LLaVA: A Dataset and Benchmark for Diagnostic Reasoning from Pathology Videos
8
 
@@ -26,11 +21,28 @@ Our method leverages chain-of-thought (CoT) prompting to distill the reasoning c
26
 
27
  ## 📚 Video Datasets
28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ### 🔹 [ViDRiP_Instruct_Train](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train)
30
- The videos data is ~ 100 GB:
31
 
32
  [//]: # (### 🔹 [ViDRiP_Instruct_Train_Video_GoogleDrive](https://drive.google.com/drive/folders/1oxZlaJpE7PGDYt32LeoGgIzwEvWdnupY?usp=sharing))
33
- ### 🔹 [ViDRiP_Instruct_Train_Video_Hugging Face](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train) (There is 10 zip files)
34
 
35
  - 4,000+ instruction-style samples
36
  - Each sample includes:
@@ -46,6 +58,8 @@ The videos data is ~ 100 GB:
46
  - Held-out test set of diagnostic Q&A pairs
47
  - Used for benchmarking reasoning performance
48
 
 
 
49
  ## 📚 Image Datasets
50
  We use publicly available datasets: Quilt-LLaVA and PathAsst.
51
  Please refer to their respective repositories for download instructions.
@@ -139,4 +153,4 @@ All content usage complies with [**YouTube’s Terms of Service**](https://www.y
139
 
140
  * Not for **commercial use**
141
  * Not to be used in **clinical care** or **medical decision-making**
142
- * For **academic research, development, and evaluation only**
 
 
 
 
 
 
1
 
2
  # 🧬 ViDRiP-LLaVA: A Dataset and Benchmark for Diagnostic Reasoning from Pathology Videos
3
 
 
21
 
22
  ## 📚 Video Datasets
23
 
24
+ ### 🎥 Released Video Format
25
+
26
+ All clips are:
27
+ - **Cleaned** using a Visual Data Refinement pipeline (temporal trimming + YoloPath filtering + OCR exclusion + inpainting)
28
+ - **Downsampled** to **1–5 FPS** to reduce file size and support fair-use compliance
29
+ - **Muted** to avoid redistribution of original YouTube audio
30
+
31
+ These steps preserve diagnostic signal while respecting the rights of YouTube creators and complying with [YouTube’s Terms of Service](https://www.youtube.com/t/terms).
32
+
33
+ ### 🔍 Training vs. Public Release Notice
34
+ The ViDRiP-LLaVA models were trained on an internal dataset version that included:
35
+ - Full-frame-rate video clips
36
+ - Visual content **prior to OCR filtering**
37
+
38
+ All **evaluations** (including those in our benchmark) are conducted using the **publicly released test set**, ensuring full reproducibility.
39
+
40
+
41
  ### 🔹 [ViDRiP_Instruct_Train](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train)
42
+ The videos data is ~ 60 GB:
43
 
44
  [//]: # (### 🔹 [ViDRiP_Instruct_Train_Video_GoogleDrive](https://drive.google.com/drive/folders/1oxZlaJpE7PGDYt32LeoGgIzwEvWdnupY?usp=sharing))
45
+ ### 🔹 [ViDRiP_Instruct_Train_Video_Hugging Face](https://huggingface.co/datasets/trinhvg/ViDRiP_Instruct_Train) (There is 6 zip files)
46
 
47
  - 4,000+ instruction-style samples
48
  - Each sample includes:
 
58
  - Held-out test set of diagnostic Q&A pairs
59
  - Used for benchmarking reasoning performance
60
 
61
+
62
+
63
  ## 📚 Image Datasets
64
  We use publicly available datasets: Quilt-LLaVA and PathAsst.
65
  Please refer to their respective repositories for download instructions.
 
153
 
154
  * Not for **commercial use**
155
  * Not to be used in **clinical care** or **medical decision-making**
156
+ * For **academic research, development, and evaluation only**