nielsr HF Staff commited on
Commit
643dedb
·
verified ·
1 Parent(s): b5ffed2

Update pipeline tag to video-text-to-text

Browse files

This PR updates the `pipeline_tag` in the model card metadata from `image-text-to-text` to `video-text-to-text` to accurately reflect the model's ability to process both video and text. This ensures the model is correctly discoverable through Hugging Face's model search functionality.

Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -1,15 +1,15 @@
1
  ---
2
- license: apache-2.0
3
- pipeline_tag: image-text-to-text
4
- library_name: transformers
5
  base_model:
6
- - OpenGVLab/InternVL2.5-4B
7
- base_model_relation: merge
8
  language:
9
- - multilingual
 
 
 
10
  tags:
11
- - Sa2VA
12
- - custom_code
 
13
  ---
14
 
15
  # Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
@@ -156,4 +156,4 @@ If you find this project useful in your research, please consider citing:
156
  journal={arXiv preprint},
157
  year={2025}
158
  }
159
- ```
 
1
  ---
 
 
 
2
  base_model:
3
+ - OpenGVLab/InternVL2.5-4B
 
4
  language:
5
+ - multilingual
6
+ library_name: transformers
7
+ license: apache-2.0
8
+ pipeline_tag: video-text-to-text
9
  tags:
10
+ - Sa2VA
11
+ - custom_code
12
+ base_model_relation: merge
13
  ---
14
 
15
  # Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
 
156
  journal={arXiv preprint},
157
  year={2025}
158
  }
159
+ ```