Improve model card: Update pipeline tag, add dataset, and HF paper link

This PR aims to improve the model card by:

- Updating the `pipeline_tag` from `visual-question-answering` to `image-text-to-text` to better reflect the model's capabilities as a Multimodal Large Language Model.
- Adding `MLLM-CL/MLLM-CL-ReplayData` to the `datasets` metadata, as referenced in the project's GitHub README.
- Including the Hugging Face paper link alongside the existing arXiv link for improved accessibility to the paper.

These changes enhance the model's discoverability and provide more comprehensive information for users on the Hugging Face Hub.

Files changed (1) hide show

README.md +11 -10

README.md CHANGED Viewed

@@ -1,13 +1,17 @@
 ---
-license: apache-2.0
 language:
 - en
 metrics:
 - accuracy
-base_model:
-- llava-hf/llava-1.5-7b-hf
-- OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
-base_model_relation: adapter
 tags:
 - finance
 - medical
@@ -25,10 +29,7 @@ tags:
 - multimodal
 - image-to-text
 - text-generation
-pipeline_tag: visual-question-answering
-library_name: transformers
-datasets:
-- MLLM-CL/MLLM-CL
 ---
 ## MLLM-CL Benchmark Description
@@ -36,7 +37,7 @@ MLLM-CL is a novel benchmark encompassing domain and ability continual learning,
 whereas the latter evaluates on non-IID scenarios with emerging model ability.
 For more details, please refer to:
-**MLLM-CL: Continual Learning for Multimodal Large Language Models** [[paper](https://arxiv.org/abs/2506.05453)], [[code](https://github.com/bjzhb666/MLLM-CL/)].
 ![](MLLM-CL.png "Magic Gardens")
 [‪Hongbo Zhao](https://scholar.google.com/citations?user=Gs22F0UAAAAJ&hl=zh-CN), [Fei Zhu](https://impression2805.github.io/), [Haiyang Guo](https://ghy0501.github.io/guohaiyang0501.github.io/), [Meng Wang](https://moenupa.github.io/), Rundong Wang, [‪Gaofeng Meng](https://scholar.google.com/citations?hl=zh-CN&user=5hti_r0AAAAJ), [‪Zhaoxiang Zhang‬](https://scholar.google.com/citations?hl=zh-CN&user=qxWfV6cAAAAJ)

 ---
+base_model:
+- llava-hf/llava-1.5-7b-hf
+- OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
+datasets:
+- MLLM-CL/MLLM-CL
+- MLLM-CL/MLLM-CL-ReplayData
 language:
 - en
+library_name: transformers
+license: apache-2.0
 metrics:
 - accuracy
+pipeline_tag: image-text-to-text
 tags:
 - finance
 - medical
 - multimodal
 - image-to-text
 - text-generation
+base_model_relation: adapter
 ---
 ## MLLM-CL Benchmark Description
 whereas the latter evaluates on non-IID scenarios with emerging model ability.
 For more details, please refer to:
+**MLLM-CL: Continual Learning for Multimodal Large Language Models** [[paper](https://arxiv.org/abs/2506.05453)], [[HF paper](https://huggingface.co/papers/2506.05453)], [[code](https://github.com/bjzhb666/MLLM-CL/)].
 ![](MLLM-CL.png "Magic Gardens")
 [‪Hongbo Zhao](https://scholar.google.com/citations?user=Gs22F0UAAAAJ&hl=zh-CN), [Fei Zhu](https://impression2805.github.io/), [Haiyang Guo](https://ghy0501.github.io/guohaiyang0501.github.io/), [Meng Wang](https://moenupa.github.io/), Rundong Wang, [‪Gaofeng Meng](https://scholar.google.com/citations?hl=zh-CN&user=5hti_r0AAAAJ), [‪Zhaoxiang Zhang‬](https://scholar.google.com/citations?hl=zh-CN&user=qxWfV6cAAAAJ)