File size: 1,092 Bytes
bfd2121 49a856d bfd2121 5a10cc3 bfd2121 49a856d bfd2121 6c7f229 bfd2121 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
---
language:
- zh
- en
pipeline_tag: visual-question-answering
datasets:
- Lin-Chen/ShareGPT4V
- liuhaotian/LLaVA-Pretrain
---
# Model
<!-- Provide a quick summary of what the model is/does. -->
llava-qwen1.5-4b-chat is a lightweight multimodal models base on [LLaVA architecture](https://llava-vl.github.io/).
- Language Model: [Qwen/Qwen1.5-4B-Chat](https://huggingface.co/Qwen/Qwen1.5-4B-Chat)
- Vision Encoder: [google/siglip-so400m-patch14-384](https://huggingface.co/google/siglip-so400m-patch14-384)
- Total Paramters: 4,388,102,720
## Evaluation
### MMBench
Model | MMBench Test (EN) | MMBench Dev (EN) | MMBench Test (CN) | MMBench Dev (CN) | CCBench Dev
------------- | ------------- | ------------- | ------------- | ------------- | -------------
LLaVA-v1.5-7B | 67.7 | 69.2 | 61.0 | 59.7 | 28.4
LLaVA-InternLM-7B | 69.0 | 68.5 | 66.7 | 63.8 | 37.3
LLaVA-InternLM2-7B | 73.3 | 74.6 | 71.7 | 72.0 | 42.5
Bunny-3B | 69.2 | 68.6 | - | - | -
MiniCPM-V | 64.1 | 67.9 | 62.6 | 65.3 | 41.4
llava-qwen1.5-4b-chat | 69.6 | 69.2 | 68.6 | 68.3 | 41.0
## Uses
TBD
## Training Details
TBD |