|
|
|
# LLaVA-NeXT: Tackling Multi-image, Video, and 3D in Large Multimodal Models |
|
|
|
## Contents |
|
- [Demo](#demo) |
|
- [Evaluation](#evaluation) |
|
|
|
## Demo |
|
|
|
> make sure you installed the LLaVA-NeXT model files via outside REAME.md |
|
|
|
1. **Example model:** `lmms-lab/llava-next-interleave-7b` |
|
|
|
|
|
To run a demo, execute: |
|
```bash |
|
# If you find any bug when running the demo, please make sure checkpoint path contains 'qwen'. |
|
# You can try command like 'mv llava-next-interleave-7b llava-next-interleave-qwen-7b' |
|
python playground/demo/interleave_demo.py --model_path path/to/ckpt |
|
``` |
|
|
|
## Evaluation |
|
|
|
### Preparation |
|
|
|
Please download the evaluation data and its metadata from the following links: |
|
|
|
1. **llava-interleave-bench:** [here](https://huggingface.co/datasets/lmms-lab/llava-interleave-bench). |
|
|
|
Unzip eval_images.zip and there are Split1 and Split2 in it. |
|
Organize the downloaded data into the following structure: |
|
``` |
|
|
|
interleave_data |
|
βββ Split1 |
|
β βββ ... |
|
β βββ ... |
|
| |
|
βββ Split2 |
|
| βββ ... |
|
β βββ ... |
|
βββ multi_image_in_domain.json |
|
βββ multi_image_out_domain.json |
|
βββ multi_view_in_domain.json |
|
``` |
|
|
|
### Inference and Evaluation |
|
Example: |
|
Please first edit /path/to/ckpt to the path of checkpoint, /path/to/images to the path of "interleave_data" in scripts/interleave/eval_all.sh and then run |
|
```bash |
|
bash scripts/interleave/eval_all.sh |
|
``` |
|
|
|
|