File size: 1,397 Bytes
a9e1e1a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
# LLaVA-NeXT: Tackling Multi-image, Video, and 3D in Large Multimodal Models
## Contents
- [Demo](#demo)
- [Evaluation](#evaluation)
## Demo
> make sure you installed the LLaVA-NeXT model files via outside REAME.md
1. **Example model:** `lmms-lab/llava-next-interleave-7b`
To run a demo, execute:
```bash
# If you find any bug when running the demo, please make sure checkpoint path contains 'qwen'.
# You can try command like 'mv llava-next-interleave-7b llava-next-interleave-qwen-7b'
python playground/demo/interleave_demo.py --model_path path/to/ckpt
```
## Evaluation
### Preparation
Please download the evaluation data and its metadata from the following links:
1. **llava-interleave-bench:** [here](https://huggingface.co/datasets/lmms-lab/llava-interleave-bench).
Unzip eval_images.zip and there are Split1 and Split2 in it.
Organize the downloaded data into the following structure:
```
interleave_data
βββ Split1
β βββ ...
β βββ ...
|
βββ Split2
| βββ ...
β βββ ...
βββ multi_image_in_domain.json
βββ multi_image_out_domain.json
βββ multi_view_in_domain.json
```
### Inference and Evaluation
Example:
Please first edit /path/to/ckpt to the path of checkpoint, /path/to/images to the path of "interleave_data" in scripts/interleave/eval_all.sh and then run
```bash
bash scripts/interleave/eval_all.sh
```
|