Update README.md

ed70074 verified 3 months ago

3.82 kB

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen2.5-VL-7B-Instruct
	tags:
	- vision
	- llm
	- critical
	- sft
	- d3.js
	- visualization
	---

	# VIS-Shepherd: Constructing Critic for LLM-based Data Visualization Generation

	[GitHub Repo](https://github.com/bopan3/VIS-Shepherd)

	![VIS-Shepherd Overview](https://github.com/bopan3/VIS-Shepherd/raw/main/static/visShepherd_overview.png)

	This repository is the official implementation of VIS-Shepherd: Constructing Critic for LLM-based Data Visualization Generation.


	## Requirements

	### Common Dependencies

	#### Pyhton Environment Setup
	To install requirements for python environment (we recommend python 3.10):

	```bash
	pip install -r requirements.txt
	```

	You can use some virtual environment to install dependencies, e.g. conda or venv.

	#### LLaMA-Factory

	We use [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) for training and model inference. If you reproduce our training experiments, please follow the instructions in the repository:

	```bash
	git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
	cd LLaMA-Factory
	# More optional dependencies can be found at https://llamafactory.readthedocs.io/en/latest/getting_started/installation.html
	pip install -e ".[torch,metrics,deepspeed]"
	```

	## Training
	The dataset for training is available at train/data/viscrafter_20250521.json, with the format as follows:
	```json
	[
	{
	"input": "the input instruction",
	"output": "the output response",
	"images": [
	"the image path"
	]
	},
	]
	```
	To train the model(s) in the paper, directly run this command at the root of the project:

	```bash
	llamafactory-cli train train/configs/train-sft-full-viscrafter-20250521.yml
	```

	We trained the model on 8 A800 GPUs (80G memory) using DeepSpeed. You can find more configuration methods in the [LLaMA-Factory documentation](https://llamafactory.readthedocs.io/en/latest/advanced/arguments.html) to modify training parameters to adapt to your training environment.

	## Setup Local Inference Server

	You can set up an inference server using the following command, which will start a server compatible with the OpenAI API that you can use to test your model.

	```bash
	llamafactory-cli api train/configs/infer-sft-full-viscrafter-20250521.yml
	```

	## Evaluation

	First move to the folder for evaluation and fill your API_BASE, API_KEY, and list of the name of models to use in evaluation/config/config.yaml. Note that we use Azure's API for GPT-4o, local inference server for locally trained models and OpenRouters for other models (e.g. llama-4-maverick).
	```bash
	cd evaluation
	```
	```yaml
	## config for openai key
	OPENAI_API_BASE: "put your api base here"
	OPENAI_API_KEY: "put your api key here"
	OPENAI_API_MODEL_LIST: ["gpt-4o", "qwen/qwen-2.5-vl-7b-instruct", "qwen/qwen2.5-vl-72b-instruct", "meta-llama/llama-4-maverick"]
	OPENAI_TEMPERATURE: 0.01
	OPENAI_TOP_P: 0.1
	```

	To run inference on the test dataset for certain model, execute the following command (set --model_used to the name of model used as critic) and automatically save the inference result at folder critic_outputs:
	```bash
	python run_parallel_autoCritic.py --input_base_path test_set --output_base_path critic_outputs --model_used "The name of the LLM used as critic"
	```

	To run auto evaluation for all the inference results under the folder critic_outputs, execute:
	```bash
	./run_all_autoEvaluate.sh
	```

	The Evaluation result will be saved as evaluation/result.md.


	## Results

	\| Model \| Mean Score \| % Scores 3-5 \|
	\|-------\|------------\|-------------\|
	\| GPT-4o \| 3.41 \| 72.0% \|
	\| VIS-Shepherd \| 2.98 \| 67.1% \|
	\| Llama-4-Maverick \| 2.94 \| 52.8% \|
	\| Qwen-2.5-VL-72B \| 2.78 \| 49.1% \|
	\| qwen-2.5-VL-7B_1.2k \| 2.5 \| 52.2% \|
	\| qwen-2.5-VL-7B_0.3k \| 2.4 \| 44.1% \|
	\| qwen-2.5-VL-7B \| 2.2 \| 44.1% \|