tuandunghcmut
/

vlm_clone_2

Model card Files Files and versions Community

vlm_clone_2 / opencompass /configs /datasets /TheoremQA /README.md

tuandunghcmut

Add files using upload-large-folder tool

4dcb263 verified 3 months ago

preview code

raw

history blame contribute delete

2.8 kB

	# TheoremQA

	```bash
	python3 run.py --models hf_internlm2_7b --datasets TheoremQA_5shot_gen_6f0af8 --debug
	python3 run.py --models hf_internlm2_chat_7b --datasets TheoremQA_5shot_gen_6f0af8 --debug
	```

	## Base Models

	\| model \| TheoremQA \|
	\|:------------------------:\|------------:\|
	\| llama-7b-turbomind \| 10.25 \|
	\| llama-13b-turbomind \| 11.25 \|
	\| llama-30b-turbomind \| 14.25 \|
	\| llama-65b-turbomind \| 15.62 \|
	\| llama-2-7b-turbomind \| 12.62 \|
	\| llama-2-13b-turbomind \| 11.88 \|
	\| llama-2-70b-turbomind \| 15.62 \|
	\| llama-3-8b-turbomind \| 20.25 \|
	\| llama-3-70b-turbomind \| 33.62 \|
	\| internlm2-1.8b-turbomind \| 10.50 \|
	\| internlm2-7b-turbomind \| 21.88 \|
	\| internlm2-20b-turbomind \| 26.00 \|
	\| qwen-1.8b-turbomind \| 9.38 \|
	\| qwen-7b-turbomind \| 15.00 \|
	\| qwen-14b-turbomind \| 21.62 \|
	\| qwen-72b-turbomind \| 27.12 \|
	\| qwen1.5-0.5b-hf \| 5.88 \|
	\| qwen1.5-1.8b-hf \| 12.00 \|
	\| qwen1.5-4b-hf \| 13.75 \|
	\| qwen1.5-7b-hf \| 4.25 \|
	\| qwen1.5-14b-hf \| 12.62 \|
	\| qwen1.5-32b-hf \| 26.62 \|
	\| qwen1.5-72b-hf \| 26.62 \|
	\| qwen1.5-moe-a2-7b-hf \| 7.50 \|
	\| mistral-7b-v0.1-hf \| 17.00 \|
	\| mistral-7b-v0.2-hf \| 16.25 \|
	\| mixtral-8x7b-v0.1-hf \| 24.12 \|
	\| mixtral-8x22b-v0.1-hf \| 36.75 \|
	\| yi-6b-hf \| 13.88 \|
	\| yi-34b-hf \| 24.75 \|
	\| deepseek-7b-base-hf \| 12.38 \|
	\| deepseek-67b-base-hf \| 21.25 \|

	## Chat Models

	\| model \| TheoremQA \|
	\|:-----------------------------:\|------------:\|
	\| qwen1.5-0.5b-chat-hf \| 9.00 \|
	\| qwen1.5-1.8b-chat-hf \| 9.25 \|
	\| qwen1.5-4b-chat-hf \| 13.88 \|
	\| qwen1.5-7b-chat-hf \| 12.25 \|
	\| qwen1.5-14b-chat-hf \| 13.63 \|
	\| qwen1.5-32b-chat-hf \| 19.25 \|
	\| qwen1.5-72b-chat-hf \| 22.75 \|
	\| qwen1.5-110b-chat-hf \| 17.50 \|
	\| internlm2-chat-1.8b-hf \| 13.63 \|
	\| internlm2-chat-1.8b-sft-hf \| 12.88 \|
	\| internlm2-chat-7b-hf \| 18.50 \|
	\| internlm2-chat-7b-sft-hf \| 18.75 \|
	\| internlm2-chat-20b-hf \| 23.00 \|
	\| internlm2-chat-20b-sft-hf \| 25.12 \|
	\| llama-3-8b-instruct-hf \| 19.38 \|
	\| llama-3-70b-instruct-hf \| 36.25 \|
	\| llama-3-8b-instruct-lmdeploy \| 19.62 \|
	\| llama-3-70b-instruct-lmdeploy \| 34.50 \|
	\| mistral-7b-instruct-v0.1-hf \| 12.62 \|
	\| mistral-7b-instruct-v0.2-hf \| 11.38 \|
	\| mixtral-8x7b-instruct-v0.1-hf \| 26.00 \|

	# TheoremQA

	```bash
	python3 run.py --models hf_internlm2_7b --datasets TheoremQA_5shot_gen_6f0af8 --debug
	python3 run.py --models hf_internlm2_chat_7b --datasets TheoremQA_5shot_gen_6f0af8 --debug
	```

	## Base Models

	\| model \| TheoremQA \|
	\|:------------------------:\|------------:\|
	\| llama-7b-turbomind \| 10.25 \|
	\| llama-13b-turbomind \| 11.25 \|
	\| llama-30b-turbomind \| 14.25 \|
	\| llama-65b-turbomind \| 15.62 \|
	\| llama-2-7b-turbomind \| 12.62 \|
	\| llama-2-13b-turbomind \| 11.88 \|
	\| llama-2-70b-turbomind \| 15.62 \|
	\| llama-3-8b-turbomind \| 20.25 \|
	\| llama-3-70b-turbomind \| 33.62 \|
	\| internlm2-1.8b-turbomind \| 10.50 \|
	\| internlm2-7b-turbomind \| 21.88 \|
	\| internlm2-20b-turbomind \| 26.00 \|
	\| qwen-1.8b-turbomind \| 9.38 \|
	\| qwen-7b-turbomind \| 15.00 \|
	\| qwen-14b-turbomind \| 21.62 \|
	\| qwen-72b-turbomind \| 27.12 \|
	\| qwen1.5-0.5b-hf \| 5.88 \|
	\| qwen1.5-1.8b-hf \| 12.00 \|
	\| qwen1.5-4b-hf \| 13.75 \|
	\| qwen1.5-7b-hf \| 4.25 \|
	\| qwen1.5-14b-hf \| 12.62 \|
	\| qwen1.5-32b-hf \| 26.62 \|
	\| qwen1.5-72b-hf \| 26.62 \|
	\| qwen1.5-moe-a2-7b-hf \| 7.50 \|
	\| mistral-7b-v0.1-hf \| 17.00 \|
	\| mistral-7b-v0.2-hf \| 16.25 \|
	\| mixtral-8x7b-v0.1-hf \| 24.12 \|
	\| mixtral-8x22b-v0.1-hf \| 36.75 \|
	\| yi-6b-hf \| 13.88 \|
	\| yi-34b-hf \| 24.75 \|
	\| deepseek-7b-base-hf \| 12.38 \|
	\| deepseek-67b-base-hf \| 21.25 \|

	## Chat Models

	\| model \| TheoremQA \|
	\|:-----------------------------:\|------------:\|
	\| qwen1.5-0.5b-chat-hf \| 9.00 \|
	\| qwen1.5-1.8b-chat-hf \| 9.25 \|
	\| qwen1.5-4b-chat-hf \| 13.88 \|
	\| qwen1.5-7b-chat-hf \| 12.25 \|
	\| qwen1.5-14b-chat-hf \| 13.63 \|
	\| qwen1.5-32b-chat-hf \| 19.25 \|
	\| qwen1.5-72b-chat-hf \| 22.75 \|
	\| qwen1.5-110b-chat-hf \| 17.50 \|
	\| internlm2-chat-1.8b-hf \| 13.63 \|
	\| internlm2-chat-1.8b-sft-hf \| 12.88 \|
	\| internlm2-chat-7b-hf \| 18.50 \|
	\| internlm2-chat-7b-sft-hf \| 18.75 \|
	\| internlm2-chat-20b-hf \| 23.00 \|
	\| internlm2-chat-20b-sft-hf \| 25.12 \|
	\| llama-3-8b-instruct-hf \| 19.38 \|
	\| llama-3-70b-instruct-hf \| 36.25 \|
	\| llama-3-8b-instruct-lmdeploy \| 19.62 \|
	\| llama-3-70b-instruct-lmdeploy \| 34.50 \|
	\| mistral-7b-instruct-v0.1-hf \| 12.62 \|
	\| mistral-7b-instruct-v0.2-hf \| 11.38 \|
	\| mixtral-8x7b-instruct-v0.1-hf \| 26.00 \|