This is our script to evaluate 32k long context task You need to first install dependencies from requirement.txt ``` pip install -r requirement.txt ``` Then you need to configure the model_path and data_home in *eval_retro_vllm.sh* and then run the following command ``` bash run_eval_vllm.sh ``` to get the corresponding score