More detailed usage examples are needed.

by lycfight - opened May 23

May 23

I couldn’t find any documentation explaining how to use it. I used OpenHands to generate an output.jsonl file for SWE-bench_Verified, and now I’d like to evaluate it using all-hands/openhands-critic-32b-exp-20250417.

It seems that a specific version of vLLM is required for this model, but I’m not sure how to use it—should the evaluation be performed on the final patch only, or on the entire trajectory?

Could you provide more detailed documentation and examples?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment