Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Batch size 'auto' leads to hanging jobs
#1110
by
gcamp
- opened
Hello
I am trying to reproduce lb evaluation for Qwen2.5-72B, on 8 H100
I am noticing some differences when running with an explicit batch size value (as expected) but when running with 'auto' the evaluation job hangs when computing the batch size. how can I overcome this problem?
Also, I wanted to ask for such model, how do you set your parallelization in accelerate such as num_processes, etc?
Thanks