@ajinkya-tejankar in our private experimentation, we have tried to hack in FSDP2 into accelerate, and tested it with collocate. There are a few issues I believe that remain. 1. TRL's weight loading code only works with FSDP1 I believe. 2. FSDP1 has a NAN problem and I had filed a bug report awhile back https://github.com/vllm-project/vllm/issues/14443
See the previous discussion here:
https://github.com/huggingface/trl/pull/3317#issuecomment-2842576427