The generation will depend on your GPU architecture. Let's compare the execution of a openai-community/gpt2 language model training over a small sample of wikitext. The results are: | NVlink | Time | | ----- | ---: | | Y | 101s | | N | 131s | You can see that NVLink completes the training ~23% faster.