Query expansion
This model does not do query expansion using mask tokens: https://huggingface.co/lightonai/GTE-ModernColBERT-v1/blob/main/config_sentence_transformers.json#L49
Is this intended? In previous versions of pylate the model did do query expansion (I'm not sure which versions exactly)
Hey,
Sorry for the delayed answer, did not get the notification!
Essentially, ColBERT models using Flash Attention does not perform query expansion (the embeddings of padding tokens for FA are zeros), effectively never adding anything to the MaxSim computation, see here. It was not a big issue per say, only damaging the potential of these models, but it actually became one when people started using the FA models on CPU, which was activating again query expansion whereas the model was not trained for it, leading to poor results, see this issue.
I thus decided to properly add a parameter to choose whether or not to use query expansion, to allow model trained with FA to be used with CPU, here.
Tl;Dr: any colbert model trained with FA was not using query expansion when being used on GPU while was using it when being used on CPU.
If you ran evaluation with FA models on CPU before this fix, the results should be worse than what it should have been
Got it, cheers for the clarification!