Query expansion

#3
by fschlatt - opened

This model does not do query expansion using mask tokens: https://huggingface.co/lightonai/GTE-ModernColBERT-v1/blob/main/config_sentence_transformers.json#L49

Is this intended? In previous versions of pylate the model did do query expansion (I'm not sure which versions exactly)

LightOn AI org

Hey,
Sorry for the delayed answer, did not get the notification!
Essentially, ColBERT models using Flash Attention does not perform query expansion (the embeddings of padding tokens for FA are zeros), effectively never adding anything to the MaxSim computation, see here. It was not a big issue per say, only damaging the potential of these models, but it actually became one when people started using the FA models on CPU, which was activating again query expansion whereas the model was not trained for it, leading to poor results, see this issue.

I thus decided to properly add a parameter to choose whether or not to use query expansion, to allow model trained with FA to be used with CPU, here.

Tl;Dr: any colbert model trained with FA was not using query expansion when being used on GPU while was using it when being used on CPU.
If you ran evaluation with FA models on CPU before this fix, the results should be worse than what it should have been

Got it, cheers for the clarification!

fschlatt changed discussion status to closed

Sign up or log in to comment