Robustness in Both Domains: CLIP Needs a Robust Text Encoder Paper • 2506.03355 • Published 7 days ago • 6 • 2
FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens Paper • 2506.03096 • Published 7 days ago • 3 • 2