Some question about model

by hookzeng - opened 5 days ago

5 days ago

I check it find negative scores in my rerank list， it not normalized 0-1? If I want a threshold to clear completely irrelevant chunk, which scores suggest?
No special token in first Q and D, if the query too long, it will interference first doc_emb?
Can I change prompt if I just use it in Chinese?

Jina AI org 4 days ago

Glad to hear from you. Regarding to your concerns:

the returned score is cosine (normalized in [-1, 1]) between the query and document embedding. For the threshold, I would suggest you chose the positive threshold (usually you can set it > 0.2 by default) based on your own data.
I could not get your question. What do you mean "no special token in First Q and D"
The current prompt works with multilingual, you don't need to change it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment