Some question about model
#2
by
hookzeng
- opened
- I check it find negative scores in my rerank list, it not normalized 0-1? If I want a threshold to clear completely irrelevant chunk, which scores suggest?
- No special token in first Q and D, if the query too long, it will interference first doc_emb?
- Can I change prompt if I just use it in Chinese?
Glad to hear from you. Regarding to your concerns:
- the returned score is cosine (normalized in [-1, 1]) between the query and document embedding. For the threshold, I would suggest you chose the positive threshold (usually you can set it > 0.2 by default) based on your own data.
- I could not get your question. What do you mean "no special token in First Q and D"
- The current prompt works with multilingual, you don't need to change it.