query_instruction and doc_instruction
In the model card example, these two instructions are empty.
However, in the bright evaluation code query_instruction is "<|user|>\nGiven a {task} post, retrieve relevant passages that help answer the post\n<|embed|>\n"
and doc_instruction is "<|embed|>\n"
.
Is this <|embed|>
token in the instruction important (I guess it is if training always uses this token)? Or empty instructions should work just fine?
Thank you!
Hi Arvin,
Thanks for the question! We provided empty instructions for query and doc in the model card as a simple example. We haven't ablated the impact of these special tokens and instructions.
Our model training was based on GritLM's representation learning, so it naturally comes with these formatting tokens (https://github.com/ContextualAI/gritlm/blob/main/gritlm/training/run.py#L23). Therefore, we also used them during evaluation for consistency, the same as what GRIT did.
Hope it helps!
Hi @qiaoruiyt
Thanks for the clarification.
Just FYI, I tried empty instruction and your bright instruction myself. Empty instruction gives much worse results.
Arvin