query_instruction and doc_instruction

by ArvinZhuang - opened 29 days ago

29 days ago

•

In the model card example, these two instructions are empty.
However, in the bright evaluation code query_instruction is "<|user|>\nGiven a {task} post, retrieve relevant passages that help answer the post\n<|embed|>\n" and doc_instruction is "<|embed|>\n".

Is this <|embed|> token in the instruction important (I guess it is if training always uses this token)? Or empty instructions should work just fine?

Thank you!

ArvinZhuang changed discussion title from query_instruction and document_instruction to query_instruction and doc_instruction 29 days ago

qiaoruiyt

ReasonIR org 29 days ago

Hi Arvin,

Thanks for the question! We provided empty instructions for query and doc in the model card as a simple example. We haven't ablated the impact of these special tokens and instructions.

Our model training was based on GritLM's representation learning, so it naturally comes with these formatting tokens (https://github.com/ContextualAI/gritlm/blob/main/gritlm/training/run.py#L23). Therefore, we also used them during evaluation for consistency, the same as what GRIT did.

Hope it helps!

ArvinZhuang

28 days ago

Hi @qiaoruiyt

Thanks for the clarification.
Just FYI, I tried empty instruction and your bright instruction myself. Empty instruction gives much worse results.

Arvin

ArvinZhuang changed discussion status to closed 28 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment