how to set context in multi-turn QA?
From the model card, multi-turn QA is performed on the single context which is provided in {system}....
.
In real use cases, new context might be available for a new question.
Hi,
We use "\n\n" to connect multiple context. In other words, the format would be {context_1}\n\n{context_2}\n\n{context_3} ...
I mean, how to provide a new context when asking a new question?
i see. need to just simply replace the old context with the new context when asking a new question.
So, I need to do this, which is not intuitive:
System: {System}
{Context1}
{Context2}
User: {Question1 relevant to Context1 }
Assistant: {Response}
User: {Question2 relevant to Context2}
Assistant:
It would be much better if I can use it like this:
System: {System}
User: {Question1}
{Context1}
Assistant: {Response}
User: {Question2}
{Context2}
Assistant:
oh not like this. you should remove {Context1} and do this instead:
System: {System}
{Context2}
User: {Question1 relevant to Context1}
Assistant: {Response}
User: {Question2 relevant to Context2}
Assistant:
The retriever will consider both question1 and question2 in the conversation when retrieving the relevant context for question2 (i.e., {Context2})
It is clear after the Dragon-multiturn retriever is involved.
Anyway, this looks like weird to me. My assumption is that LLM generates output incrementally: when a new question and its context are appended, the generation continues, which is compute-friendly. While with ChatQA, when a new question is raised, LLM needs to re-process a whole new chat history before generating answer to the latest question, which is suitable for a LLM server but not suitable for single user, local interference.
I suggest updating the model card to clarify multi-turn conversations.