Possible contamination of question data

#8
by wowthecoder - opened

The vector database stores the embeddings of the GAIA validation set questions and answers. However, this set contains the questions that are used to calculate the score for the final hands on of the agents course, which means that the agent is not actually using any of its tools to solve the problems. In fact, this functions as an LLM with RAG instead of an AI agent, and it is just finding the answer from the database instead of browsing the web, executing code, etc.

Sign up or log in to comment