RAG system configuration question

#57
by Day1Kim - opened

Hi.

I'm trying to configure a rag system. Is there an example of loading a pdf document, extracting text and images, and then vector embedding them with the corresponding embedding model?

Jina AI org

Hi @Day1Kim , unfortunately we don't have any RAG examples for now.

Is there an example of loading a pdf document, extracting text and images

One unique feature of jina-embeddings-v4 is that you can encode pdfs without text/image extraction. What you could do is convert each pdf page to an image using pdf2image or a similar tool, and encode each page directly. Keep in mind to use the "retrieval" adapter for this task.

Sign up or log in to comment