RAG system configuration question

#57

by Day1Kim - opened 3 days ago

3 days ago

Hi.

I'm trying to configure a rag system. Is there an example of loading a pdf document, extracting text and images, and then vector embedding them with the corresponding embedding model?

jupyterjazz

Jina AI org 3 days ago

Hi @Day1Kim , unfortunately we don't have any RAG examples for now.

Is there an example of loading a pdf document, extracting text and images

One unique feature of jina-embeddings-v4 is that you can encode pdfs without text/image extraction. What you could do is convert each pdf page to an image using pdf2image or a similar tool, and encode each page directly. Keep in mind to use the "retrieval" adapter for this task.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment