VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
Paper
β’
2505.22019
β’
Published
β’
10
ποΈ Creators of models with the most cumulative new downloads each month (users only, no orgs)
author_model-name
./llama-perplexity -m /models/Mistral-Nemo-Instruct-2407-f32.gguf -f /training_data/wikitext-2-raw/wiki.test.raw --kl-divergence-base /training_data/mistral-nemo-f32.kld -ngl 35 -fa -sm row
./llama-perplexity -m /models/Mistral-Nemo-Instruct-2407-Q8_0.gguf -f /training_data/wikitext-2-raw/wiki.test.raw --kl-divergence-base /training_data/mistral-nemo-f32.kld --kl-divergence -ngl 50 -fa -sm row
Llama-3-8B-instruct
) to generate synthetic instructions and then fine-tune the base version (Llama-3-8B
) on this dataset, you can improve even the it-tuned versionollama
models (initially phi and llama3) automatically and upload it to the Hugging Face Hub!