The Meta Llama-3.1 model series can be used for distilling and fine-tuning but this requires annotated preference data so I created a Human Feedback Collector based on Gradio that directly logs data to the Hugging Face Hub.
- Model meta-llama/Meta-Llama-3.1-8B-Instruct - Data SFT, KTO and DPO data - Runs on free Zero GPUs in Hugging Face Spaces - Might need some human curation in Argilla - Or provide some AI feedback with distilabel