Kyara: Knowledge Yielding Adaptive Retrieval Augmentation for LLM Fine-tuning

DOI

🤗 Hugging Face  | 🚀Github  |  ðŸ“‘ Paper  |  ðŸ“– English  |  ðŸ“– Chinese  |  ðŸ’» Kaggle Notebook

kyara

Kyara (Knowledge Yielding Adaptive Retrieval Augmentation) is an experimental project aimed at improving language models through knowledge retrieval processes. The project seeks to enhance the model’s ability to adapt knowledge and improve language comprehension, particularly in underrepresented languages like Traditional Chinese. Given the relatively scarce availability of Traditional Chinese data compared to the vast corpus of English data used for model training, Kyara addresses this gap by expanding the limited corpus for this language.

This release is a preview version of the Kyara-2.5 series. Compared to Kyara-1.5, this iteration incorporates a significantly larger volume of high-quality STEM content and challenging reasoning datasets. Additionally, it employs online reinforcement techniques for preference optimization to refining the model’s performance.

Benchmark

All evaluations are conducted in a zero-shot setting.

Metric Kyara-9b-it Gemma-2-9b-it
TMMLUPlus 60.74 54.77
 - STEM 69.54 58.11
 - Humanities 52.64 48.71
 - Other 57.10 51.43
 - Social-Science 63.69 60.84
MMLU-Redux 73.04 72.82
GSM8K 90.37 87.41
MATH-L5 31.35 19.42
CRUX 49.25 46.00
MT-Bench 8.81 8.53
MT-Bench-TW 8.36 7.80
Chatbot-Arena-Hard 43.90 33.60
Downloads last month
37
Safetensors
Model size
9.24B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for zake7749/gemma-2-9b-it-chinese-kyara

Base model

google/gemma-2-9b
Finetuned
(149)
this model

Collection including zake7749/gemma-2-9b-it-chinese-kyara