Low-resource Vocabulary Expansion
Collection
Collection of models for "How Can We Effectively Expand the Vocabulary of LLMs with 0.01GB of Target Language Text?"
•
277 items
•
Updated
This model is built on top of atsuki-yamaguchi/gemma-2-9b-si-30K-align. The model uses the ElChat approach to mitigate catastrophic forgetting of the original capabilities of the source Gemma2 model.
Use the code below to get started with the model.
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
"atsuki-yamaguchi/gemma-2-9b-si-30K-align-merge"
)
tokenizer = AutoTokenizer.from_pretrained(
"atsuki-yamaguchi/gemma-2-9b-si-30K-align-merge"
)
@article{yamaguchi-etal-2024-effectively,
title={How Can We Effectively Expand the Vocabulary of LLMs with 0.01GB of Target Language Text?},
author={Atsuki Yamaguchi and Aline Villavicencio and Nikolaos Aletras},
year={2024},
journal={ArXiv},
year={2024},
volume={abs/2406.11477},
url={https://arxiv.org/abs/2406.11477},
}
Base model
google/gemma-2-9b