Summary

This is an Gemma 2 Baku lora, created using the VNTL 3.1 dataset. The purpose of this lora is to improve Gemma's performance at translating Japanese visual novels to English.

Notes

Recently, rinna released the Gemma2 Baku 2B model, pretrained on a substantial 80 billion tokens(!). After testing, I found its performance quite impressive for a 2B model, so I decided to create this fine-tune (it only took 30 minutes, which is nice). However, I opted to remove the chat mode from this model, as I wasn't sure if the 2B model could effectively manage both capabilities.

Training Details

This model was trained using the same hyperparameters as the VNTL LLaMA3 8B qlora.

Rank: 128
Alpha: 32
Effective Batch Size: 30
Warmup Ratio: 0.02
Learning Rate: 6.5e-5
Embedding Learning Rate: 1.5e-5
LR Schedule: cosine
Weight Decay: 0.01

Translation Prompt

This is an prompt example for translation:

<<METADATA>>
[character] Name: Uryuu Shingo (瓜生 新吾) | Gender: Male | Aliases: Onii-chan (お兄ちゃん)
[character] Name: Uryuu Sakuno (瓜生 桜乃) | Gender: Female
<<TRANSLATE>>
<<JAPANESE>>
[桜乃]: 『……ごめん』
<<ENGLISH>>
[Sakuno]: 『... Sorry.』<eos>
<<JAPANESE>>
[新吾]: 「ううん、こう言っちゃなんだけど、迷子でよかったよ。桜乃は可愛いから、いろいろ心配しちゃってたんだぞ俺」
<<ENGLISH>>

The generated translation for that prompt, with temperature 0, is:

[Shingo]: 「No, I'm glad you got lost. You were so cute that it made me worry.」

lmg-anon
/

vntl-gemma2-2b-lora

Summary

Notes

Training Details

Translation Prompt

Model tree for lmg-anon/vntl-gemma2-2b-lora

Dataset used to train lmg-anon/vntl-gemma2-2b-lora