Edit model card

Summary

This is an Gemma 2 Baku lora, created using the VNTL 3.1 dataset. The purpose of this lora is to improve Gemma's performance at translating Japanese visual novels to English.

Notes

Recently, rinna released the Gemma2 Baku 2B model, pretrained on a substantial 80 billion tokens(!). After testing, I found its performance quite impressive for a 2B model, so I decided to create this fine-tune (it only took 30 minutes, which is nice). However, I opted to remove the chat mode from this model, as I wasn't sure if the 2B model could effectively manage both capabilities.

Training Details

This model was trained using the same hyperparameters as the VNTL LLaMA3 8B qlora.

  • Rank: 128
  • Alpha: 32
  • Effective Batch Size: 30
  • Warmup Ratio: 0.02
  • Learning Rate: 6.5e-5
  • Embedding Learning Rate: 1.5e-5
  • LR Schedule: cosine
  • Weight Decay: 0.01

Translation Prompt

This is an prompt example for translation:

<<METADATA>>
[character] Name: Uryuu Shingo (η“œη”Ÿ 新吾) | Gender: Male | Aliases: Onii-chan (γŠε…„γ‘γ‚ƒγ‚“)
[character] Name: Uryuu Sakuno (η“œη”Ÿ ζ‘œδΉƒ) | Gender: Female
<<TRANSLATE>>
<<JAPANESE>>
[ζ‘œδΉƒ]: γ€Žβ€¦β€¦γ”γ‚γ‚“γ€
<<ENGLISH>>
[Sakuno]: γ€Ž... Sorry.』<eos>
<<JAPANESE>>
[新吾]: γ€Œγ†γ†γ‚“γ€γ“γ†θ¨€γ£γ‘γ‚ƒγͺγ‚“γ γ‘γ©γ€θΏ·ε­γ§γ‚ˆγ‹γ£γŸγ‚ˆγ€‚ζ‘œδΉƒγ―ε―ζ„›γ„γ‹γ‚‰γ€γ„γ‚γ„γ‚εΏƒι…γ—γ‘γ‚ƒγ£γ¦γŸγ‚“γ γžδΏΊγ€
<<ENGLISH>>

The generated translation for that prompt, with temperature 0, is:

[Shingo]: γ€ŒNo, I'm glad you got lost. You were so cute that it made me worry.」
Downloads last month
18
Inference Examples
Inference API (serverless) does not yet support peft models for this pipeline type.

Model tree for lmg-anon/vntl-gemma2-2b-lora

Base model

google/gemma-2-2b
Adapter
(3)
this model

Dataset used to train lmg-anon/vntl-gemma2-2b-lora