llama-lang-adapt
/

pretrain-wura

Text Generation

text-generation-inference

Model card Files Files and versions Community

We continual pre-train meta-llama/Llama-2-7b-hf on monolingual WURA corpus for 20 languages. All languages are uniformly sampled.

Important Parameters

num_gpus: 8
max_steps: 8000 # see here
gradient_accumulation_steps: 16
per_device_batch_size: 2
learning_rate: 2e-5

Downloads last month: 47

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llama-lang-adapt/pretrain-wura

Adapters

1 model

Dataset used to train llama-lang-adapt/pretrain-wura