Poro 2 70B SFT Model Card

Note for most users: This is an intermediate checkpoint from our post-training pipeline. Most users should use Poro 2 70B Instruct instead, which includes an additional round of Direct Preference Optimization (DPO) for improved response quality and alignment. This SFT-only model is primarily intended for researchers interested in studying the effects of different post-training techniques.

Poro 2 70B SFT is a supervised fine-tuned model created from the Poro 2 70B Base model. This model has been trained for instruction following and conversational AI applications in both Finnish and English, but has not undergone preference tuning. It represents the intermediate step before Direct Preference Optimization (DPO) in our post-training pipeline.

Poro 2 was created in a collaboration between AMD Silo AI, the TurkuNLP group of the University of Turku, and High Performance Language Technologies (HPLT). Training was conducted on the LUMI supercomputer, using compute resources generously provided by CSC - IT Center for Science, Finland.

For more details on our training and data generation pipeline, check out our Continued Pretraining Playbook.

Poro 2 Model Family

The Poro 2 model family includes both 8B and 70B models, and there are three different versions released of the Poro 2 models: a base model, a post-training SFT-only checkpoint, and the final instruct model which is the SFT model plus a round of DPO.

Model	Based on	Base Model	SFT	Instruct
Poro 2 8B	Llama 3.1 8B	Poro 2 8B Base	Poro 2 8B SFT	Poro 2 8B Instruct
Poro 2 70B	Llama 3.1 70B	Poro 2 70B Base	Poro 2 70B SFT	Poro 2 70B Instruct

What does Poro mean? Poro is the Finnish word for Reindeer! 🦌 These animals are native to Finland and hold a significant and historical role in Finnish culture.

Model Overview

Poro 2 70B SFT is based on the Llama 3.1 70B architecture and has been supervised fine-tuned for instruction following. The model supports both English and Finnish conversations but has not undergone preference tuning for response quality optimization.

Hyperparameter	Value
n_parameters	70.55B
n_layers	80
n_heads	64
n_kv_heads	8
d_model	8192
vocab_size	128256
max_sequence_length	8192
base_model	Llama-3.1-70B

Training Process

Continued Pretraining

The base Poro 2 70B model was created through continued pretraining on 165B tokens of Finnish, English, code, and math data.

Supervised Fine-Tuning (SFT)

This model represents the SFT phase of post-training, using 1.4M instruction-following examples in English and Finnish, including:

English and Finnish Tulu 3 prompts with Llama-3.3-70B-Instruct responses
Multi-turn conversations generated using the Magpie method
Top-rated conversations from OASST2 and Avoin Avustaja datasets
Translation samples from EuroParl

We release the Poro 2 instruction collection.

SFT Hyperparameters

Hyperparameter	Value
Epochs	2
Global batch size	128
Learning rate	5e-6
LR scheduler	linear
Warmup ratio	0.03
Max sequence length	4,096

Evaluation Results

Poro 2 70B SFT shows substantial improvements in Finnish instruction-following capabilities compared to Llama 3.1 70B Instruct and is on par with Llama 3.3 70B Instruct, while maintaining excellent English performance. Note that the final Instruct model (with DPO) performs better.

Finnish Instruction Following

	Poro 2 70B SFT	Llama 3.1 70B Instruct	Llama 3.3 70B Instruct	Poro 2 70B Instruct
IFEval Finnish	70.05	63.95	71.71	70.79
MTBench Finnish	7.2	7.06	7.4	7.77
AlpacaEval 2 Finnish	30.74	21.06	25.73	41.96

English Instruction Following

	Poro 2 70B SFT	Llama 3.1 70B Instruct	Llama 3.3 70B Instruct	Poro 2 70B Instruct
IFEval	89.46	86.69	90.38	85.95
MTBench	8.03	8.33	8.35	8.41
AlpacaEval 2	43.18	43.87	45.12	49.77

Overall: Notable improvement over Llama 3.1 70B Instruct and competitive with Llama 3.3 70B Instruct in Finnish, while maintaining strong English performance. The additional DPO step in the Instruct model provides further improvements.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "LumiOpen/Poro-2-70B-SFT"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Finnish conversation example
messages = [
    {"role": "user", "content": "Kerro minulle Suomen historiasta."}
]

inputs = tokenizer.apply_chat_template(
    messages, 
    add_generation_prompt=True,
    return_tensors="pt"
)

outputs = model.generate(
    inputs,
    max_new_tokens=500,
    temperature=0.7,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Research Applications

This SFT-only model is particularly useful for researchers studying:

The effects of supervised fine-tuning vs. preference tuning
Comparative analysis of different post-training techniques
Ablation studies on instruction-following capabilities
Cross-lingual transfer in instruction-following tasks
The impact of DPO on model behavior and alignment

Intended Use

Poro 2 70B SFT is primarily intended for:

Research purposes: Studying post-training techniques and their effects
Comparative analysis: Understanding the contribution of different training phases
Educational applications: Learning about instruction-following model development
Development: As a starting point for further preference tuning experiments

For production use cases, we recommend using Poro 2 70B Instruct instead.

Ethical Considerations and Limitations

Poro 2 70B SFT is an advanced language model optimized for English and Finnish instruction following. As this model has not undergone preference tuning, it may be more prone to generating responses that are misaligned with user expectations compared to the final Instruct model.

Key limitations:

Limited proficiency in languages other than English and Finnish
Potential for generating biased or inappropriate content
May produce factually incorrect information

License

Built with Llama

Poro 2 70B SFT is released under the Llama 3.3 Community License. Please review the license terms before use.

Citation

@misc{poro2_2025,
    title={Poro 2: Continued Pretraining for Language Acquisition},
    author={Elaine Zosa and Jouni Louma and Kai Hakala and Antti Virtanen and Mika Koistinen and Risto Luukkonen and Akseli Reunamo and Sampo Pyysalo and Jonathan Burdge},
    year={2025},
    howpublished={LumiOpen}
}

Acknowledgments

We thank CSC - IT Center for Science, Finland for providing access to the LUMI supercomputer. This work was supported by the High Performance Language Technologies (HPLT) project and conducted in collaboration with TurkuNLP from the University of Turku. This project has received funding from the European Union's Horizon Europe research and innovation programme under grant agreement No 101070350.

LumiOpen
/

Llama-Poro-2-70B-SFT

Poro 2 70B SFT Model Card

Poro 2 Model Family

Model Overview

Training Process

Continued Pretraining

Supervised Fine-Tuning (SFT)

SFT Hyperparameters

Evaluation Results

Finnish Instruction Following

Finnish Instruction Following

English Instruction Following

Usage

Research Applications

Intended Use

Ethical Considerations and Limitations

License

Citation

Acknowledgments

Dataset used to train LumiOpen/Llama-Poro-2-70B-SFT

Collection including LumiOpen/Llama-Poro-2-70B-SFT

Poro 2