Model Card for llama-estllm-protype-0825
llama-estllm-protype-0825 is the first artifact produced by the EstLLM project. The intention of this release is to evaluate the first prototype in a conversational ChatbotArena-style setting on baromeeter.ai, and thus establish a baseline for future improvements.
The model underwent continuous pre-training starting from Llama-3.1-8B on approximately 35B tokens, then supervised fine-tuning and direct preference optimization were applied.
Model Details
Model Description
- Developed by: TartuNLP and TalTechNLP research groups
- Funded by: Estonian Ministry of Education and Research, โEstonian Language Technology Program 2018-2027โ
- Model type: Causal Language Model, Instruction-following
- Language(s) (NLP): Estonian, English
- License: Llama 3.1 Community License Agreement
- Finetuned from model meta-llama/Llama-3.1-8B
Evaluation
Instruction-following
Every benchmark in this category is treated as a generative problem, and thus the evaluation is performed on the model responses obtained with 0 temperature (not logits).
Model (# parameters โ) | IFEval-et* | Winogrande-et** | Trivia-et*** | Grammar-et**** |
---|---|---|---|---|
moonshotai/Kimi-K2-Instruct | 0.7891 | 0.8138 | 0.4225 | 0.916 |
deepseek-ai/DeepSeek-V3-0324 | 0.7171 | 0.8042 | 0.27 | 0.364 |
meta-llama/Llama-3.1-405B-Instruct | 0.7159 | 0.7878 | 0.4713 | 0.818 |
meta-llama/Llama-3.3-70B-Instruct | 0.7705 | 0.7397 | 0.3875 | 0.797 |
Qwen/Qwen2.5-72B-Instruct | 0.7407 | 0.7227 | 0.315 | 0.694 |
google/gemma-3-27b-it | 0.7655 | 0.7510 | 0.325 | 0.817 |
utter-project/EuroLLM-9B-Instruct | 0.5397 | 0.5846 | 0.3738 | 0.764 |
meta-llama/Llama-3.1-8B-Instruct | 0.3797 | 0.5399 | 0.2888 | 0.657 |
tartuNLP/llama-estlm-prototype-0825 | 0.5174 | 0.5812 | 0.425 | 0.692 |
BSC-LT/salamandra-7b-instruct | 0.5195 | 0.2878 | 0.2875 | 0.594 |
tartuNLP/Llammas | 0.3524 | 0.5037 | 0.2838 | 0.529 |
Qwen/Qwen2.5-7B-Instruct | 0.4988 | 0.5473 | 0.2938 | 0.598 |
* inst_level_strict_acc
** 3-shot, accuracy
*** 0-shot, accuracy
**** 0-shot, accuracy, formatted as multiple-choice
Translation
English to Estonian
Model | wmt24pp (bleu โ) |
---|---|
tartuNLP/llama-estlm-prototype-0825 | 0.264 |
utter-project/EuroLLM-9B-Instruct | 0.2602 |
tartuNLP/Llammas | 0.1472 |
meta-llama/Llama-3.1-8B-Instruct | 0.1406 |
BSC-LT/salamandra-7b-instruct | 0.1201 |
Qwen/Qwen2.5-7B-Instruct | 0.0476 |
Limitations
This is an early prototype version. Accordignly, it has limitations in addition to the base Llama limitations:
- Relatively short context of 4096 tokens. It's not expected to perform well on context sizes beyond that.
- Multi-turn conversations are not supported in this version.
- Trained with the original Llama 3.1 system prompt that has a hard-coded date cut-off.
Citation
TBA
- Downloads last month
- -
Model tree for tartuNLP/llama-estllm-protype-0825
Base model
meta-llama/Llama-3.1-8B