LLMic Model Card
LLMic: Romanian Foundation Language Model
Model Summary
LLMic is a bilingual Romanian-English foundation model. LLmic is a 3B parameters dense decoder-only Transformer model based on Llama2.
Architecture
Parameter | Value |
---|---|
Sequence Length | 2048 |
Number of Layers | 24 |
Embedding Size | 2,560 |
FFN Hidden Size | 10,240 |
Number of Heads | 20 |
Number of KV Heads | 5 |
Activation Function | SiLU |
Position Encodings | RoPE (Θ=500,000) |
Layer Norm | RMSNorm (ε=10⁻⁵) |
Tied Embeddings | No |
Intended Use
Our model is designed to accelerate research on Romanian language models, serving as a building block for generative AI applications.
Use with transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
device = "cuda"
model_id = "faur-ai/LLMic"
prompt = "Capitala României este"
model = AutoModelForCausalLM.from_pretrained(model_id).to(device)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer)
inputs = tokenizer.encode(
prompt,
add_special_tokens=False,
return_tensors='pt',
).to(device)
outputs = model.generate(
streamer=streamer,
input_ids=inputs,
temperature=0.8,
do_sample=True
)
Data Overview
Training Datasets
Source | Size |
---|---|
Romanian (300B) | |
Web Sources | 621 GB |
Discussions, Curated & Parallel | 10 GB |
English (700B) | |
FineWebEdu | -- |
Dolma Subset | 109 GB |
Benchmark datasets
We evaluated LLMic on the WMT16 language translation benchmark for English-to-Romanian.
Model | Score |
---|---|
LLMIC | 41.01 |
mBART | 38.50 |
Llama-3.1-8B-Instruct | 29.02 |
RoMistral-7b-Instruct | 27.70 |
RoLlama3-8b-Instruct | 27.31 |
Mistral-7B-Instruct-v0.2 | 26.19 |
RoGemma-7b-Instruct | 25.96 |
Gemma-1.1-7b-it | 25.48 |
Citation
BibTeX:
@misc{bădoiu2025llmicromanianfoundationlanguage,
title={LLMic: Romanian Foundation Language Model},
author={Vlad-Andrei Bădoiu and Mihai-Valentin Dumitru and Alexandru M. Gherghescu and Alexandru Agache and Costin Raiciu},
year={2025},
eprint={2501.07721},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.07721},
}
- Downloads last month
- 110
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.