LLM ITA
Collection
Open-Source Language Models Finetuned for Italian
β’
4 items
β’
Updated
β’
7
π‘ Found this resource helpful? Creating and maintaining open source AI models and datasets requires significant computational resources. If this work has been valuable to you, consider supporting my research to help me continue building tools that benefit the entire AI community. Every contribution directly funds more open source innovation! β
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
MODEL_NAME = "DeepMount00/Llama-3.1-8b-Ita"
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, torch_dtype=torch.bfloat16).eval()
model.to(device)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
def generate_answer(prompt):
messages = [
{"role": "user", "content": prompt},
]
model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
generated_ids = model.generate(model_inputs, max_new_tokens=200, do_sample=True,
temperature=0.001)
decoded = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
return decoded[0]
prompt = "Come si apre un file json in python?"
answer = generate_answer(prompt)
print(answer)
[Michele Montebovi]
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 28.23 |
IFEval (0-Shot) | 79.17 |
BBH (3-Shot) | 30.93 |
MATH Lvl 5 (4-Shot) | 10.88 |
GPQA (0-shot) | 5.03 |
MuSR (0-shot) | 11.40 |
MMLU-PRO (5-shot) | 31.96 |
Base model
meta-llama/Llama-3.1-8B