image.png

ALIA-40b in GGUF format and quantized to Q3_K

ALIA-40B is a 40B parameter base language model developed by the Barcelona Supercomputing Center (BSC).

Original model and details here: https://huggingface.co/BSC-LT/ALIA-40b

This model is released under a permissive Apache 2.0 license. Along with the open weights, all training scripts and configuration files are made publicly available in this GitHub repository.

This repository contains the model in GGUF format and afterwards quantized to Q3_K level using llama.cpp.


Model Details

Description

Transformer-based decoder-only language model that has been pre-trained from scratch on 9.37 trillion tokens of highly curated data. The pre-training corpus contains text in 35 European languages and code.

Hyperparameters

The full list of hyperparameters can be found here.

Architecture

Total Parameters 40,433,885,184
Embedding Parameters 2,097,152,000
Layers 48
Hidden size 8,192
Attention heads 64
Context length 32,768
Vocabulary size 256,000
Precision bfloat16
Embedding type RoPE
Activation Function SwiGLU
Layer normalization RMS Norm
Flash attention ✅
Grouped Query Attention ✅
Num. query groups 8

Conversion Process

There are the steps that were followed to convert the weights to GGUF format and quantize.

1. Download from HuggingFace

Requirement: huggingface_hub

huggingface-cli download --cache-dir . BSC-LT/ALIA-40b

This command downloads the model into the directory ./models--BSC-LT--ALIA-40b/

The safetensors files end up inside ./models--BSC-LT--ALIA-40b/snapshots/aa8a4ac7f9e18f3c2ea8ec0cc84e7783cd751ac7/.

2. Convert Safetensors to GUFF without quantization using llama.cpp

Requirement: llama.cpp repository and python requirements installed.

cd $LLAMA_PATH
python convert_hf_to_gguf.py $ALIA_PATH/models--BSC-LT--ALIA-40b/snapshots/aa8a4ac7f9e18f3c2ea8ec0cc84e7783cd751ac7/ --outfile $ALIA_PATH/ALIA-40B.gguf

LLAMA_PATH is the root of the llama.cpp directory. ALIA_PATH is the directory where we downloaded the Safetensors weights and where we want to store the ALIA-40B GGUF file.

This creates the file $ALIA_PATH/ALIA-40B.gguf.

3. Quantize the model

Requirement: llama.cpp built and installed.

cd $ALIA_PATH
llama-quantize ALIA-40B.gguf ALIA-40B.Q3_K.gguf Q3_K

This generates the file ALIA-40B.Q3_K.gguf within the same directory.

Downloads last month
9
GGUF
Model size
40.4B params
Architecture
llama
Hardware compatibility
Log In to view the estimation
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for csala/ALIA-40b-Q3_K-GGUF

Base model

BSC-LT/ALIA-40b
Quantized
(12)
this model

Datasets used to train csala/ALIA-40b-Q3_K-GGUF