--- base_model: [] library_name: transformers tags: - mergekit - merge --- # llama-3.1-8b-mixed-slerp-0.90 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [SLERP](https://en.wikipedia.org/wiki/Slerp) merge method. ### Models Merged The following models were included in the merge: * /scratch/gpfs/vv7118/models/hub/models--meta-llama--Llama-3.1-8B/snapshots/d04e592bb4f6aa9cfee91e2e20afa771667e1d4b * /scratch/gpfs/vv7118/models/hub/models--deepseek-ai--DeepSeek-R1-Distill-Llama-8B/snapshots/ebf7e8d03db3d86a442d22d30d499abb7ec27bea ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: /scratch/gpfs/vv7118/models/hub/models--deepseek-ai--DeepSeek-R1-Distill-Llama-8B/snapshots/ebf7e8d03db3d86a442d22d30d499abb7ec27bea merge_method: slerp base_model: /scratch/gpfs/vv7118/models/hub/models--meta-llama--Llama-3.1-8B/snapshots/d04e592bb4f6aa9cfee91e2e20afa771667e1d4b parameters: t: - value: 0.9 # fallback for rest of tensors dtype: float16 chat_template: "llama3" tokenizer: source: "union" # or "base" or a specific model path ```