--- base_model: - TheDrummer/Anubis-70B-v1 - EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1 - meta-llama/Llama-3.3-70B-Instruct - Sao10K/70B-L3.3-Cirrus-x1 - SicariusSicariiStuff/Negative_LLAMA_70B - Sao10K/L3.1-70B-Hanami-x1 library_name: transformers tags: - mergekit - merge license: llama3.3 --- **Upon further testing I found some logic issues!! WEIGHTS ARE BROKEN** //User y-ryan discovered an issue where the model had invalid tensor.Shape for weights ([1, 8192]), raising following errors when loading with transformers, and fixed it here: [tmfi-us/Progenitor-V5-Final-LLaMa-70B](https://huggingface.co/tmfi-us/Progenitor-V5-Final-LLaMa-70B) I have no clue what is the reason, but despite that I was still able to use and even quant this model?! Testing the fixed version and this gave me different outputs too, with this version's stuff being good?. If anyone understands this I would love to hear about it.// This marks the culmination of my experiments with the Progenitor series. I fixed the typo I had earlier where it wasn't computing in float32, but 6 models in computed in float32 is a bit taxing on resources and time and so I left it for the configuration I thought was the best (it's not something I can afford to do with every model I make, just the worthwhile ones). This one also uses the Sicari's tokenizer which I find the best. # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [Linear DELLA](https://arxiv.org/abs/2406.11617) merge method using [meta-llama/Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) as a base. ### Models Merged The following models were included in the merge: * [TheDrummer/Anubis-70B-v1](https://huggingface.co/TheDrummer/Anubis-70B-v1) * [EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1](https://huggingface.co/EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1) * [Sao10K/70B-L3.3-Cirrus-x1](https://huggingface.co/Sao10K/70B-L3.3-Cirrus-x1) * [SicariusSicariiStuff/Negative_LLAMA_70B](https://huggingface.co/SicariusSicariiStuff/Negative_LLAMA_70B) * [Sao10K/L3.1-70B-Hanami-x1](https://huggingface.co/Sao10K/L3.1-70B-Hanami-x1) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: Sao10K/L3.1-70B-Hanami-x1 parameters: weight: 0.20 density: 0.7 - model: Sao10K/70B-L3.3-Cirrus-x1 parameters: weight: 0.20 density: 0.7 - model: SicariusSicariiStuff/Negative_LLAMA_70B parameters: weight: 0.20 density: 0.7 - model: TheDrummer/Anubis-70B-v1 parameters: weight: 0.20 density: 0.7 - model: EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1 parameters: weight: 0.20 density: 0.7 merge_method: della_linear base_model: meta-llama/Llama-3.3-70B-Instruct parameters: epsilon: 0.2 lambda: 1.1 int8_mask: true dtype: float32 out_dtype: bfloat16 tokenizer: source: SicariusSicariiStuff/Negative_LLAMA_70B ```