Combine the three methods of della, ties, and model stock to merge the instruction model with the base model.
The aim is to solve the problems of the decline in instruction-following ability and mathematical ability caused by using only the ties merging method or only the della merging method.
models:
- model: Qwen/Qwen2.5-14B-instruct
parameters:
density: 1
weight: 1
lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-14B
parameters:
density: 1
weight: 1
lambda: 0.9
normalize: true
int8_mask: true
dtype: bfloat16
name: Qwen2.5-14B-della
models:
- model: Qwen/Qwen2.5-14B-instruct
parameters:
density: 1
weight: 1
merge_method: ties
base_model: Qwen/Qwen2.5-14B
parameters:
density: 1
weight: 1
normalize: true
int8_mask: true
dtype: bfloat16
name: Qwen2.5-14B-ties
merge_method: model_stock
base_model: Qwen/Qwen2.5-14B-instruct
models:
- model: Qwen2.5-14B-della
- model: Qwen2.5-14B-ties
dtype: bfloat16
tokenizer_source: base
int8_mask: true
normalize: true
name: Qwen2.5-14B-it-restore
- Downloads last month
- 12
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support