image/jpeg Combine the three methods of della, ties, and model stock to merge the instruction model with the base model.

The aim is to solve the problems of the decline in instruction-following ability and mathematical ability caused by using only the ties merging method or only the della merging method.

models:
  - model: Qwen/Qwen2.5-14B-instruct
    parameters:
      density: 1 
      weight: 1
      lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-14B
parameters:
  density: 1
  weight: 1
  lambda: 0.9
  normalize: true
  int8_mask: true
dtype: bfloat16
name: Qwen2.5-14B-della
models:
  - model: Qwen/Qwen2.5-14B-instruct
    parameters:
      density: 1 
      weight: 1
merge_method: ties
base_model: Qwen/Qwen2.5-14B
parameters:
  density: 1
  weight: 1
  normalize: true
  int8_mask: true
dtype: bfloat16
name: Qwen2.5-14B-ties
merge_method: model_stock
base_model: Qwen/Qwen2.5-14B-instruct
models:
  - model: Qwen2.5-14B-della
  - model: Qwen2.5-14B-ties
dtype: bfloat16
tokenizer_source: base
int8_mask: true
normalize: true
name: Qwen2.5-14B-it-restore
Downloads last month
12
Safetensors
Model size
14.8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for YOYO-AI/Qwen2.5-14B-it-restore

Merge model
this model
Merges
4 models