YOYO-AI/Qwen2.5-14B-it-restore

Combine the three methods of della, ties, and model stock to merge the instruction model with the base model.

The aim is to solve the problems of the decline in instruction-following ability and mathematical ability caused by using only the ties merging method or only the della merging method.

models:
  - model: Qwen/Qwen2.5-14B-instruct
    parameters:
      density: 1 
      weight: 1
      lambda: 0.9
merge_method: della
base_model: Qwen/Qwen2.5-14B
parameters:
  density: 1
  weight: 1
  lambda: 0.9
  normalize: true
  int8_mask: true
dtype: bfloat16
name: Qwen2.5-14B-della

models:
  - model: Qwen/Qwen2.5-14B-instruct
    parameters:
      density: 1 
      weight: 1
merge_method: ties
base_model: Qwen/Qwen2.5-14B
parameters:
  density: 1
  weight: 1
  normalize: true
  int8_mask: true
dtype: bfloat16
name: Qwen2.5-14B-ties

merge_method: model_stock
base_model: Qwen/Qwen2.5-14B-instruct
models:
  - model: Qwen2.5-14B-della
  - model: Qwen2.5-14B-ties
dtype: bfloat16
tokenizer_source: base
int8_mask: true
normalize: true
name: Qwen2.5-14B-it-restore

YOYO-AI
/

Qwen2.5-14B-it-restore

Model tree for YOYO-AI/Qwen2.5-14B-it-restore