YOYO-AI
/

Qwen2.5-32B-YOYO-MIX

Text Generation

text-generation-inference

Model card Files Files and versions

Qwen2.5-32B-YOYO-MIX / README.md

YOYO-AI's picture

Update README.md

7fbf1bc verified 6 months ago

|

history blame contribute delete

1.58 kB

	---
	base_model:
	- Qwen/Qwen2.5-32B
	- Qwen/Qwen2.5-32B-Instruct
	- Qwen/Qwen2.5-Coder-32B
	- Qwen/Qwen2.5-Coder-32B-Instruct
	library_name: transformers
	tags:
	- mergekit
	- merge
	license: apache-2.0
	language:
	- en
	- zh
	pipeline_tag: text-generation
	---
	![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/64e174e202fa032de4143324/9sSGARM4_J0ZUbm8jSQkD.jpeg)
	This series aims to unify the official models* of Qwen.*

	The unified model* obtained by merging the code model and the instruction model through the SCE method*
	### Configuration

	The following YAML configuration was used to produce this model:
	```yaml
	models:
	- model: Qwen/Qwen2.5-32B-instruct
	parameters:
	density: 1
	weight: 1
	lambda: 0.9
	merge_method: della
	base_model: Qwen/Qwen2.5-32B
	parameters:
	density: 1
	weight: 1
	lambda: 0.9
	normalize: true
	int8_mask: true
	dtype: bfloat16
	name: Qwen2.5-32B-YOYO
	```
	```yaml
	models:
	- model: Qwen/Qwen2.5-Coder-32B-instruct
	parameters:
	density: 1
	weight: 1
	lambda: 0.9
	merge_method: della
	base_model: Qwen/Qwen2.5-Coder-32B
	parameters:
	density: 1
	weight: 1
	lambda: 0.9
	normalize: true
	int8_mask: true
	dtype: bfloat16
	name: Qwen2.5-Coder-32B-YOYO
	```
	```yaml
	merge_method: sce
	models:
	# Pivot model
	- model: Qwen/Qwen2.5-Coder-32B
	# Target models
	- model: YOYO-AI/Qwen2.5-32B-YOYO
	- model: YOYO-AI/Qwen2.5-Coder-32B-YOYO
	base_model: Qwen/Qwen2.5-Coder-32B
	parameters:
	select_topk: 1
	dtype: bfloat16
	tokenizer_source: base
	normalize: true
	int8_mask: true
	```