Text Generation
Transformers
Safetensors
qwen3
mergekit
Merge
qwen
qwen-3
qwen-3-8b
8b
reasoning
code
code-reasoning
code-instruct
python
javascript
dev-ops
jenkins
terraform
scripting
powershell
azure
aws
gcp
cloud
science
science-reasoning
physics
biology
chemistry
earth-science
astronomy
machine-learning
artificial-intelligence
compsci
computer-science
information-theory
ML-Ops
math
cuda
deep-learning
agentic
LLM
neuromorphic
self-improvement
complex-systems
cognition
linguistics
philosophy
logic
epistemology
simulation
game-theory
knowledge-management
creativity
problem-solving
architect
engineer
developer
creative
analytical
expert
rationality
conversational
chat
instruct
text-generation-inference
metadata
base_model:
- ValiantLabs/Qwen3-8B-ShiningValiant3
- ValiantLabs/Qwen3-8B-Esper3
- Qwen/Qwen3-8B
library_name: transformers
tags:
- mergekit
- merge
- qwen
- qwen-3
- qwen-3-8b
- 8b
- reasoning
- code
- code-reasoning
- code-instruct
- python
- javascript
- dev-ops
- jenkins
- terraform
- scripting
- powershell
- azure
- aws
- gcp
- cloud
- science
- science-reasoning
- physics
- biology
- chemistry
- earth-science
- astronomy
- machine-learning
- artificial-intelligence
- compsci
- computer-science
- information-theory
- ML-Ops
- math
- cuda
- deep-learning
- transformers
- agentic
- LLM
- neuromorphic
- self-improvement
- complex-systems
- cognition
- linguistics
- philosophy
- logic
- epistemology
- simulation
- game-theory
- knowledge-management
- creativity
- problem-solving
- architect
- engineer
- developer
- creative
- analytical
- expert
- rationality
- conversational
- chat
- instruct
datasets:
- sequelbox/Celestia3-DeepSeek-R1-0528
- sequelbox/Mitakihara-DeepSeek-R1-0528
- sequelbox/Titanium2.1-DeepSeek-R1
- sequelbox/Tachibana2-DeepSeek-R1
- sequelbox/Raiden-DeepSeek-R1
PlumEsper
This is a merge of pre-trained language models created using mergekit, combining the specialty and general reasoning skills of Esper 3 8b and Shining Valiant 3 8b.
Merge Details
Merge Method
This model was merged using the DELLA merge method using Qwen/Qwen3-8B as a base.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
merge_method: della
dtype: bfloat16
parameters:
normalize: true
models:
- model: ValiantLabs/Qwen3-8B-Esper3
parameters:
density: 0.5
weight: 0.3
- model: ValiantLabs/Qwen3-8B-ShiningValiant3
parameters:
density: 0.5
weight: 0.3
base_model: Qwen/Qwen3-8B