Upload folder using huggingface_hub
Browse files- README.md +16 -3
- config.json +1 -1
README.md
CHANGED
|
@@ -15,6 +15,19 @@ datasets:
|
|
| 15 |
metrics:
|
| 16 |
- pass@1
|
| 17 |
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
---
|
| 19 |
|
| 20 |
# Wisent-Qwen2.5-Coder-7B-Instruct with CAA Steering
|
|
@@ -26,7 +39,7 @@ This is an enhanced version of Qwen2.5-Coder-7B-Instruct that integrates **Contr
|
|
| 26 |
### Key Features
|
| 27 |
|
| 28 |
- π **Automatic CAA Steering**: No manual hook management required
|
| 29 |
-
- π― **Optimized Parameters**: Layer 24, Ξ±=
|
| 30 |
- ποΈ **Trait-Based Organization**: Steering vectors organized by traits
|
| 31 |
- π§ **Runtime Configurable**: Adjust or disable steering on the fly
|
| 32 |
- π€ **HuggingFace Compatible**: Works with standard transformers API
|
|
@@ -131,7 +144,7 @@ To switch traits, simply update the configuration:
|
|
| 131 |
|
| 132 |
- **Steering Method**: Contrastive Activation Addition (CAA)
|
| 133 |
- **Optimal Layer**: 24 (out of 28 transformer layers)
|
| 134 |
-
- **Steering Strength (Ξ±)**:
|
| 135 |
- **Vector Format**: Safetensors format for efficient loading and HuggingFace compatibility
|
| 136 |
- **Vector Dimension**: 3584 (pre-normalized during training)
|
| 137 |
- **Storage Path**: `./vectors/mbpp_plus/steering_vector.safetensors`
|
|
@@ -151,7 +164,7 @@ The CAA parameters were optimized using:
|
|
| 151 |
- **Framework**: Optuna with TPE sampler
|
| 152 |
- **Search Space**: Layers 15-28, Ξ± β [0.1, 5.0]
|
| 153 |
- **Objective**: Maximize accuracy on MBPP Plus validation set
|
| 154 |
-
- **
|
| 155 |
|
| 156 |
## Model Architecture
|
| 157 |
|
|
|
|
| 15 |
metrics:
|
| 16 |
- pass@1
|
| 17 |
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
|
| 18 |
+
model-index:
|
| 19 |
+
- name: wisent-ai/qwen2.5-coder-7b-wisent-caa
|
| 20 |
+
results:
|
| 21 |
+
- task:
|
| 22 |
+
type: code-generation
|
| 23 |
+
name: Code Generation
|
| 24 |
+
dataset:
|
| 25 |
+
type: mbppplus
|
| 26 |
+
name: MBPP Plus
|
| 27 |
+
metrics:
|
| 28 |
+
- type: pass@1
|
| 29 |
+
value: 0.521
|
| 30 |
+
name: Pass@1
|
| 31 |
---
|
| 32 |
|
| 33 |
# Wisent-Qwen2.5-Coder-7B-Instruct with CAA Steering
|
|
|
|
| 39 |
### Key Features
|
| 40 |
|
| 41 |
- π **Automatic CAA Steering**: No manual hook management required
|
| 42 |
+
- π― **Optimized Parameters**: Layer 24, Ξ±=1.4
|
| 43 |
- ποΈ **Trait-Based Organization**: Steering vectors organized by traits
|
| 44 |
- π§ **Runtime Configurable**: Adjust or disable steering on the fly
|
| 45 |
- π€ **HuggingFace Compatible**: Works with standard transformers API
|
|
|
|
| 144 |
|
| 145 |
- **Steering Method**: Contrastive Activation Addition (CAA)
|
| 146 |
- **Optimal Layer**: 24 (out of 28 transformer layers)
|
| 147 |
+
- **Steering Strength (Ξ±)**: 1.4
|
| 148 |
- **Vector Format**: Safetensors format for efficient loading and HuggingFace compatibility
|
| 149 |
- **Vector Dimension**: 3584 (pre-normalized during training)
|
| 150 |
- **Storage Path**: `./vectors/mbpp_plus/steering_vector.safetensors`
|
|
|
|
| 164 |
- **Framework**: Optuna with TPE sampler
|
| 165 |
- **Search Space**: Layers 15-28, Ξ± β [0.1, 5.0]
|
| 166 |
- **Objective**: Maximize accuracy on MBPP Plus validation set
|
| 167 |
+
- **Best Performance**: 52.1% accuracy on MBPP Plus (378 problems)
|
| 168 |
|
| 169 |
## Model Architecture
|
| 170 |
|
config.json
CHANGED
|
@@ -116,7 +116,7 @@
|
|
| 116 |
},
|
| 117 |
"caa_enabled": true,
|
| 118 |
"caa_layer_id": 24,
|
| 119 |
-
"caa_alpha":
|
| 120 |
"steering_method": "caa",
|
| 121 |
"wisent_optimization": {
|
| 122 |
"best_value": 0.64,
|
|
|
|
| 116 |
},
|
| 117 |
"caa_enabled": true,
|
| 118 |
"caa_layer_id": 24,
|
| 119 |
+
"caa_alpha": 1.4,
|
| 120 |
"steering_method": "caa",
|
| 121 |
"wisent_optimization": {
|
| 122 |
"best_value": 0.64,
|