wisent-ai
/

qwen2.5-coder-7b-wisent-caa

@@ -11,23 +11,10 @@ tags:
 - wisent
 library_name: transformers
 datasets:
-- mbpp
 metrics:
 - pass@1
 base_model: Qwen/Qwen2.5-Coder-7B-Instruct
-model-index:
-- name: wisent-ai/qwen2.5-coder-7b-wisent-caa
-  results:
-  - task:
-      type: code-generation
-      name: Code Generation
-    dataset:
-      type: mbpp
-      name: MBPP Plus
-    metrics:
-    - type: pass@1
-      value: 0.67
-      name: Pass@1
 ---
 # Wisent-Qwen2.5-Coder-7B-Instruct with CAA Steering
@@ -164,7 +151,7 @@ The CAA parameters were optimized using:
 - **Framework**: Optuna with TPE sampler
 - **Search Space**: Layers 15-28, α ∈ [0.1, 5.0]
 - **Objective**: Maximize accuracy on MBPP Plus validation set
-- **Best Validation Score**: 64% accuracy
 ## Model Architecture
@@ -193,7 +180,7 @@ huggingface_qwen25-7b-coder-caa/
 ### MBPP Plus Benchmark
-The model should be evaluated on the complete MBPP Plus dataset (378 problems) to measure improvement over the baseline. Expected improvements based on validation results.
 ### Running Evaluation

 - wisent
 library_name: transformers
 datasets:
+- evalplus/mbppplus
 metrics:
 - pass@1
 base_model: Qwen/Qwen2.5-Coder-7B-Instruct
 ---
 # Wisent-Qwen2.5-Coder-7B-Instruct with CAA Steering
 - **Framework**: Optuna with TPE sampler
 - **Search Space**: Layers 15-28, α ∈ [0.1, 5.0]
 - **Objective**: Maximize accuracy on MBPP Plus validation set
+- **Validation Results**: Optimized for improved performance on MBPP Plus tasks
 ## Model Architecture
 ### MBPP Plus Benchmark
+The model has been optimized using Optuna on MBPP Plus tasks. For reliable performance metrics, evaluation should be conducted on the complete MBPP Plus dataset (378 problems) using the [evalplus/mbppplus](https://huggingface.co/datasets/evalplus/mbppplus) dataset.
 ### Running Evaluation

modeling_wisent_qwen.py CHANGED Viewed

@@ -5,12 +5,11 @@ This model automatically applies CAA steering during generation without requirin
 The steering parameters are optimized using Optuna and stored in the model configuration.
 """
-from typing import Optional, Tuple, Union, List
 import torch
-import torch.nn as nn
-from transformers import Qwen2ForCausalLM, Qwen2Config
 from transformers.modeling_outputs import CausalLMOutputWithPast
-from transformers.cache_utils import Cache
 class WisentQwen2Config(Qwen2Config):
@@ -150,8 +149,7 @@ class WisentQwen2ForCausalLM(Qwen2ForCausalLM):
         # Return modified output
         if isinstance(output, tuple):
             return (hidden_states,) + output[1:]
-        else:
-            return hidden_states
     def forward(
         self,
@@ -254,7 +252,7 @@ class WisentQwen2ForCausalLM(Qwen2ForCausalLM):
         if not has_weights and local_path.exists() and (local_path / "config.json").exists():
             # We have config but no weights - load from base model
-            print(f"Loading weights from base model: Qwen/Qwen2.5-Coder-7B-Instruct")
             # First, load config from local path
             from transformers import AutoConfig
@@ -301,7 +299,7 @@ class WisentQwen2ForCausalLM(Qwen2ForCausalLM):
 # Register the model
-from transformers import AutoModelForCausalLM, AutoConfig
 AutoConfig.register("wisent_qwen2", WisentQwen2Config)
 AutoModelForCausalLM.register(WisentQwen2Config, WisentQwen2ForCausalLM)

 The steering parameters are optimized using Optuna and stored in the model configuration.
 """
+from typing import List, Optional, Tuple, Union
 import torch
+from transformers import Qwen2Config, Qwen2ForCausalLM
 from transformers.modeling_outputs import CausalLMOutputWithPast
 class WisentQwen2Config(Qwen2Config):
         # Return modified output
         if isinstance(output, tuple):
             return (hidden_states,) + output[1:]
+        return hidden_states
     def forward(
         self,
         if not has_weights and local_path.exists() and (local_path / "config.json").exists():
             # We have config but no weights - load from base model
+            print("Loading weights from base model: Qwen/Qwen2.5-Coder-7B-Instruct")
             # First, load config from local path
             from transformers import AutoConfig
 # Register the model
+from transformers import AutoConfig, AutoModelForCausalLM
 AutoConfig.register("wisent_qwen2", WisentQwen2Config)
 AutoModelForCausalLM.register(WisentQwen2Config, WisentQwen2ForCausalLM)