jfpio's picture
Upload folder using huggingface_hub
481a266 verified
|
raw
history blame
6.88 kB
---
language: en
license: apache-2.0
tags:
- code-generation
- causal-lm
- steering
- contrastive-activation-addition
- caa
- qwen
- wisent
library_name: transformers
datasets:
- mbpp
metrics:
- pass@1
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
model-index:
- name: wisent-ai/qwen2.5-coder-7b-wisent-caa
results:
- task:
type: code-generation
name: Code Generation
dataset:
type: mbpp
name: MBPP Plus
metrics:
- type: pass@1
value: 0.67
name: Pass@1
---
# Wisent-Qwen2.5-Coder-7B-Instruct with CAA Steering
## Model Description
This is an enhanced version of Qwen2.5-Coder-7B-Instruct that integrates **Contrastive Activation Addition (CAA)** steering directly into the model architecture. The steering parameters have been optimized using Optuna to improve code generation quality on the MBPP Plus benchmark.
### Key Features
- πŸš€ **Automatic CAA Steering**: No manual hook management required
- 🎯 **Optimized Parameters**: Layer 24, α=0.9
- πŸ—‚οΈ **Trait-Based Organization**: Steering vectors organized by traits
- πŸ”§ **Runtime Configurable**: Adjust or disable steering on the fly
- πŸ€— **HuggingFace Compatible**: Works with standard transformers API
## Installation
```bash
pip install transformers torch safetensors
# Or install from requirements.txt
pip install -r requirements.txt
```
## Hardware Requirements
### Minimum Requirements:
- **GPU Memory**: 16GB VRAM (for inference with bfloat16)
- **System RAM**: 32GB recommended
- **Storage**: 15MB (model configuration + steering vectors)
### Recommended Setup:
- **GPU**: NVIDIA RTX 4090, A100, or similar
- **CUDA**: 11.8 or newer
- **Python**: 3.8-3.11
### Performance Notes:
- Model automatically loads base Qwen2.5-Coder weights (7B parameters)
- CAA steering adds minimal computational overhead (~1-2% inference time)
- Supports CPU inference but GPU recommended for practical use
- Memory usage: ~14GB GPU memory for bfloat16 inference
## Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model - CAA steering is automatically applied!
model = AutoModelForCausalLM.from_pretrained("./huggingface_qwen_generated", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("./huggingface_qwen_generated")
# Generate code
prompt = "Write a Python function to calculate the factorial of a number"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.2)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## Advanced Usage
### Adjusting Steering Strength
```python
# Increase steering strength for stronger safety alignment
model.set_caa_alpha(1.2)
# Decrease for more creative outputs
model.set_caa_alpha(0.5)
```
### Disabling CAA Steering
```python
# Disable CAA to get baseline model behavior
model.set_caa_enabled(False)
# Re-enable CAA
model.set_caa_enabled(True)
```
### Accessing Steering Configuration
```python
print(f"CAA Layer: {model.caa_layer_id}")
print(f"CAA Alpha: {model.caa_alpha}")
print(f"Steering Method: {model.steering_method}")
```
### Trait-Based Vector Organization
The model uses a trait-based organization for steering vectors:
```
vectors/
β”œβ”€β”€ coding/ # Current: Optimized for code generation
β”œβ”€β”€ safety/ # Future: Safety-aligned behavior
β”œβ”€β”€ creativity/ # Future: Enhanced creative outputs
β”œβ”€β”€ helpfulness/ # Future: Improved helpfulness
└── reasoning/ # Future: Enhanced logical reasoning
```
To switch traits, simply update the configuration:
```json
{
"steering_vector_path": "./vectors/safety/steering_vector.safetensors"
}
```
## Technical Details
### CAA Steering Parameters
- **Steering Method**: Contrastive Activation Addition (CAA)
- **Optimal Layer**: 24 (out of 28 transformer layers)
- **Steering Strength (Ξ±)**: 0.9
- **Vector Format**: Safetensors format for efficient loading and HuggingFace compatibility
- **Vector Dimension**: 3584 (pre-normalized during training)
- **Storage Path**: `./vectors/coding/steering_vector.safetensors`
### How It Works
1. **Trait-based Organization**: Steering vectors are organized by behavioral traits (`vectors/{trait}/`)
2. **Dynamic Loading**: The model loads the specified steering vector from the configured path
3. **Layer Application**: Steering is applied to hidden states at layer 24 during forward pass
4. **Generation Integration**: Steering affects the last token position during generation
5. **Configurable Strength**: The Ξ± parameter (default: 0.9) controls steering intensity
6. **Pre-optimized Vectors**: Steering vectors are pre-normalized and ready for immediate use
### Optimization Process
The CAA parameters were optimized using:
- **Framework**: Optuna with TPE sampler
- **Search Space**: Layers 15-28, α ∈ [0.1, 5.0]
- **Objective**: Maximize accuracy on MBPP Plus validation set
- **Best Validation Score**: 64% accuracy
## Model Architecture
```
WisentQwen2ForCausalLM
β”œβ”€β”€ Base: Qwen2.5-Coder-7B-Instruct
β”œβ”€β”€ CAA Integration: Layer 24
β”œβ”€β”€ Steering Vector: ./vectors/coding/steering_vector.safetensors
└── Auto-applied during generation
```
## File Structure
```
huggingface_qwen_generated/
β”œβ”€β”€ config.json # Model configuration with CAA params
β”œβ”€β”€ modeling_wisent_qwen.py # Custom model class
β”œβ”€β”€ tokenizer files # Standard Qwen tokenizer
β”œβ”€β”€ wisent_config.json # Optimization results
└── vectors/ # Trait-based steering vectors
└── coding/
└── steering_vector.safetensors # Optimized coding steering vector
```
## Evaluation
### MBPP Plus Benchmark
The model should be evaluated on the complete MBPP Plus dataset (378 problems) to measure improvement over the baseline. Expected improvements based on validation results.
### Running Evaluation
```python
# Use with bigcode-evaluation-harness
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
"./huggingface_qwen_generated",
trust_remote_code=True
)
# CAA steering is automatically applied during evaluation!
# No manual hooks or modifications needed
```
## Citation
If you use this model, please cite:
```bibtex
@software{wisent_qwen_caa_2025,
title={Wisent-Qwen2.5-Coder with CAA Steering},
author={Wisent AI},
year={2025},
url={https://github.com/wisent-ai/wisent-guard}
}
```
## License
This model inherits the license from the base Qwen2.5-Coder-7B-Instruct model. Please refer to the original model's license for usage terms.
## Acknowledgments
- Base model: Qwen2.5-Coder-7B-Instruct by Alibaba
- CAA method: Contrastive Activation Addition
- Optimization: Optuna framework
- Implementation: Wisent Guard framework