# Combined Task Vector Model

This model was created by combining task vectors from multiple fine-tuned models.

## Task Vector Computation

```python
t_1 = TaskVector("google/gemma-2b-it", "coastalcph/gemma-2b-it-gcd_sycophancy_2e-04")
t_2 = TaskVector("google/gemma-2b-it", "coastalcph/gemma-2b-it-personality-non-sycophancy")
t_combined = 1.0 * t_1 + 1.5 * t_2 - 1.5 * t_3
new_model = t_combined.apply_to("google/gemma-2b-it", scaling_coef=1.0)
```

Models Used

- Base Model: https://huggingface.co/google/gemma-2b-it
- Fine-tuned Model 1: https://huggingface.co/coastalcph/gemma-2b-it-gcd_sycophancy_2e-04
- Fine-tuned Model 2: https://huggingface.co/coastalcph/gemma-2b-it-personality-non-sycophancy

Technical Details

- Creation Script Git Hash: d0db42d73be516ec04f0ecdc8003189e98b5f722
- Task Vector Method: Additive combination
- Args: {
  "pretrained_model": "google/gemma-2b-it",
  "finetuned_model1": "coastalcph/gemma-2b-it-gcd_sycophancy_2e-04",
  "finetuned_model2": "coastalcph/gemma-2b-it-personality-non-sycophancy",
  "finetuned_model3": "coastalcph/gemma-2b-it-personality-sycophancy",
  "output_model_name": "coastalcph/gemma-2b-it-1t_gcd_sycophancy_pout_1.5t_diff_sycophant",
  "output_dir": "/projects/nlp/data/constanzam/weight-interp/task-vectors/math_non_sycophant_12Aug",
  "scaling_coef": 1.0,
  "apply_line_scaling_t1": false,
  "apply_line_scaling_t2": false,
  "apply_line_scaling_t3": false,
  "combine_diff_projecting_out": true,
  "scale_t1": 1.0,
  "scale_t2": 1.5,
  "scale_t3": 1.5
}