--- license: mit --- # SWE-Swiss-32B-SFT

SWE-Swiss Performance Chart

Figure 1: Performance and model size comparison on SWE-bench Verified. Our 32B model achieves a top-tier score of 60.2%.

**SWE-Swiss-32B-SFT** is a 32B parameter model, fine-tuned for high-performance software issue resolution. It is the result of multi-task fine-tuning. This repository contains the model weights for `SWE-Swiss-32B-SFT`. The model was developed by researchers from Peking University, ByteDance Seed, and The University of Hong Kong. - **Base Model:** [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) - **Blog:** [SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution](https://www.notion.so/SWE-Swiss-A-Multi-Task-Fine-Tuning-and-RL-Recipe-for-High-Performance-Issue-Resolution-21e174dedd4880ea829ed4c861c44f88) - **GitHub:** https://github.com/zhenyuhe00/SWE-Swiss ## Model Description `SWE-Swiss-32B` is trained to be a versatile software engineering model. The core idea behind its training recipe is the explicit modeling of three key skills: * **Localization:** Pinpointing the exact files that need to be modified. * **Repair:** Generating the correct code patch to resolve an issue. * **Unit Test Generation:** Creating new tests to validate the proposed fix. The model demonstrates how a principled training methodology can enable smaller models to achieve performance levels previously only seen in much larger models, highlighting a path toward more efficient and accessible AI for software engineering. ## How to Get Started ### Transformers You can use the `transformers` library to load and run `SWE-Swiss-32B-SFT`. ```python import torch import torch.nn as nn import transformers from transformers import AutoModelForCausalLM, AutoTokenizer # We set the o_bias in the attention module to True to be compatible with our code base. def apply_qwen2_bias_patch(): Qwen2Attention = transformers.models.qwen2.modeling_qwen2.Qwen2Attention original_qwen2_attention_init = Qwen2Attention.__init__ def patched_qwen2_attention_init(self, config, layer_idx): original_qwen2_attention_init(self, config, layer_idx) self.o_proj = nn.Linear(config.num_attention_heads * self.head_dim, config.hidden_size, bias=True) Qwen2Attention.__init__ = patched_qwen2_attention_init apply_qwen2_bias_patch() model_id = "SWE-Swiss/SWE-Swiss-32B-SFT" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto" ) ``` ### vLLM You can also use the [`vLLM`](https://github.com/vllm-project/vllm) library to load and run `SWE-Swiss-32B-SFT`. Firstly. git clone the vLLM repository. ``` git clone https://github.com/vllm-project/vllm cd vllm git checkout v0.8.4 # or other versions compatible with Qwen2. ``` Then, change the [o_bias in the attention module](https://github.com/vllm-project/vllm/blob/v0.8.4/vllm/model_executor/models/qwen2.py#L148) to True and install vllm. ``` # please remember to set "bias=False" to "bias=True" before install vLLM. pip3 install -e . ``` Finally, use vLLM as usual: ```python from vllm import LLM, SamplingParams prompts = [ "How are you?", ] sampling_params = SamplingParams(temperature=0.6, top_p=0.95) llm = LLM(model="SWE-Swiss/SWE-Swiss-32B-SFT", tensor_parallel_size=8, max_model_len=102400) outputs = llm.generate(prompts, sampling_params) for output in outputs: prompt = output.prompt generated_text = output.outputs[0].text print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}") ``` ## Citation ```bibtex @misc{SWESwiss2025, title = {SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution}, url = {https://www.notion.so/SWE-Swiss-A-Multi-Task-Fine-Tuning-and-RL-Recipe-for-High-Performance-Issue-Resolution-21e174dedd4880ea829ed4c861c44f88}, author = {He, Zhenyu and Yang, Qingping and Sheng, Wei and Zhong, Xiaojian and Zhang, Kechi and An, Chenxin and Shi, Wenlei and Cai, Tianle and He, Di and Chen, Jiaze and Xu, Jingjing and Wang, Mingxuan} year = {2025} } ```