File size: 4,499 Bytes
a635839
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
# R2E-TestgenAgent

## Overview

R2E-TestgenAgent is a specialized execution-based testing agent designed for generating targeted unit tests for software engineering tasks. This agent is part of the R2E-Gym framework, which provides a comprehensive environment for training and evaluating software engineering agents.

## Model Description

The R2E-TestgenAgent is an execution-based testing agent that specializes in:
- **Targeted Unit Test Generation**: Creates specific unit tests to validate code patches and implementations
- **Execution-Based Verification**: Generates tests that can be executed to verify the correctness of code changes
- **Corner Case Detection**: Identifies and tests potential edge cases and corner scenarios
- **Patch Disambiguation**: Creates tests that can differentiate between correct and incorrect patches

## Architecture

The agent is built on top of the Qwen2.5-Coder-32B-Instruct model and fine-tuned using R2E-Gym's SFT (Supervised Fine-Tuning) trajectories specifically designed for testing tasks.

## Training Data

The model was trained on the `R2E-Gym/R2EGym-TestingAgent-SFT-Trajectories` dataset, which contains:
- High-quality testing trajectories collected from Claude-3.5-Sonnet
- Execution-based testing scenarios
- Diverse software engineering problems across 13 repositories
- Real-world testing patterns and methodologies

## Usage

### Basic Usage

```python
from r2egym.agenthub.environment.env import EnvArgs, RepoEnv
from r2egym.agenthub.agent.agent import AgentArgs, Agent
from pathlib import Path
from datasets import load_dataset

# Load dataset
ds = load_dataset("R2E-Gym/R2E-Gym-Lite")
env_args = EnvArgs(ds=ds['train'][0])
env = RepoEnv(env_args)

# Load testing agent configuration
agent_args = AgentArgs.from_yaml(Path('./config/testing_agent.yaml'))
agent_args.llm_name = 'r2e-gym/R2E-TestgenAgent'
agent = Agent(name="TestingAgent", args=agent_args)

# Run the testing agent
output = agent.run(env, max_steps=30, use_fn_calling=True)
```

### Configuration

The agent uses specific prompts and configurations optimized for test generation:

```yaml
system_prompt: |
  You are a specialized testing agent designed to generate targeted unit tests 
  for software engineering tasks. Your goal is to create comprehensive tests 
  that can validate code patches and identify potential issues.

instance_prompt: |
  Given the following problem and potential patches, create targeted unit tests
  that can effectively validate the correctness of the implementation.
```

## Training Configuration

The model was trained using the following configuration:

- **Base Model**: Qwen/Qwen2.5-Coder-32B-instruct
- **Training Method**: Full fine-tuning with DeepSpeed ZeRO-3
- **Learning Rate**: 1.0e-5
- **Epochs**: 2.0
- **Batch Size**: 1 (per device)
- **Context Length**: 20,480 tokens
- **Optimizer**: AdamW with cosine learning rate scheduling

## Performance

The R2E-TestgenAgent is designed to work in conjunction with other R2E-Gym agents:
- **Code Editing Agent**: For generating and fixing code
- **Execution-free Verifier**: For reranking patches
- **Hybrid Test-time Scaling**: Combines execution-based and execution-free verification

## Integration with R2E-Gym

This agent is part of the larger R2E-Gym ecosystem:

1. **Environment**: Works with R2E-Gym's 8.1K+ procedurally curated environments
2. **Evaluation**: Can be evaluated on SWE-Bench Verified and other benchmarks
3. **Training**: Supports continued training on additional trajectories

## Citation

If you use R2E-TestgenAgent in your research, please cite:

```bibtex
@article{jain2025r2e,
  title={R2e-gym: Procedural environments and hybrid verifiers for scaling open-weights swe agents},
  author={Jain, Naman and Singh, Jaskirat and Shetty, Manish and Zheng, Liang and Sen, Koushik and Stoica, Ion},
  journal={arXiv preprint arXiv:2504.07164},
  year={2025}
}
```

## License

This model is released under the same license as the base Qwen2.5-Coder model.

## Links

- **Paper**: [R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents](https://arxiv.org/abs/2504.07164)
- **GitHub**: [R2E-Gym](https://github.com/R2E-Gym/R2E-Gym)
- **Dataset**: [R2EGym-TestingAgent-SFT-Trajectories](https://huggingface.co/datasets/R2E-Gym/R2EGym-TestingAgent-SFT-Trajectories)
- **Related Models**: 
  - [R2EGym-32B-Agent](https://huggingface.co/R2E-Gym/R2EGym-32B-Agent)
  - [R2EGym-Verifier](https://huggingface.co/R2E-Gym/R2EGym-Verifier)