R2E-TestgenAgent / README.md
StringChaos's picture
Upload R2E-TestgenAgent - Testing Agent for R2E-Gym
a635839 verified
# R2E-TestgenAgent
## Overview
R2E-TestgenAgent is a specialized execution-based testing agent designed for generating targeted unit tests for software engineering tasks. This agent is part of the R2E-Gym framework, which provides a comprehensive environment for training and evaluating software engineering agents.
## Model Description
The R2E-TestgenAgent is an execution-based testing agent that specializes in:
- **Targeted Unit Test Generation**: Creates specific unit tests to validate code patches and implementations
- **Execution-Based Verification**: Generates tests that can be executed to verify the correctness of code changes
- **Corner Case Detection**: Identifies and tests potential edge cases and corner scenarios
- **Patch Disambiguation**: Creates tests that can differentiate between correct and incorrect patches
## Architecture
The agent is built on top of the Qwen2.5-Coder-32B-Instruct model and fine-tuned using R2E-Gym's SFT (Supervised Fine-Tuning) trajectories specifically designed for testing tasks.
## Training Data
The model was trained on the `R2E-Gym/R2EGym-TestingAgent-SFT-Trajectories` dataset, which contains:
- High-quality testing trajectories collected from Claude-3.5-Sonnet
- Execution-based testing scenarios
- Diverse software engineering problems across 13 repositories
- Real-world testing patterns and methodologies
## Usage
### Basic Usage
```python
from r2egym.agenthub.environment.env import EnvArgs, RepoEnv
from r2egym.agenthub.agent.agent import AgentArgs, Agent
from pathlib import Path
from datasets import load_dataset
# Load dataset
ds = load_dataset("R2E-Gym/R2E-Gym-Lite")
env_args = EnvArgs(ds=ds['train'][0])
env = RepoEnv(env_args)
# Load testing agent configuration
agent_args = AgentArgs.from_yaml(Path('./config/testing_agent.yaml'))
agent_args.llm_name = 'r2e-gym/R2E-TestgenAgent'
agent = Agent(name="TestingAgent", args=agent_args)
# Run the testing agent
output = agent.run(env, max_steps=30, use_fn_calling=True)
```
### Configuration
The agent uses specific prompts and configurations optimized for test generation:
```yaml
system_prompt: |
You are a specialized testing agent designed to generate targeted unit tests
for software engineering tasks. Your goal is to create comprehensive tests
that can validate code patches and identify potential issues.
instance_prompt: |
Given the following problem and potential patches, create targeted unit tests
that can effectively validate the correctness of the implementation.
```
## Training Configuration
The model was trained using the following configuration:
- **Base Model**: Qwen/Qwen2.5-Coder-32B-instruct
- **Training Method**: Full fine-tuning with DeepSpeed ZeRO-3
- **Learning Rate**: 1.0e-5
- **Epochs**: 2.0
- **Batch Size**: 1 (per device)
- **Context Length**: 20,480 tokens
- **Optimizer**: AdamW with cosine learning rate scheduling
## Performance
The R2E-TestgenAgent is designed to work in conjunction with other R2E-Gym agents:
- **Code Editing Agent**: For generating and fixing code
- **Execution-free Verifier**: For reranking patches
- **Hybrid Test-time Scaling**: Combines execution-based and execution-free verification
## Integration with R2E-Gym
This agent is part of the larger R2E-Gym ecosystem:
1. **Environment**: Works with R2E-Gym's 8.1K+ procedurally curated environments
2. **Evaluation**: Can be evaluated on SWE-Bench Verified and other benchmarks
3. **Training**: Supports continued training on additional trajectories
## Citation
If you use R2E-TestgenAgent in your research, please cite:
```bibtex
@article{jain2025r2e,
title={R2e-gym: Procedural environments and hybrid verifiers for scaling open-weights swe agents},
author={Jain, Naman and Singh, Jaskirat and Shetty, Manish and Zheng, Liang and Sen, Koushik and Stoica, Ion},
journal={arXiv preprint arXiv:2504.07164},
year={2025}
}
```
## License
This model is released under the same license as the base Qwen2.5-Coder model.
## Links
- **Paper**: [R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents](https://arxiv.org/abs/2504.07164)
- **GitHub**: [R2E-Gym](https://github.com/R2E-Gym/R2E-Gym)
- **Dataset**: [R2EGym-TestingAgent-SFT-Trajectories](https://huggingface.co/datasets/R2E-Gym/R2EGym-TestingAgent-SFT-Trajectories)
- **Related Models**:
- [R2EGym-32B-Agent](https://huggingface.co/R2E-Gym/R2EGym-32B-Agent)
- [R2EGym-Verifier](https://huggingface.co/R2E-Gym/R2EGym-Verifier)