R2E-TestgenAgent / README.md

Upload R2E-TestgenAgent - Testing Agent for R2E-Gym

a635839 verified 4 months ago

4.5 kB

	# R2E-TestgenAgent

	## Overview

	R2E-TestgenAgent is a specialized execution-based testing agent designed for generating targeted unit tests for software engineering tasks. This agent is part of the R2E-Gym framework, which provides a comprehensive environment for training and evaluating software engineering agents.

	## Model Description

	The R2E-TestgenAgent is an execution-based testing agent that specializes in:
	- Targeted Unit Test Generation: Creates specific unit tests to validate code patches and implementations
	- Execution-Based Verification: Generates tests that can be executed to verify the correctness of code changes
	- Corner Case Detection: Identifies and tests potential edge cases and corner scenarios
	- Patch Disambiguation: Creates tests that can differentiate between correct and incorrect patches

	## Architecture

	The agent is built on top of the Qwen2.5-Coder-32B-Instruct model and fine-tuned using R2E-Gym's SFT (Supervised Fine-Tuning) trajectories specifically designed for testing tasks.

	## Training Data

	The model was trained on the `R2E-Gym/R2EGym-TestingAgent-SFT-Trajectories` dataset, which contains:
	- High-quality testing trajectories collected from Claude-3.5-Sonnet
	- Execution-based testing scenarios
	- Diverse software engineering problems across 13 repositories
	- Real-world testing patterns and methodologies

	## Usage

	### Basic Usage

	```python
	from r2egym.agenthub.environment.env import EnvArgs, RepoEnv
	from r2egym.agenthub.agent.agent import AgentArgs, Agent
	from pathlib import Path
	from datasets import load_dataset

	# Load dataset
	ds = load_dataset("R2E-Gym/R2E-Gym-Lite")
	env_args = EnvArgs(ds=ds['train'][0])
	env = RepoEnv(env_args)

	# Load testing agent configuration
	agent_args = AgentArgs.from_yaml(Path('./config/testing_agent.yaml'))
	agent_args.llm_name = 'r2e-gym/R2E-TestgenAgent'
	agent = Agent(name="TestingAgent", args=agent_args)

	# Run the testing agent
	output = agent.run(env, max_steps=30, use_fn_calling=True)
	```

	### Configuration

	The agent uses specific prompts and configurations optimized for test generation:

	```yaml
	system_prompt: \|
	You are a specialized testing agent designed to generate targeted unit tests
	for software engineering tasks. Your goal is to create comprehensive tests
	that can validate code patches and identify potential issues.

	instance_prompt: \|
	Given the following problem and potential patches, create targeted unit tests
	that can effectively validate the correctness of the implementation.
	```

	## Training Configuration

	The model was trained using the following configuration:

	- Base Model: Qwen/Qwen2.5-Coder-32B-instruct
	- Training Method: Full fine-tuning with DeepSpeed ZeRO-3
	- Learning Rate: 1.0e-5
	- Epochs: 2.0
	- Batch Size: 1 (per device)
	- Context Length: 20,480 tokens
	- Optimizer: AdamW with cosine learning rate scheduling

	## Performance

	The R2E-TestgenAgent is designed to work in conjunction with other R2E-Gym agents:
	- Code Editing Agent: For generating and fixing code
	- Execution-free Verifier: For reranking patches
	- Hybrid Test-time Scaling: Combines execution-based and execution-free verification

	## Integration with R2E-Gym

	This agent is part of the larger R2E-Gym ecosystem:

	1. Environment: Works with R2E-Gym's 8.1K+ procedurally curated environments
	2. Evaluation: Can be evaluated on SWE-Bench Verified and other benchmarks
	3. Training: Supports continued training on additional trajectories

	## Citation

	If you use R2E-TestgenAgent in your research, please cite:

	```bibtex
	@article{jain2025r2e,
	title={R2e-gym: Procedural environments and hybrid verifiers for scaling open-weights swe agents},
	author={Jain, Naman and Singh, Jaskirat and Shetty, Manish and Zheng, Liang and Sen, Koushik and Stoica, Ion},
	journal={arXiv preprint arXiv:2504.07164},
	year={2025}
	}
	```

	## License

	This model is released under the same license as the base Qwen2.5-Coder model.

	## Links

	- Paper: [R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents](https://arxiv.org/abs/2504.07164)
	- GitHub: [R2E-Gym](https://github.com/R2E-Gym/R2E-Gym)
	- Dataset: [R2EGym-TestingAgent-SFT-Trajectories](https://huggingface.co/datasets/R2E-Gym/R2EGym-TestingAgent-SFT-Trajectories)
	- Related Models:
	- [R2EGym-32B-Agent](https://huggingface.co/R2E-Gym/R2EGym-32B-Agent)
	- [R2EGym-Verifier](https://huggingface.co/R2E-Gym/R2EGym-Verifier)