| # R2E-TestgenAgent | |
| ## Overview | |
| R2E-TestgenAgent is a specialized execution-based testing agent designed for generating targeted unit tests for software engineering tasks. This agent is part of the R2E-Gym framework, which provides a comprehensive environment for training and evaluating software engineering agents. | |
| ## Model Description | |
| The R2E-TestgenAgent is an execution-based testing agent that specializes in: | |
| - **Targeted Unit Test Generation**: Creates specific unit tests to validate code patches and implementations | |
| - **Execution-Based Verification**: Generates tests that can be executed to verify the correctness of code changes | |
| - **Corner Case Detection**: Identifies and tests potential edge cases and corner scenarios | |
| - **Patch Disambiguation**: Creates tests that can differentiate between correct and incorrect patches | |
| ## Architecture | |
| The agent is built on top of the Qwen2.5-Coder-32B-Instruct model and fine-tuned using R2E-Gym's SFT (Supervised Fine-Tuning) trajectories specifically designed for testing tasks. | |
| ## Training Data | |
| The model was trained on the `R2E-Gym/R2EGym-TestingAgent-SFT-Trajectories` dataset, which contains: | |
| - High-quality testing trajectories collected from Claude-3.5-Sonnet | |
| - Execution-based testing scenarios | |
| - Diverse software engineering problems across 13 repositories | |
| - Real-world testing patterns and methodologies | |
| ## Usage | |
| ### Basic Usage | |
| ```python | |
| from r2egym.agenthub.environment.env import EnvArgs, RepoEnv | |
| from r2egym.agenthub.agent.agent import AgentArgs, Agent | |
| from pathlib import Path | |
| from datasets import load_dataset | |
| # Load dataset | |
| ds = load_dataset("R2E-Gym/R2E-Gym-Lite") | |
| env_args = EnvArgs(ds=ds['train'][0]) | |
| env = RepoEnv(env_args) | |
| # Load testing agent configuration | |
| agent_args = AgentArgs.from_yaml(Path('./config/testing_agent.yaml')) | |
| agent_args.llm_name = 'r2e-gym/R2E-TestgenAgent' | |
| agent = Agent(name="TestingAgent", args=agent_args) | |
| # Run the testing agent | |
| output = agent.run(env, max_steps=30, use_fn_calling=True) | |
| ``` | |
| ### Configuration | |
| The agent uses specific prompts and configurations optimized for test generation: | |
| ```yaml | |
| system_prompt: | | |
| You are a specialized testing agent designed to generate targeted unit tests | |
| for software engineering tasks. Your goal is to create comprehensive tests | |
| that can validate code patches and identify potential issues. | |
| instance_prompt: | | |
| Given the following problem and potential patches, create targeted unit tests | |
| that can effectively validate the correctness of the implementation. | |
| ``` | |
| ## Training Configuration | |
| The model was trained using the following configuration: | |
| - **Base Model**: Qwen/Qwen2.5-Coder-32B-instruct | |
| - **Training Method**: Full fine-tuning with DeepSpeed ZeRO-3 | |
| - **Learning Rate**: 1.0e-5 | |
| - **Epochs**: 2.0 | |
| - **Batch Size**: 1 (per device) | |
| - **Context Length**: 20,480 tokens | |
| - **Optimizer**: AdamW with cosine learning rate scheduling | |
| ## Performance | |
| The R2E-TestgenAgent is designed to work in conjunction with other R2E-Gym agents: | |
| - **Code Editing Agent**: For generating and fixing code | |
| - **Execution-free Verifier**: For reranking patches | |
| - **Hybrid Test-time Scaling**: Combines execution-based and execution-free verification | |
| ## Integration with R2E-Gym | |
| This agent is part of the larger R2E-Gym ecosystem: | |
| 1. **Environment**: Works with R2E-Gym's 8.1K+ procedurally curated environments | |
| 2. **Evaluation**: Can be evaluated on SWE-Bench Verified and other benchmarks | |
| 3. **Training**: Supports continued training on additional trajectories | |
| ## Citation | |
| If you use R2E-TestgenAgent in your research, please cite: | |
| ```bibtex | |
| @article{jain2025r2e, | |
| title={R2e-gym: Procedural environments and hybrid verifiers for scaling open-weights swe agents}, | |
| author={Jain, Naman and Singh, Jaskirat and Shetty, Manish and Zheng, Liang and Sen, Koushik and Stoica, Ion}, | |
| journal={arXiv preprint arXiv:2504.07164}, | |
| year={2025} | |
| } | |
| ``` | |
| ## License | |
| This model is released under the same license as the base Qwen2.5-Coder model. | |
| ## Links | |
| - **Paper**: [R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents](https://arxiv.org/abs/2504.07164) | |
| - **GitHub**: [R2E-Gym](https://github.com/R2E-Gym/R2E-Gym) | |
| - **Dataset**: [R2EGym-TestingAgent-SFT-Trajectories](https://huggingface.co/datasets/R2E-Gym/R2EGym-TestingAgent-SFT-Trajectories) | |
| - **Related Models**: | |
| - [R2EGym-32B-Agent](https://huggingface.co/R2E-Gym/R2EGym-32B-Agent) | |
| - [R2EGym-Verifier](https://huggingface.co/R2E-Gym/R2EGym-Verifier) |