Papers
arxiv:2509.21834

RobustFlow: Towards Robust Agentic Workflow Generation

Published on Sep 26
Authors:
,
,
,
,
,
,
,
,

Abstract

A novel training framework, RobustFlow, improves the robustness of agentic workflows generated by large language models by teaching invariance to instruction variations.

AI-generated summary

The automated generation of agentic workflows is a promising frontier for enabling large language models (LLMs) to solve complex tasks. However, our investigation reveals that the robustness of agentic workflow remains a critical, unaddressed challenge. Current methods often generate wildly inconsistent workflows when provided with instructions that are semantically identical but differently phrased. This brittleness severely undermines their reliability and trustworthiness for real-world applications. To quantitatively diagnose this instability, we propose metrics based on nodal and topological similarity to evaluate workflow consistency against common semantic variations such as paraphrasing and noise injection. Subsequently, we further propose a novel training framework, RobustFlow, that leverages preference optimization to teach models invariance to instruction variations. By training on sets of synonymous task descriptions, RobustFlow boosts workflow robustness scores to 70\% - 90\%, which is a substantial improvement over existing approaches. The code is publicly available at https://github.com/DEFENSE-SEU/RobustFlow.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2509.21834 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2509.21834 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2509.21834 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.