Agents
Smolagents is an experimental API which is subject to change at any time. Results returned by the agents can vary as the APIs or underlying models are prone to change.
To learn more about agents and tools make sure to read the introductory guide. This page contains the API docs for the underlying classes.
Agents
Our agents inherit from MultiStepAgent, which means they can act in multiple steps, each step consisting of one thought, then one tool call and execution. Read more in this conceptual guide.
We provide two types of agents, based on the main Agent
class.
- CodeAgent is the default agent, it writes its tool calls in Python code.
- ToolCallingAgent writes its tool calls in JSON.
Both require arguments model
and list of tools tools
at initialization.
Classes of agents
class smolagents.MultiStepAgent
< source >( tools: typing.List[smolagents.tools.Tool] model: typing.Callable[[typing.List[typing.Dict[str, str]]], smolagents.models.ChatMessage] system_prompt: typing.Optional[str] = None tool_description_template: typing.Optional[str] = None max_steps: int = 6 tool_parser: typing.Optional[typing.Callable] = None add_base_tools: bool = False verbosity_level: int = 1 grammar: typing.Optional[typing.Dict[str, str]] = None managed_agents: typing.Optional[typing.List] = None step_callbacks: typing.Optional[typing.List[typing.Callable]] = None planning_interval: typing.Optional[int] = None )
Parameters
- tools (
list[Tool]
) — Tools that the agent can use. - model (
Callable[[list[dict[str, str]]], ChatMessage]
) — Model that will generate the agent’s actions. - system_prompt (
str
, optional) — System prompt that will be used to generate the agent’s actions. - tool_description_template (
str
, optional) — Template used to describe the tools in the system prompt. - max_steps (
int
, default6
) — Maximum number of steps the agent can take to solve the task. - tool_parser (
Callable
, optional) — Function used to parse the tool calls from the LLM output. - add_base_tools (
bool
, defaultFalse
) — Whether to add the base tools to the agent’s tools. - verbosity_level (
int
, default1
) — Level of verbosity of the agent’s logs. - grammar (
dict[str, str]
, optional) — Grammar used to parse the LLM output. - managed_agents (
list
, optional) — Managed agents that the agent can call. - step_callbacks (
list[Callable]
, optional) — Callbacks that will be called at each step. - planning_interval (
int
, optional) — Interval at which the agent will run a planning step.
Agent class that solves the given task step by step, using the ReAct framework: While the objective is not reached, the agent will perform a cycle of action (given by the LLM) and observation (obtained from the environment).
execute_tool_call
< source >( tool_name: str arguments: typing.Union[typing.Dict[str, str], str] )
Execute tool with the provided input and returns the result. This method replaces arguments with the actual values from the state if they refer to state variables.
extract_action
< source >( llm_output: str split_token: str )
Parse action from the LLM output
planning_step
< source >( task is_first_step: bool step: int )
Used periodically by the agent to plan the next steps to reach the objective.
provide_final_answer
< source >( task: str images: typing.Optional[list[str]] ) → str
Provide the final answer to the task, based on the logs of the agent’s interactions.
run
< source >( task: str stream: bool = False reset: bool = True single_step: bool = False images: typing.Optional[typing.List[str]] = None additional_args: typing.Optional[typing.Dict] = None )
Parameters
- task (
str
) — Task to perform. - stream (
bool
) — Whether to run in a streaming way. - reset (
bool
) — Whether to reset the conversation or keep it going from previous run. - single_step (
bool
) — Whether to run the agent in one-shot fashion. - images (
list[str]
, optional) — Paths to image(s). - additional_args (
dict
) — Any other variables that you want to pass to the agent run, for instance images or dataframes. Give them clear names!
Run the agent for the given task.
To be implemented in children classes. Should return either None if the step is not final.
write_inner_memory_from_logs
< source >( summary_mode: bool = False )
Reads past llm_outputs, actions, and observations or errors from the logs into a series of messages that can be used as input to the LLM.
class smolagents.CodeAgent
< source >( tools: typing.List[smolagents.tools.Tool] model: typing.Callable[[typing.List[typing.Dict[str, str]]], smolagents.models.ChatMessage] system_prompt: typing.Optional[str] = None grammar: typing.Optional[typing.Dict[str, str]] = None additional_authorized_imports: typing.Optional[typing.List[str]] = None planning_interval: typing.Optional[int] = None use_e2b_executor: bool = False max_print_outputs_length: typing.Optional[int] = None **kwargs )
Parameters
- tools (
list[Tool]
) — Tools that the agent can use. - model (
Callable[[list[dict[str, str]]], ChatMessage]
) — Model that will generate the agent’s actions. - system_prompt (
str
, optional) — System prompt that will be used to generate the agent’s actions. - grammar (
dict[str, str]
, optional) — Grammar used to parse the LLM output. - additional_authorized_imports (
list[str]
, optional) — Additional authorized imports for the agent. - planning_interval (
int
, optional) — Interval at which the agent will run a planning step. - use_e2b_executor (
bool
, defaultFalse
) — Whether to use the E2B executor for remote code execution. - max_print_outputs_length (
int
, optional) — Maximum length of the print outputs. - **kwargs — Additional keyword arguments.
In this agent, the tool calls will be formulated by the LLM in code format, then parsed and executed.
Perform one step in the ReAct framework: the agent thinks, acts, and observes the result. Returns None if the step is not final.
class smolagents.ToolCallingAgent
< source >( tools: typing.List[smolagents.tools.Tool] model: typing.Callable[[typing.List[typing.Dict[str, str]]], smolagents.models.ChatMessage] system_prompt: typing.Optional[str] = None planning_interval: typing.Optional[int] = None **kwargs )
Parameters
- tools (
list[Tool]
) — Tools that the agent can use. - model (
Callable[[list[dict[str, str]]], ChatMessage]
) — Model that will generate the agent’s actions. - system_prompt (
str
, optional) — System prompt that will be used to generate the agent’s actions. - planning_interval (
int
, optional) — Interval at which the agent will run a planning step. - **kwargs — Additional keyword arguments.
This agent uses JSON-like tool calls, using method model.get_tool_call
to leverage the LLM engine’s tool calling capabilities.
Perform one step in the ReAct framework: the agent thinks, acts, and observes the result. Returns None if the step is not final.
ManagedAgent
class smolagents.ManagedAgent
< source >( agent name description additional_prompting: typing.Optional[str] = None provide_run_summary: bool = False managed_agent_prompt: typing.Optional[str] = None )
Parameters
- agent (
object
) — The agent to be managed. - name (
str
) — The name of the managed agent. - description (
str
) — A description of the managed agent. - additional_prompting (
Optional[str]
, optional) — Additional prompting for the managed agent. Defaults to None. - provide_run_summary (
bool
, optional) — Whether to provide a run summary after the agent completes its task. Defaults to False. - managed_agent_prompt (
Optional[str]
, optional) — Custom prompt for the managed agent. Defaults to None.
ManagedAgent class that manages an agent and provides additional prompting and run summaries.
Adds additional prompting for the managed agent, like ‘add more detail in your answer’.
stream_to_gradio
smolagents.stream_to_gradio
< source >( agent task: str reset_agent_memory: bool = False additional_args: typing.Optional[dict] = None )
Runs an agent with the given task and streams the messages from the agent as gradio ChatMessages.
GradioUI
You must have gradio
installed to use the UI. Please run pip install smolagents[gradio]
if it’s not the case.
class smolagents.GradioUI
< source >( agent: MultiStepAgent file_upload_folder: str | None = None )
A one-line interface to launch your agent in Gradio
upload_file
< source >( file file_uploads_log allowed_file_types = ['application/pdf', 'application/vnd.openxmlformats-officedocument.wordprocessingml.document', 'text/plain'] )
Handle file uploads, default allowed types are .pdf, .docx, and .txt