Running Agents

Learn how to execute agents and handle results.

Overview

The Runner class provides static methods for executing agents synchronously, asynchronously, or with streaming. Each execution returns a RunResult containing the output, usage statistics, and execution metadata.

Basic Execution

Execute an agent with a simple text prompt:

Agent<UnknownContext, TextOutput> agent =
    Agent.<UnknownContext, TextOutput>builder()
        .name("Assistant")
        .instructions("You are a helpful assistant.")
        .build();

RunResult<UnknownContext, ?> result =
    Runner.run(agent, "Explain what an AI agent is in one sentence.");

System.out.println(result.getFinalOutput());
// Output: "An AI agent is a software system..."

View complete example →

Runner Methods

`Runner.run()`

Synchronous execution that blocks until completion:

RunResult<UnknownContext, ?> result = Runner.run(agent, "Your prompt");

With configuration:

RunConfig config = RunConfig.builder()
    .maxTurns(10)
    .build();

RunResult<UnknownContext, ?> result = Runner.run(agent, "Your prompt", config);

`Runner.runAsync()`

Asynchronous execution returning a CompletableFuture:

CompletableFuture<RunResult<UnknownContext, ?>> futureResult =
    Runner.runAsync(agent, "Your prompt");

// Process result when ready
futureResult.thenAccept(result -> {
    System.out.println(result.getFinalOutput());
});

`Runner.runStreamed()`

Streaming execution for real-time updates:

Flux<RunStreamEvent> stream = Runner.runStreamed(agent, "Your prompt");

stream.subscribe(event -> {
    if (event.getEventType().equals("output")) {
        System.out.print(event.getData());
    }
});

See the Streaming guide for detailed streaming patterns.

Run Configuration

Configure execution behavior with RunConfig:

RunConfig config = RunConfig.builder()
    .maxTurns(5)       // Maximum conversation turns (default: 20)
    .timeout(30000)     // Timeout in milliseconds
    .build();

RunResult<UnknownContext, ?> result = Runner.run(agent, "Your prompt", config);

Configuration Options

Option	Type	Default	Description
`maxTurns`	`int`	`20`	Maximum number of conversation turns before stopping
`timeout`	`long`	None	Maximum execution time in milliseconds

Max Turns Limit

When the agent reaches maxTurns, execution stops and may throw MaxTurnsExceededError. Set this value based on your task complexity and cost tolerance.

Understanding RunResult

RunResult contains comprehensive information about the agent execution:

RunResult<UnknownContext, ?> result = Runner.run(agent, "Your question");

// Final output
Object output = result.getFinalOutput();

// Usage statistics
Usage usage = result.getUsage();
System.out.println("Total tokens: " + usage.getTotalTokens());
System.out.println("Input tokens: " + usage.getInputTokens());
System.out.println("Output tokens: " + usage.getOutputTokens());

// Execution metadata
List<ModelResponse> responses = result.getRawResponses();
System.out.println("Turns taken: " + responses.size());

List<RunOutputItem> items = result.getNewItems();
String lastId = result.getLastResponseId();

RunResult Methods

Method	Return Type	Description
`getFinalOutput()`	`Object`	The agent's final output (text or structured)
`getUsage()`	`Usage`	Token usage statistics
`getRawResponses()`	`List<ModelResponse>`	All model responses (one per turn)
`getNewItems()`	`List<RunOutputItem>`	Generated conversation items
`getLastResponseId()`	`String`	ID of the last response

Multi-Turn Execution

Agents may require multiple turns to complete complex tasks, especially when using tools:

Agent<UnknownContext, TextOutput> agent =
    Agent.<UnknownContext, TextOutput>builder()
        .name("ThinkingAssistant")
        .instructions("Think step by step to provide thorough answers.")
        .build();

RunConfig config = RunConfig.builder()
    .maxTurns(5)  // Allow up to 5 turns
    .build();

RunResult<UnknownContext, ?> result = Runner.run(
    agent,
    "What are the key differences between OOP and functional programming?",
    config
);

// Track execution
System.out.println("Turns taken: " + result.getRawResponses().size());
System.out.println("Total tokens: " + result.getUsage().getTotalTokens());

Each turn represents one request-response cycle with the model. Multi-turn execution occurs when:

The agent calls tools and processes their results
The agent uses handoffs to delegate to other agents
The agent performs multi-step reasoning

View complete example →

Per-Turn Usage Tracking

Track token usage for each individual turn:

RunResult<UnknownContext, ?> result = Runner.run(agent, "Your question");

// Per-turn breakdown
int turnNumber = 1;
for (ModelResponse response : result.getRawResponses()) {
    System.out.printf(
        "Turn %d: %.0f tokens (in: %.0f, out: %.0f)%n",
        turnNumber,
        response.getUsage().getTotalTokens(),
        response.getUsage().getInputTokens(),
        response.getUsage().getOutputTokens()
    );
    turnNumber++;
}

Error Handling

Handle common execution errors:

try {
    RunResult<UnknownContext, ?> result = Runner.run(agent, "Your prompt");
    System.out.println(result.getFinalOutput());

} catch (MaxTurnsExceededError e) {
    // Agent hit the max turns limit
    System.err.println("Agent exceeded maximum turns: " + e.getMessage());

} catch (AuthenticationException e) {
    // Invalid or missing API key
    System.err.println("Authentication failed: " + e.getMessage());
    System.err.println("Check your OPENAI_API_KEY environment variable");

} catch (RateLimitException e) {
    // Hit OpenAI rate limits
    System.err.println("Rate limit exceeded: " + e.getMessage());
    System.err.println("Retry after: " + e.getRetryAfter());

} catch (Exception e) {
    // Other errors (network, model errors, etc.)
    System.err.println("Execution failed: " + e.getMessage());
}

Common Error Types

Exception	Cause	Resolution
`MaxTurnsExceededError`	Agent hit `maxTurns` limit	Increase `maxTurns` or simplify the task
`AuthenticationException`	Invalid/missing API key	Set `OPENAI_API_KEY` environment variable
`RateLimitException`	Hit OpenAI rate limits	Implement retry logic with backoff
`TimeoutException`	Execution exceeded timeout	Increase timeout or simplify task

Production Error Handling

Always implement retry logic with exponential backoff for rate limit and network errors. Use structured logging to track execution failures.

Running with Sessions

Add conversation memory using sessions:

// Create a session for conversation memory
Session session = new MemorySession();

Agent<UnknownContext, TextOutput> agent =
    Agent.<UnknownContext, TextOutput>builder()
        .name("Assistant")
        .instructions("You are a helpful assistant.")
        .build();

// First message
RunResult<UnknownContext, ?> result1 = Runner.run(agent, "My name is Alice", session);

// Agent remembers previous context
RunResult<UnknownContext, ?> result2 = Runner.run(agent, "What's my name?", session);
System.out.println(result2.getFinalOutput());
// Output: "Your name is Alice."

See the Sessions guide for detailed session management.

Running with Context

Pass custom context for tool approval or usage tracking:

public class MyContext {
    private Set<String> approvedActions = new HashSet<>();

    public void approve(String action) {
        approvedActions.add(action);
    }

    public boolean isApproved(String action) {
        return approvedActions.contains(action);
    }
}

Agent<MyContext, TextOutput> agent = /* ... */;
MyContext context = new MyContext();

RunResult<MyContext, ?> result = Runner.run(agent, "Your prompt", context);

See the Run Context guide for advanced context patterns.

Best Practices

Optimize Token Usage

Monitor result.getUsage() to track costs
Use gpt-4.1-mini for simple tasks to reduce costs
Set appropriate maxTurns to prevent runaway executions
Use sessions to maintain context without repeating information

Error Recovery

Implement exponential backoff for rate limit errors
Log lastResponseId for debugging partial failures
Set reasonable timeouts for production workloads
Validate inputs before execution to fail fast

Performance

Use runAsync() for non-blocking operations
Use streaming for real-time user feedback
Cache frequently used agents (they're immutable and thread-safe)
Pool session objects for concurrent executions

Next Steps

Streaming - Real-time output streaming
Tools - Add custom functions for agents to call
Sessions - Add conversation memory
Run Context - Custom context and tool approval

Additional Resources

BasicTextOutput.java - Simple execution
MultiTurnExecution.java - Multi-turn tracking
API Reference - Complete Javadoc documentation