Skip to content

Agent Runtimes

Agent Runtimes

RemoteClaw runs CLI agents as subprocesses. Each agent has a runtime that handles subprocess lifecycle, parses the CLI’s native output format, and translates events into a unified stream.

The AgentRuntime Interface

Every runtime implements a single method:

interface AgentRuntime {
execute(params: AgentExecuteParams): AsyncIterable<AgentEvent>;
}

execute() spawns the CLI process and returns an async iterable of events. Callers consume events as they arrive — there is no buffering of the full response.

Execute Parameters

ParameterTypePurpose
promptstringFull prompt text (system + user message)
sessionIdstring?CLI session ID for conversation resumption
mcpServersRecord<string, McpServerConfig>?MCP tool servers to expose
abortSignalAbortSignal?Cancellation support
workingDirectorystring?Subprocess working directory
envRecord<string, string>?Extra environment variables

The Event Stream

Events form a discriminated union. Every execution ends with exactly one done event:

EventKey FieldsPurpose
texttextStreaming text delta from the agent
tool_usetoolName, toolId, inputAgent invoked an MCP tool
tool_resulttoolId, output, isError?Tool returned a result
errormessage, code?Runtime or subprocess error
doneresult: AgentRunResultTerminal event with run summary

The done event carries an AgentRunResult with accumulated text, the session ID for the next invocation, duration, token usage, cost, and stop reason.

CLIRuntimeBase — Subprocess Machinery

All four runtimes extend CLIRuntimeBase, which handles subprocess lifecycle through a template method pattern. Subclasses implement three methods:

MethodPurpose
buildArgs(params)Construct CLI command-line arguments
buildEnv(params)Construct environment variables
extractEvent(line)Parse one NDJSON line into an AgentEvent

Two properties control I/O behavior:

PropertyDefaultPurpose
supportsStdinPrompttrueWhether to deliver large prompts via stdin
ndjsonStream"stdout"Which file descriptor carries structured events

Subprocess Lifecycle

When execute() is called:

  1. Spawn: The CLI process starts with full stdio pipes (stdin, stdout, stderr).

  2. Stream selection: The ndjsonStream property determines which file descriptor carries NDJSON events. The other stream is captured as diagnostic output.

  3. NDJSON parsing: Lines from the selected stream are parsed as JSON. Valid JSON lines are passed to extractEvent(), which returns an AgentEvent or null (to skip the line).

  4. Stdin prompt delivery: If supportsStdinPrompt is true and the prompt exceeds 10 KB, it is written to stdin. stdin.end() is always called so CLIs that block on stdin receive EOF.

  5. Event yielding: Events are pushed into a queue and yielded to the caller via async iteration.

  6. Termination: After the process exits, any final events (watchdog errors, abort markers) are emitted, followed by the done event.

Watchdog Timer

A 5-minute inactivity watchdog resets on every NDJSON line received. If no output arrives within the timeout, the runtime triggers process termination and emits an error event with code WATCHDOG_TIMEOUT.

Signal Escalation

Process termination (from watchdog, abort signal, or errors) follows a two-step escalation:

  1. SIGTERM — gives the CLI process a chance to clean up
  2. SIGKILL (after 1.5 seconds) — forces termination if SIGTERM is ignored

Per-Execution State Reset

Each concrete runtime resets its internal state before every execute() call. This makes runtime instances reusable across multiple invocations without reconstruction.

CLI Runtimes

Claude

AspectDetail
Commandclaude --output-format stream-json --verbose
Structured outputStream JSON events on stdout
Session resumption--resume <sessionId>
MCP configInline via --mcp-config '{"mcpServers":{...}}'
Stdin promptSupported (prompts over 10 KB)

Claude emits a content block streaming protocol. Text arrives as content_block_delta events with text_delta payloads. Tool use is assembled across multiple events: content_block_start begins a tool buffer, content_block_delta events with input_json_delta accumulate the input JSON, and content_block_stop triggers parsing and emission of the complete tool_use event.

Token usage and cost are captured from result events at the end of a run.

Gemini

AspectDetail
Commandgemini --output-format stream-json --prompt <prompt>
Structured outputFlat NDJSON events on stdout
Session resumption--resume <sessionId>
MCP configFile-based merge-restore of .gemini/settings.json
Stdin promptNot supported

Gemini emits flat events: message (with text content), tool_use, and tool_result as complete, self-contained events. No streaming assembly is needed.

Since the Gemini CLI lacks a flag for MCP server configuration, the runtime uses a merge-restore pattern: it reads the existing .gemini/settings.json, backs up the original content, merges in the MCP server entries, runs the CLI, and restores the original file in a finally block.

Codex

AspectDetail
Commandcodex exec --json --color never <prompt>
Structured outputTwo-level event hierarchy on stdout
Session resumptioncodex exec resume --json <sessionId> <prompt>
MCP configFile-based merge-restore of ~/.codex/config.toml
Stdin promptNot supported

Codex has the most complex event model. Top-level events (thread.started, item.started, item.updated, item.completed, turn.completed) contain nested item types (agent_message, command_execution, mcp_tool_call, etc.).

Text arrives incrementally: item.updated events carry cumulative text, and the runtime tracks the last emitted length to compute and yield only the delta.

The MCP config merge-restore targets the global ~/.codex/config.toml file, using a custom TOML serializer (no external TOML dependency).

OpenCode

AspectDetail
Commandopencode run --format json <prompt>
Structured outputEnvelope events with part field on stdout
Session resumption--session <sessionId>
MCP configFile-based merge-restore of .opencode/config.json
Stdin promptSupported (default)

OpenCode wraps events in an envelope containing type, timestamp, and sessionID. The actual event data is in a part field.

A notable pattern: OpenCode emits tool use and tool result in the same NDJSON line. The runtime yields the tool_use event first, then buffers the tool_result in a pending queue that drains after each base event yield. This preserves the expected tool_use followed by tool_result ordering.

MCP Configuration Patterns

CLI agents receive MCP server configuration so they can access RemoteClaw’s gateway tools (messaging, cron, etc.). Each CLI has a different mechanism:

CLIMethodFile ModifiedCleanup
ClaudeCLI flagNone (inline argument)None needed
GeminiMerge-restore.gemini/settings.json (workdir)Restore original
CodexMerge-restore~/.codex/config.toml (global)Restore original
OpenCodeMerge-restore.opencode/config.json (workdir)Restore original

The merge-restore pattern (used by Gemini, Codex, and OpenCode) follows a consistent lifecycle:

  1. Read the existing config file (if any)
  2. Save the original content
  3. Merge MCP server entries into the config
  4. Write the modified config
  5. Run the CLI
  6. Restore the original content in a finally block

Each manager tracks whether it created the file or directory, ensuring cleanup does not remove pre-existing user configuration.

Runtime Selection

createCliRuntime(provider) maps a provider string ("claude", "gemini", "codex", or "opencode") to the corresponding runtime class. The provider is normalized to lowercase before matching.