Where to Run the LLM Agent Harness: Sandbox vs. Local Architecture Tradeoffs
By
Andrea Luzzardi
Fresh out the oven, still warm. Top of the tray.
Summary
This article explores the architectural decision of where to run an LLM agent harness — the core loop that drives an agent by sending prompts, executing tool calls, and feeding results back. It compares two approaches: running the harness inside the sandbox (alongside the LLM) versus outside the sandbox (on the user's local machine or server). The piece details the security implications, failure modes, and capability tradeoffs of each architecture, particularly contrasting single-user and multi-user agent scenarios. It also discusses how skills and memories work when the harness runs outside the sandbox, offering practical guidance for engineers building production agent systems.
Key quotes
· 3 pulledAn agent harness is the loop that drives an LLM. It sends a prompt, gets a response, executes the tool calls the model requested, feeds the results back, and repeats until the model says it's done.
There are two answers. They have different security properties, different failure modes, and different implications for what the agent can do.
The tradeoffs also look different depending on whether you're building a single-user agent (one engineer on a laptop) or a multi-user one (dozens of engineers in the same organization sharing the...)
You might also wanna read
Agent Sandbox: A Tool for AI Agents to Run Code and Generate Files Locally
Agent Sandbox is a tool that provides AI agents with sandboxed computing capabilities, allowing them to run Python/Bash scripts, install pac
OpenAI Updates Agents SDK with Codex-Style Harness and Enhanced Sandboxing
OpenAI's Build Hour session, led by engineer Steve Corley, introduced key updates to the Agents SDK, including a new "Codex-style harness" t
Secure AI Agent Deployment: Sandboxed Execution with relaxAI
This article promotes a webinar/presentation by Ben Norris, AI Engineer at relaxAI, focused on deploying AI agents within secure, sandboxed
