Why I Use gRPC for the Agent-to-Sandbox Bridge (and JSON-RPC Inside It)

Most teams pick one wire protocol and use it everywhere. The right answer is to pick by trust boundary: gRPC + Protobuf for the API-to-pod hop where you control both ends, JSON-RPC over a subprocess pipe inside the sandbox where the network surface has to be zero. Here's the two-tier design and the math behind it.

If you read most agent-sandbox posts in 2026 you'll come away with the impression that the protocol question is settled: REST/JSON over HTTPS, maybe SSE for streaming, done. The two reasons it looks settled are: (1) the demo always works that way because it's the easiest to wire up, and (2) the people writing about it usually have one trust boundary to defend, so one protocol is enough. Production agent platforms have at least two trust boundaries. Picking one protocol for both is a category error.

The version I ship in production looks like this: gRPC + Protobuf at the API-to-pod boundary, JSON-RPC over a stdio pipe inside the sandbox. Two protocols, two reasons, one architectural diagram you have to actually look at:

┌──────────────┐     gRPC + Protobuf            ┌──────────────────┐
│  API server  │ ─────HTTP/2 unary + stream───► │  agent pod (slim)│
│  (FastAPI)   │ ◄──── typed responses ──────── │  one container,  │
└──────────────┘                                │  no Python code  │
                                                │  except orchestr.│
                                                └────────┬─────────┘
                                                         │
                                          fork() + bwrap + seccomp
                                                         │
                                                         ▼
                                                ┌──────────────────┐
                                                │   sandbox proc   │
                                                │   (untrusted     │
                                  JSON-RPC over │    LLM-generated │
                                  stdin/stdout  │    Python)       │
                                  ◄─────────────│                  │
                                                └──────────────────┘

Why gRPC for the API-to-pod boundary

Both ends of this hop are mine. The API server is mine, the agent pod is mine, the network between them is a private VPC. There is no public client, there is no third-party integration. This is exactly the situation gRPC was built for: internal service-to-service traffic where you control the schema, the deploy, and the network.

▸Type-safe schema. The Protobuf .proto file is the contract. It compiles to Python types on both sides. Drift between the API and the pod becomes a build error, not a runtime 500. I've spent enough Saturdays debugging silent JSON shape mismatches to never want to spend another one.
▸Wire-format efficiency. Public benchmarks put Protobuf at 60-80% smaller payloads than equivalent JSON, and Python's stdlib json parser is roughly 8x slower than protobuf at decoding equivalent payloads. At 1 RPS this doesn't matter. At 50 RPS in a multi-tenant agent it costs you a full pod replica and the latency budget on every call.
▸HTTP/2 multiplexing. The agent makes many small RPCs (load tool, run tool, write artifact, stream stdout). HTTP/1.1 head-of-line blocking serializes those onto a connection; HTTP/2 fires them in parallel on the same connection. The cumulative latency saving is non-trivial: I measured ~120 ms shaved off an end-to-end agent turn that fans out 4-6 tool calls.
▸Native server streaming. The agent's stdout streams from the pod to the API and onward to the client. gRPC server-streaming is a one-line .proto annotation. Doing the same thing on top of REST means SSE on the client, chunked transfer encoding on the server, and a custom heartbeat to detect drops, which is more code than the rest of the bridge combined.

I want to flag the case where E2B's architecture lines up with this same conclusion: their public docs explicitly say the data plane uses gRPC, while only the control plane (lifecycle, billing, auth) uses REST. I take that as confirmation that the people running the largest production sandbox fleet in 2026 reached the same answer for the same reasons.

Why JSON-RPC for inside the sandbox

Inside the bwrap'd, seccomp-locked sandbox process, the picture inverts. There is no network. The whole point of the sandbox is that the executed code can't reach a network. The only IPC surface is the parent process's stdin and stdout: file descriptors that exist before the seccomp filter clamps down on the syscall set.

This is where JSON-RPC is exactly right and gRPC would be silly:

▸The transport is a Unix pipe, not a network socket. gRPC's HTTP/2 transport assumes a network. JSON-RPC's transport-agnostic framing (Content-Length headers, JSON body) was literally designed to run over stdio. You can ship messages with a 30-line Python implementation on each side.
▸The sandbox process is short-lived. It boots, runs one task, dies. The Protobuf toolchain (protoc, generated code, schema compilation) is overhead that pays off only when the process lives long enough to amortize the setup. A sandbox that lives for 800 ms doesn't.
▸The schema rarely changes inside the sandbox boundary. The same six tool primitives (read file, write file, exec, list, status, done) cover almost everything the agent generates. JSON's looseness is a feature here, not a bug: I can add a new optional field to the message and old sandboxes ignore it.
▸Debuggability matters more than wire size. When a sandboxed task fails, I want to read the request and response in plain text in a log file. JSON-RPC gives me that. Protobuf needs a decoder.

There's a deeper reason JSON-RPC fits here too: this is the same protocol shape the Model Context Protocol (MCP) standardized for tool definition. JSON-RPC over stdio is the path of least resistance for any tool that wants to live inside an agent's sandboxed runtime later, and matching MCP's framing now means the sandbox boundary speaks the same dialect the rest of the agent ecosystem already does.

The decision rule, in one sentence

That sentence is the whole post. Everything above is the supporting math.

Common objections, briefly answered

▸"gRPC is too complex for an internal hop." The whole gRPC + Protobuf setup for a single .proto file (3 service methods, 4 message types) is about 120 lines of generated Python on each side, plus a buf or protoc invocation in CI. The complexity budget gRPC consumes is real but it's a one-time cost, not a per-call cost.
▸"Why not gRPC inside the sandbox too?" Because the sandbox doesn't have a network and you don't want to give it one. Adding HTTP/2 over a Unix pipe is technically possible (gRPC supports the unix domain socket transport) but the seccomp filter has to allow connect() and bind() to make it work, which weakens the sandbox to save zero meaningful CPU.
▸"Just use REST + SSE." That's the answer for the public-facing API. It's a fine answer at the boundary where you need browser clients, OpenAPI tooling, and Stripe-style debuggability. It's the wrong answer in a private VPC between two services you wrote.
▸"What about MCP everywhere?" MCP is JSON-RPC framing with specific verbs and discovery semantics. It's perfect inside the sandbox. At the API-to-pod boundary, where you have one client and one server and you control both, MCP's discovery layer is overhead you don't need.

What I'd change if I rebuilt this from scratch

▸Generate the Python types from the .proto file in CI, not in a manual make protos step. Schema drift between dev and prod is the failure mode you want to make impossible, not just unlikely.
▸Define a single shared Protobuf for every typed envelope in the system, not separate files per service. The cost of a slightly larger generated file is much lower than the cost of a duplicated message type that drifts.
▸Treat the JSON-RPC schema inside the sandbox as the documented version (using JSON Schema, not just hope) so a new contributor can read what the sandbox boundary accepts without reading the source code.

The bigger point

The protocol question is not 'gRPC or REST.' It's 'where are my trust boundaries, and what is each one's transport actually capable of?' If you ask the second question, the first one answers itself in two or three places per architecture, with different answers in each, and the design just falls out. I'd argue this is what 'principled engineering choice' looks like in 2026 for production agent infrastructure: not picking the protocol that everyone else picked, but knowing why the boundaries are where they are and matching the protocol to the boundary.

References

▸E2B docs — REST/OpenAPI for control plane, gRPC for data plane (deepwiki.com/e2b-dev/E2B)
▸Cloudflare Sandbox SDK — gRPC support in network settings (developers.cloudflare.com/network/grpc-connections)
▸Glama — "How JSON-RPC Helps AI Agents Talk to Tools" (glama.ai)
▸MCP specification — JSON-RPC framing for tool definition (modelcontextprotocol.io)
▸HackerNoon — "Cut Inter-Agent Latency by 80% With gRPC Streaming" (2026)
▸Northflank — code execution sandbox comparison 2026 (northflank.com/blog)

← MORE NOTES OPEN COMMS →