Structured AI · Founding Engineer · Founding-engineer work

// 01 — WHY I BUILT IT

THE PROBLEM

Joined at the founding-engineering stage with a clear mandate: ship the agent infrastructure, the iOS client, and the deploy pipeline that production needed. Most of the engineering decisions had compound constraints — enterprise compliance, cross-platform parity, deploy speed — that rewarded engineers who could move fast across the stack.

// 02 — THE APPROACH

THE WORK

Fourteen weeks of velocity: 1,035 commits, ~74/week median, peak week of 145. The work spanned backend (FastAPI + Postgres + Redis + gRPC bridge), agent loop (LangGraph + Claude with on-demand tool loading), code execution (in-house sandbox built for Zero Data Retention compliance), iOS / iPadOS client from zero, and CI work cutting staging deploys to under seven minutes.

Most architectural detail beyond the public timeline is NDA-bound. The deeper version is available in 1:1 conversation under appropriate scope.

// 03 — KEY DECISIONS

WHAT I CHOSE & WHY

DECISION · 01

In-house sandbox over hosted PTC

Anthropic's hosted Programmatic Tool Calling doesn't support Zero Data Retention, and enterprise clients required ZDR. Built a request-interception layer that handles PTC traffic locally — untrusted code runs in a hardened in-house pod (JSON-RPC over child subprocess, AST validation, bubblewrap + seccomp); customer data wipes immediately and never reaches the model provider.

DECISION · 02

Multi-vendor agent (Claude reasoning + Gemini vision)

Anthropic Claude drives the reasoning loop; Google Gemini Vision is delegated for high-resolution document inspection through a multi-tile rendering pipeline that yields significantly higher effective resolution than naive full-page rendering. Different providers, different strengths — the agent uses each where it wins.

DECISION · 03

Synthesis over framework adoption

Built three prior agents on LangGraph Deep Agents, raw LangGraph nodes, and Anthropic's Claude Code SDK loop. None hit the bar. The shipped agent is a synthesis: takes the strongest pattern from each (prefetch-backed tools and server-side compaction, workspace + execute model, minimal-tool philosophy, adaptive thinking) and strips out every line of framework overhead.

// 05 — STATE OF THE ART

2026 BLEEDING-EDGE TECH

▸ LangGraph + Anthropic Claude (Opus / Sonnet)

Modular agent loop with on-demand tool loading and persistent skill / memory injection. Cache-friendly via Anthropic's ephemeral cache_control directive — the right primitive for cost-sensitive long-context agents.

▸ Multi-tile vision delegation (Gemini Vision)

Construction drawings are 36×24-inch sheets — naive full-page rendering loses critical detail. Multi-tile rendering yields significantly higher effective resolution at the same per-tile cost. Multi-vendor design where each provider plays its strength.

▸ Hardened Python sandbox (bubblewrap + seccomp + AST validation)

Linux user-namespace isolation + syscall filtering + pre-execution AST static analysis. Same-class hardening as production code-execution platforms (Replit / E2B), built in-house for ZDR-compliant enterprise deployment.

▸ Programmatic-Tool-Calling request interception

Anthropic's hosted PTC doesn't support Zero Data Retention. Custom interception layer keeps execution local while preserving Claude's PTC code-quality. ZDR compliance without sacrificing model capability.

▸ gRPC + Protobuf bridge between API and agent pod

Cross-process typed RPC at scale. Protobuf schemas double as the API contract — no drift between client and server.

▸ Event-sourced durable chat (session_events + replay cursor + DLQ)

Chat survives network blips, pod restarts, and deploys without lost messages. Replay-on-reconnect is the modern primitive for production conversational UI.

▸ Azure Container Apps Jobs + ACR Tasks (sub-7-min deploys)

Container-native serverless compute for the agent pod. ACR Tasks build with Blacksmith BuildKit cache. CPU-only torch + blobless wipe-guard clone cut staging deploys from 10+ min to under 7.

// 06 — MEASURED

NUMBERS THAT MATTER

Commits / 14 wks

1,035

Author-filtered, single email

Lines authored

~832K

At HEAD

PRs merged

30

GitHub

Repo insertions

~37%

Of all-author insertions ever

// 07 — IF I DID IT AGAIN

LESSONS · WHAT I'D CHANGE

→Founding-engineering velocity isn't typing speed; it's the ability to make 'good enough now, refactor next sprint' calls without stopping the line. Every week I shipped, I also paid down a known piece of debt I'd taken the prior week. Net: forward motion every Monday.
→Multi-vendor agent design is increasingly important — single-provider lock-in is a liability for both compliance and capability. The pattern of 'reasoning model + delegated vision model' will keep generalizing.
→The most valuable infrastructure I shipped was the CI work. Cutting staging deploys from 10+ minutes to under 7 changed how the whole team iterated.

// 08 — STACK

THE TOOLS

BACKEND

Python · FastAPI

STORE

PostgreSQL · asyncpg · pgvectorRedis

RUNTIME

Docker · multi-stage

INFRA

Azure (Container Apps · ACR)

AGENT

LangGraph · Anthropic ClaudeGoogle Gemini Vision

TRANSPORT

gRPC + Protobuf

SECURITY

bubblewrap · seccomp · AST validation

WEB

SvelteKit + Svelte 5

IOS

Swift / SwiftUI

// 09 — RELATED WORK

OTHER CASE STUDIES BY TANAY

Architecture sketch · weekend project

Real-Time Sales-Conversation Coaching Agent

Architecture sketch for an agent that listens to in-person field sales conversations and delivers grounded coaching — live during the call and deep after it.

READ →

Real-time data infrastructure

MercuryStream

A real-time market data pipeline that solves the problems exchanges warn you about.

READ →

Reference / open-source library

AI Agent Error-Handling Patterns

Stop your AI agents from failing silently. Four production reliability patterns, with tests and upgrade paths.

READ →

// RELATED NOTES