// Document — Field Report

FILED 2026.05.08 / NEW YORK, NY

// Subject

T. SHAH — AI engineer
Founding-engineer reps · NYC

// Status

OPEN TO BUILD · TALKING TO TEAMS

TanayShah.

I'm an AI engineer. I build the systems behind production chat apps and agent platforms — backends, sandboxes, iOS clients, fast.

CS Honors + Statistics · UMD · grad. Dec 2025
4 roles · 4 years · NYC-based
Open to: founding · early-stage · applied AI
Most recent: Founding Engineer @ Structured AI

// IF YOU'RE COUNTING

Most recent 14-week window: 1,035 commits, 30 PRs merged, work spanning backend, frontend, iOS, and infra at near-100% blame-ownership. Four production roles across four years. The numbers are reproducible — charts in §03, timeline in §04.

// ACTIONS

↓ RESUME ▸ CASE STUDIES ✉ EMAIL OPEN COMMS →

SCROLL FOR THE 30-SECOND TAKE →§ 01 / 08

§01.5// transmissionFIELD REPORT

I'm Tanay. I build production AI infrastructure — agent loops, sandboxed runtimes, durable event-sourced chat, the iOS clients that hang off them. Production agent infrastructure that ships — and agents that don't just work, they work well. I think about evals as much as I think about loops.

The way I work: pick a hard problem, draw the diagram, ship the first version in days, instrument it, iterate. I think about agent systems for fun — LangGraph + Claude with on-demand tool loading; gRPC bridges between API and slim agent pods; Python sandboxes built on JSON-RPC, AST validation, and seccomp-locked bubblewrap. The diagrams in §02 are real, the numbers in §03 are measured.

// MOST RECENT CHAPTER
Fourteen weeks as Founding Engineer at Structured AI (NYC). 1,035 commits across backend, frontend, iOS, plugin work, and infra. Shipped the production document-analysis agent (the shape in §02), a ZDR-compliant code-execution sandbox built on the PTC-interception pattern, an in-house eval harness wired to a public domain benchmark, and the iOS app from zero. Before that: distributed systems at Intuitive Labs SF; two years of ML research at UMD's iSchool fine-tuning Llama on twenty million Wikipedia AfD comments. I started writing CNNs at sixteen.

Graduate of the University of Maryland — departmental honors in CS, Statistics minor, December 2025. The page below is instrumented; you can read it like a spec.

§02// architecture — how i build agent systemsLAYERED AGENT PATTERN

How I architect
agent systems.

Same shape every time: a thin client, an API gateway, a slim agent pod that runs a graph runtime + LLM with on-demand tool loading, retrieval over a vector store, durable state in Postgres + Redis. The diagram below is the reference pattern — portable across agent products.

// THE SANDBOX TRICK
The code sandbox isn't there for safety alone — it's a request interceptor. Hosted code-execution offerings typically ship customer payloads through a third-party endpoint. This sandbox lets the model believe it's executing upstream while the run actually happens on hardened in-house infra — giving you full Zero Data Retention without giving up tool quality. Payloads wipe immediately; the upstream provider never sees them.

→ Layered shape · portable across agent products
→ Streaming end-to-end · WSS · SSE · JSON-RPC
→ Tools discovered at runtime · MCP-shaped
→ Sandbox = interceptor · ZDR by default

// HOVER A NODE TO INSPECT

Idle. Move cursor over diagram →

// FIG. 02 · AGENT PLATFORM · LAYERED VIEWSCALE 1:N

critical pathaux flowstreamedPROD · 2026.04

§03// velocity — output, sampled14-WEEK SAMPLE · COUNTS ANONYMIZED

What fourteen weeks of me looks like.

1,035 commits, ~74/week median. The chart below is a single 14-week sample of my output cadence — author-filtered to a single email. Calendar dates and milestone labels are stripped out for confidentiality; the bar shape is real.

WEEKS14

TOTAL1,035 commits

MEDIAN~74 commits / week

PEAK WK145 commits

MIN WK30 commits

AUTHORsingle email · public git

// FIG. 03 · WEEKLY COMMIT VOLUMEN = 14

// SHAPE — ramp up over the first three weeks, sustained mid-cadence through the middle, peak in the final third around a major ship, taper through the wrap. Median ~74/week, peak 145, min 30. Counts and shape are real; calendar dates and milestone labels are out of scope here.

§04// timeline · career

loading…

§06// capability · year × technologyLEGEND BELOW

Surface area —
instrumented.

A skill cloud lies. This grid tells the truth: which years used what, how deeply, and where things accelerated. 2026 is the steepest column on the page — that's by design.

// LEGEND

not used / before time

touched · learning

production-grade

shipped at scale · owned

// TECH20222023202420252026

LANGUAGES

Python

TypeScript

Swift / SwiftUI

C# / XAML

C / C++

SQL

AGENTS · AI

Anthropic Claude

LangGraph / Deep Agents

MCP

Gemini Pro / Flash

PyTorch / TF

RAG · pgvector

Eval harness · benchmark

Prompt eng · cache · 1M ctx

BACKEND · INFRA

FastAPI

PostgreSQL · asyncpg

Redis

gRPC + Protobuf

Docker · multi-stage

Azure (Container Apps)

AWS (Fargate · Lambda · S3 · SQS)

Kafka / Spark

SANDBOXING · SECURITY

bwrap / seccomp

AST validation

FRONTEND · MOBILE

SvelteKit · Svelte 5

React / Next.js

iOS · MarkdownUI

OTHER

Typst

SearXNG · self-host

§07// faq · for recruiters & co-founders14 ANSWERS

The
quick answers.

Recruiters and co-founders ask the same ten questions on the first call. Here are the answers — concise enough to scan, complete enough to pre-qualify.

Q01Who is Tanay Shah?+

Tanay Shah is an AI engineer in New York City — Founding Engineer at Structured AI (Y Combinator F25 batch), where over a 14-week tenure he shipped a production document-analysis agent, an in-house ZDR-compliant code-execution sandbox, an iOS / iPadOS client from zero, and CI work that materially cut staging deploy time (1,035 commits, author-filtered, public-git verifiable). University of Maryland Computer Science graduate (departmental honors, Statistics minor, December 2025). Open to founding and early-stage AI engineering roles in 2026. Distinct from the other public Tanay Shahs: the Senior Product Manager at Amazon in Boston, the corporate-securities attorney at Baker McKenzie in Toronto, the diagnostic-radiology professor at the University of Florida, and the M&A principal at Deloitte Consulting.

Q02What does Tanay build?+

Production AI infrastructure: agent loops on LangGraph + Anthropic Claude with on-demand tool loading and prompt caching; sandboxed Python runtimes built on JSON-RPC + AST validation + bubblewrap + seccomp; durable event-sourced chat backends with replay-on-reconnect cursors and dead-letter queues; iOS / iPadOS clients with streaming markdown chat and Apple Sign-In OAuth; real-time data pipelines with bounded queues, drop-oldest backpressure, and on-anomaly black-box flight recorders.

Q03Where is Tanay based?+

New York City. Open to relocation for the right team.

Q04What kinds of roles is Tanay looking for?+

Founding / early-stage engineer roles at applied-AI startups, infrastructure / platform engineering at scale, and big-tech SWE positions where the team ships hard problems. He's particularly drawn to teams working on agent infrastructure, MCP, multi-vendor LLM orchestration, durable chat, and AI safety / sandboxing.

Q05What's Tanay's strongest technical area?+

AI agent infrastructure. He's shipped LangGraph + Anthropic Claude (Opus / Sonnet) agents in production, designed an in-house Programmatic-Tool-Calling sandbox that satisfies Zero Data Retention compliance for enterprise clients, built multi-vendor agent designs (Claude for reasoning, Gemini Vision for high-resolution document inspection), and contributed to the Model Context Protocol (MCP) ecosystem with a 4-star Travel MCP server.

Q06What was Tanay's role at Structured AI?+

Founding Engineer (NYC, February 2026 – May 2026). 1,035 commits in 14 weeks across backend, frontend, iOS, plugin work, and infra — author-filtered, public-git verifiable. Shipped the production document-analysis agent (graph-runtime + LLM), an in-house ZDR-compliant code-execution sandbox, the iOS app from zero, and CI work that materially cut staging deploy time.

Q07How fast does Tanay ship?+

Sustained: ~74 commits per week median over a documented 14-week sample. Peak week: 145 commits. The sample is author-filtered to a single email and reproducible from public git. The cadence sustains across backend, frontend, iOS, and infra work simultaneously.

Q08What's Tanay's tech stack?+

Languages: Python (3.12, asyncio TaskGroup / ExceptionGroup), TypeScript (5.5+), Swift / SwiftUI, C# / XAML, SQL, C / C++. Agents / AI: Anthropic Claude (Opus / Sonnet), Google Gemini 3.1 Pro Vision, OpenAI GPT-4o, Llama 3.3 70B (via Groq), LangGraph, Deep Agents, Anthropic Claude Code SDK, MCP. Backend: FastAPI, PostgreSQL (async drivers, pgvector, connection pooling), Redis, gRPC + Protobuf, Trigger.dev v4. Storage: Drift (Flutter SQLite), ObjectBox vectors, TensorFlow Lite, Apache AGE (Postgres graph), MongoDB. Frontend / Mobile: SvelteKit + Svelte 5, React + Next.js, Tailwind, Swift / SwiftUI, Flutter / Dart. Infra: Azure (Container Apps + Jobs, container registry, build cache), AWS (Fargate, Lambda), Vercel, Docker (multi-stage). Security: bubblewrap, seccomp, AST validation.

Q09How can I reach Tanay?+

Email tanayshah2024@gmail.com (fastest), LinkedIn at linkedin.com/in/tanayshah11, or GitHub at github.com/tanayshah11. His résumé is downloadable from tanayshah.dev. He responds to recruiter outreach within 24 hours.

Q10Does Tanay have open-source work?+

Yes. Notable repos at github.com/tanayshah11: travel-mcp-server (4 stars — MCP server for AI travel search), ai-agent-error-patterns (production reliability patterns for AI agents on Trigger.dev v4), mercury-stream (real-time market data pipeline with anomaly detection), apex-backtest (CLI wrapper for QuantConnect's Lean), and jarvis (privacy-first on-device personal AI on Flutter + FastAPI).

Q11How does Tanay pick MCP servers for an agent?+

He treats each MCP server as 500-1,000 tokens of prompt context per tool, billed every turn forever, and runs each candidate through five questions: does it solve a weekly problem (not a quarterly one); is the tool surface domain-shaped or kitchen-sink; does the server cache; does it have a clean structured-error contract; is the maintainer responsive to security issues. Recommended starting set is three servers (a code/repo server, a docs server, one domain-specific) and earn the fourth. Full heuristic with 2026 token-cost data at tanayshah.dev/blog/picking-mcp-servers-for-agents.

Q12How does Tanay think about agent sandbox security?+

He uses a four-layer hardening model — filesystem (allowlist via Bubblewrap --ro-bind, never denylist), syscall (seccomp BPF filters blocking ptrace / process_vm_readv / unexpected execve), network (default-deny with per-tool allowlists), and capability boundary at the model layer (narrow domain-specific tools instead of generic shell). Three operational rules: assume eventual escape and scope IAM accordingly; log every tool call to immutable storage; treat the model itself as a supply-chain dependency. Reasoned through the April 2026 Claude Code Bubblewrap escape (Ona) and the May 2026 CVEs in Microsoft Semantic Kernel and Claude Code Hooks. Full write-up at tanayshah.dev/blog/agent-sandbox-runtime-hardening.

Q13How does Tanay handle multi-vendor LLM orchestration?+

He picks each vendor by what it wins at — Claude for long-context reasoning and tool calling, Gemini Vision for high-resolution document inspection, Llama 3.3 on Groq for sub-100ms first-token latency on the live coaching path, GPT for the open-ended creative work. Behind a clean abstraction so a model swap is one config line, with per-tenant routing rules, per-vendor circuit breakers, and unified token-cost / latency telemetry. Multi-vendor is the new senior+ AI engineering bar; single-vendor portfolios are 2024-tier. Architecture in the multi-vendor agent design post.

Q14What's Tanay's view on Zero Data Retention for production AI agents?+

ZDR-grade compliance can't piggyback on Anthropic's hosted Programmatic Tool Calling because that feature's data retention is governed by its own policy, not your org's ZDR setting. The pattern Tanay uses: a request-interception layer that intercepts tool calls before they reach the hosted backend, runs them in your own sandbox, and returns results to the agent loop — preserving model-side reasoning quality while keeping customer data on-prem. Architectural write-up with the Anthropic-doc citation at tanayshah.dev/blog/zero-data-retention-agents.

§08// contact · open commsSTDIN OPEN

Want to build
something serious?

Founding / early engineer roles. AI infra. Backend at scale. Big tech if the team ships hard problems. I'm in NYC, willing to relocate, and I move fast. Pick a command:

tanay@field-report ~ %● ONLINE
$ 
$ open mailOpen mail client↗
$ ssh linkedinConnect on LinkedIn↗
$ git cloneView source on GitHub↗
$ open devpostDevpost portfolio↗
$ cat resume.pdfDownload resume↗
# Or just email tanayshah2024@gmail.com

EOF · END OF TRANSMISSION

TanayShah.

How I architectagent systems.

What fourteen weeks of me looks like.

Surface area —instrumented.

Thequick answers.

Want to buildsomething serious?

How I architect
agent systems.

Surface area —
instrumented.

The
quick answers.

Want to build
something serious?