---
id: "claim-80-percent-plumbing"
type: "claim"
source_timestamps: ["00:25:27"]
tags: ["engineering-philosophy"]
related: ["concept-dual-logging-system-events", "quote-80-percent-plumbing"]
confidence: "high"
testable: true
speakers: ["Nate B. Jones"]
sources: ["s46-anthropic-25b-leak"]
sourceVaultSlug: "s46-anthropic-25b-leak"
originDay: 46
---
# Agent Building is 80% Plumbing

## The Claim
Building successful, production-grade AI agents is **80% non-glamorous backend engineering** (plumbing) and only **20% AI prompting**. The majority of the work involves:

- tool registries (see [[concept-metadata-first-tool-registry]])
- permission systems (see [[concept-risk-segmentation-permissions]], [[concept-contextual-permission-handlers]])
- state persistence (see [[concept-complete-session-persistence]], [[concept-workflow-state-separation]])
- logging infrastructure (see [[concept-dual-logging-system-events]])
- token budgeting (see [[concept-predictive-token-budgeting]], [[concept-transcript-compaction]])
- streaming and observability (see [[concept-structured-streaming-events]])

## Confidence: High
This is the central organizing claim of the entire video. Captured directly in [[quote-80-percent-plumbing]].

## Testability
**Testable in spirit, not in exact ratio.** A team could measure engineering hours spent on infrastructure vs. prompt iteration, but the precise 80/20 split is rhetorical.

## Validation (Enrichment)
- Partially supported as opinion; no direct empirical study quantifies an 80/20 split.
- Industry analyses cite **70–90% non-LLM work** in production agents, broadly aligning with the directional claim.
- Counter-perspective: with stronger reasoning models like o1, OpenAI evals suggest infra-vs-prompting may shift toward 50/50 in some domains.

## How a Downstream Agent Should Treat This
Use the *direction* (engineering effort dominates prompting effort) as well-supported. Avoid quoting the exact 80/20 number as if it were measured. Cite the supporting primitives when defending the claim.