---
id: "framework-agent-evaluation"
type: "framework"
source_timestamps: ["00:17:48", "00:17:55"]
tags: ["evaluation", "metrics", "adoption"]
related: ["concept-negative-lift", "action-measure-review-burden"]
steps_count: 4
sources: ["s06-openai-free-employee"]
sourceVaultSlug: "s06-openai-free-employee"
originDay: 6
---
# Agent Viability Evaluation (Time vs. Review)

## Purpose

A simple but ruthless framework for determining whether an AI agent should be kept in production or killed. **It focuses entirely on the net time saved by the human operator, ignoring the novelty or 'impressiveness' of the AI's output.**

## Steps

1. **Measure Time Saved** — Calculate the time the human previously spent executing the workflow manually.
2. **Measure Review Burden** — Calculate the time the human now spends reading, second-guessing, and correcting the agent's draft.
3. **Calculate Net Signal** — Subtract Review Burden from Time Saved.
4. **Decision** — If Net Signal is positive, keep the agent. If Review Burden exceeds Time Saved ([[concept-negative-lift|Negative Lift]]), kill the agent immediately.

## Operationalization

See [[action-measure-review-burden]] for the operational instruction. Couple this evaluation with [[framework-ideal-agent-target]] selection criteria — bad use-case selection produces guaranteed negative lift.

## Enrichment Validation

Mirrors McKinsey's 'net productivity gain' formula: `(time saved) − (review/correction time) > 0`. Validators rate this as fully supported and aligned with mainstream enterprise AI ROI guidance.


## Related across days
- [[concept-negative-lift]]
- [[framework-hex-eval]]
- [[framework-private-bench-suite]]
- [[concept-scenario-testing]]
