---
id: "quote-oversell-undersell"
type: "quote"
source_timestamps: ["00:00:00"]
tags: ["evaluations", "model-behavior"]
related: ["concept-model-self-review-bias"]
speaker: "Nate B. Jones"
speakers: ["Nate B. Jones"]
sources: ["s12-opus-47"]
sourceVaultSlug: "s12-opus-47"
originDay: 12
---
# Model self-review biases (oversell vs. undersell)

## Quote

> *"Opus oversells itself and GPT-5.4 undersells itself."*
>
> — [[entity-nate-b-jones|Nate B. Jones]]

## Why It Matters

A succinct summary of how different frontier models evaluate their own performance — **crucial for teams building LLM-as-a-judge pipelines**.

If you use the wrong model as evaluator, you embed its self-review bias into your eval pipeline:
- Use [[entity-claude-opus-4-7-d12|Opus]] as judge → optimistic results.
- Use [[entity-chatgpt-5-4|GPT-5.4]] as judge → pessimistic results.
- Use both and triangulate → calibrated.

## Maps To

- [[concept-model-self-review-bias]] — the underlying bias dynamic.
- [[framework-hex-eval]] — peer-review step that surfaces this.

## Cross-References

- Concept: [[concept-model-self-review-bias]]
- Entity: [[entity-claude-opus-4-7-d12]], [[entity-chatgpt-5-4]]
- Framework: [[framework-hex-eval]]
