Evidence, Oracles, And Admission

5 days ago

Agents are good at producing candidates.

They can write a patch, explain why it should work, summarize a trace, or say a test passed.

The system should not confuse any of that with admission. Admission needs a stronger object: a candidate, a claim, a trusted surface that can check the claim, and a boundary that says what proof is required before shared state changes.

PATCH

TEST

TRACE

BUNDLE

ADMISSION

APPLY

REVIEW

RERUN

BLOCK

A candidate becomes useful when it carries evidence into an admission decision.

Claims Need Surfaces

This is a claim:

I fixed the issue and tests pass.

This is an admissible shape:

Parser Recovery Candidate

Candidate: Changed parser recoveryThe patch is the proposed value. It is not the decision.
Claim: Malformed input now reports a spanThe statement the candidate wants the system to believe.
Oracle: Parser fixture malformed-03 passedA trusted surface checks the claim in a bounded case.
Boundary: Private parser behaviorThe proof requirement is smaller than a public API change.
Limit: Does not cover streaming parserA visible non-claim prevents accidental widening.
Route: Apply or reviewThe output is a route, not just a pass/fail label.

A semantic record makes the candidate, claim, oracle, boundary, limit, and route inspectable as one review object.

The second form can be inspected. It can be compared with other attempts. It can be invalidated if the code moves.

Evidence is not a longer explanation. It is the material that lets the system decide what the explanation is allowed to prove.

Oracles Decide Trust

An oracle is a surface the system agrees to trust more than the agent's confidence.

It might be a compiler, a test, a trace, a golden output, a reference implementation, a product invariant, or a human decision.

CLAIM

COMPILER

TESTS

TRACE

HUMAN

COMPARE

ADMIT

REJECT

A claim becomes useful when it is checked against oracles the system has agreed to trust.

Different oracles have different jurisdictions:

type checker -> static compatibility
unit fixture -> focused behavior
browser trace -> observed workflow
declaration diff -> exported shape
human answer -> chosen intent
reference implementation -> expected semantics

No oracle proves everything. That is the point. A good system remembers what each oracle can and cannot decide.

Admission Is A Type Check

A patch is not ready because it applies.

It is ready when its claim, region, and evidence match the boundary it wants to cross.

PATCH

CLAIMS

REGIONS

EVIDENCE

CHECKER

APPLY

TYPE ERROR

Admission checks the candidate, claims, regions, and evidence before shared state can change.

The same text edit can require different evidence depending on where it lands:

private helper edit
  requires: local call-site proof
 
public type edit
  requires: declaration proof, type gate, consumer impact
 
runtime-order edit
  requires: effect graph, trace, replay
 
product behavior edit
  requires: interaction evidence, accepted decision

That is why admission feels like a type system. The patch is the value. The boundary is the expected type. The evidence is the proof that the value is assignable to that position.

Evidence Has Types

Evidence is not one flat substance called proof.

A type check proves static compatibility. A browser trace proves one rendered workflow. A declaration diff proves exported shape. A screenshot proves one viewport. None of those should be silently widened into a stronger claim.

CLAIM

EXPECTED

PROJECT GATE

API PROOF

TRACE

SCREENSHOT

ADMIT

MISMATCH

Stronger evidence can satisfy weaker proof requirements, but weak evidence should not be widened into stronger claims.

Agent noteweakest
Can proveA hypothesis, summary, or proposed explanation.
Stops atDoes not prove the patch is correct.
Ask for an oracle
Source bindingbound
Can proveThe evidence belongs to a specific file state, span, hash, or output.
Stops atDoes not prove behavior by itself.
Attach to candidate
Static oraclestatic
Can proveSyntax, types, imports, declarations, or public shape under a static checker.
Stops atDoes not prove layout, interaction, or runtime order.
Admit static claim
Focused fixturefocused
Can proveOne declared behavior path, fixture, viewport, or state transition.
Stops atDoes not prove every product state.
Admit scoped behavior
Runtime traceobserved
Can proveThe output behaved under a replayed browser, effect, or workflow condition.
Stops atOnly proves the traced conditions.
Admit traced workflow
Accepted decisionstrongest
Can proveThe intended outcome was chosen for the named boundary.
Stops atDoes not erase lower evidence requirements.
Cross boundary

An evidence ladder names what each proof level can claim, where the claim stops, and what route follows from that strength.

project type gate
  can satisfy: single-file type gate
 
single-file type gate
  cannot satisfy: project type gate
 
workflow trace
  can satisfy: rendered-state proof
 
rendered screenshot
  cannot satisfy: workflow proof

Agent confidence is not a subtype of evidence. It can be a reviewer note, a hypothesis, or a reason to run a check. It should not be assignable to a gate result, source hash, trace, or accepted decision.

Disagreement Becomes Work

The useful moment is often when a claim and an oracle disagree.

OPINION

ORACLE

COMPARE

MATCH

MISMATCH

NEXT TASK

When claim and oracle disagree, the mismatch should become a route instead of a debate over confidence.

claim: public API preserved
oracle: declaration changed
route: review or update claim
 
claim: behavior unchanged
oracle: trace diverged
route: rerun with narrower patch
 
claim: product choice is obvious
oracle: no accepted decision
route: ask

The candidate is not necessarily worthless. The system just learned which proof is missing or which claim was too broad.

Type Errors Become Tasks

A failed admission should produce a precise task, not a vague rejection.

CANDIDATE

CONTRACT

PROOF

ADMIT

STATE

ERROR RECORD

NEXT TASK

A failed admission can become a narrower task with the missing contract made explicit.

Display Name Helper

Candidate: Add displayName helper
Claim: Safe public-type rebase
Expected: Symbol rename evidence, declaration proof, type gate
Received: Text patch, focused unit test
Route: Request stronger evidence

A failed record can still preserve progress by saying exactly which proof is missing.

That route preserves progress while keeping the boundary honest.

The Mental Model

Agents produce candidates.

Oracles produce trusted observations.

Admission decides whether the candidate's claim is proven enough for the boundary it wants to cross.

The system becomes more reliable when it stops asking whether a model sounds convincing and starts asking which oracle can prove the claim, which boundary the patch is crossing, and what route follows from the mismatch.

SHAPE

SHIFT

Evidence, Oracles, And Admission

Claims Need Surfaces

Oracles Decide Trust

Admission Is A Type Check

Evidence Has Types

Disagreement Becomes Work

Type Errors Become Tasks

The Mental Model