root 4bb6548cbc proof harness Phase E: FINAL_REPORT.md answers the 9 mandated questions
Per docs/TEST_PROOF_SCOPE.md, this is the closing deliverable for the
proof harness: a single document that names what's proven, what's
partially proven, what failed, what was skipped and why, what evidence
exists for each, what bottlenecks were measured, what contract drift
was found, what refactor risks remain, and what to fix first.

Per-run report dirs (tests/proof/reports/proof-<ts>/) keep their
existing summary.md + summary.json + raw/ structure — they are the
replayable evidence chain. FINAL_REPORT.md is the stable, repo-tracked
synthesis pointing at them.

Headline findings (no surprises — harness behaves as designed):
  - 24 claims encoded; 22 fully proven, 1 informational (GOLAKE-085
    duplicate vector ID, contract not yet specified), 0 failed.
  - 4 contract-drift findings recorded as canonical: vectord add
    body field is `items` not `vectors`, search response is `results`
    not `hits`, index info is `length` not `count`, status codes
    201/204 not 200. All caught during Phase B; all now pinned by the
    harness.
  - Performance baseline shows queryd as the largest RSS (69 MiB,
    DuckDB process); single-sample noise floor is ~40% — tightening
    to multi-sample medians is a documented Sprint follow-up.
  - HIGH-risk audit findings (R-001 queryd /sql, R-002/R-003 untested
    shared+storeclient) are NOT closed by the harness — it's a
    multiplier, not a replacement for unit tests + auth posture.

The proof harness is complete. 11 cases · 3 modes · 168 assertions
peak across all tiers · ~22s total wall (contract+integration+perf).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 05:32:56 -05:00
..