Phase 45 slice 3: doc_drift check + resolve endpoints #5

Merged
profit merged 1 commits from phase/45-slice-3 into main 2026-04-22 19:14:13 +00:00
Owner

Closes Phase 45

Previously the hybrid fixture honestly reported layer 5 as 404/unimplemented. With this PR it flips to PASS. First time the full pipeline (chat → Langfuse → seed + doc_refs → bridge diff → drift flag) is green end-to-end.

New endpoints

  • POST /vectors/playbook_memory/doc_drift/check/{id} — ask the bridge per doc_ref, flag if any drifted.
  • POST /vectors/playbook_memory/doc_drift/resolve/{id} — human re-admission, clears the boost exclusion.

Tests

5 new regression tests. 14/14 upsert_tests pass. Release build clean.

## Closes Phase 45 Previously the hybrid fixture honestly reported layer 5 as 404/unimplemented. With this PR it flips to PASS. First time the full pipeline (chat → Langfuse → seed + doc_refs → bridge diff → drift flag) is green end-to-end. ## New endpoints - `POST /vectors/playbook_memory/doc_drift/check/{id}` — ask the bridge per doc_ref, flag if any drifted. - `POST /vectors/playbook_memory/doc_drift/resolve/{id}` — human re-admission, clears the boost exclusion. ## Tests 5 new regression tests. 14/14 upsert_tests pass. Release build clean.
profit added 1 commit 2026-04-22 19:13:01 +00:00
Phase 45 slice 3: doc_drift check + resolve endpoints
Some checks failed
lakehouse/auditor cloud: claim not backed — "Previously the hybrid fixture honestly reported layer 5 as 404/unimplemented. With this PR it flips "
8bacd43465
Closes the last open loop of Phase 45. Previously, playbooks could
carry doc_refs (slice 1) and the context7 bridge could report drift
(slice 2) — but nothing tied them together. An operator had no way
to say "check this playbook against its doc sources and flag it if
the docs moved." This slice wires that.

Ships:
- crates/vectord/src/doc_drift.rs — thin context7 bridge client.
  No cache (bridge has its own 5-min TTL). No retry (transient
  failure = Unknown outcome, caller decides).
- PlaybookMemory::flag_doc_drift(id) — stamps doc_drift_flagged_at
  idempotently. Once flagged, compute_boost_for_filtered_with_role
  excludes the entry from both the non-geo and geo-indexed boost
  paths until resolved.
- PlaybookMemory::resolve_doc_drift(id) — human re-admission.
  Stamps doc_drift_reviewed_at which clears the boost exclusion.
- PlaybookMemory::get_entry(id) — new read-only accessor the
  handler uses to read doc_refs without exposing the state lock.
- POST /vectors/playbook_memory/doc_drift/check/{id}
- POST /vectors/playbook_memory/doc_drift/resolve/{id}

Design call: Unknown outcomes from the bridge (bridge down, tool
not in context7, no snippet_hash recorded) are NEVER enough to
flag. Only a positive drifted=true from the bridge flips the flag.
A down bridge doesn't silently drift-flag every playbook.

Tests (5 new, in upsert_tests mod):
- flag_doc_drift_stamps_timestamp_and_persists
- flag_doc_drift_is_idempotent_on_already_flagged
- resolve_doc_drift_clears_flag_admission_gate
- boost_excludes_flagged_unreviewed_entries
- boost_re_admits_resolved_entries
14/14 upsert tests pass (9 pre-existing + 5 new).

Live end-to-end — hybrid fixture on auditor/scaffold (merged to
main at b6d69b2) now shows:

  overall: PASS
  shipped: [38, 40, 45.1, 45.2, 45.3]
  placeholder: [—]
  ✓ Phase 38    /v1/chat              4039ms
  ✓ Phase 40    Langfuse trace          11ms
  ✓ Phase 45.1  seed + doc_refs        748ms
  ✓ Phase 45.2  bridge diff            563ms
  ✓ Phase 45.3  drift-check endpoint   116ms ← was a 404 before this

First time the fixture reports overall=PASS with zero placeholder
layers. The honest "not built" signal on layer 5 is now honestly
"built and working."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author
Owner

Auditor verdict: ⚠️ request_changes

One-liner: cloud: claim not backed — "Previously the hybrid fixture honestly reported layer 5 as 404/unimplemented. With this PR it flips "
Head SHA: 8bacd434654f
Audited at: 2026-04-22T19:13:33.346Z

dynamic — 1 findings (0 block, 0 warn, 1 info)

ℹ️ info — dynamic check skipped — skipped by options

  • skipped by options
inference — 3 findings (0 block, 2 warn, 1 info)

ℹ️ info — cloud review completed (model=gpt-oss:120b, tokens=6766)

  • claim_verdicts: 2, unflagged_gaps: 0
    ⚠️ warn — cloud: claim not backed — "Previously the hybrid fixture honestly reported layer 5 as 404/unimplemented. With this PR it flips "
  • at pr_body:2
  • cloud reason: Diff adds doc‑drift detection and endpoints but contains no code related to the "hybrid fixture" or any change that would turn a previously failing layer 5 into a PAS
    ⚠️ warn — cloud: claim not backed — "shipped: [38, 40, 45.1, 45.2, 45.3]"
  • at commit:8bacd434:41
  • cloud reason: Diff implements phase‑45 slice‑3 functionality (flag, resolve, boost exclusion) but does not show any changes for items 38 or 40, so the claim that all listed items a
kb_query — 1 findings (0 block, 0 warn, 1 info)

ℹ️ info — KB: 69 recent scenario runs, 209/289 events ok (fail rate 27.7%)

  • most recent: scenario-2026-04-21T05-29-34
  • recent failing sigs: 5745bcd5e4c68591, 5745bcd5e4c68591, caeeeffc69d36009

Metrics

{
  "audit_duration_ms": 10464,
  "findings_total": 5,
  "findings_block": 0,
  "findings_warn": 2,
  "findings_info": 3,
  "claims_strong": 0,
  "claims_moderate": 2,
  "claims_weak": 0,
  "claims_total": 2,
  "diff_bytes": 20386
}

Lakehouse auditor · SHA 8bacd434 · re-audit on new commit flips the status automatically.

## Auditor verdict: ⚠️ `request_changes` **One-liner:** cloud: claim not backed — "Previously the hybrid fixture honestly reported layer 5 as 404/unimplemented. With this PR it flips " **Head SHA:** `8bacd434654f` **Audited at:** 2026-04-22T19:13:33.346Z <details><summary><b>dynamic</b> — 1 findings (0 block, 0 warn, 1 info)</summary> ℹ️ **info** — dynamic check skipped — skipped by options - `skipped by options` </details> <details><summary><b>inference</b> — 3 findings (0 block, 2 warn, 1 info)</summary> ℹ️ **info** — cloud review completed (model=gpt-oss:120b, tokens=6766) - `claim_verdicts: 2, unflagged_gaps: 0` ⚠️ **warn** — cloud: claim not backed — "Previously the hybrid fixture honestly reported layer 5 as 404/unimplemented. With this PR it flips " - `at pr_body:2` - `cloud reason: Diff adds doc‑drift detection and endpoints but contains no code related to the "hybrid fixture" or any change that would turn a previously failing layer 5 into a PAS` ⚠️ **warn** — cloud: claim not backed — "shipped: [38, 40, 45.1, 45.2, 45.3]" - `at commit:8bacd434:41` - `cloud reason: Diff implements phase‑45 slice‑3 functionality (flag, resolve, boost exclusion) but does not show any changes for items 38 or 40, so the claim that all listed items a` </details> <details><summary><b>kb_query</b> — 1 findings (0 block, 0 warn, 1 info)</summary> ℹ️ **info** — KB: 69 recent scenario runs, 209/289 events ok (fail rate 27.7%) - `most recent: scenario-2026-04-21T05-29-34` - `recent failing sigs: 5745bcd5e4c68591, 5745bcd5e4c68591, caeeeffc69d36009` </details> ### Metrics ```json { "audit_duration_ms": 10464, "findings_total": 5, "findings_block": 0, "findings_warn": 2, "findings_info": 3, "claims_strong": 0, "claims_moderate": 2, "claims_weak": 0, "claims_total": 2, "diff_bytes": 20386 } ``` <sub>Lakehouse auditor · SHA 8bacd434 · re-audit on new commit flips the status automatically.</sub>
Author
Owner

Manual override — both warnings are cloud-diff-context limitations

Cloud reviewer only sees the PR diff, not the results of running the hybrid fixture. Both warnings:

  1. pr_body:2 — "Previously the hybrid fixture honestly reported layer 5 as 404"
  2. commit:8bacd434:41 — "shipped: [38, 40, 45.1, 45.2, 45.3]"

are past/output claims I verified live before committing. The fixture output IS in the commit message body. Cloud can't re-run the fixture — only see the code change.

Same false-positive class as PR #2 + PR #4 overrides. Known inference-check limitation to tune later.

Proceeding with merge.

## Manual override — both warnings are cloud-diff-context limitations Cloud reviewer only sees the PR diff, not the results of running the hybrid fixture. Both warnings: 1. `pr_body:2` — "Previously the hybrid fixture honestly reported layer 5 as 404" 2. `commit:8bacd434:41` — "shipped: [38, 40, 45.1, 45.2, 45.3]" are past/output claims I verified live before committing. The fixture output IS in the commit message body. Cloud can't re-run the fixture — only see the code change. Same false-positive class as PR #2 + PR #4 overrides. Known inference-check limitation to tune later. Proceeding with merge.
profit merged commit 6d7b251607 into main 2026-04-22 19:14:13 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: profit/lakehouse#5
No description provided.