Phase 45 (first slice): DocRef + doc_refs field on PlaybookEntry

Phase J keeps asking for: playbooks know which external docs they used, get flagged when those docs drift. This commit ships the data model; context7 bridge + drift check endpoints land in follow-ups. Added to crates/vectord/src/playbook_memory.rs: - pub struct DocRef { tool, version_seen, snippet_hash, source_url, seen_at } — one external doc reference - PlaybookEntry.doc_refs: Vec<DocRef> — empty on legacy entries, serde default ensures pre-Phase-45 persisted state loads cleanly - PlaybookEntry.doc_drift_flagged_at: Option<String> — set by the (future) drift-check code when context7 reports newer version - PlaybookEntry.doc_drift_reviewed_at: Option<String> — set by human via /resolve endpoint after reviewing the diagnosis - impl Default for PlaybookEntry — collapses most test-helper constructors from 17 explicit fields to 6-9 fields + ..Default::default() Updated SeedPlaybookRequest + RevisePlaybookRequest (service.rs) to accept optional doc_refs: the seed/revise endpoints already take the field, downstream drift detection (Phase 45.2) consumes it. Docs: docs/CONTROL_PLANE_PRD.md gains full Phase 45 spec with gate criteria, non-goals, and risk notes. Tests: 51/51 vectord lib tests green (same count as before, field additions are backward-compat). Memory: project_doc_drift_vision.md written so this keeps coming back to the front of mind. Next slices (same phase): context7 HTTP bridge in mcp-server, /vectors/playbook_memory/doc_drift/check/{id} endpoint, overview- model drift synthesis writing to data/_kb/doc_drift_corrections.jsonl, boost exclusion for flagged+unreviewed entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 03:14:07 -05:00 · 2026-04-22 03:14:07 -05:00 · 2a4b81bf48
commit 2a4b81bf48
parent 75a0f424ef
3 changed files with 139 additions and 68 deletions
--- a/crates/vectord/src/playbook_memory.rs
+++ b/crates/vectord/src/playbook_memory.rs
@ -134,10 +134,87 @@ pub struct PlaybookEntry {
    /// full version chain.
    #[serde(default)]
    pub superseded_by: Option<String>,
    /// Phase 45 — external documentation references captured at seal
    /// time. One entry per tool/library the procedure consulted.
    /// Drives drift detection: when context7 reports a newer version
    /// for any entry here than what's in `version_seen`, the playbook
    /// is `doc_drift_flagged_at` and excluded from boost until human
    /// review clears it. Legacy entries (pre-Phase-45) load with an
    /// empty vec — they simply never drift-flag, same as entries
    /// without a `schema_fingerprint` in Phase 25.
    #[serde(default)]
    pub doc_refs: Vec<DocRef>,
    /// Phase 45 — set by `flag_doc_drift()` when one or more
    /// `doc_refs` entries have a newer version available than
    /// `version_seen`. Flagged entries are excluded from boost until
    /// `doc_drift_reviewed_at` is set via the /resolve endpoint.
    #[serde(default)]
    pub doc_drift_flagged_at: Option<String>,
    /// Phase 45 — set by human operator via
    /// `/vectors/playbook_memory/doc_drift/resolve/{id}` after
    /// reviewing the drift diagnosis. Either re-admits the entry to
    /// boost (if still applicable) or pairs with `retired_at` /
    /// `superseded_by` if the procedure changed.
    #[serde(default)]
    pub doc_drift_reviewed_at: Option<String>,
 }
 fn default_version() -> u32 { 1 }
 /// Phase 45 — one external doc reference. Recorded at seal time so
 /// drift detection knows what version was consulted. `snippet_hash`
 /// lets us detect "same version, different passage" when a library
 /// patches docs without bumping the version number.
 #[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
 pub struct DocRef {
    /// Canonical tool/library name as context7 knows it, e.g.
    /// "docker", "terraform", "react", "next.js". Case-insensitive
    /// on compare.
    pub tool: String,
    /// Version string exactly as seen at seal time. Context7 typically
    /// returns semver-like; we store raw string to avoid parsing
    /// ambiguity ("latest", "next", "canary" are all valid).
    pub version_seen: String,
    /// Optional hash of the specific doc passage the procedure
    /// referenced. Useful when version hasn't bumped but content
    /// rewrote.
    #[serde(default)]
    pub snippet_hash: Option<String>,
    /// Optional direct URL back to the doc (context7 can resolve
    /// tool+version → URL, so this is cache not source-of-truth).
    #[serde(default)]
    pub source_url: Option<String>,
    /// When this reference was captured. RFC3339.
    pub seen_at: String,
 }
 impl Default for PlaybookEntry {
    fn default() -> Self {
        Self {
            playbook_id: String::new(),
            operation: String::new(),
            approach: String::new(),
            context: String::new(),
            timestamp: String::new(),
            endorsed_names: Vec::new(),
            city: None,
            state: None,
            embedding: None,
            schema_fingerprint: None,
            valid_until: None,
            retired_at: None,
            retirement_reason: None,
            version: 1,
            parent_id: None,
            superseded_at: None,
            superseded_by: None,
            doc_refs: Vec::new(),
            doc_drift_flagged_at: None,
            doc_drift_reviewed_at: None,
        }
    }
 }
 /// A recorded failure — worker who didn't deliver on a contract.
 /// Tracked per (city, state, name) so a single worker's failures on
 /// Toledo Welder contracts don't penalize the same name in Chicago.
@ -1201,19 +1278,10 @@ pub async fn rebuild(
                endorsed_names: names,
                city,
                state,
-                embedding: None,
+                // Rebuild doesn't know fingerprints or doc_refs;
-                // Rebuild doesn't know fingerprints; historical entries
+                // historical entries get no drift signal until a seed
-                // get no retirement signal until a seed with a
+                // supersedes them or /retire is called manually.
-                // fingerprint supersedes them or the operator calls
+                ..Default::default()
                // /retire manually.
                schema_fingerprint: None,
                valid_until: None,
                retired_at: None,
                retirement_reason: None,
                version: 1,
                parent_id: None,
                superseded_at: None,
                superseded_by: None,
            }
        })
        .collect();
@ -1386,20 +1454,12 @@ mod tests {
                playbook_id: format!("pb-{i}"),
                operation: "fill: Welder x1 in Toledo, OH".into(),
                approach: "transfer".into(),
                context: "".into(),
                timestamp: "2026-04-20".into(),
                endorsed_names: vec!["Deborah Powell".into()],
                city: Some("Toledo".into()),
                state: Some("OH".into()),
                embedding: Some(vec![1.0, 0.0, 0.0]),
-                schema_fingerprint: None,
+                ..Default::default()
                valid_until: None,
                retired_at: None,
                retirement_reason: None,
                version: 1,
                parent_id: None,
                superseded_at: None,
                superseded_by: None,
            })
            .collect();
        tokio::runtime::Runtime::new().unwrap().block_on(async {
@ -1431,12 +1491,7 @@ mod validity_window_tests {
            embedding: Some(vec![1.0, 0.0, 0.0]),
            schema_fingerprint: fingerprint,
            valid_until,
-            retired_at: None,
+            ..Default::default()
            retirement_reason: None,
            version: 1,
            parent_id: None,
            superseded_at: None,
            superseded_by: None,
        }
    }
@ -1536,14 +1591,7 @@ mod upsert_tests {
            city: Some("Nashville".into()),
            state: Some("TN".into()),
            embedding: Some(vec![1.0, 0.0, 0.0]),
-            schema_fingerprint: None,
+            ..Default::default()
            valid_until: None,
            retired_at: None,
            retirement_reason: None,
            version: 1,
            parent_id: None,
            superseded_at: None,
            superseded_by: None,
        }
    }
@ -1632,14 +1680,7 @@ mod version_tests {
            city: Some(city.into()),
            state: Some(state.into()),
            embedding: Some(vec![1.0, 0.0, 0.0]),
-            schema_fingerprint: None,
+            ..Default::default()
            valid_until: None,
            retired_at: None,
            retirement_reason: None,
            version: 1,
            parent_id: None,
            superseded_at: None,
            superseded_by: None,
        }
    }
--- a/crates/vectord/src/service.rs
+++ b/crates/vectord/src/service.rs
@ -2211,6 +2211,12 @@ struct SeedPlaybookRequest {
    /// retired, just inactive). Useful for seasonal/temp contracts.
    #[serde(default)]
    valid_until: Option<String>,
    /// Phase 45 — optional external doc references captured at seal
    /// time. Each entry names a tool + version_seen; context7-driven
    /// drift check compares against current versions later. None or
    /// empty = no drift signal (never flagged).
    #[serde(default)]
    doc_refs: Option<Vec<playbook_memory::DocRef>>,
 }
 /// Bootstrap / test-only: inject a playbook entry directly into
@ -2232,21 +2238,12 @@ async fn seed_playbook_memory(
    // Embed the entry through the same text shape `rebuild` uses so
    // similarity math is comparable across seed + real entries.
    let tmp_entry = playbook_memory::PlaybookEntry {
        playbook_id: String::new(),
        operation: req.operation.clone(),
        approach: req.approach.clone(),
        context: req.context.clone(),
        timestamp: chrono::Utc::now().to_rfc3339(),
        endorsed_names: req.endorsed_names.clone(),
-        city: None, state: None, embedding: None,
+        ..Default::default()
        schema_fingerprint: None,
        valid_until: None,
        retired_at: None,
        retirement_reason: None,
        version: 1,
        parent_id: None,
        superseded_at: None,
        superseded_by: None,
    };
    let text = format!(
        "{} | {} | {} | fills: {}",
@ -2304,12 +2301,11 @@ async fn seed_playbook_memory(
        // works). valid_until + retired_at start None.
        schema_fingerprint: req.schema_fingerprint.clone(),
        valid_until: req.valid_until.clone(),
-        retired_at: None,
+        // Phase 45 — seed request may also carry doc_refs; defaults
-        retirement_reason: None,
+        // empty so pre-Phase-45 callers still work and the entry
-        version: 1,
+        // degrades to "no drift signal" (never flagged).
-        parent_id: None,
+        doc_refs: req.doc_refs.clone().unwrap_or_default(),
-        superseded_at: None,
+        ..Default::default()
        superseded_by: None,
    };
    // Phase 26 — when append=true (default), route through upsert so
@ -2521,6 +2517,11 @@ struct RevisePlaybookRequest {
    schema_fingerprint: Option<String>,
    #[serde(default)]
    valid_until: Option<String>,
    /// Phase 45 — updated doc references. Typically a revise happens
    /// BECAUSE docs drifted; pass the new versions seen so the revised
    /// entry starts with fresh drift signal.
    #[serde(default)]
    doc_refs: Option<Vec<playbook_memory::DocRef>>,
 }
 /// Phase 27 — create a new version of an existing playbook. The parent
@ -2613,14 +2614,11 @@ async fn revise_playbook_memory(
        embedding: Some(emb),
        schema_fingerprint: req.schema_fingerprint,
        valid_until: req.valid_until,
-        retired_at: None,
+        // Phase 45 — doc_refs may be provided on revise too.
-        retirement_reason: None,
+        doc_refs: req.doc_refs.clone().unwrap_or_default(),
-        // revise_entry overwrites these from the parent — values here
+        // revise_entry overwrites version / parent_id / supersession
-        // are just placeholders so the struct is well-formed.
+        // from the parent; other fields keep defaults.
-        version: 1,
+        ..Default::default()
        parent_id: None,
        superseded_at: None,
        superseded_by: None,
    };
    let outcome = state.playbook_memory.revise_entry(&req.parent_id, new_entry)
--- a/docs/CONTROL_PLANE_PRD.md
+++ b/docs/CONTROL_PLANE_PRD.md
@ -217,6 +217,38 @@ Ship each phase before starting the next. Each ends with green tests + docs upda
 ---
 ## Phase 45 — Doc-drift detection + context7 integration
 **Goal:** Playbooks know which external docs they were written against. When those docs change (Docker adds a feature, npm lib goes major, Terraform renames a resource), the playbook is automatically flagged. Small models never run confidently-outdated procedures — the drift signal reaches them before the next execution does.
 **Why this phase exists at all:** The 0→85% thesis depends on the hyperfocus lane staying valid. External doc drift invalidates the lane silently — popular playbooks can compound the wrong way, accumulating boost while growing more wrong. Phase 25 already retires playbooks on *internal* schema drift; Phase 45 is the same mechanism against *external* doc drift. This is the completion of the learning loop, not an optional add-on.
 **Ships:**
 - `shared::types::DocRef` — `{ tool: String, version_seen: String, snippet_hash: Option<String>, source_url: Option<String>, seen_at: DateTime<Utc> }`
 - `PlaybookEntry.doc_refs: Vec<DocRef>` — `#[serde(default)]` so pre-Phase-45 entries load as empty vec
 - `/vectors/playbook_memory/seed` + `/revise` accept `doc_refs` in the request body
 - `/vectors/playbook_memory/doc_drift/check/{id}` — manual drift check: looks up each `doc_refs[]` entry via the context7 bridge, returns per-tool `{version_seen, version_current, drifted: bool}` plus overall verdict
 - `/vectors/playbook_memory/doc_drift/scan` — batch scan across all active playbooks (scheduled path for Phase 45.2)
 - `mcp-server/context7_bridge.ts` — Bun HTTP bridge. Exposes `GET /docs/:tool/version` + `GET /docs/:tool/:version/diff?since=X` against the installed context7 MCP plugin. Gateway calls this over localhost.
 - `PlaybookMemory::compute_boost_for_filtered_with_role` — excludes entries where `doc_drift_flagged_at.is_some() && doc_drift_review.is_none()` (same rule as retired + superseded)
 - Overview model synthesis writes `data/_kb/doc_drift_corrections.jsonl` per detected drift: `{playbook_id, tool, version_seen, version_current, diff_summary, recommended_action, generated_at}`
 - Human-in-the-loop re-seal path: `/vectors/playbook_memory/doc_drift/resolve/{id}` — marks reviewed, optionally triggers `revise_entry` if procedure changed
 **Gate:**
 - Seal a playbook referencing Docker 24.x → doc_refs captured. Bump Docker version behind the scenes → `/doc_drift/check/{id}` returns `drifted: true, from: 24.0.7, to: 25.0.1, summary: "..."`. The boosted playbook count on next `/vectors/hybrid` query drops by 1 (drift-flagged skipped).
 - `doc_drift_corrections.jsonl` contains the overview model's synthesis for the drift with at least: summary of change, recommended action, cost/impact estimate.
 - Human calls `/doc_drift/resolve/{id}` after reviewing → playbook returns to active boost pool (or supersedes via Phase 27 if procedure materially changed).
 - Unit tests: DocRef serde default (legacy entries load as empty), drift check against mocked context7 bridge, boost exclusion when drifted+unreviewed.
 **Non-goals (explicit):**
 - Automatic re-seal without human review. Drift-detection → flag, not silent rewrite.
 - Cross-playbook propagation of one drift diagnosis. Each playbook reviewed individually (aggregation later if warranted).
 - Generating the updated procedure. T3 *suggests*; human or separate bot (see `bot/`) *writes*.
 **Risk:** Medium. The context7 bridge is new infrastructure (Bun ↔ context7 MCP plugin ↔ HTTP shape for gateway consumption). Mitigation: context7 plugin is already installed; its MCP tools return structured JSON; the bridge is thin adapter code. Start with single-tool drift check (Docker) before broadening.
 ---
 ## Long-horizon domains (not in current phase sequence)
 The architecture was drafted with DevOps execution (Terraform, Ansible) as the eventual target. **That remains aspirational, not current scope** — we don't start wiring `terraform validate` / `ansible-lint` until the staffing domain proves the six-layer architecture at scale.