Phase 45 (first slice): DocRef + doc_refs field on PlaybookEntry

Phase J keeps asking for: playbooks know which external docs they
used, get flagged when those docs drift. This commit ships the data
model; context7 bridge + drift check endpoints land in follow-ups.

Added to crates/vectord/src/playbook_memory.rs:
- pub struct DocRef { tool, version_seen, snippet_hash, source_url,
  seen_at } — one external doc reference
- PlaybookEntry.doc_refs: Vec<DocRef> — empty on legacy entries,
  serde default ensures pre-Phase-45 persisted state loads cleanly
- PlaybookEntry.doc_drift_flagged_at: Option<String> — set by the
  (future) drift-check code when context7 reports newer version
- PlaybookEntry.doc_drift_reviewed_at: Option<String> — set by
  human via /resolve endpoint after reviewing the diagnosis
- impl Default for PlaybookEntry — collapses most test-helper
  constructors from 17 explicit fields to 6-9 fields +
  ..Default::default()

Updated SeedPlaybookRequest + RevisePlaybookRequest (service.rs) to
accept optional doc_refs: the seed/revise endpoints already take the
field, downstream drift detection (Phase 45.2) consumes it.

Docs: docs/CONTROL_PLANE_PRD.md gains full Phase 45 spec with gate
criteria, non-goals, and risk notes.

Tests: 51/51 vectord lib tests green (same count as before, field
additions are backward-compat).

Memory: project_doc_drift_vision.md written so this keeps coming
back to the front of mind.

Next slices (same phase): context7 HTTP bridge in mcp-server,
/vectors/playbook_memory/doc_drift/check/{id} endpoint, overview-
model drift synthesis writing to data/_kb/doc_drift_corrections.jsonl,
boost exclusion for flagged+unreviewed entries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
profit 2026-04-22 03:14:07 -05:00
parent 75a0f424ef
commit 2a4b81bf48
3 changed files with 139 additions and 68 deletions

View File

@ -134,10 +134,87 @@ pub struct PlaybookEntry {
/// full version chain. /// full version chain.
#[serde(default)] #[serde(default)]
pub superseded_by: Option<String>, pub superseded_by: Option<String>,
/// Phase 45 — external documentation references captured at seal
/// time. One entry per tool/library the procedure consulted.
/// Drives drift detection: when context7 reports a newer version
/// for any entry here than what's in `version_seen`, the playbook
/// is `doc_drift_flagged_at` and excluded from boost until human
/// review clears it. Legacy entries (pre-Phase-45) load with an
/// empty vec — they simply never drift-flag, same as entries
/// without a `schema_fingerprint` in Phase 25.
#[serde(default)]
pub doc_refs: Vec<DocRef>,
/// Phase 45 — set by `flag_doc_drift()` when one or more
/// `doc_refs` entries have a newer version available than
/// `version_seen`. Flagged entries are excluded from boost until
/// `doc_drift_reviewed_at` is set via the /resolve endpoint.
#[serde(default)]
pub doc_drift_flagged_at: Option<String>,
/// Phase 45 — set by human operator via
/// `/vectors/playbook_memory/doc_drift/resolve/{id}` after
/// reviewing the drift diagnosis. Either re-admits the entry to
/// boost (if still applicable) or pairs with `retired_at` /
/// `superseded_by` if the procedure changed.
#[serde(default)]
pub doc_drift_reviewed_at: Option<String>,
} }
fn default_version() -> u32 { 1 } fn default_version() -> u32 { 1 }
/// Phase 45 — one external doc reference. Recorded at seal time so
/// drift detection knows what version was consulted. `snippet_hash`
/// lets us detect "same version, different passage" when a library
/// patches docs without bumping the version number.
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct DocRef {
/// Canonical tool/library name as context7 knows it, e.g.
/// "docker", "terraform", "react", "next.js". Case-insensitive
/// on compare.
pub tool: String,
/// Version string exactly as seen at seal time. Context7 typically
/// returns semver-like; we store raw string to avoid parsing
/// ambiguity ("latest", "next", "canary" are all valid).
pub version_seen: String,
/// Optional hash of the specific doc passage the procedure
/// referenced. Useful when version hasn't bumped but content
/// rewrote.
#[serde(default)]
pub snippet_hash: Option<String>,
/// Optional direct URL back to the doc (context7 can resolve
/// tool+version → URL, so this is cache not source-of-truth).
#[serde(default)]
pub source_url: Option<String>,
/// When this reference was captured. RFC3339.
pub seen_at: String,
}
impl Default for PlaybookEntry {
fn default() -> Self {
Self {
playbook_id: String::new(),
operation: String::new(),
approach: String::new(),
context: String::new(),
timestamp: String::new(),
endorsed_names: Vec::new(),
city: None,
state: None,
embedding: None,
schema_fingerprint: None,
valid_until: None,
retired_at: None,
retirement_reason: None,
version: 1,
parent_id: None,
superseded_at: None,
superseded_by: None,
doc_refs: Vec::new(),
doc_drift_flagged_at: None,
doc_drift_reviewed_at: None,
}
}
}
/// A recorded failure — worker who didn't deliver on a contract. /// A recorded failure — worker who didn't deliver on a contract.
/// Tracked per (city, state, name) so a single worker's failures on /// Tracked per (city, state, name) so a single worker's failures on
/// Toledo Welder contracts don't penalize the same name in Chicago. /// Toledo Welder contracts don't penalize the same name in Chicago.
@ -1201,19 +1278,10 @@ pub async fn rebuild(
endorsed_names: names, endorsed_names: names,
city, city,
state, state,
embedding: None, // Rebuild doesn't know fingerprints or doc_refs;
// Rebuild doesn't know fingerprints; historical entries // historical entries get no drift signal until a seed
// get no retirement signal until a seed with a // supersedes them or /retire is called manually.
// fingerprint supersedes them or the operator calls ..Default::default()
// /retire manually.
schema_fingerprint: None,
valid_until: None,
retired_at: None,
retirement_reason: None,
version: 1,
parent_id: None,
superseded_at: None,
superseded_by: None,
} }
}) })
.collect(); .collect();
@ -1386,20 +1454,12 @@ mod tests {
playbook_id: format!("pb-{i}"), playbook_id: format!("pb-{i}"),
operation: "fill: Welder x1 in Toledo, OH".into(), operation: "fill: Welder x1 in Toledo, OH".into(),
approach: "transfer".into(), approach: "transfer".into(),
context: "".into(),
timestamp: "2026-04-20".into(), timestamp: "2026-04-20".into(),
endorsed_names: vec!["Deborah Powell".into()], endorsed_names: vec!["Deborah Powell".into()],
city: Some("Toledo".into()), city: Some("Toledo".into()),
state: Some("OH".into()), state: Some("OH".into()),
embedding: Some(vec![1.0, 0.0, 0.0]), embedding: Some(vec![1.0, 0.0, 0.0]),
schema_fingerprint: None, ..Default::default()
valid_until: None,
retired_at: None,
retirement_reason: None,
version: 1,
parent_id: None,
superseded_at: None,
superseded_by: None,
}) })
.collect(); .collect();
tokio::runtime::Runtime::new().unwrap().block_on(async { tokio::runtime::Runtime::new().unwrap().block_on(async {
@ -1431,12 +1491,7 @@ mod validity_window_tests {
embedding: Some(vec![1.0, 0.0, 0.0]), embedding: Some(vec![1.0, 0.0, 0.0]),
schema_fingerprint: fingerprint, schema_fingerprint: fingerprint,
valid_until, valid_until,
retired_at: None, ..Default::default()
retirement_reason: None,
version: 1,
parent_id: None,
superseded_at: None,
superseded_by: None,
} }
} }
@ -1536,14 +1591,7 @@ mod upsert_tests {
city: Some("Nashville".into()), city: Some("Nashville".into()),
state: Some("TN".into()), state: Some("TN".into()),
embedding: Some(vec![1.0, 0.0, 0.0]), embedding: Some(vec![1.0, 0.0, 0.0]),
schema_fingerprint: None, ..Default::default()
valid_until: None,
retired_at: None,
retirement_reason: None,
version: 1,
parent_id: None,
superseded_at: None,
superseded_by: None,
} }
} }
@ -1632,14 +1680,7 @@ mod version_tests {
city: Some(city.into()), city: Some(city.into()),
state: Some(state.into()), state: Some(state.into()),
embedding: Some(vec![1.0, 0.0, 0.0]), embedding: Some(vec![1.0, 0.0, 0.0]),
schema_fingerprint: None, ..Default::default()
valid_until: None,
retired_at: None,
retirement_reason: None,
version: 1,
parent_id: None,
superseded_at: None,
superseded_by: None,
} }
} }

View File

@ -2211,6 +2211,12 @@ struct SeedPlaybookRequest {
/// retired, just inactive). Useful for seasonal/temp contracts. /// retired, just inactive). Useful for seasonal/temp contracts.
#[serde(default)] #[serde(default)]
valid_until: Option<String>, valid_until: Option<String>,
/// Phase 45 — optional external doc references captured at seal
/// time. Each entry names a tool + version_seen; context7-driven
/// drift check compares against current versions later. None or
/// empty = no drift signal (never flagged).
#[serde(default)]
doc_refs: Option<Vec<playbook_memory::DocRef>>,
} }
/// Bootstrap / test-only: inject a playbook entry directly into /// Bootstrap / test-only: inject a playbook entry directly into
@ -2232,21 +2238,12 @@ async fn seed_playbook_memory(
// Embed the entry through the same text shape `rebuild` uses so // Embed the entry through the same text shape `rebuild` uses so
// similarity math is comparable across seed + real entries. // similarity math is comparable across seed + real entries.
let tmp_entry = playbook_memory::PlaybookEntry { let tmp_entry = playbook_memory::PlaybookEntry {
playbook_id: String::new(),
operation: req.operation.clone(), operation: req.operation.clone(),
approach: req.approach.clone(), approach: req.approach.clone(),
context: req.context.clone(), context: req.context.clone(),
timestamp: chrono::Utc::now().to_rfc3339(), timestamp: chrono::Utc::now().to_rfc3339(),
endorsed_names: req.endorsed_names.clone(), endorsed_names: req.endorsed_names.clone(),
city: None, state: None, embedding: None, ..Default::default()
schema_fingerprint: None,
valid_until: None,
retired_at: None,
retirement_reason: None,
version: 1,
parent_id: None,
superseded_at: None,
superseded_by: None,
}; };
let text = format!( let text = format!(
"{} | {} | {} | fills: {}", "{} | {} | {} | fills: {}",
@ -2304,12 +2301,11 @@ async fn seed_playbook_memory(
// works). valid_until + retired_at start None. // works). valid_until + retired_at start None.
schema_fingerprint: req.schema_fingerprint.clone(), schema_fingerprint: req.schema_fingerprint.clone(),
valid_until: req.valid_until.clone(), valid_until: req.valid_until.clone(),
retired_at: None, // Phase 45 — seed request may also carry doc_refs; defaults
retirement_reason: None, // empty so pre-Phase-45 callers still work and the entry
version: 1, // degrades to "no drift signal" (never flagged).
parent_id: None, doc_refs: req.doc_refs.clone().unwrap_or_default(),
superseded_at: None, ..Default::default()
superseded_by: None,
}; };
// Phase 26 — when append=true (default), route through upsert so // Phase 26 — when append=true (default), route through upsert so
@ -2521,6 +2517,11 @@ struct RevisePlaybookRequest {
schema_fingerprint: Option<String>, schema_fingerprint: Option<String>,
#[serde(default)] #[serde(default)]
valid_until: Option<String>, valid_until: Option<String>,
/// Phase 45 — updated doc references. Typically a revise happens
/// BECAUSE docs drifted; pass the new versions seen so the revised
/// entry starts with fresh drift signal.
#[serde(default)]
doc_refs: Option<Vec<playbook_memory::DocRef>>,
} }
/// Phase 27 — create a new version of an existing playbook. The parent /// Phase 27 — create a new version of an existing playbook. The parent
@ -2613,14 +2614,11 @@ async fn revise_playbook_memory(
embedding: Some(emb), embedding: Some(emb),
schema_fingerprint: req.schema_fingerprint, schema_fingerprint: req.schema_fingerprint,
valid_until: req.valid_until, valid_until: req.valid_until,
retired_at: None, // Phase 45 — doc_refs may be provided on revise too.
retirement_reason: None, doc_refs: req.doc_refs.clone().unwrap_or_default(),
// revise_entry overwrites these from the parent — values here // revise_entry overwrites version / parent_id / supersession
// are just placeholders so the struct is well-formed. // from the parent; other fields keep defaults.
version: 1, ..Default::default()
parent_id: None,
superseded_at: None,
superseded_by: None,
}; };
let outcome = state.playbook_memory.revise_entry(&req.parent_id, new_entry) let outcome = state.playbook_memory.revise_entry(&req.parent_id, new_entry)

View File

@ -217,6 +217,38 @@ Ship each phase before starting the next. Each ends with green tests + docs upda
--- ---
## Phase 45 — Doc-drift detection + context7 integration
**Goal:** Playbooks know which external docs they were written against. When those docs change (Docker adds a feature, npm lib goes major, Terraform renames a resource), the playbook is automatically flagged. Small models never run confidently-outdated procedures — the drift signal reaches them before the next execution does.
**Why this phase exists at all:** The 0→85% thesis depends on the hyperfocus lane staying valid. External doc drift invalidates the lane silently — popular playbooks can compound the wrong way, accumulating boost while growing more wrong. Phase 25 already retires playbooks on *internal* schema drift; Phase 45 is the same mechanism against *external* doc drift. This is the completion of the learning loop, not an optional add-on.
**Ships:**
- `shared::types::DocRef``{ tool: String, version_seen: String, snippet_hash: Option<String>, source_url: Option<String>, seen_at: DateTime<Utc> }`
- `PlaybookEntry.doc_refs: Vec<DocRef>``#[serde(default)]` so pre-Phase-45 entries load as empty vec
- `/vectors/playbook_memory/seed` + `/revise` accept `doc_refs` in the request body
- `/vectors/playbook_memory/doc_drift/check/{id}` — manual drift check: looks up each `doc_refs[]` entry via the context7 bridge, returns per-tool `{version_seen, version_current, drifted: bool}` plus overall verdict
- `/vectors/playbook_memory/doc_drift/scan` — batch scan across all active playbooks (scheduled path for Phase 45.2)
- `mcp-server/context7_bridge.ts` — Bun HTTP bridge. Exposes `GET /docs/:tool/version` + `GET /docs/:tool/:version/diff?since=X` against the installed context7 MCP plugin. Gateway calls this over localhost.
- `PlaybookMemory::compute_boost_for_filtered_with_role` — excludes entries where `doc_drift_flagged_at.is_some() && doc_drift_review.is_none()` (same rule as retired + superseded)
- Overview model synthesis writes `data/_kb/doc_drift_corrections.jsonl` per detected drift: `{playbook_id, tool, version_seen, version_current, diff_summary, recommended_action, generated_at}`
- Human-in-the-loop re-seal path: `/vectors/playbook_memory/doc_drift/resolve/{id}` — marks reviewed, optionally triggers `revise_entry` if procedure changed
**Gate:**
- Seal a playbook referencing Docker 24.x → doc_refs captured. Bump Docker version behind the scenes → `/doc_drift/check/{id}` returns `drifted: true, from: 24.0.7, to: 25.0.1, summary: "..."`. The boosted playbook count on next `/vectors/hybrid` query drops by 1 (drift-flagged skipped).
- `doc_drift_corrections.jsonl` contains the overview model's synthesis for the drift with at least: summary of change, recommended action, cost/impact estimate.
- Human calls `/doc_drift/resolve/{id}` after reviewing → playbook returns to active boost pool (or supersedes via Phase 27 if procedure materially changed).
- Unit tests: DocRef serde default (legacy entries load as empty), drift check against mocked context7 bridge, boost exclusion when drifted+unreviewed.
**Non-goals (explicit):**
- Automatic re-seal without human review. Drift-detection → flag, not silent rewrite.
- Cross-playbook propagation of one drift diagnosis. Each playbook reviewed individually (aggregation later if warranted).
- Generating the updated procedure. T3 *suggests*; human or separate bot (see `bot/`) *writes*.
**Risk:** Medium. The context7 bridge is new infrastructure (Bun ↔ context7 MCP plugin ↔ HTTP shape for gateway consumption). Mitigation: context7 plugin is already installed; its MCP tools return structured JSON; the bridge is thin adapter code. Start with single-tool drift check (Docker) before broadening.
---
## Long-horizon domains (not in current phase sequence) ## Long-horizon domains (not in current phase sequence)
The architecture was drafted with DevOps execution (Terraform, Ansible) as the eventual target. **That remains aspirational, not current scope** — we don't start wiring `terraform validate` / `ansible-lint` until the staffing domain proves the six-layer architecture at scale. The architecture was drafted with DevOps execution (Terraform, Ansible) as the eventual target. **That remains aspirational, not current scope** — we don't start wiring `terraform validate` / `ansible-lint` until the staffing domain proves the six-layer architecture at scale.