lakehouse

Author	SHA1	Message	Date
root	cdc24d8bd0	shared: build ModelMatrix — migrate 5 call sites off deprecated estimate_tokens Some checks failed lakehouse/auditor 1 blocking issue: todo!() macro call in tests/real-world/scrum_master_pipeline.ts The `aibridge::context::estimate_tokens` deprecation has been pointing at `shared::model_matrix::ModelMatrix::estimate_tokens` for a while, but that module didn't exist — so the deprecation was aspirational noise, not actionable guidance. Built the minimal target: `shared::model_matrix::ModelMatrix` with an associated `estimate_tokens(text: &str) -> usize` method. Same chars/4 ceiling heuristic as the deprecated helper. 6 tests cover empty/3/4/5-char cases, multi-byte UTF-8 (emoji count as 1 char each), and linear scaling to 400-char inputs. Migrated 5 call sites: - aibridge/context.rs:88 — opts.system token count - aibridge/context.rs:89 — prompt token count - aibridge/tree_split.rs:22 — import (now uses ModelMatrix) - aibridge/tree_split.rs:84, 89 — truncate_scratchpad budget loop - aibridge/tree_split.rs:282 — scratchpad post-truncation assertion - aibridge/context.rs:183 — system-prompt budget test Also cleaned up two parallel test warnings: - aibridge/context.rs legacy estimate_tokens_ceiling_divides_by_four test deleted (ModelMatrix's tests cover the same behavior now). - vectord/playbook_memory.rs:1650 unused_mut on e_alive. Net workspace warning count: 11 → 0 (including --tests build). The deprecated `estimate_tokens` wrapper stays in aibridge/context.rs for external callers. Future commits can remove it entirely once no public API surface still references it. The applier's warning-count gate now has a floor of 0 — any future patch that introduces a single warning trips the gate automatically. Previously a floor of 11 tolerated noise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 06:32:16 -05:00
profit	8bacd43465	Phase 45 slice 3: doc_drift check + resolve endpoints Some checks failed lakehouse/auditor cloud: claim not backed — "Previously the hybrid fixture honestly reported layer 5 as 404/unimplemented. With this PR it flips " Closes the last open loop of Phase 45. Previously, playbooks could carry doc_refs (slice 1) and the context7 bridge could report drift (slice 2) — but nothing tied them together. An operator had no way to say "check this playbook against its doc sources and flag it if the docs moved." This slice wires that. Ships: - crates/vectord/src/doc_drift.rs — thin context7 bridge client. No cache (bridge has its own 5-min TTL). No retry (transient failure = Unknown outcome, caller decides). - PlaybookMemory::flag_doc_drift(id) — stamps doc_drift_flagged_at idempotently. Once flagged, compute_boost_for_filtered_with_role excludes the entry from both the non-geo and geo-indexed boost paths until resolved. - PlaybookMemory::resolve_doc_drift(id) — human re-admission. Stamps doc_drift_reviewed_at which clears the boost exclusion. - PlaybookMemory::get_entry(id) — new read-only accessor the handler uses to read doc_refs without exposing the state lock. - POST /vectors/playbook_memory/doc_drift/check/{id} - POST /vectors/playbook_memory/doc_drift/resolve/{id} Design call: Unknown outcomes from the bridge (bridge down, tool not in context7, no snippet_hash recorded) are NEVER enough to flag. Only a positive drifted=true from the bridge flips the flag. A down bridge doesn't silently drift-flag every playbook. Tests (5 new, in upsert_tests mod): - flag_doc_drift_stamps_timestamp_and_persists - flag_doc_drift_is_idempotent_on_already_flagged - resolve_doc_drift_clears_flag_admission_gate - boost_excludes_flagged_unreviewed_entries - boost_re_admits_resolved_entries 14/14 upsert tests pass (9 pre-existing + 5 new). Live end-to-end — hybrid fixture on auditor/scaffold (merged to main at b6d69b2) now shows: overall: PASS shipped: [38, 40, 45.1, 45.2, 45.3] placeholder: [—] ✓ Phase 38 /v1/chat 4039ms ✓ Phase 40 Langfuse trace 11ms ✓ Phase 45.1 seed + doc_refs 748ms ✓ Phase 45.2 bridge diff 563ms ✓ Phase 45.3 drift-check endpoint 116ms ← was a 404 before this First time the fixture reports overall=PASS with zero placeholder layers. The honest "not built" signal on layer 5 is now honestly "built and working." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 14:12:57 -05:00
profit	1270e167fe	Post-merge: update test pattern matches for struct-like UpsertOutcome After merging main (with the UpsertOutcome struct-like enum shape from PR #2), the 4 new upsert tests needed pattern-match updates: UpsertOutcome::Added(_) → UpsertOutcome::Added { .. } 9/9 upsert tests pass.	2026-04-22 04:11:13 -05:00
profit	4dca2a6705	Merge branch 'main' of https://git.agentview.dev/profit/lakehouse into fix/upsert-outcome-update-merge	2026-04-22 04:10:27 -05:00
profit	320009ddf4	Fix: UPDATE branch of upsert_entry dropped doc_refs + valid_until All checks were successful lakehouse/auditor all checks passed (3 findings, all info) The auditor's hybrid fixture (branch auditor/scaffold) surfaced this on 2026-04-22. A re-seed of the same (operation, day) pair with new endorsed_names merged the names but silently discarded the incoming doc_refs and valid_until fields. schema_fingerprint was partially handled (set-if-Some) but doc_refs and valid_until weren't touched. Root cause: the UPDATE arm of upsert_entry at playbook_memory.rs:609 only covered: - endorsed_names (union-merge) - timestamp - embedding (if Some) - schema_fingerprint (if Some) Fix: - valid_until — refresh if caller provides one - doc_refs — merge by tool (case-insensitive). Same-tool new entry supersedes older one; different-tool refs are appended. Empty incoming doc_refs preserves existing (don't wipe on partial seed). 4 new regression tests under upsert_tests: - update_merges_doc_refs_with_existing_ones - update_same_tool_supersedes_older_version - update_preserves_existing_doc_refs_when_new_entry_has_none - update_refreshes_valid_until_when_caller_provides_one Test result: 9/9 upsert tests pass (4 new + 5 pre-existing). Branch basis note: this branch is off main, so the UpsertOutcome enum here still has the newtype variants Added(String) / Noop(String). PR #2 (fix/upsert-outcome-serde) changes that enum to struct-like. When PR #2 merges first this branch needs a trivial rebase; the UPDATE arm logic is untouched by that change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 04:06:54 -05:00
profit	f0a3ed6832	Fix: UpsertOutcome newtype variants panicked serde from Phase 26 Some checks failed lakehouse/auditor 1 blocking issue: cloud: claim not backed — "Verified live after gateway restart:" playbook_memory.rs:257 — UpsertOutcome had two newtype variants carrying a bare String: Added(String) Noop(String) under #[serde(tag = "mode")]. serde cannot tag newtype variants of primitive types, so every serialization threw: "cannot serialize tagged newtype variant UpsertOutcome::Added containing a string" This caused gateway /vectors/playbook_memory/seed to panic the tokio worker on EVERY call that reached Added or Noop, returning an empty socket close to the client. The bug was silent from commit 640db8c (Phase 26, 2026-04-21) until 2026-04-22 when the auditor's hybrid fixture (auditor/fixtures/hybrid_38_40_45.ts on the auditor/scaffold branch) exercised the endpoint live and gateway logs showed the panic. Fix — convert both newtype variants to struct-like: Added { playbook_id: String } Noop { playbook_id: String } Updated all 7 construction + pattern-match sites. Updated rustdoc on the enum explaining why the shape is what it is. JSON wire format is now uniform across all three variants: {"mode":"added","playbook_id":"pb-..."} {"mode":"updated","playbook_id":"pb-...","merged_names":[...]} {"mode":"noop","playbook_id":"pb-..."} Verified live after gateway restart: curl /seed new payload → mode=added, playbook 860231f5 curl /seed new payload + doc_refs → mode=added, playbook 11d348d9 curl /seed identical re-submit → mode=noop, same id 860231f5, entries_after unchanged (Mem0 contract intact) Tests: 51/51 vectord lib tests green. Release build clean. This is a follow-up bug fix landed in its own branch (fix/upsert-outcome-serde) rather than commingled with other work. The auditor's hybrid fixture on the auditor/scaffold branch will now light up layer 3 (phase45_seed_with_doc_refs) as a pass once this merges — previously it failed here with an empty socket close.	2026-04-22 03:48:05 -05:00
profit	2a4b81bf48	Phase 45 (first slice): DocRef + doc_refs field on PlaybookEntry Phase J keeps asking for: playbooks know which external docs they used, get flagged when those docs drift. This commit ships the data model; context7 bridge + drift check endpoints land in follow-ups. Added to crates/vectord/src/playbook_memory.rs: - pub struct DocRef { tool, version_seen, snippet_hash, source_url, seen_at } — one external doc reference - PlaybookEntry.doc_refs: Vec<DocRef> — empty on legacy entries, serde default ensures pre-Phase-45 persisted state loads cleanly - PlaybookEntry.doc_drift_flagged_at: Option<String> — set by the (future) drift-check code when context7 reports newer version - PlaybookEntry.doc_drift_reviewed_at: Option<String> — set by human via /resolve endpoint after reviewing the diagnosis - impl Default for PlaybookEntry — collapses most test-helper constructors from 17 explicit fields to 6-9 fields + ..Default::default() Updated SeedPlaybookRequest + RevisePlaybookRequest (service.rs) to accept optional doc_refs: the seed/revise endpoints already take the field, downstream drift detection (Phase 45.2) consumes it. Docs: docs/CONTROL_PLANE_PRD.md gains full Phase 45 spec with gate criteria, non-goals, and risk notes. Tests: 51/51 vectord lib tests green (same count as before, field additions are backward-compat). Memory: project_doc_drift_vision.md written so this keeps coming back to the front of mind. Next slices (same phase): context7 HTTP bridge in mcp-server, /vectors/playbook_memory/doc_drift/check/{id} endpoint, overview- model drift synthesis writing to data/_kb/doc_drift_corrections.jsonl, boost exclusion for flagged+unreviewed entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 03:14:07 -05:00
profit	a6f12e2609	Phase 21 Rust port + Phase 27 playbook versioning + doc-sync Phase 21 — Rust port of scratchpad + tree-split primitives (companion to the 2026-04-21 TS shipment). New crates/aibridge modules: context.rs — estimate_tokens (chars/4 ceil), context_window_for, assert_context_budget returning a BudgetCheck with numeric diagnostics on both success and overflow. Windows table mirrors config/models.json. continuation.rs — generate_continuable<G: TextGenerator>. Handles the two failure modes: empty-response from thinking models (geometric 2x budget backoff up to budget_cap) and truncated-non-empty (continuation with partial as scratchpad). is_structurally_complete balances braces then JSON.parse-checks. Guards the degen case "all retries empty, don't loop on empty partial". tree_split.rs — generate_tree_split map->reduce with running scratchpad. Per-shard + reduce-prompt go through assert_context_budget first; loud-fails rather than silently truncating. Oldest-digest-first scratchpad truncation at scratchpad_budget (default 6000 t). TextGenerator trait (native async-fn-in-trait, edition 2024). AiClient implements it; ScriptedGenerator test double lets tests inject canned sequences without a live Ollama. GenerateRequest gained think: Option<bool> — forwards to sidecar for per-call hidden-reasoning opt-out on hot-path JSON emitters. Three existing callsites updated (rag.rs x2, service.rs hybrid answer). Phase 27 — Playbook versioning. PlaybookEntry gained four optional fields (all #[serde(default)] so pre-Phase-27 state loads as roots): version u32, default 1 parent_id Option<String>, previous version's playbook_id superseded_at Option<String>, set when newer version replaces superseded_by Option<String>, the playbook_id that replaced New methods: revise_entry(parent_id, new_entry) — appends new version, stamps superseded_at+superseded_by on parent, inherits parent_id and sets version = parent + 1 on the new entry. Rejects revising a retired or already-superseded parent (tip-of-chain is the only valid revise target). history(playbook_id) — returns full chain root->tip from any node. Walks parent_id back to root, then superseded_by forward to tip. Cycle-safe. Superseded entries excluded from boost (same rule as retired): filter in compute_boost_for_filtered_with_role (both active-entries prefilter and geo-filtered path), rebuild_geo_index, and upsert_entry's existing- idx search. status_counts returns (total, retired, superseded, failures); /status JSON reports active = total - retired - superseded. Endpoints: POST /vectors/playbook_memory/revise GET /vectors/playbook_memory/history/{id} Doc-sync — PHASES.md + PRD.md drifted from git after Phases 24-26 shipped. Fixes applied: - Phase 24 marked shipped (commit b95dd86) with detail of observer HTTP ingest + scenario outcome streaming. PRD "NOT YET WIRED" rewritten to reflect shipped state. - Phase 25 (validity windows, commit e0a843d) added to PHASES + PRD. - Phase 26 (Mem0 upsert + Letta hot cache, commit 640db8c) added. - Phase 27 entry added to both docs. - Phase 19.6 time decay corrected: was documented as "deferred", actually wired via BOOST_HALF_LIFE_DAYS = 30.0 in playbook_memory.rs. - Phase E/Phase 8 tombstone-at-compaction limit note updated — Phase E.2 closed it. Tests: 8 new version_tests in vectord (chain-metadata stamping, retired/superseded parent rejection, boost exclusion, history from root/tip/middle, legacy default round-trip, status counts). 25 new aibridge tests (context/continuation/tree_split). Workspace total 145 green (was 120). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 17:40:49 -05:00
root	640db8c63c	Phase 26 — Mem0 upsert + Letta geo hot cache Closes the two remaining 2026-era memory findings. Both are optimizations per J's framing — not load-bearing, but good data hygiene + future-proofing at scale. MEM0 UPSERT (data hygiene): Before: /seed always appended. A scenario re-running the same operation on the same day wrote duplicate entries, inflating the playbook corpus with near-identical rows. Now: upsert_entry(new) inspects existing non-retired entries and decides ADD / UPDATE / NOOP: ADD → no matching (operation, day, city, state) tuple, append UPDATE → match exists with different names → merge (union, stable order), refresh timestamp, keep original playbook_id so citations stay valid NOOP → match exists with identical names → skip, return id Day-granularity keying on timestamp YYYY-MM-DD means intraday re-seeds dedup but tomorrow's same-operation is a fresh ADD. Retired entries don't block new seeds — they're out of scope anyway. Seed endpoint returns {outcome: {mode, playbook_id, merged_names?}, entries_after}. Append=false retains old replace-all semantics. 5 unit tests pass: first_seed_is_add, identical_reseed_is_noop, same_day_different_names_updates_and_merges, different_day_same_op_is_add, retired_entry_doesnt_block_new_seed. Live verified: three successive seeds with (Alice), (Alice), (Alice, Bob) left entry count unchanged at 1936 with merged names {Alejandro, Lauren, Alice, Bob}. Previously would have been 3 appends. LETTA GEO HOT CACHE (scale primitive): Added geo_index: HashMap<(city_lower, state_upper), Vec<usize>> alongside PlaybookMemoryState. Rebuilt on every mutation: set_entries, retire_one, retire_on_schema_drift, upsert_entry, load_from_storage. compute_boost_for_filtered_with_role now uses the index for O(1) geo lookup instead of scanning all entries. At current scale (1.9K) the scan was sub-ms; at 100K+ the scan becomes the dominant cost. The hot cache future-proofs without adding an LRU abstraction. Retired entries excluded from index; valid_until still checked on the hot path since it can elapse between rebuilds. Owns cloned PlaybookEntries in the geo_filtered vector so the state read-lock is released before cosine scoring — avoids lock contention on the scoring path. Memory-findings progress: 5 of 5 shipped. ✓ Multi-strategy parallel retrieval (Phase 19 refinement) ✓ Input normalization + unified /memory/query (Phase 24 TS) ✓ Zep validity windows (Phase 25) ✓ Mem0 UPSERT (Phase 26 today) ✓ Letta geo hot cache (Phase 26 today) All 18 playbook_memory tests pass.	2026-04-21 00:24:05 -05:00
root	e0a843d1a5	Phase 25 — validity windows + playbook retirement Addresses the load-bearing memory gap J flagged: playbook entries had timestamps but no retirement semantic. When a schema migration changed a column or a seasonal contract ended, stale playbooks kept boosting candidates silently. Zep 2026-era finding — temporal validity is the single highest-value memory-hygiene primitive. SCHEMA (PlaybookEntry gains four optional fields, serde default): schema_fingerprint — SHA-256 over dataset (column, type) tuples at seed time. Missing = legacy entry, never auto-retired on drift. valid_until — RFC3339 hard expiry. compute_boost skips entries past this moment. retired_at — Set by retire_one or retire_on_schema_drift. Retired entries excluded from all boost calculations but kept in journal. retirement_reason — Human-readable: "schema_drift: ...", "expired: ...", "manual: ..." RETRIEVAL PATH (compute_boost_for_filtered_with_role): Before geo+cosine, active_entries filter removes anything retired OR past valid_until. Uses chrono::Utc::now() once per call, no per- entry clock queries. NEW METHODS on PlaybookMemory: retire_one(playbook_id, reason) retire_on_schema_drift(city, state, current_fp, reason) — idempotent, scopes by (city, state) so a Nashville migration doesn't touch Chicago. Skips legacy entries with no fingerprint. status_counts() -> (total, retired, failures) HTTP ENDPOINTS: POST /vectors/playbook_memory/retire {playbook_id, reason} → retire by id {city, state, current_schema_fingerprint, reason} → schema drift GET /vectors/playbook_memory/status {total, active, retired, failures} SEED REQUEST extended with optional schema_fingerprint + valid_until so the orchestrator (scenario.ts) can pass the current schema hash when seeding, without a round trip through catalogd. UNIT TESTS (5/5 pass): retire_one_marks_entry_and_persists, retired_entries_do_not_boost, expired_valid_until_is_skipped, schema_drift_retires_mismatched_fingerprints_only, schema_drift_skips_other_cities. LIVE VERIFIED: /status on current state = 1936 entries, 43 failures. POST /retire with a sample playbook_id → "retired":1, /status now reports active=1935, retired=1. Memory-findings progress: 3 of 5 shipped. ✓ Multi-strategy parallel retrieval (Phase 19 refinement) ✓ Input normalization + unified /memory/query (Phase 24 TS) ✓ Zep-style validity windows (Phase 25, tonight) ⏳ Mem0 UPDATE / DELETE / NOOP ops (dedup same-(op,date) seeds) ⏳ Letta working-memory hot cache (not biting at 1.5K entries)	2026-04-21 00:11:02 -05:00
root	ad0edbe29c	Cloud kimi-k2.5 executor for weak tiers + multi-strategy playbook retrieval Two coupled changes from the 2026 agent-memory research + tool asymmetry findings. SCENARIO (weak-tier cloud substitute): qwen2.5 collapsed to 0/14 across the basic/minimal tool_levels. Replace with cloud kimi-k2.5 on Ollama Cloud — same family as k2.6 (pro-tier locked today, on J's upgrade path). Plumb cloud flag through ACTIVE_EXECUTOR_CLOUD / ACTIVE_REVIEWER_CLOUD into generateContinuable so executor/reviewer can route to cloud when tool_level requires. think:false supported by Kimi family. Tool level mapping (revised): full — qwen3.5 local + qwen3 local + cloud gpt-oss:120b T3 + rescue local — qwen3.5 local + qwen3 local + local gpt-oss:20b T3 + rescue basic — kimi-k2.5 cloud + qwen3 local + local T3, no rescue minimal — kimi-k2.5 cloud + qwen3 local, no T3, no rescue. Playbook inheritance alone on the decision path. This is the honest version of J's "minimal tools still works via inheritance" hypothesis — with the executor no longer broken at the tokenizer level, we can actually measure whether playbook retrieval substitutes for missing overseers. PLAYBOOK_MEMORY (multi-strategy retrieval): Zep / Mem0 research shows multi-strategy rerank (semantic + keyword + graph + temporal) outperforms single-strategy cosine. Lakehouse now has a two-tier: 1. Exact (role, city, state) match: skip cosine, assign similarity=1.0, take up to top_k/2+1 slots. These are identity-class neighbors — the strongest possible signal. 2. Cosine fallback within the same (city, state) but different role: fills remaining slots. Exposed as compute_boost_for_filtered_with_role(target_geo, target_role). Backwards-compatible: compute_boost_for_filtered forwards with role=None so existing callers keep their current behavior. Service.rs wires both: extract_target_geo and extract_target_role pull from the executor's SQL filter. grab_eq_value is factored out of extract_target_geo so both lookups share one parser. Diagnostic log now prints target_role alongside target_geo for every hybrid_search: playbook_boost: boosts=88 sources=39 parsed=39 matched=5 target_geo=Some(("Nashville", "TN")) target_role=Some("Welder") Verified: Nashville Welder query returns 5/10 boosted workers in top_k with clean role+geo provenance. Research sources: atlan.com Agent Memory Frameworks 2026, Mem0 paper (arxiv 2504.19413), Zep/Graphiti LongMemEval comparison, ossinsight Agent Memory Race 2026. kimi-k2.6 on current key returns 403 — pro-tier upgrade required. kimi-k2.5 is the substitute today; swap to k2.6 by renaming one line in applyToolLevel once the subscription lands.	2026-04-20 23:20:07 -05:00
root	a663698571	Item 3 — geo-filtered playbook boost; diagnostic logging ROOT CAUSE (found via instrumentation, not hunch): After a 20-scenario corpus batch, only 6/40 successful (role, city) combos ever triggered playbook_memory citations on subsequent runs. Added `playbook_boost:` tracing::info! line in vectord::service to log boost map size vs candidate pool vs match count. One query revealed: boosts=170 sources=50 parsed=50 matched=0 170 endorsed workers came back from compute_boost_for — but zero were in the 50-candidate Toledo pool. The boost map was pulling globally- ranked semantic neighbors (top-100 playbooks across ALL cities), dominated by Kansas City / Chicago / Detroit forklift playbooks the Toledo SQL filter would never admit. The mechanism was correct at the per-playbook level; the problem was pool intersection. FIX (surgical, not cap-tuning): - playbook_memory::compute_boost_for_filtered(): accepts optional (city, state) filter. When set, skips playbooks from other geos BEFORE cosine-ranking, so top-k is within the target city. - Backwards-compatible: compute_boost_for() calls the filtered variant with None — existing callers unchanged. - service::hybrid_search(): extracts target (city, state) from the executor's SQL filter via a small parser (extract_target_geo), passes to compute_boost_for_filtered. VERIFIED: Before fix: boosts=170 sources=50 parsed=50 matched=0 (0% hit) After fix: boosts=36 sources=50 parsed=50 matched=11 (22% hit) Top-k=10 now has 7/10 boosted workers with 2-3 citations each. Boost values 0.075-0.113 on cosine scores 0.67-0.74 — meaningful reorder without saturation. scripts/kb_measure.py: Aggregator that reads data/_kb/.jsonl and playbooks//results.json, reports fill rate, citation density, recommender confidence trend, and zero-citation-ok combos (item 3 target signal). Used to measure before/after on bigger batches. Diagnostic logging stays — the class of "boosts computed but not matched" bug can recur if the SQL filter format ever drifts, and without the counter it's invisible. Every hybrid_search with use_playbook_memory=true now logs its boost stats.	2026-04-20 21:35:04 -05:00
root	95c26f04f8	Path 1 negative signal + Path 2 pattern discovery + name validation New: - /vectors/playbook_memory/patterns: meta-index pattern discovery. Given a query, finds top-K similar playbooks, pulls each endorsed worker's full workers_500k profile, aggregates shared traits (cert frequencies, skill frequencies, modal archetype, reliability distribution), returns a human-readable discovered_pattern. Surfaces signals operators didn't explicitly query — the original PRD's "identify things we didn't know" dimension. - /vectors/playbook_memory/mark_failed: records worker failures per (city, state, name). compute_boost_for applies 0.5^n penalty per recorded failure, so 3 failures quarter a worker's positive boost and 5 effectively zero it. Path 1 negative signal — recruiter trust depends on the system NOT recommending people who no-showed. - Bun /log_failure: validates failed_names against workers_500k (same ghost-guard as /log), forwards to /mark_failed. Improved: - /log now validates endorsed_names against workers_500k for the contract's city+state before seeding. Ghost names (names that don't correspond to real workers) are rejected in the response and excluded from the seed, preventing silent boost failures. - Bun /search auto-appends `CAST(availability AS DOUBLE) > 0.5` to sql_filter when the caller didn't constrain availability. Opt out with `include_unavailable: true`. Recruiter trust bug: surfacing already-placed workers breaks the first call. - DEFAULT_TOP_K_PLAYBOOKS 25 → 100. Direct cosine measurement showed similarities cluster 0.55-0.67 across all playbooks regardless of geo, so k=25 missed relevant geo-matched playbooks. Brute-force is still sub-ms at this size. Verified end-to-end on live data: - Ghost names rejected on /log + /log_failure - Availability filter drops unavailable workers from candidate pool - Pattern discovery on unseen Cleveland OH Welder query returned recurring skills (first aid 43%, grinder 43%, blueprint 43%) and modal archetype (specialist) across 20 semantically similar past playbooks in 0.24s - Negative signal: Helen Sanchez boost dropped +0.250 → +0.163 after 3 failures recorded via /log_failure (34% reduction)	2026-04-20 14:55:46 -05:00
root	25b7e6c3a7	Phase 19 wiring + Path 1/2 work + chain integrity fixes Backend: - crates/vectord/src/playbook_memory.rs (new): Phase 19 in-memory boost store with seed/rebuild/snapshot, plus temporal decay (e^-age/30 per playbook), persist_to_sql endpoint backing successful_playbooks_live, and discover_patterns endpoint for meta-index pattern aggregation (recurring certs/skills/archetype/reliability across similar past fills). - DEFAULT_TOP_K_PLAYBOOKS bumped 5 → 25; old default silently missed most boosts when memory had > 25 entries. - service.rs: new routes /vectors/playbook_memory/{seed,rebuild,stats, persist_sql,patterns}. Bun staffing co-pilot (mcp-server/): - /search, /match, /verify, /proof, /simulation/run, MCP tools all forward use_playbook_memory:true and playbook_memory_k:25 to the hybrid endpoint. Boost was previously dark across the entire app. - /log no longer POSTs to /ingest/file — that endpoint REPLACES the dataset's object list, so single-row CSV writes were wiping all prior rows in successful_playbooks (sp_rows went 33→1 in one /log call). /log now seeds playbook_memory with canonical short text and calls /persist_sql to keep successful_playbooks_live in sync. - /simulation/run cumulative end-of-week CSV write removed for the same reason. Per-day per-contract /seed (added in this session) is the accumulating feedback path now. - search.html addWorkerInsight renders a green "Endorsed · N playbooks" chip with playbook citations when boost > 0. Internal Dioxus UI (crates/ui/): - Dashboard phase list rewritten through Phase 19 (was stuck at "Phase 16: File Watcher" / "Phase 17: DB Connector" — both wrong). - Removed fabricated "27ms" stat label. - Ask tab examples + SQL default replaced with real staffing prompts against candidates/clients/job_orders (was referencing nonexistent employees/products/events). - New Playbook tab exposes /vectors/playbook_memory/{stats,rebuild} and side-by-side hybrid search (boost OFF vs ON) with citations. Tests (tests/multi-agent/): - run_e2e_rated.ts: parallel two-agent (mistral + qwen2.5) build phase + verifier rating (geo, auth, persist, boost, speed → /10). - network_proving.ts: continuous build → verify → repeat with staffing-recruiter profile hot-swap; geo-discrimination check. - chain_of_custody.ts: single recruiter operation traced through every layer (Bun /search, direct /vectors/hybrid parity, /log, SQL, playbook_memory growth, profile activation, post-op boost lift).	2026-04-20 06:21:13 -05:00

14 Commits