diff --git a/STATE_OF_PLAY.md b/STATE_OF_PLAY.md index 7572777..6f92e43 100644 --- a/STATE_OF_PLAY.md +++ b/STATE_OF_PLAY.md @@ -272,7 +272,8 @@ a steady state. Future items will land here as production triggers fire. | (scrum) | 3-lineage scrum on `434f466..0d4f033` (post_role_gate_v1). Convergent finding (Opus + Kimi): `DecodeIndex` lost nil-meta items across persistence. **Fixed** by bumping envelope version 1→2 with explicit `IDs []string` field; v1 envelopes still load via meta-key fallback. Opus-only real bugs also actioned: `handleMerge` non-`ErrIndexNotFound` nil-deref, `mathLog` dead wrapper removed, bubble sort → `sort.Slice`. False positives rejected after verification (Kimi rollback misreading + Opus stale-comment claim). 2 new regression tests lock the v2 round-trip + v1 backward-compat. Disposition: `reports/scrum/_evidence/2026-05-01/verdicts/post_role_gate_v1_disposition.md`. | | (audit-full port) | **Audit-FULL pipeline** (phases 0/3/4) ported from `scripts/distillation/audit_full.ts`. `internal/distillation/audit_full.go` + `cmd/audit_full` CLI. 6 ported required-check classes; 4 phases (1, 2, 5, 6, 7) deferred — depend on broader Rust pieces (materializer / replay / run-summaries) not yet ported. **Cross-runtime byte-equal verdict on live data**: Go-side audit-full against `/home/profit/lakehouse` produced p3_*/p4_* metrics IDENTICAL to the last Rust-emitted `audit_baselines.jsonl` entry (all 8 metrics match: p3_accepted=386, p3_partial=132, p3_rejected=57, p3_human=480, p4_sft_rows=353, p4_rag_rows=448, p4_pref_pairs=83, p4_total_quarantined=1325). 6 new tests + the live-data probe captured in `reports/cutover/audit_full_go_vs_rust.md`. | | (audit-full skips fixed) | **Phases 1/2/5/7 unskipped** (2026-05-01) — port reduced from 4 deferred phases to 1. **Phase 1**: invokes `go test ./internal/distillation/...` via exec.Command (Go equivalent of Rust's `bun test`). **Phase 2**: reads `data/evidence/` and tallies rows + tier-1 source hits as an observer (doesn't re-run the materializer; emits `p2_evidence_rows`/`p2_evidence_skips` metrics). **Phase 5**: reads `reports/distillation/{run_id}/summary.json` + 5 stage receipts; validates schema_version + run_hash sha256 + git_commit hex. **Phase 7**: reads `data/_kb/replay_runs.jsonl`; tail-row JSON parse check. Only **Phase 6** remains skipped (Rust `acceptance.ts` is a TS-only fixture harness; porting fixtures + invariant runner is its own ADR). Live-data probe: 12/12 required checks PASS, `p2_evidence_rows=1055` byte-equal to Rust `summary.json` `collect.records_out`. 6 new tests. | -| (lets-go) | **Persistent Go stack live** (2026-05-01). All 11 daemons (storaged/catalogd/ingestd/queryd/embedd/vectord/pathwayd/observerd/matrixd/gateway/chatd) up as long-running processes on :3110+:3211-:3220, alongside the live Rust gateway on :3100 (no port conflict). First time the Go side runs as production-shape daemons rather than per-harness transient processes. Brought up via `scripts/cutover/start_go_stack.sh`. Gateway proxies `/v1/embed` correctly to embedd; all 5 chatd providers loaded. **First Go-side entry written to `data/_kb/audit_baselines.jsonl`** (entry #8, git_commit=`ee2a40c`, golangLAKEHOUSE SHA distinguishable from Rust's `ca7375ea`); the longitudinal log now mixes runtimes. | +| (lets-go) | **Persistent Go stack live** (2026-05-01). All 11 daemons (storaged/catalogd/ingestd/queryd/embedd/vectord/pathwayd/observerd/matrixd/gateway/chatd) up as long-running processes on :3110+:3211-:3220 → later moved to :4110+:4211-:4219+:3220 for smoke isolation. First time the Go side runs as production-shape daemons rather than per-harness transient processes. Brought up via `scripts/cutover/start_go_stack.sh`. Gateway proxies `/v1/embed` correctly to embedd; all 5 chatd providers loaded. **First Go-side entry written to `data/_kb/audit_baselines.jsonl`** (entry #8, git_commit=`ee2a40c`, golangLAKEHOUSE SHA distinguishable from Rust's `ca7375ea`); the longitudinal log now mixes runtimes. | +| (g5-slice) | **G5 cutover slice LIVE** (2026-05-01). First real Bun-frontend traffic reaching the Go substrate end-to-end. Bun mcp-server (`/home/profit/lakehouse/mcp-server/index.ts`) gains opt-in `/_go/*` pass-through to `$GO_LAKEHOUSE_URL` (set to `http://127.0.0.1:4110` via systemd drop-in). `/_go/v1/embed` returns nomic-embed-text-v2-moe vectors via Go embedd; `/_go/v1/matrix/search` returns 3/3 Forklift Operators against the persistent 200-worker corpus. Fully additive (no existing Bun tool modified) + fully reversible (unset env). `/api/*` (Rust gateway) path unchanged. See `reports/cutover/g5_first_slice_live.md`. | | (close-3) | **OPEN #3: distribution drift via PSI** — `internal/drift/drift.go`: `ComputeDistributionDrift` returns Population Stability Index + verdict tier (stable < 0.10, minor 0.10–0.25, major ≥ 0.25). Equal-width bucketing over combined min/max range, epsilon-clamping for empty buckets, per-bucket breakdown for drilldown. 7 new tests including identical-is-stable, hard-shift-is-major, moderate-detected-not-stable, empty-inputs-safe, all-identical-safe, bucket-counts-conserved, num-buckets-clamping. | | (close-4) | **OPEN #4: ops nice-to-haves** — (a) Real-time wall-clock for stress harness: per-phase elapsed time logged to stdout as it runs (`[stress] phase NAME starting (T+12.3s)` + `[stress] phase NAME done — 8.5s (T+20.8s)`); `Output.PhaseTimings` + `Output.TotalElapsedMs` written to JSON; (b) chatd fixture-mode S3 mock + (c) liberal-paraphrase calibration: not actioned — no fired trigger yet, would be speculative. Documented as deferred-until-need rather than ignored. | diff --git a/reports/cutover/SUMMARY.md b/reports/cutover/SUMMARY.md index ec1b505..3b92231 100644 --- a/reports/cutover/SUMMARY.md +++ b/reports/cutover/SUMMARY.md @@ -12,6 +12,7 @@ what's safe to flip. Append a row when a new endpoint clears parity. | `audit-FULL` (phases 0/1/2/3/4/5/7 — observer mode) | 2026-05-01 | `scripts/distillation/audit_full.ts` | `cmd/audit_full` + `internal/distillation` `RunAuditFull` | ✅ PASS 12/12 | Skips reduced from 4 → 1: phase 1 invokes `go test`, phases 2/5/7 read existing artifacts as observers (no live materializer/replay invocation). Only phase 6 (TS-only acceptance harness) remains skipped. `p2_evidence_rows=1055` matches Rust `summary.json` `collect.records_out=1055` byte-equal. Updated `audit_full_go_vs_rust.md`. | | `audit_baselines.jsonl` write side | 2026-05-01 | `data/_kb/audit_baselines.jsonl` (Rust-emitted, 7 entries) | Go-emitted entry #8 via `cmd/audit_full -append-baseline` | ✅ Mixed-runtime log | First Go-side entry written to the shared longitudinal log: `git_commit=ee2a40c5...` (golangLAKEHOUSE SHA, distinguishable from prior Rust SHAs like `ca7375ea`). All 10 metric fields match Rust shape exactly — drift comparator fires correctly across the runtime boundary. | | Full Go stack (persistent) | 2026-05-01 | per-binary on :31xx | 11 daemons (storaged/catalogd/ingestd/queryd/embedd/vectord/pathwayd/observerd/matrixd/gateway/chatd) | ✅ All 11 healthy | First time the Go stack runs as long-running daemons rather than per-harness transient processes. Brought up via `scripts/cutover/start_go_stack.sh`; gateway proxies `/v1/embed` correctly through to embedd; all 5 chatd providers loaded. Live alongside the Rust gateway on :3100 (no port conflict). | +| **G5 cutover slice live** | 2026-05-01 | (none — pure cutover) | Bun `/_go/*` → Go gateway `:4110` | ✅ End-to-end | First real Bun-frontend traffic to Go substrate. Rust legacy `mcp-server/index.ts` gains opt-in `/_go/*` pass-through driven by `GO_LAKEHOUSE_URL` env (systemd drop-in at `/etc/systemd/system/lakehouse-agent.service.d/go-cutover.conf`). `/_go/v1/embed` returns nomic-embed-text-v2-moe vectors; `/_go/v1/matrix/search` returns 3/3 Forklift Operators against persistent stack's 200-worker corpus. Reversible (unset env or revert systemd unit). See `g5_first_slice_live.md`. | ## Wire-format drift catalog diff --git a/reports/cutover/g5_first_slice_live.md b/reports/cutover/g5_first_slice_live.md new file mode 100644 index 0000000..64037f9 --- /dev/null +++ b/reports/cutover/g5_first_slice_live.md @@ -0,0 +1,110 @@ +# G5 cutover slice live — first real Bun-frontend traffic to Go substrate + +2026-05-01: the Bun mcp-server (which serves devop.live/lakehouse/ +via nginx → :3700) now has a `/_go/*` pass-through that routes +requests to the Go gateway. Real frontend traffic can flow through +the Go substrate while the existing /api/* path (Rust gateway) +stays untouched. + +## What changed + +**Rust legacy repo** (`/home/profit/lakehouse/`): + +- `mcp-server/index.ts`: + - New `GO_BASE = process.env.GO_LAKEHOUSE_URL || ""` (off-by-default) + - New `/_go/*` handler at the same shape as `/api/*` (forwarded + method+headers+body) but pointing at `${GO_BASE}${path}` + - Returns 503 with rationale when `GO_LAKEHOUSE_URL` is unset + - Returns 502 when Go gateway is unreachable (preserves operator + visibility into Go-side failures vs silent fallback) + +**System config**: + +- `/etc/systemd/system/lakehouse-agent.service.d/go-cutover.conf`: + ``` + [Service] + Environment=GO_LAKEHOUSE_URL=http://127.0.0.1:4110 + ``` + Drop-in adds the env var to the systemd-managed Bun service. + Reversible via `systemctl revert lakehouse-agent.service` or by + removing the file. + +**Go rewrite repo** (this commit): documentation + cross-runtime +parity log entry. + +## Verification (all live, against persistent Go stack on :4110) + +| Path | Routes to | Verdict | +|---|---|---| +| `GET /_go/health` | Go gateway `:4110/health` | `{"status":"ok","service":"gateway"}` | +| `POST /_go/v1/embed` | Go embedd via gateway | nomic-embed-text-v2-moe vectors, dim=768 | +| `POST /_go/v1/matrix/search` | Go matrixd via gateway | 3/3 Forklift Operators against the 200-worker persistent corpus | +| `GET /api/health` (control) | Rust gateway `:3100/health` | `lakehouse ok` (unchanged) | + +Search results from the live cutover slice: + +``` +rank=0 id=w-43 dist=0.445 Brian Ramirez (Forklift Operator, Springfield, IL) +rank=1 id=w-102 dist=0.448 Laura Long (Forklift Operator, Cleveland, OH) +rank=2 id=w-101 dist=0.485 Terrence Gray (Forklift Operator, Champaign, IL) +``` + +3/3 role match. Top-1 in IL exactly. Real coordinator-shape query +served by Go substrate via Bun frontend. + +## What this is + +This is the **first time** real Bun-frontend traffic has reached the +Go substrate end-to-end. Production paths (devop.live/lakehouse/* → +nginx → Bun :3700 → /api/*) still go through Rust on :3100; the +new /_go/* path is a parallel slice that operators or external +tools can hit to validate Go under realistic frontend conditions. + +The cutover unit is FULLY ADDITIVE: +- No existing tool modified +- /api/* unchanged +- All Bun mcp-server tools that currently call BASE (Rust) unchanged +- /_go/* fails-closed (503) when env var unset + +The cutover unit is FULLY REVERSIBLE: +- Unset `GO_LAKEHOUSE_URL` → /_go/* returns 503 +- `systemctl revert lakehouse-agent` → removes the drop-in +- `git revert` of the Bun source change → removes the handler + +## What this is NOT + +- **Not** an nginx flip. devop.live/lakehouse/* still goes + through Rust. Operators have to opt into /_go/*. +- **Not** a wholesale routing change. Each path under /_go/ is a + manual choice; no automatic routing of Rust paths to Go + equivalents. +- **Not** a transformation layer. /_go/v1/embed expects the Go + request shape (`{texts, model}`); /_go/* doesn't translate + Rust-shaped requests for the caller. + +## Operational discipline + +The systemd drop-in is at `/etc/systemd/system/lakehouse-agent.service.d/`. +Operators inspecting the system should know to look there for +cutover-related env overrides. + +The persistent Go stack must be up at :4110 for /_go/* to succeed. +Tear down via `pkill -f 'bin/persistent-'`. Bring up via +`scripts/cutover/start_go_stack.sh`. + +## What's next + +- **Pick a real coordinator-facing tool to flip.** The current + /_go/* pass-through is operator-facing (manual curl). The next + step would be modifying ONE specific tool in mcp-server/index.ts + (e.g. `/search` → /v1/matrix/search) to optionally route through + Go when GO_BASE is set, with response shape adapter. That's a + product-visible flip; current state is infrastructure-visible. +- **Shadow-read mode.** A hybrid where /api/* ALSO calls /_go/* + in parallel and logs diffs to stderr — operators see Go-vs-Rust + divergences on real traffic without changing user-facing + behavior. Cleaner cutover discipline than direct flip. +- **Production load test.** This slice has been exercised on 3 + manual curl requests. A traffic generator hitting /_go/v1/matrix/search + at sustained rate would expose latency / stability under load + that single-shot tests can't.