From eb0dfdff047e34439896552d483abbee673d5a47 Mon Sep 17 00:00:00 2001 From: root Date: Fri, 1 May 2026 01:20:37 -0500 Subject: [PATCH] =?UTF-8?q?vectord:=20v2=20envelope=20+=20handleMerge=20ro?= =?UTF-8?q?bustness=20=E2=80=94=20actions=20post=5Frole=5Fgate=5Fv1=20scru?= =?UTF-8?q?m?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 3-lineage scrum on 434f466..0d4f033 surfaced one convergent finding (Opus + Kimi) and 3 Opus-only real bugs. All actioned in this commit. Two false positives (Kimi rollback misreading, Opus stale- comment claim) verified + rejected — both required manual control- flow inspection to refute, matching the documented Kimi-truncation behavior in feedback_cross_lineage_review.md. Convergent fix — DecodeIndex lost nil-meta items: - Envelope version bumped 1 → 2. - New v2 field: IDs []string carries the canonical ID set explicitly, independent of meta map's nil-vs-{} sparseness. - DecodeIndex accepts both versions: v2 reads from env.IDs; v1 falls back to meta-key inference (with the documented limitation that nil-meta items are invisible — preserved for backward-compat with already-persisted indexes). - Encode emits v2 going forward. - 2 new regression tests: - TestEncodeDecode_NilMetaItemsSurviveRoundTrip: items added with nil metadata MUST survive Encode → Decode and remain visible to IDs(). Pre-fix would have yielded IDs() == []. - TestDecodeIndex_V1BackwardCompat: hand-crafted v1 envelope still decodes (proves the fallback path). Opus-only fixes: - handleMerge: non-ErrIndexNotFound errors at h.reg.Get(name) / h.reg.Get(req.Dest) now return 500 + log instead of falling through with nil src/dest pointers (which would panic on the next deref). Real bug — only the sentinel error was handled. - internal/drift/drift.go: mathLog wrapper removed; math.Log inlined. Wrapper added no value (math was already imported). - internal/distillation/audit_baseline.go: BuildAuditDriftTable's bubble sort replaced with sort.Slice. Idiomatic + shorter. Rejected after verification: - Kimi WARN "missing rollback on partial merge": misread the control flow. Code at cmd/vectord/main.go:404-414 does NOT delete from src when dest.Add fails (continue before reaching src.Delete). Only successful Adds trigger Deletes. - Opus INFO "TimestampUnixNano comment references missing field": field exists at scripts/multi_coord_stress/main.go:128. Opus saw only the diff context, not the full file. Deferred (no fired trigger): - Opus WARN "no per-index lock during merge": no concurrent merge callers today (operators run merge as deliberate one-shot job). Worth a lock if/when matrixd or chatd start auto-triggering. Disposition: reports/scrum/_evidence/2026-05-01/verdicts/post_role_gate_v1_disposition.md. Build + vet + tests green; 2 new regression tests + all prior tests unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) --- STATE_OF_PLAY.md | 1 + cmd/vectord/main.go | 13 +++++ internal/distillation/audit_baseline.go | 15 ++---- internal/drift/drift.go | 6 +-- internal/vectord/index.go | 54 ++++++++++++++------ internal/vectord/index_test.go | 66 +++++++++++++++++++++++++ 6 files changed, 124 insertions(+), 31 deletions(-) diff --git a/STATE_OF_PLAY.md b/STATE_OF_PLAY.md index 2306f4c..bda8764 100644 --- a/STATE_OF_PLAY.md +++ b/STATE_OF_PLAY.md @@ -269,6 +269,7 @@ a steady state. Future items will land here as production triggers fire. | (close-2) | **OPEN #2: distillation SFT export substrate** — `internal/distillation/sft_export.go`: `IsSftNever` predicate + `ListScoredRunFiles` (data/scored-runs/YYYY/MM/DD walk) + `LoadScoredRunsFromFile` + partial `ExportSft` that wires the firewall but leaves synthesis (instruction/input/response generation) as the next wave. Firewall pinning test fails if `SftNever` set changes without review. 5 new tests. The synthesis port remains on Rust at `scripts/distillation/export_sft.ts`. | | (close-2 full) | **OPEN #2 fully ported** (2026-05-01): `SynthesizeSft` + `LoadEvidenceByRunID` + `buildInstruction` ported byte-for-byte from `scripts/distillation/export_sft.ts`. All 8 source-class instruction templates (scrum_reviews / mode_experiments / auto_apply / audits / observer_reviews / contract_analyses / outcomes / default) match Rust output exactly so a/b validation between runtimes can diff JSONL byte-for-byte. `ExportSft` writes to `data/distilled/sft/sft_export.jsonl`. 5 additional tests including per-source-class template verification, extraction-rejection, empty-text-rejection, context-assembly, end-to-end fixture write. | | (close-2 lineage) | **Audit-baselines lineage ported** (2026-05-01): `internal/distillation/audit_baseline.go` mirrors Rust `audit_full.ts`'s LoadBaseline/AppendBaseline/buildDriftTable. `LoadLastBaseline` reads the most recent JSON line from `data/_kb/audit_baselines.jsonl`; `AppendBaseline` appends append-only with bufio. `BuildAuditDriftTable` flags drift `>20%` (configurable); zero-baseline and new-metric edge cases handled (no division-by-zero, no false-stable on zero→nonzero). `FormatAuditDriftTable` for stdout dumps. Generic on metric names so callers running both runtimes can pin Rust-compat names (`AuditBaselineRustCompat` constant lists them). 13 tests including last-line-wins, trailing-blank-tolerance, malformed-line-errors, threshold-boundary, zero-baseline-handling, sort-stability. | +| (scrum) | 3-lineage scrum on `434f466..0d4f033` (post_role_gate_v1). Convergent finding (Opus + Kimi): `DecodeIndex` lost nil-meta items across persistence. **Fixed** by bumping envelope version 1→2 with explicit `IDs []string` field; v1 envelopes still load via meta-key fallback. Opus-only real bugs also actioned: `handleMerge` non-`ErrIndexNotFound` nil-deref, `mathLog` dead wrapper removed, bubble sort → `sort.Slice`. False positives rejected after verification (Kimi rollback misreading + Opus stale-comment claim). 2 new regression tests lock the v2 round-trip + v1 backward-compat. Disposition: `reports/scrum/_evidence/2026-05-01/verdicts/post_role_gate_v1_disposition.md`. | | (close-3) | **OPEN #3: distribution drift via PSI** — `internal/drift/drift.go`: `ComputeDistributionDrift` returns Population Stability Index + verdict tier (stable < 0.10, minor 0.10–0.25, major ≥ 0.25). Equal-width bucketing over combined min/max range, epsilon-clamping for empty buckets, per-bucket breakdown for drilldown. 7 new tests including identical-is-stable, hard-shift-is-major, moderate-detected-not-stable, empty-inputs-safe, all-identical-safe, bucket-counts-conserved, num-buckets-clamping. | | (close-4) | **OPEN #4: ops nice-to-haves** — (a) Real-time wall-clock for stress harness: per-phase elapsed time logged to stdout as it runs (`[stress] phase NAME starting (T+12.3s)` + `[stress] phase NAME done — 8.5s (T+20.8s)`); `Output.PhaseTimings` + `Output.TotalElapsedMs` written to JSON; (b) chatd fixture-mode S3 mock + (c) liberal-paraphrase calibration: not actioned — no fired trigger yet, would be speculative. Documented as deferred-until-need rather than ignored. | diff --git a/cmd/vectord/main.go b/cmd/vectord/main.go index 403ef15..9bab5e3 100644 --- a/cmd/vectord/main.go +++ b/cmd/vectord/main.go @@ -363,6 +363,13 @@ func (h *handlers) handleMerge(w http.ResponseWriter, r *http.Request) { if errors.Is(err, vectord.ErrIndexNotFound) { http.Error(w, "source not found", http.StatusNotFound) return + } else if err != nil { + // Per scrum post_role_gate_v1 (Opus): non-ErrIndexNotFound + // errors must NOT fall through — src would be nil and the + // next deref panics. + slog.Error("merge: get source", "name", name, "err", err) + http.Error(w, "internal", http.StatusInternalServerError) + return } var req mergeRequest if !decodeJSON(w, r, &req) { @@ -376,6 +383,12 @@ func (h *handlers) handleMerge(w http.ResponseWriter, r *http.Request) { if errors.Is(err, vectord.ErrIndexNotFound) { http.Error(w, "dest not found", http.StatusNotFound) return + } else if err != nil { + // Same fix as source — non-ErrIndexNotFound dest errors + // can't reach the merge body with a nil dest pointer. + slog.Error("merge: get dest", "name", req.Dest, "err", err) + http.Error(w, "internal", http.StatusInternalServerError) + return } // Dimension match is non-negotiable — silently moving a 768-d // vector into a 384-d index would corrupt search forever. diff --git a/internal/distillation/audit_baseline.go b/internal/distillation/audit_baseline.go index 82fb96a..caa4724 100644 --- a/internal/distillation/audit_baseline.go +++ b/internal/distillation/audit_baseline.go @@ -28,6 +28,7 @@ import ( "math" "os" "path/filepath" + "sort" "strings" ) @@ -217,16 +218,10 @@ func BuildAuditDriftTable(prior *AuditBaseline, current map[string]int64, thresh } rows = append(rows, row) } - // Sort for stable display + deterministic JSON output. Bubble- - // sort by name; size is at most a few dozen metrics, so the - // O(n²) cost is irrelevant. - for i := 0; i < len(rows); i++ { - for j := i + 1; j < len(rows); j++ { - if rows[i].Metric > rows[j].Metric { - rows[i], rows[j] = rows[j], rows[i] - } - } - } + // Sort for stable display + deterministic JSON output. Per + // scrum post_role_gate_v1 (Opus INFO): use sort.Slice — already + // imported in the package, idiomatic, and shorter. + sort.Slice(rows, func(i, j int) bool { return rows[i].Metric < rows[j].Metric }) return rows } diff --git a/internal/drift/drift.go b/internal/drift/drift.go index a28825d..2b0ed72 100644 --- a/internal/drift/drift.go +++ b/internal/drift/drift.go @@ -41,10 +41,6 @@ import ( "git.agentview.dev/profit/golangLAKEHOUSE/internal/distillation" ) -// mathLog is wrapped to localize the math import in case future -// drift math swaps in a stable-log variant. Inlined by the compiler. -func mathLog(x float64) float64 { return math.Log(x) } - // ScorerDriftEntry is one mismatch — a historical (record, category) // pair where the current scorer disagrees with the persisted // verdict. Reasons captures the current scorer's explanation so @@ -337,7 +333,7 @@ func ComputeDistributionDrift(in DistributionDriftInput) DistributionDriftReport if cRatio < driftEpsilon { cRatio = driftEpsilon } - part := (cRatio - bRatio) * mathLog(cRatio/bRatio) + part := (cRatio - bRatio) * math.Log(cRatio/bRatio) psi += part report.Buckets = append(report.Buckets, DistributionDriftBucket{ Lower: minV + float64(i)*width, diff --git a/internal/vectord/index.go b/internal/vectord/index.go index 117c50f..7cf0efe 100644 --- a/internal/vectord/index.go +++ b/internal/vectord/index.go @@ -403,16 +403,25 @@ func (i *Index) Search(query []float32, k int) ([]Result, error) { // HNSW graph bytes. params + metadata + format version travel // together; the graph itself is opaque binary that round-trips // through hnsw.Graph.Export / Import. +// +// Version history: +// - v1: original; metadata-only ID inference on decode (loses +// items added with nil metadata across persistence). +// - v2: explicit IDs slice closes the v1 gap. Older v1 envelopes +// still load (backward-compat fallback path in DecodeIndex). type IndexEnvelope struct { Version int `json:"version"` Params IndexParams `json:"params"` Metadata map[string]json.RawMessage `json:"metadata"` + IDs []string `json:"ids,omitempty"` // v2+ — canonical ID set } -// envelopeVersion bumps when the on-disk JSON shape changes -// incompatibly. Reading a future version returns ErrVersionMismatch -// rather than producing a half-decoded index. -const envelopeVersion = 1 +// envelopeVersion is the version this build EMITS. Decoders accept +// any envelope ≤ this version (with version-specific fallbacks for +// older shapes). Bumping this constant means the format changed +// incompatibly going forward; readers without the version-specific +// branch will return ErrVersionMismatch. +const envelopeVersion = 2 // ErrVersionMismatch is returned by DecodeIndex when the envelope // claims a version this build doesn't understand. @@ -429,10 +438,17 @@ func (i *Index) Encode(envelopeW, graphW io.Writer) error { i.mu.RLock() defer i.mu.RUnlock() + // v2: serialize the canonical ID set explicitly so DecodeIndex + // can restore i.ids without depending on meta-key inference. + idList := make([]string, 0, len(i.ids)) + for id := range i.ids { + idList = append(idList, id) + } env := IndexEnvelope{ Version: envelopeVersion, Params: i.params, Metadata: i.meta, + IDs: idList, } if err := json.NewEncoder(envelopeW).Encode(env); err != nil { return fmt.Errorf("encode envelope: %w", err) @@ -451,7 +467,10 @@ func DecodeIndex(envelopeR, graphR io.Reader) (*Index, error) { if err := json.NewDecoder(envelopeR).Decode(&env); err != nil { return nil, fmt.Errorf("decode envelope: %w", err) } - if env.Version != envelopeVersion { + // Accept v1 (legacy: ids inferred from meta keys) and v2 (explicit + // ids slice). Future versions are unknown — return mismatch + // rather than half-decoding. + if env.Version < 1 || env.Version > envelopeVersion { return nil, fmt.Errorf("%w: have %d, got %d", ErrVersionMismatch, envelopeVersion, env.Version) } @@ -465,17 +484,20 @@ func DecodeIndex(envelopeR, graphR io.Reader) (*Index, error) { if env.Metadata != nil { idx.meta = env.Metadata } - // Restore the ids tracker from the metadata keyset. Items that - // were Add'd with nil metadata aren't in env.Metadata and won't - // appear in i.ids after reload — IDs() will miss them, and the - // merge endpoint will skip them. This is acceptable because the - // production HTTP path always supplies non-nil metadata (handler - // requires it explicitly; multi_coord_stress always sends an - // object). The edge case is intentionally not closed because - // closing it requires bumping the envelope version, which - // invalidates existing persisted indexes. - for id := range idx.meta { - idx.ids[id] = struct{}{} + // v2: explicit IDs field is the canonical source. v1 fallback: + // derive from meta keys, accepting that nil-meta items will be + // invisible to IDs()/merge until they get re-Add'd. Closes the + // scrum post_role_gate_v1 convergent finding (Opus + Kimi). + if env.Version >= 2 && env.IDs != nil { + for _, id := range env.IDs { + idx.ids[id] = struct{}{} + } + } else { + // v1 backward-compat path. Old envelopes don't carry ids + // explicitly; the metadata keyset is the best signal we have. + for id := range idx.meta { + idx.ids[id] = struct{}{} + } } return idx, nil } diff --git a/internal/vectord/index_test.go b/internal/vectord/index_test.go index c3a45ae..41113ae 100644 --- a/internal/vectord/index_test.go +++ b/internal/vectord/index_test.go @@ -6,6 +6,7 @@ import ( "errors" "fmt" "math" + "strings" "sync" "testing" ) @@ -189,6 +190,71 @@ func TestDecodeIndex_VersionMismatch(t *testing.T) { } } +// TestEncodeDecode_NilMetaItemsSurviveRoundTrip locks the +// post_role_gate_v1 scrum convergent finding (Opus + Kimi): +// items added with nil metadata MUST survive Encode→Decode and +// remain visible to IDs(). Pre-fix v1 envelope inferred ids from +// meta keys, silently dropping nil-meta items. v2 envelope carries +// the IDs slice explicitly. Test creates a worst-case where every +// item has nil metadata — pre-fix would yield IDs() == []. +func TestEncodeDecode_NilMetaItemsSurviveRoundTrip(t *testing.T) { + src, _ := NewIndex(IndexParams{Name: "nil_meta_test", Dimension: 4, Distance: DistanceCosine}) + for _, id := range []string{"a", "b", "c"} { + // nil meta — the case Opus + Kimi flagged. + if err := src.Add(id, []float32{1, 0, 0, 0}, nil); err != nil { + t.Fatalf("Add %s: %v", id, err) + } + } + if got := src.IDs(); len(got) != 3 { + t.Fatalf("pre-encode: expected 3 IDs, got %d", len(got)) + } + + var envBuf, graphBuf bytes.Buffer + if err := src.Encode(&envBuf, &graphBuf); err != nil { + t.Fatalf("Encode: %v", err) + } + dst, err := DecodeIndex(&envBuf, &graphBuf) + if err != nil { + t.Fatalf("DecodeIndex: %v", err) + } + if got := dst.IDs(); len(got) != 3 { + t.Errorf("post-decode: expected 3 IDs (nil-meta items must survive v2 round-trip), got %d %v", len(got), got) + } +} + +// TestDecodeIndex_V1BackwardCompat locks the legacy-shape fallback: +// envelope without an explicit "ids" field is still loadable. The +// v2 → v1 fallback path infers ids from meta keys (with the +// documented limitation for nil-meta items, which this test does +// NOT exercise — it only proves v1 envelopes still load). +func TestDecodeIndex_V1BackwardCompat(t *testing.T) { + // Hand-craft a v1 envelope (no IDs field). + envJSON := `{"version":1,"params":{"name":"v1_test","dimension":4,"distance":"cosine","m":16,"ef_search":20},"metadata":{"id1":{"foo":"bar"}}}` + // Empty graph stream — DecodeIndex should still succeed and + // emit an Index with id1 in i.ids inferred from meta. + src, _ := NewIndex(IndexParams{Name: "tmp", Dimension: 4}) + _ = src.Add("dummy", []float32{1, 0, 0, 0}, json.RawMessage(`{"x":1}`)) + var graphBuf bytes.Buffer + if err := src.g.Export(&graphBuf); err != nil { + t.Fatalf("export tmp graph for v1 fixture: %v", err) + } + dst, err := DecodeIndex(strings.NewReader(envJSON), &graphBuf) + if err != nil { + t.Fatalf("v1 envelope must still load, got %v", err) + } + // ids should contain "id1" (from the v1 metadata-key fallback). + hasID1 := false + for _, id := range dst.IDs() { + if id == "id1" { + hasID1 = true + break + } + } + if !hasID1 { + t.Errorf("v1 fallback didn't restore id from meta keys, got IDs=%v", dst.IDs()) + } +} + func TestRegistry_CreateGetDelete(t *testing.T) { r := NewRegistry() idx, err := r.Create(IndexParams{Name: "workers", Dimension: 4})