vectord: v2 envelope + handleMerge robustness — actions post_role_gate_v1 scrum

3-lineage scrum on 434f466..0d4f033 surfaced one convergent finding (Opus + Kimi) and 3 Opus-only real bugs. All actioned in this commit. Two false positives (Kimi rollback misreading, Opus stale- comment claim) verified + rejected — both required manual control- flow inspection to refute, matching the documented Kimi-truncation behavior in feedback_cross_lineage_review.md. Convergent fix — DecodeIndex lost nil-meta items: - Envelope version bumped 1 → 2. - New v2 field: IDs []string carries the canonical ID set explicitly, independent of meta map's nil-vs-{} sparseness. - DecodeIndex accepts both versions: v2 reads from env.IDs; v1 falls back to meta-key inference (with the documented limitation that nil-meta items are invisible — preserved for backward-compat with already-persisted indexes). - Encode emits v2 going forward. - 2 new regression tests: - TestEncodeDecode_NilMetaItemsSurviveRoundTrip: items added with nil metadata MUST survive Encode → Decode and remain visible to IDs(). Pre-fix would have yielded IDs() == []. - TestDecodeIndex_V1BackwardCompat: hand-crafted v1 envelope still decodes (proves the fallback path). Opus-only fixes: - handleMerge: non-ErrIndexNotFound errors at h.reg.Get(name) / h.reg.Get(req.Dest) now return 500 + log instead of falling through with nil src/dest pointers (which would panic on the next deref). Real bug — only the sentinel error was handled. - internal/drift/drift.go: mathLog wrapper removed; math.Log inlined. Wrapper added no value (math was already imported). - internal/distillation/audit_baseline.go: BuildAuditDriftTable's bubble sort replaced with sort.Slice. Idiomatic + shorter. Rejected after verification: - Kimi WARN "missing rollback on partial merge": misread the control flow. Code at cmd/vectord/main.go:404-414 does NOT delete from src when dest.Add fails (continue before reaching src.Delete). Only successful Adds trigger Deletes. - Opus INFO "TimestampUnixNano comment references missing field": field exists at scripts/multi_coord_stress/main.go:128. Opus saw only the diff context, not the full file. Deferred (no fired trigger): - Opus WARN "no per-index lock during merge": no concurrent merge callers today (operators run merge as deliberate one-shot job). Worth a lock if/when matrixd or chatd start auto-triggering. Disposition: reports/scrum/_evidence/2026-05-01/verdicts/post_role_gate_v1_disposition.md. Build + vet + tests green; 2 new regression tests + all prior tests unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 01:20:37 -05:00 · 2026-05-01 01:20:37 -05:00 · eb0dfdff04
commit eb0dfdff04
parent 0d4f033b34
6 changed files with 124 additions and 31 deletions
--- a/STATE_OF_PLAY.md
+++ b/STATE_OF_PLAY.md
@ -269,6 +269,7 @@ a steady state. Future items will land here as production triggers fire.
 | (close-2) | **OPEN #2: distillation SFT export substrate** — `internal/distillation/sft_export.go`: `IsSftNever` predicate + `ListScoredRunFiles` (data/scored-runs/YYYY/MM/DD walk) + `LoadScoredRunsFromFile` + partial `ExportSft` that wires the firewall but leaves synthesis (instruction/input/response generation) as the next wave. Firewall pinning test fails if `SftNever` set changes without review. 5 new tests. The synthesis port remains on Rust at `scripts/distillation/export_sft.ts`. |
 | (close-2 full) | **OPEN #2 fully ported** (2026-05-01): `SynthesizeSft` + `LoadEvidenceByRunID` + `buildInstruction` ported byte-for-byte from `scripts/distillation/export_sft.ts`. All 8 source-class instruction templates (scrum_reviews / mode_experiments / auto_apply / audits / observer_reviews / contract_analyses / outcomes / default) match Rust output exactly so a/b validation between runtimes can diff JSONL byte-for-byte. `ExportSft` writes to `data/distilled/sft/sft_export.jsonl`. 5 additional tests including per-source-class template verification, extraction-rejection, empty-text-rejection, context-assembly, end-to-end fixture write. |
 | (close-2 lineage) | **Audit-baselines lineage ported** (2026-05-01): `internal/distillation/audit_baseline.go` mirrors Rust `audit_full.ts`'s LoadBaseline/AppendBaseline/buildDriftTable. `LoadLastBaseline` reads the most recent JSON line from `data/_kb/audit_baselines.jsonl`; `AppendBaseline` appends append-only with bufio. `BuildAuditDriftTable` flags drift `>20%` (configurable); zero-baseline and new-metric edge cases handled (no division-by-zero, no false-stable on zero→nonzero). `FormatAuditDriftTable` for stdout dumps. Generic on metric names so callers running both runtimes can pin Rust-compat names (`AuditBaselineRustCompat` constant lists them). 13 tests including last-line-wins, trailing-blank-tolerance, malformed-line-errors, threshold-boundary, zero-baseline-handling, sort-stability. |
+| (scrum) | 3-lineage scrum on `434f466..0d4f033` (post_role_gate_v1). Convergent finding (Opus + Kimi): `DecodeIndex` lost nil-meta items across persistence. **Fixed** by bumping envelope version 1→2 with explicit `IDs []string` field; v1 envelopes still load via meta-key fallback. Opus-only real bugs also actioned: `handleMerge` non-`ErrIndexNotFound` nil-deref, `mathLog` dead wrapper removed, bubble sort → `sort.Slice`. False positives rejected after verification (Kimi rollback misreading + Opus stale-comment claim). 2 new regression tests lock the v2 round-trip + v1 backward-compat. Disposition: `reports/scrum/_evidence/2026-05-01/verdicts/post_role_gate_v1_disposition.md`. |
 | (close-3) | **OPEN #3: distribution drift via PSI** — `internal/drift/drift.go`: `ComputeDistributionDrift` returns Population Stability Index + verdict tier (stable < 0.10, minor 0.10–0.25, major ≥ 0.25). Equal-width bucketing over combined min/max range, epsilon-clamping for empty buckets, per-bucket breakdown for drilldown. 7 new tests including identical-is-stable, hard-shift-is-major, moderate-detected-not-stable, empty-inputs-safe, all-identical-safe, bucket-counts-conserved, num-buckets-clamping. |
 | (close-4) | **OPEN #4: ops nice-to-haves** — (a) Real-time wall-clock for stress harness: per-phase elapsed time logged to stdout as it runs (`[stress] phase NAME starting (T+12.3s)` + `[stress] phase NAME done — 8.5s (T+20.8s)`); `Output.PhaseTimings` + `Output.TotalElapsedMs` written to JSON; (b) chatd fixture-mode S3 mock + (c) liberal-paraphrase calibration: not actioned — no fired trigger yet, would be speculative. Documented as deferred-until-need rather than ignored. |

--- a/cmd/vectord/main.go
+++ b/cmd/vectord/main.go
@ -363,6 +363,13 @@ func (h *handlers) handleMerge(w http.ResponseWriter, r *http.Request) {
 	if errors.Is(err, vectord.ErrIndexNotFound) {
 		http.Error(w, "source not found", http.StatusNotFound)
 		return
+	} else if err != nil {
+		// Per scrum post_role_gate_v1 (Opus): non-ErrIndexNotFound
+		// errors must NOT fall through — src would be nil and the
+		// next deref panics.
+		slog.Error("merge: get source", "name", name, "err", err)
+		http.Error(w, "internal", http.StatusInternalServerError)
+		return
 	}
 	var req mergeRequest
 	if !decodeJSON(w, r, &req) {
@ -376,6 +383,12 @@ func (h *handlers) handleMerge(w http.ResponseWriter, r *http.Request) {
 	if errors.Is(err, vectord.ErrIndexNotFound) {
 		http.Error(w, "dest not found", http.StatusNotFound)
 		return
+	} else if err != nil {
+		// Same fix as source — non-ErrIndexNotFound dest errors
+		// can't reach the merge body with a nil dest pointer.
+		slog.Error("merge: get dest", "name", req.Dest, "err", err)
+		http.Error(w, "internal", http.StatusInternalServerError)
+		return
 	}
 	// Dimension match is non-negotiable — silently moving a 768-d
 	// vector into a 384-d index would corrupt search forever.
--- a/internal/distillation/audit_baseline.go
+++ b/internal/distillation/audit_baseline.go
@ -28,6 +28,7 @@ import (
 	"math"
 	"os"
 	"path/filepath"
+	"sort"
 	"strings"
 )

@ -217,16 +218,10 @@ func BuildAuditDriftTable(prior *AuditBaseline, current map[string]int64, thresh
 		}
 		rows = append(rows, row)
 	}
-	// Sort for stable display + deterministic JSON output. Bubble-
-	// sort by name; size is at most a few dozen metrics, so the
-	// O(n²) cost is irrelevant.
-	for i := 0; i < len(rows); i++ {
-		for j := i + 1; j < len(rows); j++ {
-			if rows[i].Metric > rows[j].Metric {
-				rows[i], rows[j] = rows[j], rows[i]
-			}
-		}
-	}
+	// Sort for stable display + deterministic JSON output. Per
+	// scrum post_role_gate_v1 (Opus INFO): use sort.Slice — already
+	// imported in the package, idiomatic, and shorter.
+	sort.Slice(rows, func(i, j int) bool { return rows[i].Metric < rows[j].Metric })
 	return rows
 }

--- a/internal/drift/drift.go
+++ b/internal/drift/drift.go
@ -41,10 +41,6 @@ import (
 	"git.agentview.dev/profit/golangLAKEHOUSE/internal/distillation"
 )

-// mathLog is wrapped to localize the math import in case future
-// drift math swaps in a stable-log variant. Inlined by the compiler.
-func mathLog(x float64) float64 { return math.Log(x) }
-
 // ScorerDriftEntry is one mismatch — a historical (record, category)
 // pair where the current scorer disagrees with the persisted
 // verdict. Reasons captures the current scorer's explanation so
@ -337,7 +333,7 @@ func ComputeDistributionDrift(in DistributionDriftInput) DistributionDriftReport
 		if cRatio < driftEpsilon {
 			cRatio = driftEpsilon
 		}
-		part := (cRatio - bRatio) * mathLog(cRatio/bRatio)
+		part := (cRatio - bRatio) * math.Log(cRatio/bRatio)
 		psi += part
 		report.Buckets = append(report.Buckets, DistributionDriftBucket{
 			Lower:         minV + float64(i)*width,
--- a/internal/vectord/index.go
+++ b/internal/vectord/index.go
@ -403,16 +403,25 @@ func (i *Index) Search(query []float32, k int) ([]Result, error) {
 // HNSW graph bytes. params + metadata + format version travel
 // together; the graph itself is opaque binary that round-trips
 // through hnsw.Graph.Export / Import.
+//
+// Version history:
+//   - v1: original; metadata-only ID inference on decode (loses
+//     items added with nil metadata across persistence).
+//   - v2: explicit IDs slice closes the v1 gap. Older v1 envelopes
+//     still load (backward-compat fallback path in DecodeIndex).
 type IndexEnvelope struct {
 	Version  int                        `json:"version"`
 	Params   IndexParams                `json:"params"`
 	Metadata map[string]json.RawMessage `json:"metadata"`
+	IDs      []string                   `json:"ids,omitempty"` // v2+ — canonical ID set
 }

-// envelopeVersion bumps when the on-disk JSON shape changes
-// incompatibly. Reading a future version returns ErrVersionMismatch
-// rather than producing a half-decoded index.
-const envelopeVersion = 1
+// envelopeVersion is the version this build EMITS. Decoders accept
+// any envelope ≤ this version (with version-specific fallbacks for
+// older shapes). Bumping this constant means the format changed
+// incompatibly going forward; readers without the version-specific
+// branch will return ErrVersionMismatch.
+const envelopeVersion = 2

 // ErrVersionMismatch is returned by DecodeIndex when the envelope
 // claims a version this build doesn't understand.
@ -429,10 +438,17 @@ func (i *Index) Encode(envelopeW, graphW io.Writer) error {
 	i.mu.RLock()
 	defer i.mu.RUnlock()

+	// v2: serialize the canonical ID set explicitly so DecodeIndex
+	// can restore i.ids without depending on meta-key inference.
+	idList := make([]string, 0, len(i.ids))
+	for id := range i.ids {
+		idList = append(idList, id)
+	}
 	env := IndexEnvelope{
 		Version:  envelopeVersion,
 		Params:   i.params,
 		Metadata: i.meta,
+		IDs:      idList,
 	}
 	if err := json.NewEncoder(envelopeW).Encode(env); err != nil {
 		return fmt.Errorf("encode envelope: %w", err)
@ -451,7 +467,10 @@ func DecodeIndex(envelopeR, graphR io.Reader) (*Index, error) {
 	if err := json.NewDecoder(envelopeR).Decode(&env); err != nil {
 		return nil, fmt.Errorf("decode envelope: %w", err)
 	}
-	if env.Version != envelopeVersion {
+	// Accept v1 (legacy: ids inferred from meta keys) and v2 (explicit
+	// ids slice). Future versions are unknown — return mismatch
+	// rather than half-decoding.
+	if env.Version < 1 || env.Version > envelopeVersion {
 		return nil, fmt.Errorf("%w: have %d, got %d",
 			ErrVersionMismatch, envelopeVersion, env.Version)
 	}
@ -465,17 +484,20 @@ func DecodeIndex(envelopeR, graphR io.Reader) (*Index, error) {
 	if env.Metadata != nil {
 		idx.meta = env.Metadata
 	}
-	// Restore the ids tracker from the metadata keyset. Items that
-	// were Add'd with nil metadata aren't in env.Metadata and won't
-	// appear in i.ids after reload — IDs() will miss them, and the
-	// merge endpoint will skip them. This is acceptable because the
-	// production HTTP path always supplies non-nil metadata (handler
-	// requires it explicitly; multi_coord_stress always sends an
-	// object). The edge case is intentionally not closed because
-	// closing it requires bumping the envelope version, which
-	// invalidates existing persisted indexes.
-	for id := range idx.meta {
-		idx.ids[id] = struct{}{}
+	// v2: explicit IDs field is the canonical source. v1 fallback:
+	// derive from meta keys, accepting that nil-meta items will be
+	// invisible to IDs()/merge until they get re-Add'd. Closes the
+	// scrum post_role_gate_v1 convergent finding (Opus + Kimi).
+	if env.Version >= 2 && env.IDs != nil {
+		for _, id := range env.IDs {
+			idx.ids[id] = struct{}{}
+		}
+	} else {
+		// v1 backward-compat path. Old envelopes don't carry ids
+		// explicitly; the metadata keyset is the best signal we have.
+		for id := range idx.meta {
+			idx.ids[id] = struct{}{}
+		}
 	}
 	return idx, nil
 }
--- a/internal/vectord/index_test.go
+++ b/internal/vectord/index_test.go
@ -6,6 +6,7 @@ import (
 	"errors"
 	"fmt"
 	"math"
+	"strings"
 	"sync"
 	"testing"
 )
@ -189,6 +190,71 @@ func TestDecodeIndex_VersionMismatch(t *testing.T) {
 	}
 }

+// TestEncodeDecode_NilMetaItemsSurviveRoundTrip locks the
+// post_role_gate_v1 scrum convergent finding (Opus + Kimi):
+// items added with nil metadata MUST survive Encode→Decode and
+// remain visible to IDs(). Pre-fix v1 envelope inferred ids from
+// meta keys, silently dropping nil-meta items. v2 envelope carries
+// the IDs slice explicitly. Test creates a worst-case where every
+// item has nil metadata — pre-fix would yield IDs() == [].
+func TestEncodeDecode_NilMetaItemsSurviveRoundTrip(t *testing.T) {
+	src, _ := NewIndex(IndexParams{Name: "nil_meta_test", Dimension: 4, Distance: DistanceCosine})
+	for _, id := range []string{"a", "b", "c"} {
+		// nil meta — the case Opus + Kimi flagged.
+		if err := src.Add(id, []float32{1, 0, 0, 0}, nil); err != nil {
+			t.Fatalf("Add %s: %v", id, err)
+		}
+	}
+	if got := src.IDs(); len(got) != 3 {
+		t.Fatalf("pre-encode: expected 3 IDs, got %d", len(got))
+	}
+
+	var envBuf, graphBuf bytes.Buffer
+	if err := src.Encode(&envBuf, &graphBuf); err != nil {
+		t.Fatalf("Encode: %v", err)
+	}
+	dst, err := DecodeIndex(&envBuf, &graphBuf)
+	if err != nil {
+		t.Fatalf("DecodeIndex: %v", err)
+	}
+	if got := dst.IDs(); len(got) != 3 {
+		t.Errorf("post-decode: expected 3 IDs (nil-meta items must survive v2 round-trip), got %d %v", len(got), got)
+	}
+}
+
+// TestDecodeIndex_V1BackwardCompat locks the legacy-shape fallback:
+// envelope without an explicit "ids" field is still loadable. The
+// v2 → v1 fallback path infers ids from meta keys (with the
+// documented limitation for nil-meta items, which this test does
+// NOT exercise — it only proves v1 envelopes still load).
+func TestDecodeIndex_V1BackwardCompat(t *testing.T) {
+	// Hand-craft a v1 envelope (no IDs field).
+	envJSON := `{"version":1,"params":{"name":"v1_test","dimension":4,"distance":"cosine","m":16,"ef_search":20},"metadata":{"id1":{"foo":"bar"}}}`
+	// Empty graph stream — DecodeIndex should still succeed and
+	// emit an Index with id1 in i.ids inferred from meta.
+	src, _ := NewIndex(IndexParams{Name: "tmp", Dimension: 4})
+	_ = src.Add("dummy", []float32{1, 0, 0, 0}, json.RawMessage(`{"x":1}`))
+	var graphBuf bytes.Buffer
+	if err := src.g.Export(&graphBuf); err != nil {
+		t.Fatalf("export tmp graph for v1 fixture: %v", err)
+	}
+	dst, err := DecodeIndex(strings.NewReader(envJSON), &graphBuf)
+	if err != nil {
+		t.Fatalf("v1 envelope must still load, got %v", err)
+	}
+	// ids should contain "id1" (from the v1 metadata-key fallback).
+	hasID1 := false
+	for _, id := range dst.IDs() {
+		if id == "id1" {
+			hasID1 = true
+			break
+		}
+	}
+	if !hasID1 {
+		t.Errorf("v1 fallback didn't restore id from meta keys, got IDs=%v", dst.IDs())
+	}
+}
+
 func TestRegistry_CreateGetDelete(t *testing.T) {
 	r := NewRegistry()
 	idx, err := r.Create(IndexParams{Name: "workers", Dimension: 4})