vectord: v2 envelope + handleMerge robustness — actions post_role_gate_v1 scrum
3-lineage scrum on 434f466..0d4f033 surfaced one convergent finding
(Opus + Kimi) and 3 Opus-only real bugs. All actioned in this
commit. Two false positives (Kimi rollback misreading, Opus stale-
comment claim) verified + rejected — both required manual control-
flow inspection to refute, matching the documented Kimi-truncation
behavior in feedback_cross_lineage_review.md.
Convergent fix — DecodeIndex lost nil-meta items:
- Envelope version bumped 1 → 2.
- New v2 field: IDs []string carries the canonical ID set
explicitly, independent of meta map's nil-vs-{} sparseness.
- DecodeIndex accepts both versions: v2 reads from env.IDs; v1
falls back to meta-key inference (with the documented
limitation that nil-meta items are invisible — preserved for
backward-compat with already-persisted indexes).
- Encode emits v2 going forward.
- 2 new regression tests:
- TestEncodeDecode_NilMetaItemsSurviveRoundTrip: items added
with nil metadata MUST survive Encode → Decode and remain
visible to IDs(). Pre-fix would have yielded IDs() == [].
- TestDecodeIndex_V1BackwardCompat: hand-crafted v1 envelope
still decodes (proves the fallback path).
Opus-only fixes:
- handleMerge: non-ErrIndexNotFound errors at h.reg.Get(name) /
h.reg.Get(req.Dest) now return 500 + log instead of falling
through with nil src/dest pointers (which would panic on the
next deref). Real bug — only the sentinel error was handled.
- internal/drift/drift.go: mathLog wrapper removed; math.Log
inlined. Wrapper added no value (math was already imported).
- internal/distillation/audit_baseline.go: BuildAuditDriftTable's
bubble sort replaced with sort.Slice. Idiomatic + shorter.
Rejected after verification:
- Kimi WARN "missing rollback on partial merge": misread the
control flow. Code at cmd/vectord/main.go:404-414 does NOT
delete from src when dest.Add fails (continue before reaching
src.Delete). Only successful Adds trigger Deletes.
- Opus INFO "TimestampUnixNano comment references missing field":
field exists at scripts/multi_coord_stress/main.go:128. Opus
saw only the diff context, not the full file.
Deferred (no fired trigger):
- Opus WARN "no per-index lock during merge": no concurrent merge
callers today (operators run merge as deliberate one-shot job).
Worth a lock if/when matrixd or chatd start auto-triggering.
Disposition: reports/scrum/_evidence/2026-05-01/verdicts/post_role_gate_v1_disposition.md.
Build + vet + tests green; 2 new regression tests + all prior tests
unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
0d4f033b34
commit
eb0dfdff04
@ -269,6 +269,7 @@ a steady state. Future items will land here as production triggers fire.
|
||||
| (close-2) | **OPEN #2: distillation SFT export substrate** — `internal/distillation/sft_export.go`: `IsSftNever` predicate + `ListScoredRunFiles` (data/scored-runs/YYYY/MM/DD walk) + `LoadScoredRunsFromFile` + partial `ExportSft` that wires the firewall but leaves synthesis (instruction/input/response generation) as the next wave. Firewall pinning test fails if `SftNever` set changes without review. 5 new tests. The synthesis port remains on Rust at `scripts/distillation/export_sft.ts`. |
|
||||
| (close-2 full) | **OPEN #2 fully ported** (2026-05-01): `SynthesizeSft` + `LoadEvidenceByRunID` + `buildInstruction` ported byte-for-byte from `scripts/distillation/export_sft.ts`. All 8 source-class instruction templates (scrum_reviews / mode_experiments / auto_apply / audits / observer_reviews / contract_analyses / outcomes / default) match Rust output exactly so a/b validation between runtimes can diff JSONL byte-for-byte. `ExportSft` writes to `data/distilled/sft/sft_export.jsonl`. 5 additional tests including per-source-class template verification, extraction-rejection, empty-text-rejection, context-assembly, end-to-end fixture write. |
|
||||
| (close-2 lineage) | **Audit-baselines lineage ported** (2026-05-01): `internal/distillation/audit_baseline.go` mirrors Rust `audit_full.ts`'s LoadBaseline/AppendBaseline/buildDriftTable. `LoadLastBaseline` reads the most recent JSON line from `data/_kb/audit_baselines.jsonl`; `AppendBaseline` appends append-only with bufio. `BuildAuditDriftTable` flags drift `>20%` (configurable); zero-baseline and new-metric edge cases handled (no division-by-zero, no false-stable on zero→nonzero). `FormatAuditDriftTable` for stdout dumps. Generic on metric names so callers running both runtimes can pin Rust-compat names (`AuditBaselineRustCompat` constant lists them). 13 tests including last-line-wins, trailing-blank-tolerance, malformed-line-errors, threshold-boundary, zero-baseline-handling, sort-stability. |
|
||||
| (scrum) | 3-lineage scrum on `434f466..0d4f033` (post_role_gate_v1). Convergent finding (Opus + Kimi): `DecodeIndex` lost nil-meta items across persistence. **Fixed** by bumping envelope version 1→2 with explicit `IDs []string` field; v1 envelopes still load via meta-key fallback. Opus-only real bugs also actioned: `handleMerge` non-`ErrIndexNotFound` nil-deref, `mathLog` dead wrapper removed, bubble sort → `sort.Slice`. False positives rejected after verification (Kimi rollback misreading + Opus stale-comment claim). 2 new regression tests lock the v2 round-trip + v1 backward-compat. Disposition: `reports/scrum/_evidence/2026-05-01/verdicts/post_role_gate_v1_disposition.md`. |
|
||||
| (close-3) | **OPEN #3: distribution drift via PSI** — `internal/drift/drift.go`: `ComputeDistributionDrift` returns Population Stability Index + verdict tier (stable < 0.10, minor 0.10–0.25, major ≥ 0.25). Equal-width bucketing over combined min/max range, epsilon-clamping for empty buckets, per-bucket breakdown for drilldown. 7 new tests including identical-is-stable, hard-shift-is-major, moderate-detected-not-stable, empty-inputs-safe, all-identical-safe, bucket-counts-conserved, num-buckets-clamping. |
|
||||
| (close-4) | **OPEN #4: ops nice-to-haves** — (a) Real-time wall-clock for stress harness: per-phase elapsed time logged to stdout as it runs (`[stress] phase NAME starting (T+12.3s)` + `[stress] phase NAME done — 8.5s (T+20.8s)`); `Output.PhaseTimings` + `Output.TotalElapsedMs` written to JSON; (b) chatd fixture-mode S3 mock + (c) liberal-paraphrase calibration: not actioned — no fired trigger yet, would be speculative. Documented as deferred-until-need rather than ignored. |
|
||||
|
||||
|
||||
@ -363,6 +363,13 @@ func (h *handlers) handleMerge(w http.ResponseWriter, r *http.Request) {
|
||||
if errors.Is(err, vectord.ErrIndexNotFound) {
|
||||
http.Error(w, "source not found", http.StatusNotFound)
|
||||
return
|
||||
} else if err != nil {
|
||||
// Per scrum post_role_gate_v1 (Opus): non-ErrIndexNotFound
|
||||
// errors must NOT fall through — src would be nil and the
|
||||
// next deref panics.
|
||||
slog.Error("merge: get source", "name", name, "err", err)
|
||||
http.Error(w, "internal", http.StatusInternalServerError)
|
||||
return
|
||||
}
|
||||
var req mergeRequest
|
||||
if !decodeJSON(w, r, &req) {
|
||||
@ -376,6 +383,12 @@ func (h *handlers) handleMerge(w http.ResponseWriter, r *http.Request) {
|
||||
if errors.Is(err, vectord.ErrIndexNotFound) {
|
||||
http.Error(w, "dest not found", http.StatusNotFound)
|
||||
return
|
||||
} else if err != nil {
|
||||
// Same fix as source — non-ErrIndexNotFound dest errors
|
||||
// can't reach the merge body with a nil dest pointer.
|
||||
slog.Error("merge: get dest", "name", req.Dest, "err", err)
|
||||
http.Error(w, "internal", http.StatusInternalServerError)
|
||||
return
|
||||
}
|
||||
// Dimension match is non-negotiable — silently moving a 768-d
|
||||
// vector into a 384-d index would corrupt search forever.
|
||||
|
||||
@ -28,6 +28,7 @@ import (
|
||||
"math"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sort"
|
||||
"strings"
|
||||
)
|
||||
|
||||
@ -217,16 +218,10 @@ func BuildAuditDriftTable(prior *AuditBaseline, current map[string]int64, thresh
|
||||
}
|
||||
rows = append(rows, row)
|
||||
}
|
||||
// Sort for stable display + deterministic JSON output. Bubble-
|
||||
// sort by name; size is at most a few dozen metrics, so the
|
||||
// O(n²) cost is irrelevant.
|
||||
for i := 0; i < len(rows); i++ {
|
||||
for j := i + 1; j < len(rows); j++ {
|
||||
if rows[i].Metric > rows[j].Metric {
|
||||
rows[i], rows[j] = rows[j], rows[i]
|
||||
}
|
||||
}
|
||||
}
|
||||
// Sort for stable display + deterministic JSON output. Per
|
||||
// scrum post_role_gate_v1 (Opus INFO): use sort.Slice — already
|
||||
// imported in the package, idiomatic, and shorter.
|
||||
sort.Slice(rows, func(i, j int) bool { return rows[i].Metric < rows[j].Metric })
|
||||
return rows
|
||||
}
|
||||
|
||||
|
||||
@ -41,10 +41,6 @@ import (
|
||||
"git.agentview.dev/profit/golangLAKEHOUSE/internal/distillation"
|
||||
)
|
||||
|
||||
// mathLog is wrapped to localize the math import in case future
|
||||
// drift math swaps in a stable-log variant. Inlined by the compiler.
|
||||
func mathLog(x float64) float64 { return math.Log(x) }
|
||||
|
||||
// ScorerDriftEntry is one mismatch — a historical (record, category)
|
||||
// pair where the current scorer disagrees with the persisted
|
||||
// verdict. Reasons captures the current scorer's explanation so
|
||||
@ -337,7 +333,7 @@ func ComputeDistributionDrift(in DistributionDriftInput) DistributionDriftReport
|
||||
if cRatio < driftEpsilon {
|
||||
cRatio = driftEpsilon
|
||||
}
|
||||
part := (cRatio - bRatio) * mathLog(cRatio/bRatio)
|
||||
part := (cRatio - bRatio) * math.Log(cRatio/bRatio)
|
||||
psi += part
|
||||
report.Buckets = append(report.Buckets, DistributionDriftBucket{
|
||||
Lower: minV + float64(i)*width,
|
||||
|
||||
@ -403,16 +403,25 @@ func (i *Index) Search(query []float32, k int) ([]Result, error) {
|
||||
// HNSW graph bytes. params + metadata + format version travel
|
||||
// together; the graph itself is opaque binary that round-trips
|
||||
// through hnsw.Graph.Export / Import.
|
||||
//
|
||||
// Version history:
|
||||
// - v1: original; metadata-only ID inference on decode (loses
|
||||
// items added with nil metadata across persistence).
|
||||
// - v2: explicit IDs slice closes the v1 gap. Older v1 envelopes
|
||||
// still load (backward-compat fallback path in DecodeIndex).
|
||||
type IndexEnvelope struct {
|
||||
Version int `json:"version"`
|
||||
Params IndexParams `json:"params"`
|
||||
Metadata map[string]json.RawMessage `json:"metadata"`
|
||||
IDs []string `json:"ids,omitempty"` // v2+ — canonical ID set
|
||||
}
|
||||
|
||||
// envelopeVersion bumps when the on-disk JSON shape changes
|
||||
// incompatibly. Reading a future version returns ErrVersionMismatch
|
||||
// rather than producing a half-decoded index.
|
||||
const envelopeVersion = 1
|
||||
// envelopeVersion is the version this build EMITS. Decoders accept
|
||||
// any envelope ≤ this version (with version-specific fallbacks for
|
||||
// older shapes). Bumping this constant means the format changed
|
||||
// incompatibly going forward; readers without the version-specific
|
||||
// branch will return ErrVersionMismatch.
|
||||
const envelopeVersion = 2
|
||||
|
||||
// ErrVersionMismatch is returned by DecodeIndex when the envelope
|
||||
// claims a version this build doesn't understand.
|
||||
@ -429,10 +438,17 @@ func (i *Index) Encode(envelopeW, graphW io.Writer) error {
|
||||
i.mu.RLock()
|
||||
defer i.mu.RUnlock()
|
||||
|
||||
// v2: serialize the canonical ID set explicitly so DecodeIndex
|
||||
// can restore i.ids without depending on meta-key inference.
|
||||
idList := make([]string, 0, len(i.ids))
|
||||
for id := range i.ids {
|
||||
idList = append(idList, id)
|
||||
}
|
||||
env := IndexEnvelope{
|
||||
Version: envelopeVersion,
|
||||
Params: i.params,
|
||||
Metadata: i.meta,
|
||||
IDs: idList,
|
||||
}
|
||||
if err := json.NewEncoder(envelopeW).Encode(env); err != nil {
|
||||
return fmt.Errorf("encode envelope: %w", err)
|
||||
@ -451,7 +467,10 @@ func DecodeIndex(envelopeR, graphR io.Reader) (*Index, error) {
|
||||
if err := json.NewDecoder(envelopeR).Decode(&env); err != nil {
|
||||
return nil, fmt.Errorf("decode envelope: %w", err)
|
||||
}
|
||||
if env.Version != envelopeVersion {
|
||||
// Accept v1 (legacy: ids inferred from meta keys) and v2 (explicit
|
||||
// ids slice). Future versions are unknown — return mismatch
|
||||
// rather than half-decoding.
|
||||
if env.Version < 1 || env.Version > envelopeVersion {
|
||||
return nil, fmt.Errorf("%w: have %d, got %d",
|
||||
ErrVersionMismatch, envelopeVersion, env.Version)
|
||||
}
|
||||
@ -465,17 +484,20 @@ func DecodeIndex(envelopeR, graphR io.Reader) (*Index, error) {
|
||||
if env.Metadata != nil {
|
||||
idx.meta = env.Metadata
|
||||
}
|
||||
// Restore the ids tracker from the metadata keyset. Items that
|
||||
// were Add'd with nil metadata aren't in env.Metadata and won't
|
||||
// appear in i.ids after reload — IDs() will miss them, and the
|
||||
// merge endpoint will skip them. This is acceptable because the
|
||||
// production HTTP path always supplies non-nil metadata (handler
|
||||
// requires it explicitly; multi_coord_stress always sends an
|
||||
// object). The edge case is intentionally not closed because
|
||||
// closing it requires bumping the envelope version, which
|
||||
// invalidates existing persisted indexes.
|
||||
for id := range idx.meta {
|
||||
idx.ids[id] = struct{}{}
|
||||
// v2: explicit IDs field is the canonical source. v1 fallback:
|
||||
// derive from meta keys, accepting that nil-meta items will be
|
||||
// invisible to IDs()/merge until they get re-Add'd. Closes the
|
||||
// scrum post_role_gate_v1 convergent finding (Opus + Kimi).
|
||||
if env.Version >= 2 && env.IDs != nil {
|
||||
for _, id := range env.IDs {
|
||||
idx.ids[id] = struct{}{}
|
||||
}
|
||||
} else {
|
||||
// v1 backward-compat path. Old envelopes don't carry ids
|
||||
// explicitly; the metadata keyset is the best signal we have.
|
||||
for id := range idx.meta {
|
||||
idx.ids[id] = struct{}{}
|
||||
}
|
||||
}
|
||||
return idx, nil
|
||||
}
|
||||
|
||||
@ -6,6 +6,7 @@ import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"math"
|
||||
"strings"
|
||||
"sync"
|
||||
"testing"
|
||||
)
|
||||
@ -189,6 +190,71 @@ func TestDecodeIndex_VersionMismatch(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// TestEncodeDecode_NilMetaItemsSurviveRoundTrip locks the
|
||||
// post_role_gate_v1 scrum convergent finding (Opus + Kimi):
|
||||
// items added with nil metadata MUST survive Encode→Decode and
|
||||
// remain visible to IDs(). Pre-fix v1 envelope inferred ids from
|
||||
// meta keys, silently dropping nil-meta items. v2 envelope carries
|
||||
// the IDs slice explicitly. Test creates a worst-case where every
|
||||
// item has nil metadata — pre-fix would yield IDs() == [].
|
||||
func TestEncodeDecode_NilMetaItemsSurviveRoundTrip(t *testing.T) {
|
||||
src, _ := NewIndex(IndexParams{Name: "nil_meta_test", Dimension: 4, Distance: DistanceCosine})
|
||||
for _, id := range []string{"a", "b", "c"} {
|
||||
// nil meta — the case Opus + Kimi flagged.
|
||||
if err := src.Add(id, []float32{1, 0, 0, 0}, nil); err != nil {
|
||||
t.Fatalf("Add %s: %v", id, err)
|
||||
}
|
||||
}
|
||||
if got := src.IDs(); len(got) != 3 {
|
||||
t.Fatalf("pre-encode: expected 3 IDs, got %d", len(got))
|
||||
}
|
||||
|
||||
var envBuf, graphBuf bytes.Buffer
|
||||
if err := src.Encode(&envBuf, &graphBuf); err != nil {
|
||||
t.Fatalf("Encode: %v", err)
|
||||
}
|
||||
dst, err := DecodeIndex(&envBuf, &graphBuf)
|
||||
if err != nil {
|
||||
t.Fatalf("DecodeIndex: %v", err)
|
||||
}
|
||||
if got := dst.IDs(); len(got) != 3 {
|
||||
t.Errorf("post-decode: expected 3 IDs (nil-meta items must survive v2 round-trip), got %d %v", len(got), got)
|
||||
}
|
||||
}
|
||||
|
||||
// TestDecodeIndex_V1BackwardCompat locks the legacy-shape fallback:
|
||||
// envelope without an explicit "ids" field is still loadable. The
|
||||
// v2 → v1 fallback path infers ids from meta keys (with the
|
||||
// documented limitation for nil-meta items, which this test does
|
||||
// NOT exercise — it only proves v1 envelopes still load).
|
||||
func TestDecodeIndex_V1BackwardCompat(t *testing.T) {
|
||||
// Hand-craft a v1 envelope (no IDs field).
|
||||
envJSON := `{"version":1,"params":{"name":"v1_test","dimension":4,"distance":"cosine","m":16,"ef_search":20},"metadata":{"id1":{"foo":"bar"}}}`
|
||||
// Empty graph stream — DecodeIndex should still succeed and
|
||||
// emit an Index with id1 in i.ids inferred from meta.
|
||||
src, _ := NewIndex(IndexParams{Name: "tmp", Dimension: 4})
|
||||
_ = src.Add("dummy", []float32{1, 0, 0, 0}, json.RawMessage(`{"x":1}`))
|
||||
var graphBuf bytes.Buffer
|
||||
if err := src.g.Export(&graphBuf); err != nil {
|
||||
t.Fatalf("export tmp graph for v1 fixture: %v", err)
|
||||
}
|
||||
dst, err := DecodeIndex(strings.NewReader(envJSON), &graphBuf)
|
||||
if err != nil {
|
||||
t.Fatalf("v1 envelope must still load, got %v", err)
|
||||
}
|
||||
// ids should contain "id1" (from the v1 metadata-key fallback).
|
||||
hasID1 := false
|
||||
for _, id := range dst.IDs() {
|
||||
if id == "id1" {
|
||||
hasID1 = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !hasID1 {
|
||||
t.Errorf("v1 fallback didn't restore id from meta keys, got IDs=%v", dst.IDs())
|
||||
}
|
||||
}
|
||||
|
||||
func TestRegistry_CreateGetDelete(t *testing.T) {
|
||||
r := NewRegistry()
|
||||
idx, err := r.Create(IndexParams{Name: "workers", Dimension: 4})
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user