distillation.score, drift.scorer)
Lands the workflow.Mode adapters for the §3.4 components + the
distillation scorer + drift quantifier. Workflows can now compose
real measurement capabilities; the substrate's parallel
capabilities become composable Lego bricks (per the prior commit's
closing insight).
Modes registered (in observerd's registerBuiltinModes):
Pure-function wrappers (no I/O):
- matrix.relevance → matrix.FilterChunks
- matrix.downgrade → matrix.MaybeDowngrade
- distillation.score → distillation.ScoreRecord
- drift.scorer → drift.ComputeScorerDrift
HTTP-backed:
- matrix.search → POST matrixd /matrix/search
(registered only when matrixd_url is set)
Fixture (kept from §3.8 first slice):
- fixture.echo, fixture.upper
internal/workflow/modes.go:
Each mode follows the same glue pattern: marshal generic input
through a typed struct (free schema validation + clear error
messages), call the underlying capability, return a generic
output map. Roundtrip-via-JSON gives us schema validation
without writing custom field-by-field coercion.
internal/workflow/modes_test.go (10 tests, all PASS):
- matrix.relevance filters adjacency pollution (Connector kept,
catalogd::Registry dropped — same headline as the relevance
smoke, run through the workflow mode)
- matrix.downgrade flips lakehouse→isolation on strong model;
keeps lakehouse on weak (qwen3.5:latest); errors on missing
fields
- distillation.score rates scrum_review attempt_1 as accepted;
rejects empty record
- drift.scorer reports zero drift on matched inputs; errors on
empty inputs slice
- matrix.search HTTP flow round-trips through httptest fake
matrixd; non-OK status surfaces a clear error
scripts/workflow_smoke.sh (5 assertions PASS, was 4):
New assertion #5: real-mode chain
matrix.downgrade (lakehouse + grok-4.1-fast → isolation)
→ distillation.score (scrum_review attempt_1 → accepted)
Proves §3.4 components compose through the workflow runner with
no fixture intermediation. Both nodes ran successfully, runner
recorded provenance, status=succeeded.
Mode listing assertion now expects 7 modes (5 real + 2 fixture)
instead of just the fixtures.
17-smoke regression all green. SPEC §3.8 acceptance gate G3.8.D
("Mode catalog dispatches matrix.search invocation to the matrixd
backend without going through HTTP") still pending — current path
goes through HTTP for matrix.search, which is the cleaner service-
mesh shape but slower than direct in-process. In-process dispatch
when matrixd is co-resident is a future optimization.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
212 lines
5.9 KiB
Go
212 lines
5.9 KiB
Go
package workflow
|
|
|
|
import (
|
|
"context"
|
|
"encoding/json"
|
|
"net/http"
|
|
"net/http/httptest"
|
|
"strings"
|
|
"testing"
|
|
)
|
|
|
|
func TestMatrixRelevance_FiltersAdjacencyPollution(t *testing.T) {
|
|
input := map[string]any{
|
|
"focus": map[string]any{
|
|
"Path": "crates/queryd/src/db.go",
|
|
"Content": "pub struct Connector {}\nuse catalogd::Registry;",
|
|
},
|
|
"chunks": []any{
|
|
map[string]any{
|
|
"source": "lakehouse_symbols_v1",
|
|
"doc_id": "symbol:queryd::struct::Connector",
|
|
"text": "Connector wraps the DuckDB handle.",
|
|
"score": 0.9,
|
|
},
|
|
map[string]any{
|
|
"source": "lakehouse_symbols_v1",
|
|
"doc_id": "symbol:catalogd::struct::Registry",
|
|
"text": "Registry stores manifests. Used by ingestd.",
|
|
"score": 0.85,
|
|
},
|
|
},
|
|
"threshold": 0.3,
|
|
}
|
|
out, err := MatrixRelevance(Context{}, input)
|
|
if err != nil {
|
|
t.Fatalf("MatrixRelevance: %v", err)
|
|
}
|
|
if out["total_in"].(int) != 2 {
|
|
t.Errorf("total_in: want 2, got %v", out["total_in"])
|
|
}
|
|
// Connector should be in kept (path/symbol match), Registry in dropped (import-only).
|
|
keptStr, _ := json.Marshal(out["kept"])
|
|
if !strings.Contains(string(keptStr), "Connector") {
|
|
t.Errorf("expected Connector in kept; kept=%s", keptStr)
|
|
}
|
|
}
|
|
|
|
func TestMatrixDowngrade_StrongModelDowngrades(t *testing.T) {
|
|
out, err := MatrixDowngrade(Context{}, map[string]any{
|
|
"mode": "codereview_lakehouse",
|
|
"model": "x-ai/grok-4.1-fast",
|
|
})
|
|
if err != nil {
|
|
t.Fatalf("MatrixDowngrade: %v", err)
|
|
}
|
|
if out["mode"] != "codereview_isolation" {
|
|
t.Errorf("strong model should downgrade; got mode=%v", out["mode"])
|
|
}
|
|
if out["downgraded_from"] != "codereview_lakehouse" {
|
|
t.Errorf("downgraded_from: %v", out["downgraded_from"])
|
|
}
|
|
}
|
|
|
|
func TestMatrixDowngrade_WeakModelKept(t *testing.T) {
|
|
out, err := MatrixDowngrade(Context{}, map[string]any{
|
|
"mode": "codereview_lakehouse",
|
|
"model": "qwen3.5:latest",
|
|
})
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
if out["mode"] != "codereview_lakehouse" {
|
|
t.Errorf("weak model should keep lakehouse; got %v", out["mode"])
|
|
}
|
|
}
|
|
|
|
func TestMatrixDowngrade_MissingFieldsError(t *testing.T) {
|
|
_, err := MatrixDowngrade(Context{}, map[string]any{"mode": "codereview_lakehouse"})
|
|
if err == nil {
|
|
t.Error("missing model should error")
|
|
}
|
|
}
|
|
|
|
func TestDistillationScore_ScrumReviewAccepted(t *testing.T) {
|
|
out, err := DistillationScore(Context{}, map[string]any{
|
|
"record": map[string]any{
|
|
"run_id": "r-1",
|
|
"task_id": "t-1",
|
|
"timestamp": "2026-04-29T12:00:00Z",
|
|
"schema_version": 1,
|
|
"provenance": map[string]any{
|
|
"source_file": "data/_kb/scrum_reviews.jsonl",
|
|
"sig_hash": "abc",
|
|
"recorded_at": "2026-04-29T12:00:01Z",
|
|
},
|
|
"success_markers": []any{"accepted_on_attempt_1"},
|
|
},
|
|
})
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
if out["category"] != "accepted" {
|
|
t.Errorf("scrum_review attempt_1: want accepted, got %v", out["category"])
|
|
}
|
|
reasons, _ := out["reasons"].([]string)
|
|
if len(reasons) == 0 || !strings.Contains(reasons[0], "first attempt") {
|
|
t.Errorf("reasons missing 'first attempt': %v", reasons)
|
|
}
|
|
}
|
|
|
|
func TestDistillationScore_RejectsEmptyRecord(t *testing.T) {
|
|
_, err := DistillationScore(Context{}, map[string]any{
|
|
"record": map[string]any{},
|
|
})
|
|
if err == nil {
|
|
t.Error("empty record should error")
|
|
}
|
|
}
|
|
|
|
func TestDriftScorer_AllMatchedReturnsZeroDrift(t *testing.T) {
|
|
out, err := DriftScorer(Context{}, map[string]any{
|
|
"inputs": []any{
|
|
map[string]any{
|
|
"Record": map[string]any{
|
|
"run_id": "r-1", "task_id": "t-1",
|
|
"timestamp": "2026-04-29T12:00:00Z", "schema_version": 1,
|
|
"provenance": map[string]any{
|
|
"source_file": "data/_kb/scrum_reviews.jsonl",
|
|
"sig_hash": "x", "recorded_at": "2026-04-29T12:00:01Z",
|
|
},
|
|
"success_markers": []any{"accepted_on_attempt_1"},
|
|
},
|
|
"PersistedCategory": "accepted",
|
|
},
|
|
},
|
|
})
|
|
if err != nil {
|
|
t.Fatal(err)
|
|
}
|
|
if out["drifted"].(float64) != 0 {
|
|
t.Errorf("no-drift case: drifted=%v", out["drifted"])
|
|
}
|
|
if out["matched"].(float64) != 1 {
|
|
t.Errorf("matched: want 1, got %v", out["matched"])
|
|
}
|
|
}
|
|
|
|
func TestDriftScorer_RequiresInputs(t *testing.T) {
|
|
_, err := DriftScorer(Context{}, map[string]any{"inputs": []any{}})
|
|
if err == nil {
|
|
t.Error("empty inputs should error")
|
|
}
|
|
}
|
|
|
|
func TestMatrixSearch_HTTPFlow(t *testing.T) {
|
|
// Fake matrixd that echoes a canned SearchResponse.
|
|
mux := http.NewServeMux()
|
|
mux.HandleFunc("/matrix/search", func(w http.ResponseWriter, r *http.Request) {
|
|
var body map[string]any
|
|
_ = json.NewDecoder(r.Body).Decode(&body)
|
|
w.Header().Set("Content-Type", "application/json")
|
|
// Echo back deterministically with a synthesized result list.
|
|
_ = json.NewEncoder(w).Encode(map[string]any{
|
|
"results": []any{
|
|
map[string]any{"id": "w-1", "distance": 0.1, "corpus": "workers"},
|
|
},
|
|
"per_corpus_counts": map[string]any{"workers": 1},
|
|
"received_corpora": body["corpora"], // for round-trip verification
|
|
})
|
|
})
|
|
srv := httptest.NewServer(mux)
|
|
defer srv.Close()
|
|
|
|
mode := MatrixSearch(srv.URL, srv.Client())
|
|
out, err := mode(
|
|
Context{Ctx: context.Background()},
|
|
map[string]any{
|
|
"query_text": "forklift",
|
|
"corpora": []any{"workers"},
|
|
"k": 5,
|
|
},
|
|
)
|
|
if err != nil {
|
|
t.Fatalf("MatrixSearch: %v", err)
|
|
}
|
|
results, ok := out["results"].([]any)
|
|
if !ok || len(results) != 1 {
|
|
t.Errorf("results: %v", out["results"])
|
|
}
|
|
if first, ok := results[0].(map[string]any); ok {
|
|
if first["id"] != "w-1" {
|
|
t.Errorf("id: %v", first["id"])
|
|
}
|
|
}
|
|
}
|
|
|
|
func TestMatrixSearch_NonOKStatusErrors(t *testing.T) {
|
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
|
http.Error(w, "matrixd is down", http.StatusBadGateway)
|
|
}))
|
|
defer srv.Close()
|
|
|
|
mode := MatrixSearch(srv.URL, srv.Client())
|
|
_, err := mode(Context{Ctx: context.Background()}, map[string]any{})
|
|
if err == nil {
|
|
t.Error("502 should error")
|
|
}
|
|
if !strings.Contains(err.Error(), "502") {
|
|
t.Errorf("error should mention 502: %v", err)
|
|
}
|
|
}
|