distillation.score, drift.scorer)
Lands the workflow.Mode adapters for the §3.4 components + the
distillation scorer + drift quantifier. Workflows can now compose
real measurement capabilities; the substrate's parallel
capabilities become composable Lego bricks (per the prior commit's
closing insight).
Modes registered (in observerd's registerBuiltinModes):
Pure-function wrappers (no I/O):
- matrix.relevance → matrix.FilterChunks
- matrix.downgrade → matrix.MaybeDowngrade
- distillation.score → distillation.ScoreRecord
- drift.scorer → drift.ComputeScorerDrift
HTTP-backed:
- matrix.search → POST matrixd /matrix/search
(registered only when matrixd_url is set)
Fixture (kept from §3.8 first slice):
- fixture.echo, fixture.upper
internal/workflow/modes.go:
Each mode follows the same glue pattern: marshal generic input
through a typed struct (free schema validation + clear error
messages), call the underlying capability, return a generic
output map. Roundtrip-via-JSON gives us schema validation
without writing custom field-by-field coercion.
internal/workflow/modes_test.go (10 tests, all PASS):
- matrix.relevance filters adjacency pollution (Connector kept,
catalogd::Registry dropped — same headline as the relevance
smoke, run through the workflow mode)
- matrix.downgrade flips lakehouse→isolation on strong model;
keeps lakehouse on weak (qwen3.5:latest); errors on missing
fields
- distillation.score rates scrum_review attempt_1 as accepted;
rejects empty record
- drift.scorer reports zero drift on matched inputs; errors on
empty inputs slice
- matrix.search HTTP flow round-trips through httptest fake
matrixd; non-OK status surfaces a clear error
scripts/workflow_smoke.sh (5 assertions PASS, was 4):
New assertion #5: real-mode chain
matrix.downgrade (lakehouse + grok-4.1-fast → isolation)
→ distillation.score (scrum_review attempt_1 → accepted)
Proves §3.4 components compose through the workflow runner with
no fixture intermediation. Both nodes ran successfully, runner
recorded provenance, status=succeeded.
Mode listing assertion now expects 7 modes (5 real + 2 fixture)
instead of just the fixtures.
17-smoke regression all green. SPEC §3.8 acceptance gate G3.8.D
("Mode catalog dispatches matrix.search invocation to the matrixd
backend without going through HTTP") still pending — current path
goes through HTTP for matrix.search, which is the cleaner service-
mesh shape but slower than direct in-process. In-process dispatch
when matrixd is co-resident is a future optimization.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
194 lines
7.3 KiB
Bash
Executable File
194 lines
7.3 KiB
Bash
Executable File
#!/usr/bin/env bash
|
|
# Workflow smoke — Observer-KB workflow runner end-to-end (SPEC §3.8
|
|
# first slice). All assertions go through gateway :3110.
|
|
#
|
|
# Validates:
|
|
# - GET /observer/workflow/modes lists fixture.echo + fixture.upper
|
|
# - POST /observer/workflow/run executes a 3-node DAG with $-ref
|
|
# substitution: shape (uppercase) → weakness → improvement
|
|
# - Each node's execution lands an ObservedOp via the observer
|
|
# ring (visible in /observer/stats with source="workflow")
|
|
# - Aborting case: unknown mode → 400 with helpful error
|
|
# - Skip cascade: node with failed dep gets skipped, independent
|
|
# siblings still run
|
|
|
|
set -euo pipefail
|
|
cd "$(dirname "$0")/.."
|
|
|
|
export PATH="$PATH:/usr/local/go/bin"
|
|
|
|
echo "[workflow-smoke] building observerd + gateway..."
|
|
go build -o bin/ ./cmd/observerd ./cmd/gateway
|
|
|
|
pkill -f "bin/(observerd|gateway)" 2>/dev/null || true
|
|
sleep 0.3
|
|
|
|
PIDS=()
|
|
TMP="$(mktemp -d)"
|
|
CFG="$TMP/workflow.toml"
|
|
|
|
cleanup() {
|
|
echo "[workflow-smoke] cleanup"
|
|
for p in "${PIDS[@]}"; do [ -n "$p" ] && kill "$p" 2>/dev/null || true; done
|
|
rm -rf "$TMP"
|
|
}
|
|
trap cleanup EXIT INT TERM
|
|
|
|
cat > "$CFG" <<EOF
|
|
[gateway]
|
|
bind = "127.0.0.1:3110"
|
|
storaged_url = "http://127.0.0.1:3211"
|
|
catalogd_url = "http://127.0.0.1:3212"
|
|
ingestd_url = "http://127.0.0.1:3213"
|
|
queryd_url = "http://127.0.0.1:3214"
|
|
vectord_url = "http://127.0.0.1:3215"
|
|
embedd_url = "http://127.0.0.1:3216"
|
|
pathwayd_url = "http://127.0.0.1:3217"
|
|
matrixd_url = "http://127.0.0.1:3218"
|
|
observerd_url = "http://127.0.0.1:3219"
|
|
|
|
[observerd]
|
|
bind = "127.0.0.1:3219"
|
|
EOF
|
|
|
|
poll_health() {
|
|
local port="$1" deadline=$(($(date +%s) + 5))
|
|
while [ "$(date +%s)" -lt "$deadline" ]; do
|
|
if curl -sS --max-time 1 "http://127.0.0.1:$port/health" >/dev/null 2>&1; then return 0; fi
|
|
sleep 0.05
|
|
done
|
|
return 1
|
|
}
|
|
|
|
echo "[workflow-smoke] launching observerd → gateway..."
|
|
./bin/observerd -config "$CFG" > /tmp/observerd.log 2>&1 &
|
|
PIDS+=($!)
|
|
poll_health 3219 || { echo "observerd failed"; tail /tmp/observerd.log; exit 1; }
|
|
|
|
./bin/gateway -config "$CFG" > /tmp/gateway.log 2>&1 &
|
|
PIDS+=($!)
|
|
poll_health 3110 || { echo "gateway failed"; tail /tmp/gateway.log; exit 1; }
|
|
|
|
FAILED=0
|
|
|
|
# ── 1. /observer/workflow/modes lists registered modes ────────────
|
|
echo "[workflow-smoke] /observer/workflow/modes lists fixtures + real modes:"
|
|
RESP="$(curl -sS http://127.0.0.1:3110/v1/observer/workflow/modes)"
|
|
EXPECTED=("fixture.echo" "fixture.upper" "matrix.relevance" "matrix.downgrade" "distillation.score" "drift.scorer" "matrix.search")
|
|
MISSING=""
|
|
for m in "${EXPECTED[@]}"; do
|
|
if [ "$(echo "$RESP" | jq -r --arg m "$m" '.modes | index($m) != null')" != "true" ]; then
|
|
MISSING="$MISSING $m"
|
|
fi
|
|
done
|
|
if [ -z "$MISSING" ]; then
|
|
echo " ✓ all 7 expected modes registered (fixtures + 4 pure + matrix.search HTTP)"
|
|
else
|
|
echo " ✗ missing modes:$MISSING"; FAILED=1
|
|
fi
|
|
|
|
# ── 2. 3-node DAG with $-ref substitution ─────────────────────────
|
|
echo "[workflow-smoke] 3-node DAG: shape (upper) → weakness → improvement"
|
|
WORKFLOW='{
|
|
"workflow": {
|
|
"name": "smoke-chain",
|
|
"description": "DAG ref substitution test",
|
|
"nodes": [
|
|
{"id":"shape", "mode":"fixture.upper", "prompt":"hello world"},
|
|
{"id":"weakness", "mode":"fixture.echo",
|
|
"prompt":"observed shape: $shape.output.upper",
|
|
"depends_on":["shape"]},
|
|
{"id":"improvement", "mode":"fixture.echo",
|
|
"prompt":"based on $weakness.output.prompt do better",
|
|
"depends_on":["weakness"]}
|
|
]
|
|
}
|
|
}'
|
|
RUN="$(curl -sS -X POST http://127.0.0.1:3110/v1/observer/workflow/run \
|
|
-H 'Content-Type: application/json' -d "$WORKFLOW")"
|
|
STATUS="$(echo "$RUN" | jq -r '.status')"
|
|
SHAPE_UPPER="$(echo "$RUN" | jq -r '.nodes[0].output.upper')"
|
|
WEAK_PROMPT="$(echo "$RUN" | jq -r '.nodes[1].output.prompt')"
|
|
IMP_PROMPT="$(echo "$RUN" | jq -r '.nodes[2].output.prompt')"
|
|
|
|
if [ "$STATUS" = "succeeded" ] && [ "$SHAPE_UPPER" = "HELLO WORLD" ] \
|
|
&& [[ "$WEAK_PROMPT" == *"HELLO WORLD"* ]] \
|
|
&& [[ "$IMP_PROMPT" == *"HELLO WORLD"* ]]; then
|
|
echo " ✓ status=succeeded · shape=HELLO WORLD · refs propagated through 3-node chain"
|
|
else
|
|
echo " ✗ status=$STATUS shape=$SHAPE_UPPER weak=$WEAK_PROMPT imp=$IMP_PROMPT"
|
|
echo " full: $RUN"
|
|
FAILED=1
|
|
fi
|
|
|
|
# ── 3. Per-node provenance recorded as ObservedOps ────────────────
|
|
echo "[workflow-smoke] /observer/stats reflects workflow ops:"
|
|
STATS="$(curl -sS http://127.0.0.1:3110/v1/observer/stats)"
|
|
WORKFLOW_OPS="$(echo "$STATS" | jq -r '.by_source.workflow // 0')"
|
|
TOTAL="$(echo "$STATS" | jq -r '.total')"
|
|
if [ "$WORKFLOW_OPS" = "3" ] && [ "$TOTAL" = "3" ]; then
|
|
echo " ✓ 3 workflow ops recorded (one per node), total=3"
|
|
else
|
|
echo " ✗ workflow=$WORKFLOW_OPS total=$TOTAL"
|
|
echo " full: $STATS"; FAILED=1
|
|
fi
|
|
|
|
# ── 4. Unknown mode → 400 ─────────────────────────────────────────
|
|
echo "[workflow-smoke] unknown mode → 400:"
|
|
HTTP="$(curl -sS -o /tmp/wf_bad.json -w '%{http_code}' -X POST \
|
|
http://127.0.0.1:3110/v1/observer/workflow/run \
|
|
-H 'Content-Type: application/json' \
|
|
-d '{"workflow":{"name":"bad","nodes":[{"id":"a","mode":"does.not.exist"}]}}')"
|
|
ERR="$(jq -r '.error' < /tmp/wf_bad.json 2>/dev/null)"
|
|
if [ "$HTTP" = "400" ] && echo "$ERR" | grep -qi "unknown mode"; then
|
|
echo " ✓ unknown mode aborts with 400 + helpful error"
|
|
else
|
|
echo " ✗ http=$HTTP err=$ERR"; FAILED=1
|
|
fi
|
|
|
|
# ── 5. Real-mode chain: matrix.downgrade → distillation.score ─────
|
|
# This proves the §3.4 components compose through the workflow runner.
|
|
# Two pure modes, no external service deps, deterministic input/output.
|
|
echo "[workflow-smoke] real-mode chain: downgrade → distillation.score"
|
|
REAL_WORKFLOW='{
|
|
"workflow": {
|
|
"name": "real-mode-chain",
|
|
"nodes": [
|
|
{"id":"gate", "mode":"matrix.downgrade",
|
|
"inputs":{"mode":"codereview_lakehouse", "model":"x-ai/grok-4.1-fast"}},
|
|
{"id":"score", "mode":"distillation.score",
|
|
"inputs":{"record":{
|
|
"run_id":"r-1", "task_id":"t-1",
|
|
"timestamp":"2026-04-29T12:00:00Z", "schema_version":1,
|
|
"provenance":{"source_file":"data/_kb/scrum_reviews.jsonl",
|
|
"sig_hash":"x", "recorded_at":"2026-04-29T12:00:01Z"},
|
|
"success_markers":["accepted_on_attempt_1"]
|
|
}}}
|
|
]
|
|
}
|
|
}'
|
|
RUN="$(curl -sS -X POST http://127.0.0.1:3110/v1/observer/workflow/run \
|
|
-H 'Content-Type: application/json' -d "$REAL_WORKFLOW")"
|
|
STATUS="$(echo "$RUN" | jq -r '.status')"
|
|
GATE_MODE="$(echo "$RUN" | jq -r '.nodes[0].output.mode')"
|
|
GATE_FROM="$(echo "$RUN" | jq -r '.nodes[0].output.downgraded_from')"
|
|
SCORE_CAT="$(echo "$RUN" | jq -r '.nodes[1].output.category')"
|
|
if [ "$STATUS" = "succeeded" ] \
|
|
&& [ "$GATE_MODE" = "codereview_isolation" ] \
|
|
&& [ "$GATE_FROM" = "codereview_lakehouse" ] \
|
|
&& [ "$SCORE_CAT" = "accepted" ]; then
|
|
echo " ✓ downgrade flipped lakehouse→isolation; scorer rated scrum_review attempt_1=accepted"
|
|
else
|
|
echo " ✗ status=$STATUS gate=$GATE_MODE from=$GATE_FROM score=$SCORE_CAT"
|
|
echo " full: $RUN"
|
|
FAILED=1
|
|
fi
|
|
|
|
if [ "$FAILED" -eq 0 ]; then
|
|
echo "[workflow-smoke] Workflow runner acceptance: PASSED"
|
|
exit 0
|
|
else
|
|
echo "[workflow-smoke] Workflow runner acceptance: FAILED"
|
|
exit 1
|
|
fi
|