4 Commits

Author SHA1 Message Date
root
622e124b8f phase 2: matrix.downgrade reads WeakModels from config
migrate the strong-model auto-downgrade gate from a hardcoded weak
list to cfg.Models.WeakModels. backward compatible: existing API
preserved, callers that don't migrate keep using DefaultWeakModels.

changes:
- internal/matrix/downgrade.go: split IsWeakModel into rule-based
  base (`:free` suffix/infix) + literal-list lookup. New
  IsWeakModelInList(model, list) takes the config-supplied list.
  DowngradeInput grows a WeakModels field; nil falls back to
  DefaultWeakModels (preserves pre-phase-2 behavior).
- internal/workflow/modes.go: add MatrixDowngradeWithWeakList(list)
  factory mirroring MatrixSearch's pattern. Plain MatrixDowngrade
  kept for backward compat.
- cmd/matrixd/main.go: handlers struct holds weakModels populated
  from cfg.Models.WeakModels at startup; handleDowngrade threads it
  into every DowngradeInput.
- cmd/observerd/main.go: registerBuiltinModes accepts weakModels
  and uses the factory variant. observerd reads cfg.Models.WeakModels
  in main().

end-to-end verified: downgrade + matrix + observer + workflow smokes
all pass. Existing TestMaybeDowngrade_TruthTable + TestIsWeakModel
unchanged (backward compat). Two new tests cover the config path:
- TestIsWeakModelInList — covers rule + literal + empty + nil
- TestMaybeDowngrade_WithConfigList — verifies cfg list overrides
  default

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 23:52:18 -05:00
root
8278eb9a87 scrum2 cleanup: JSON-marshal in stringifyValue, drop dead detectCycle, name SourceWorkflow
5 small fixes from the §3.8 scrum2 review wave:

- workflow.stringifyValue now JSON-marshals maps/slices instead of
  fmt.Sprint %v (Opus+Kimi convergent: LLM modes were getting Go's
  map[k:v] syntax, which is unparseable as JSON context).
- workflow.detectCycle removed — duplicate of topoSort that discarded
  the useful node ID. Validate() now calls topoSort directly and
  returns its wrapped ErrCycle.
- observer.SourceWorkflow named constant — was an implicit string
  cast (observer.Source("workflow")) at the cmd/observerd handler.
- Unused context imports + dead silencer comments removed across
  workflow/modes.go and observerd/main.go.
- Unused store parameter dropped from registerBuiltinModes (reserved
  comment removed; can be re-added when a mode actually needs it).

just verify still PASS — these are pure cleanup, no behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 23:16:07 -05:00
root
c7e3124208 §3.8 second slice: real modes wired (matrix.relevance/downgrade/search,
distillation.score, drift.scorer)

Lands the workflow.Mode adapters for the §3.4 components + the
distillation scorer + drift quantifier. Workflows can now compose
real measurement capabilities; the substrate's parallel
capabilities become composable Lego bricks (per the prior commit's
closing insight).

Modes registered (in observerd's registerBuiltinModes):

  Pure-function wrappers (no I/O):
    - matrix.relevance    → matrix.FilterChunks
    - matrix.downgrade    → matrix.MaybeDowngrade
    - distillation.score  → distillation.ScoreRecord
    - drift.scorer        → drift.ComputeScorerDrift

  HTTP-backed:
    - matrix.search       → POST matrixd /matrix/search
                             (registered only when matrixd_url is set)

  Fixture (kept from §3.8 first slice):
    - fixture.echo, fixture.upper

internal/workflow/modes.go:
  Each mode follows the same glue pattern: marshal generic input
  through a typed struct (free schema validation + clear error
  messages), call the underlying capability, return a generic
  output map. Roundtrip-via-JSON gives us schema validation
  without writing custom field-by-field coercion.

internal/workflow/modes_test.go (10 tests, all PASS):
  - matrix.relevance filters adjacency pollution (Connector kept,
    catalogd::Registry dropped — same headline as the relevance
    smoke, run through the workflow mode)
  - matrix.downgrade flips lakehouse→isolation on strong model;
    keeps lakehouse on weak (qwen3.5:latest); errors on missing
    fields
  - distillation.score rates scrum_review attempt_1 as accepted;
    rejects empty record
  - drift.scorer reports zero drift on matched inputs; errors on
    empty inputs slice
  - matrix.search HTTP flow round-trips through httptest fake
    matrixd; non-OK status surfaces a clear error

scripts/workflow_smoke.sh (5 assertions PASS, was 4):
  New assertion #5: real-mode chain
    matrix.downgrade (lakehouse + grok-4.1-fast → isolation)
    → distillation.score (scrum_review attempt_1 → accepted)
  Proves §3.4 components compose through the workflow runner with
  no fixture intermediation. Both nodes ran successfully, runner
  recorded provenance, status=succeeded.

  Mode listing assertion now expects 7 modes (5 real + 2 fixture)
  instead of just the fixtures.

17-smoke regression all green. SPEC §3.8 acceptance gate G3.8.D
("Mode catalog dispatches matrix.search invocation to the matrixd
backend without going through HTTP") still pending — current path
goes through HTTP for matrix.search, which is the cleaner service-
mesh shape but slower than direct in-process. In-process dispatch
when matrixd is co-resident is a future optimization.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 20:39:26 -05:00
root
e30da6e5aa §3.8 first slice: workflow runner skeleton + DAG executor + observerd integration
Lands the structural piece of SPEC §3.8 (Observer-KB workflow runner)
documented in 97dd3f8: types + DAG runner + reference substitution +
provenance recording into observerd. Real-mode integrations
(matrix.search, distillation.score, drift.scorer, llm.chat) come in
follow-up commits — this commit proves the mechanics.

internal/workflow/types.go:
  - Workflow / Node / NodeResult / RunResult types matching Archon's
    YAML shape so existing workflows (e.g. lakehouse-architect-review.yaml)
    load directly. Optional `mode` field added — implicit fall-back is
    "llm.chat" matching Archon's convention.
  - Mode signature: func(Context, map[string]any) (map[string]any, error)
  - 4 sentinel errors: ErrCycle, ErrMissingDep, ErrUnknownMode,
    ErrDuplicateNodeID, ErrUnresolvedRef
  - Validate enforces structural invariants: unique IDs, every
    depends_on resolves, no cycles

internal/workflow/runner.go:
  - Kahn's-algorithm topological sort, stable for declaration-order
    ties (deterministic execution + JSON output across runs)
  - Reference substitution: $node_id.output.key.path resolves through
    nested maps; $node_id alone resolves to the whole output map
  - Skip cascade: a node whose dependency failed/skipped is skipped
    with explicit "upstream node X failed" error in NodeResult, never
    silently dropped
  - Per-node provenance: NodeResult.StartedAt + DurationMs captured
    for every execution
  - Mode pre-validation: every node's mode checked against registry
    BEFORE any node runs — typo catches in 5ms not after 6 nodes

internal/workflow/runner_test.go (14 tests, all PASS):
  - Validate: missing name, no nodes, duplicate IDs, missing deps, cycles
  - Run: single node, 3-node DAG with chained $-refs (shape→weakness→improvement),
    failed-node skip cascade with independent siblings still running,
    unknown-mode abort, unresolved-reference error, implicit
    llm.chat fallback, provenance fields populated, inputs (not just
    prompt) honor $-refs, topological-sort stability for ties

cmd/observerd extended:
  - POST /observer/workflow/run executes a workflow, records each
    node's execution as an ObservedOp (source="workflow"), returns
    the full RunResult
  - GET /observer/workflow/modes lists the registered mode names
  - registerBuiltinModes wires fixture.echo + fixture.upper for v0;
    real modes register here in follow-up commits

scripts/workflow_smoke.sh (4 assertions PASS):
  - GET /modes lists fixture.echo + fixture.upper
  - 3-node DAG executes: shape (uppercase "hello world") → weakness
    (sees "HELLO WORLD" via $shape.output.upper ref) → improvement
    (sees "HELLO WORLD" propagated through 2-hop $weakness.output.prompt)
  - /observer/stats shows by_source.workflow == 3 (one per node) and
    total == 3 — provenance lands as expected
  - Unknown mode → 400 with "unknown mode" in error body

17-smoke regression all green. Acceptance gates G3.8.A (Archon-shape
workflow loads + executes topologically) + G3.8.B (per-node ObservedOps)
+ G3.8.C ($prior_node.output ref resolves, error on missing ref) all
satisfied. G3.8.D (in-process matrix.search dispatch) deferred until
a real mode is wired.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 20:34:30 -05:00