4 Commits

Author SHA1 Message Date
root
2c71d1c637 ADR-005: observer fail-safe semantics
Closes the OPEN item from STATE_OF_PLAY. Required because observerd is
now on the prod-realistic data path via the lift harness boot (b2e45f7),
so the next consumer (scrum runner / distillation rebuild / production
workflow) needs the fail-safe rationale locked, not implicit.

The Rust "verdict:accept on crash" anti-pattern doesn't translate
one-to-one to the Go observer (witness, not gate). But four adjacent
fail-safe decisions are real and live:

5.1 Persist failure is logged-not-fatal; ring is in-flight source of
    truth. Persist-required mode deferred to a future opt-in ADR.

5.2 Mode failure → Success=false, no panic-swallow path. The runner
    catches mode errors and surfaces them via node.Error; downstream
    consumers see failures explicitly rather than as fake successes
    (the Rust anti-pattern surface).

5.3 One row per node, recorded post-run. A workflow with N nodes
    produces N audit rows, never a per-workflow catch-all that
    survives partial crashes. Known gap: recording happens after
    runner.Run returns (acceptable for short workflows; streaming
    callback is the right shape when workflows get longer).

5.4 /observer/event accepts on full ring (oldest evicted). Refusing
    to write would translate every burst into client errors — wrong
    direction for an audit witness.

Mostly ratifies existing behavior; cross-checked claims against
actual code (caught one error in Decision 5.3 draft — recording is
post-run-batched, not per-node-as-it-completes — and the ADR now
states reality).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 06:32:12 -05:00
root
2a6234ff82 ADR-004 + internal/pathway: Mem0 versioned trace substrate
Closes Sprint 2 design-bar work (audit reports/scrum/sprint-backlog.md):
  S2.1 — ADR-004 documents the pathway-memory data model
  S2.2 — pathway port lands with deterministic fixture corpus
         and full test coverage on day one
  S2.3 — retired traces are excluded from retrieval (test
         passes; would fail without the filter)

Mem0-style operations: Add / AddIdempotent / Update / Revise /
Retire / Get / History / Search. Each operation is a method on
Store; persistence is JSONL append-only with corruption recovery
on Replay.

internal/pathway/types.go     Trace + event + SearchFilter + sentinel errors
internal/pathway/store.go     in-memory state + RWMutex + ops
internal/pathway/persistor.go JSONL append-only log with replay
internal/pathway/store_test.go  20 test funcs covering all 7
                                Sprint 2 claim rows + concurrency
internal/pathway/persistor_test.go  6 test funcs covering missing-
                                file, corruption recovery, long-line
                                handling, parent-dir auto-create,
                                apply-error skip behavior

Sprint 2 claim coverage row-by-row:
  ADD          TestAdd_AssignsUIDAndTimestamps + TestAdd_RejectsInvalidJSON
  UPDATE       TestUpdate_ReplacesContentSameUID + Update_MissingUID_Errors
  REVISE       TestRevise_LinksToPredecessorViaHistory +
               TestRevise_PredecessorMissing_Errors +
               TestRevise_ChainOfThree_BackwardWalk
  RETIRE       TestRetire_ExcludedFromSearch +
               TestRetire_StillAccessibleViaGet +
               TestRetire_StillAccessibleViaHistory
  HISTORY/cycle TestHistory_CycleDetected (injected via internal map),
                TestHistory_PredecessorMissing_TruncatesChain,
                TestHistory_UnknownUID_ErrorsClean
  REPLAY/dup   TestAddIdempotent_IncrementsReplayCount (locks the
               "replay preserves original content" rule per ADR-004)
  CORRUPTION   TestPersistor_CorruptedLines_Skipped +
               TestPersistor_ApplyError_Skipped
  ROUND-TRIP   TestPersistor_RoundTrip locks the full Save → fresh
               Store → Load → Stats-match contract

Two real bugs caught during testing:
  - Add returned the same *Trace stored in the map, so callers
    holding a reference saw later mutations. Fixed: clone before
    return (matches Get's contract). Same fix in AddIdempotent
    + Revise.
  - Test typo: {"v":different} isn't valid JSON; AddIdempotent's
    json.Valid rejected it as ErrInvalidContent. Test fixed to
    use {"v":"different"}; the validation behavior is correct.

Skipped this commit (next):
  - cmd/pathwayd HTTP binary
  - gateway routing for /v1/pathway/*
  - end-to-end smoke
  These add the wire surface; the substrate ships first so the
  wire layer can be a pure proxy in the next commit.

Verified:
  go test -count=1 ./internal/pathway/ — 26 tests green
  just verify                          — vet + test + 9 smokes 34s

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 07:23:30 -05:00
root
0d18ffa780 ADR-003: inter-service auth posture — Bearer + IP allowlist
Locks in the auth model that R-001 + R-007 will be retrofitted
against. Doc-only — wiring deferred to Sprint 1 when the first
non-loopback binding is needed.

Decision: Bearer token (from secrets-go.toml [auth] section) + IP
allowlist (CIDR list). Both layers required when auth is on; empty
token = G0 dev no-op. /health exempt.

Implementation shape (when it lands):
  - internal/shared/auth.go middleware: one chi r.Use line per binary
  - shared.Run gates: refuses non-loopback bind without configured token
  - subtle.ConstantTimeCompare for token equality (timing-safe)

Alternatives considered + rejected:
  mTLS         — too heavy for single-machine inter-service traffic
  JWT          — buys nothing over Bearer without external IdP
  IP-only      — one stolen IP entry = full access; no defense depth
  OAuth2       — no external IdP commitment in G0-G3 timeline

What this doesn't do:
  - Doesn't implement (code lands Sprint 1)
  - Doesn't break G0 dev (empty token = middleware no-op)
  - Doesn't address gateway→end-user auth (different ADR shape)

Closes the design-decision blocker for R-001 and R-007. Wiring
ticket: Sprint 1 backlog story S1.2.

Also lifts ADR-002 (storaged per-prefix PUT cap) into the doc —
it was implemented in 423a381 but not yet recorded as an ADR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:05:59 -05:00
Claw
f07668064e docs: seed PRD + SPEC for the Go-direction rewrite
Two documents only — no Go code yet. PRD restates the problem and
preserves the Rust PRD's invariants verbatim, then maps the locked
stack to Go libraries and surfaces four hard problems (DuckDB-via-cgo
for the query engine, Lance dropped, Dioxus → HTMX, arrow-go maturity).
SPEC walks each Rust crate + TS surface and tags the port with library
choice / effort estimate / risk + a 5-phase migration plan from
skeleton (Phase G0) to demo parity (Phase G5).

Six open questions remain that gate Phase G0:
- DuckDB cgo OK?
- HTMX vs React for the UI?
- Repo location?
- Distillation v1.0.0 port verbatim or rebuild?
- Pathway memory data — port 88 traces or start clean?
- Auditor lineage — port audit_baselines.jsonl or restart?

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 06:35:23 -05:00