From 92df0e930aac01c09a35a8bc270504a2ae62453e Mon Sep 17 00:00:00 2001
From: root <root@island37.com>
Date: Fri, 24 Apr 2026 05:40:59 -0500
Subject: [PATCH] ADR-021: semantic-correctness layer on pathway_memory

Spec for the compounding-bug-grammar insight from J's feedback on the
queryd/delta.rs unit-mismatch fix (86901f8). Adds three proposed fields
to PathwayTrace (semantic_flags, type_hints_used, bug_fingerprints),
9 initial SemanticFlag variants, and the truth::evaluate review-time
task_class pattern that reuses existing primitives instead of building
a type-inference engine. Implementation pending approval on the flag
set and fingerprint shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 docs/DECISIONS.md | 5 +++++
 1 file changed, 5 insertions(+)
diff --git a/docs/DECISIONS.md b/docs/DECISIONS.md
index e8ef0de..3cafa35 100644
--- a/docs/DECISIONS.md
+++ b/docs/DECISIONS.md
@@ -99,3 +99,8 @@
 **Date:** 2026-04-19
 **Decision:** `catalogd::Registry::register(name, fingerprint, objects)` is idempotent on `name`. If no manifest for `name` exists, create one. If one exists with the same `schema_fingerprint`, reuse its `DatasetId`, replace `objects`, bump `updated_at`, and write through. If one exists with a different `schema_fingerprint`, reject with `409 Conflict` (HTTP) / `FAILED_PRECONDITION` (gRPC). A one-shot operator endpoint `POST /catalog/dedupe` collapses any pre-existing duplicates (preferring the manifest with a non-null `row_count`, then the most recently updated).
 **Rationale:** Registry was keyed by surrogate `DatasetId` with no uniqueness constraint on `name`, so every caller that re-registered (re-ingest, external cron, gRPC retry) silently created a parallel manifest pointing at the same parquet — accumulating 308× `successful_playbooks` in live state before detection. The fingerprint gate turns re-ingest into an explicit no-op (matching PRD invariant #5 "ingestd is idempotent — re-ingesting the same file is a no-op") while forcing schema drift to be visible instead of silently clobbering. 409 status separates policy rejections from server errors, which matters for the Phase 12 tool-consumer ecosystem. Concurrency: the write lock is held across the storage write to close the check→insert TOCTOU window; serializing registers is acceptable because registers-per-second is low on the ingest path. Audit: idempotent-register events are visible as bumps to the stored manifest's `updated_at` field and in `catalogd` tracing output (tracing is non-durable, operator view only); `DedupeReport` is the return-value audit for cleanup runs. No event-journal entries are emitted — ADR-012 scopes the journal to row-level mutations, not catalog-manifest operations.
+
+## ADR-021: Semantic-correctness layer on pathway_memory — matrix-indexed bug grammar
+**Date:** 2026-04-24
+**Decision:** Extend `pathway_memory::PathwayTrace` (ADR added 2026-04-24 in same commit as this one) with a semantic-correctness layer so the matrix index compounds recognition of unit/type/shape bugs across iterations. Three new fields: `semantic_flags: Vec<SemanticFlag>` (enum: `UnitMismatch`, `TypeConfusion`, `NullableConfusion`, `OffByOne`, `StaleReference`, `PseudoImpl`, `DeadCode`, `WarningNoise`, `BoundaryViolation`), `type_hints_used: Vec<TypeHint>` (schema/type context the reviewer was given — catalogd column types for SQL-touching code, Arrow `RecordBatch.schema()` accessors for Rust, Rust struct field types for everything else), and `bug_fingerprints: Vec<BugFingerprint>` (structural pattern hash, e.g. `{lhs_unit: "rows", rhs_unit: "files", op: "-"}` → stable SHA for similarity retrieval). Scrum pipeline pre-review: query matrix index for bug fingerprints flagged on this file's narrow fingerprint (same `task_class + file_prefix + signal_class` as hot-swap) and prepend them to the reviewer prompt as "watch for these patterns historically found here." Reviewer prompt explicitly tags each finding with a `semantic_flag`. `truth::evaluate()` gets a review-time task_class (`code_review.unit_check`) that consumes parsed-fact rules like `FieldContainsAny { field: "code_expression", needles: ["row_count - file_count", "bytes_read - row_count"] }` — the same primitive we use for SQL guard in P42-002.
+**Rationale:** The 2026-04-24 `queryd/src/delta.rs` `base_rows = pre_filter_rows - delta_count` bug (86901f8) was found by a human reading the code and noticing units didn't match. The hardened mechanical applier *cannot* catch this — its gates are syntactic (warning count, patch size, rationale-token alignment) not semantic. At 100 bugs this deep, no human catches them all; the signal→commit loop is capped by what humans can notice per iteration. We already ship the primitives: `catalogd` knows column types per dataset, Arrow `RecordBatch.schema()` is on every hot-path call, `truth::evaluate()` runs arbitrary field conditions at runtime, `shared/arrow_helpers` has typed row/byte/file accessors. All of this is used at RUNTIME; none is fed into the REVIEW pipeline. Semantic flags + bug fingerprints turn the matrix index from "what review happened" (current) into "what category of bug appeared where" (compounding) — so iter-20 scrum on `crates/queryd/src/` preempts review prompts with "this crate had a row/file unit mismatch in iter 7 (delta.rs:189); check every arithmetic on `*_count` variables." Non-goals: we are NOT building a full type-inference engine (reuse Rust's `rustdoc`-level type info for structs, Arrow's schema for RecordBatch, catalogd's column types for SQL — everything beyond is Phase 3). Non-goals: this is not a linter — clippy/rustc already catch syntactic issues; this catches SEMANTIC ones (same type, wrong units/role). Bootstrap path: start with the 9 `SemanticFlag` variants above; add new variants only when a bug is found that doesn't fit an existing one. Gate alignment with hot-swap: a pathway that repeatedly produces bugs of the same `SemanticFlag` variant on the same narrow fingerprint is more valuable as a "watch this file for X" signal than as a hot-swap candidate — retirement logic needs to consider both replay success_rate AND whether the pathway is serving as a bug-pattern beacon.