root ee31424d0c
Some checks failed
lakehouse/auditor 4 blocking issues: todo!() macro call in tests/real-world/scrum_master_pipeline.ts
ADR-021 Phase D: bug_fingerprint pattern extraction from reviewer output
Fills the gap between Phase B (flags tagged) and Phase C (preamble
quotes past fingerprints): parses each reviewer line that mentions a
Flag variant, collects backtick-quoted identifiers, canonicalizes them
(sorted alphabetically, top 3), and emits a stable pattern_key of
shape `{Flag}:{tok1}-{tok2}-{tok3}`.

Stability by design: canonical sort means "row_count + QueryResponse"
and "QueryResponse + row_count" produce the same key, so variation in
reviewer prose doesn't fragment the index. Top-3 cap keeps keys short
while retaining enough signal to separate different bugs of the same
category.

Dry-run validation on iter-8 delta.rs output (crates/queryd prefix)
extracted 10 semantically meaningful fingerprints including:
  - UnitMismatch:base_rows-checked_add-checked_sub
  - DeadCode:queryd::delta::write_delta (P9-001 dead-function finding)
  - BoundaryViolation:can_access-log_query-masked_columns (P13-001 gap)
  - NullableConfusion:CompactResult-DeltaError-IntegerOverflow

Cross-cutting signal: kimi-k2:1t's finding #5 explicitly quoted the
seeded pathway memory preamble ("Pathway memory flags row_count-
file_count unit mismatch") and proposed overflow-checked arithmetic as
the fix. That is the compounding loop in action — prior bug context
shifted the reviewer's attention toward a specific instance of the
same class, which produces a specific pattern_key that will compound
further on the next iter.

Filter: identifier-shaped tokens only (A-Za-z_ / :: paths / snake_case
/ CamelCase). Skips punctuation, prose quotes, and tokens <3 chars so
generic nouns and partial words don't pollute the index.

What's still queued (Phase E):
  - type_hints_used population from catalogd column types + Arrow schema
  - auditor → pathway audit_consensus update wire (strict-audit gate
    activation)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 06:02:07 -05:00
..