lakehouse

Go to file

root f4cff660aa ADR-021 Phase D fix: strip flag names + Rust keywords from pattern_keys

Iter 9 revealed two quality bugs in the extractor:

1. Kimi wraps the Flag column in backticks (\`DeadCode\`), so the flag
   name itself was captured as a code token. Result: pattern_keys like
   "DeadCode:DeadCode" that match nothing and add noise to the index.
   Fix: filter FLAG_VARIANTS out of token candidates.

2. Complex backtick content like \`Foo::bar(&self) -> u64\` was rejected
   wholesale by the identifier regex. Fallback now scans for identifier
   substrings and ranks by ::-qualified paths first, then length.
   Bonus: filter Rust keywords (self, mut, async, etc) since they're
   grammar, not bug-shape signal.

Dry-run on iter 9 delta.rs output produces semantically meaningful keys:
  DeadCode:DeltaStats::tombstones_applied
  NullableConfusion:DeltaError-DeltaStats-apply_delta
  BoundaryViolation:apply_delta-journald::emit-rows_dropped_by_tombstones
  PseudoImpl:apply_delta-delta_ops-validate_schema

These are stable under reviewer prose variation (canonical sort + top-3
slice) and precise enough to separate different bugs within the same
Flag category.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-24 06:05:50 -05:00

auditor

Audit pipeline PR #9 : determinism + fact extraction + verifier gate + KB stats + context injection (PR #9 )

2026-04-23 05:29:38 +00:00

bot

Control-plane pivot: Phase 38-44 plan + bot scaffold

2026-04-22 02:43:31 -05:00

config

Phase 40: Routing Engine + Policy

2026-04-23 02:36:45 -05:00

crates

ADR-021: semantic-correctness layer lands in pathway_memory (A+B+C)