profit 181c35b829 scrum_master fact extraction + verifier gate + schema_version bump
Three bundled changes that round out the KB enrichment pipeline
(PR #9 commits B/C/D compressed into one — they all touch the same
persist surfaces so splitting them would just add noise):

B. scrum_master reviews now route accepted review bodies through
   fact_extractor (same llm_team extract pipeline as inference) and
   append to data/_kb/audit_facts.jsonl tagged source:"scrum_review".
   One KB, two producers — downstream consumers can filter by source
   when they care about provenance. Skips reviews <120 chars
   (one-liners / LGTM-type comments with no extractable knowledge).

C. Verifier-gated fact persistence. fact_extractor now parses the
   verifier's free-form prose into per-fact verdicts (CORRECT /
   INCORRECT / UNVERIFIABLE / UNCHECKED). Facts marked INCORRECT are
   dropped on write; CORRECT + UNVERIFIABLE + UNCHECKED are kept
   (dropping UNVERIFIABLE would lose ~90% of real signal — the
   verifier's prior-knowledge base doesn't know Lakehouse internals,
   so domain-specific facts read as UNVERIFIABLE by default).

   verifier_verdicts array is persisted alongside facts so downstream
   queries can surface high-confidence facts (CORRECT) separately
   from provisional ones (UNVERIFIABLE).

   schema_version:2 added to both scrum_reviews.jsonl and
   audit_facts.jsonl writes. Old (v1) rows remain readable; new rows
   get the field so the forward-compat reader in kb_query can
   differentiate.

D. scrum_master_reviewed:true flag added to scrum_reviews.jsonl
   rows on accept. Future kb_query surfacing can filter by this
   (e.g., "show me PRs where a scrum review exists vs only inference"
   as governance signal). Also carried into audit_facts.jsonl when
   the scrum_review source path writes there.
2026-04-22 23:40:21 -05:00
..
2026-04-22 03:54:18 -05:00