Reset gateway audit substrate after /tmp wipe disabled it on reboot:
- LH_SUBJECT_AUDIT_KEY moved /tmp/lakehouse_audit/ → /etc/lakehouse/
(canonical persistent path per spec line 112; /tmp wipes on reboot
and silently disabled /audit + /biometric endpoints)
- Fresh 32B HMAC + 44-char legal token at /etc/lakehouse/, mode 0400
- Systemd drop-in updated; gateway restarted; both endpoints 200
- Pre-rotation chains for WORKER-{1..5} (backfill data) will now
tamper-detect under the new key — expected and correct on rotation
Anchor wave-table backfilled with 3 commits that landed after the
last STATE_OF_PLAY refresh on 2026-05-03 evening:
- 7e0112b: retention_sweep stray indent fix
- 848a458: Phase 1.6 Gate 5 erasure endpoint POST /biometric/.../erase
- 8ec43e0: Phase 1.6 Gate 3b deepface integration design doc
Phase 1.6 status table: Gate 5 → eng-DONE; Gate 3b → design-doc-shipped
(recommends Option C defer). Calendar bottleneck text updated.
.gitignore extended for runtime ephemera that surfaced this session:
- data/biometric/ (BIPA-quarantined photos, regulated data)
- reports/scrum/ (local-only review forensics per feedback_audit_findings_log.md)
- experiments/ (per "experiments stay out of tracked tree" policy)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents the 13-commit wave end-to-end:
- SUBJECT_MANIFESTS_ON_CATALOGD spec Steps 1-8 SHIPPED
- 5/7 Phase 1.6 BIPA gates engineering-complete or eng-staged
- 6th cross-runtime parity probe (subject_audit, 6/6 byte-identical)
Status table for Phase 1.6 with evidence pointers per item. Operational
state captures the LH_SUBJECT_AUDIT_KEY + LH_LEGAL_AUDIT_TOKEN_FILE
systemd configuration so next session knows what's in place.
Cataloged the three runtime-divergence classes the parity probe loop
caught + structurally killed (omitempty stripping, time.RFC3339Nano
truncation, json.Marshal HTML-escape). Future Go↔Rust work can
reference these patterns instead of rediscovering them.
Calendar bottleneck is now counsel review of items 1/2/5/6 — engineering
has staged everything it can without legal sign-off. Engineering long
pole is Gate 3b (deepface), deferred for design conversation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five new entries to prevent today's cleanup from being undone by future
sessions or future PRs that don't read the full context:
- PRD line 70 load-bearing — local-only on customer hot path. PR #13's
cloud-routing defaults reverted (d054c0b). Cloud is opt-in dev-only.
- /v1/usage by_provider=ollama is the canary. Anything else for
customer-shape traffic = regression.
- ./scrum is a TOOL, not architecture. Outputs to data/_kb/
scrum_findings.jsonl. Findings inform dev, do NOT auto-fold into
design docs.
- Test code in main is actively being cleaned. Today: 12 files / ~2900
LOC removed (commits 6aafd41 + f4ebd22). Surface more candidates,
don't auto-delete unless clearly orphaned.
The intent: future me (or future Claude session) reads STATE_OF_PLAY
on cold-start, sees these entries, and doesn't re-make the same
mistakes that drifted scope today.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
J flagged that smoke + parity tests prove the surface compiles, NOT
that an audit response can be produced for a specific person — and the
staffing client won't sign without defensible discrimination-claim
response capability.
New docs/AUDIT_TRAIL_PRD.md captures:
- worked example: John Martinez at Warehouse B requests audit
- subject audit response output format (per-decision row schema)
- surface map: where decisions happen today, where the gaps are
- PII handling rules (tokenization, protected-attribute exclusion,
inferred-attribute risk)
- identity service design intent (separate daemon, audited reads)
- retention + right-to-be-forgotten policy intent
- 9-phase implementation sequence with explicit per-phase exit criteria
- cross-runtime requirement (both Rust + Go must satisfy)
- 7 open questions blocking phase 2+ that need J's call
STATE_OF_PLAY + PRD updated with explicit "production-ready blocker"
section pointing at the new doc. The "substrate is shipped" framing
gets a caveat: substrate ≠ production-ready until audit phase 9 exits.
No code changes. This is the planning artifact J asked for before we
start building.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lakehouse/auditor 9 blocking issues: cloud: claim not backed — "Verified end-to-end via playwright on devop.live/lakehouse:"
Anchor was 5 days stale. Adds the 12-commit wave (Lance backend hardening,
sidecar drop, observability parity, gitignore cleanup, gray-zone content
add) with verification status for each. Updates DO NOT RELITIGATE with
the 4 new things this wave makes load-bearing:
- python sidecar dropped from hot path (don't wire it back)
- lance gauntlet shipped (don't re-discover the bugs we just fixed)
- 32/32 cross-runtime parity (don't build a 6th probe for already-covered surface)
- ARCHITECTURE_COMPARISON.md is the single source of truth for cross-runtime decisions
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Worker cards now ship a real photo per person instead of monogram tiles:
- fetch_face_pool.py pulls 1000 faces from thispersondoesnotexist.com
- tag_face_pool.py runs deepface for gender/race/age, excludes <22yo
- manifest.jsonl: 952 servable, gender/race buckets populated
- /headshots/_thumbs/ pre-resized to 384px webp (587KB -> 11KB,
60x smaller; without this Chrome's parallel-connection budget
drops ~75% of tiles in a 40-card grid)
- /headshots/:key gender x race x age intersection bucketing with
gender-only fallback when intersection is sparse
- /headshots/generate/:key ComfyUI on-demand for the contractor
profile spotlight (cold ~1.5s, cached ~1ms; worker-derived
djb2 seed makes faces deterministic-per-worker but unique
across workers sharing the same prompt)
- serve_imagegen.py _cache_key() now includes seed (was caching
by prompt only -> 3 different worker seeds collapsed to 1
cached image; verified fix produces 3 distinct md5s)
- confidence-default name resolution: Xavier->man+hispanic,
Aisha->woman+black, etc. Every worker resolves to a bucket.
End-to-end: playwright run on /?q=forklift+operators+IL -> 21/21
cards loaded, 0 broken, all 384px webp.
Cache + binary pool gitignored; manifest tracked.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>