lakehouse

Author	SHA1	Message	Date
root	87b034f5f9	phase 1.6: ops dashboard + consent_versions allowlist + subject timeline tool Closes the afternoon's "all four" wave (per J's request to do all the items in one pass instead of pick-one-of-options): (1) Live demo on WORKER-100 — full lifecycle exercised end-to-end against the running gateway. 3 audit rows landed in correct order (consent_grant → biometric_collection → consent_withdrawal), chain_verified=true, photo on disk at data/biometric/uploads/WORKER-100/1778011967957907731_027b6bb1.jpg (180 bytes JFIF). retention_until=2026-06-04 (30d from withdrawal per consent template v1 §2). (2) GET /biometric/stats — read-only aggregate over all subjects. Returns counts by biometric.status + subject.status, photo count, oldest_active_retention_until, and the last 20 state-change events (consent_grant / collection / withdrawal / erasure — validator_lookup and other noise filtered out). Walks per-subject audit logs via the existing writer; cheap for 100 subjects, would want an event-stream index at 100k. Legal-tier auth (same posture as /audit). 4 unit tests. (3) /biometric/dashboard mcp-server frontend. Auto-refreshes /biometric/stats every 15s, neo-brutalist tile layout for the per-status counts + retention horizon block + recent events table with kind badges + event-kind breakdown pills. sessionStorage-backed token; logout button clears state. DOM-built throughout (textContent + createElement) — never innerHTML on audit-row values, since trace_id et al. could in theory carry operator-supplied strings. (4) consent_versions allowlist. BiometricEndpointState gains `allowed_consent_versions: Option<Arc<HashSet<String>>>`, loaded at startup from /etc/lakehouse/consent_versions.json (override via LH_CONSENT_VERSIONS_FILE). process_consent refuses unknown hashes with HTTP 400 consent_version_unknown when configured. Resolution semantics: - Missing file → permissive (v1 compat, warn-log) - Parse error → permissive (error-log; broken config silently going strict would be worse) - Empty array → strict, refuse all (deliberate freeze mode for "counsel hasn't signed v1 yet") - Populated → strict, lowercase-normalized comparison 5 unit tests (known/unknown/case/empty/none-permissive). Example template at ops/consent_versions.example.json with a counsel-tier deployment note. (5) scripts/staffing/subject_timeline.sh — operator one-shot pretty-print of any subject's full BIPA lifecycle. Curls /audit/subject/{id} with legal token; renders manifest summary + on-disk photo state + chronological audit chain with kind badges + chain verification status. Smoke-tested on WORKER-100 (3 rows verified). (6) STATE_OF_PLAY.md refresh. New section "afternoon wave" captures all four commits (76cb5ac, 7f0f500, 68d226c, this one) + the live demo evidence + the v1 endpoint matrix + UI/CLI inventory + the production-cutover blocking set (counsel calendar only — eng substrate is done). Verified live post-restart: - /audit/health + /biometric/health both 200 - /biometric/stats returns 100 subjects, 2 withdrawn (WORKER-2 from earlier scrum + WORKER-100 from today's demo), 1 photo on record, 6 recent state-change events - /biometric/intake + /biometric/withdraw + /biometric/dashboard all 200 on mcp-server :3700 - subject_timeline.sh on WORKER-100: chain_verified=true, chain_root=a47563ff937d50de… - 88/88 catalogd lib tests + 55/55 biometric_endpoint tests green Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 15:27:52 -05:00
root	b2c34b80b3	phase 1.6: lock Gate 3b = C, reconcile docs to shipped state, fix double-upload file leak Four threads landing together — all driven by the audit J asked for before production cutover. (1) Gate 3b DECIDED: Option C (defer classifications). `BiometricCollection.classifications` stays `Option<JSON> = None` in v1. `docs/specs/GATE_3B_DEEPFACE_DESIGN.md` status flipped from "draft / awaits product" to DECIDED. Consent template + retention schedule revised to remove all "automated facial-classification" / "deepface" language so disclosed scope matches implemented scope. (2) Endpoint-path drift reconciled across 3 docs. `PHASE_1_6_BIPA_GATES.md`, `BIPA_DESTRUCTION_RUNBOOK.md`, and `biometric_retention_schedule_v1.md` had references to legacy `/v1/identity/subjects/` paths (proposed under a separate identityd daemon, never shipped) — corrected to actual shipped routes `/biometric/subject/` (catalogd-local). Schema block in PHASE_1_6_BIPA_GATES rewritten to reflect JSON `SubjectManifest.biometric_collection` substrate (not the proposed Postgres `subjects` table). (3) New operational artifacts: - `scripts/staffing/verify_biometric_erasure.sh` — checks 4 things post-erasure (manifest cleared, uploads dir empty, audit row matches, chain verified). Smoke-tested live against WORKER-2. - `scripts/staffing/biometric_destruction_report.sh` — monthly anonymized destruction-event aggregation. Smoke-tested clean. - `scripts/staffing/bundle_counsel_packet.sh` — tarballs the counsel-review packet with per-file SHA-256 manifest. - `docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md` — formal rotation procedure operationalized after the 2026-05-05 /tmp wipe incident. - `docs/counsel/COUNSEL_REVIEW_PACKET_2026-05-05.md` — cover note bundling all eng-staged BIPA docs for counsel review with per-doc questions, sign-off checklist, recommended review sequence. (4) Double-upload file leak fixed in `crates/catalogd/src/biometric_endpoint.rs`. `verify_biometric_erasure.sh` smoked WORKER-2 and surfaced a stranded photo file. Investigation showed the file was 13-byte test-fixture bytes (zero PII, no biometric content); audit timeline showed two consecutive uploads followed by one erasure — the second upload had silently overwritten manifest.data_path, orphaning the first file. Patched `process_upload` to refuse a second upload with HTTP 409 + `error: "biometric_already_collected"` when `biometric_collection.is_some()` on the manifest. Operator must explicitly POST `/biometric/subject/{id}/erase` first. Tests: new `second_upload_without_erase_returns_409` (asserts 409 + manifest pointer unchanged + first file untouched on disk). Replaced `repeated_uploads_grow_the_chain` with `upload_erase_upload_grows_the_chain_cleanly` (covers the legitimate re-collection cycle: chain grows to 3 rows). Updated `content_type_with_parameters_accepted` to use 2 distinct subjects (was using 1 subject with 2 uploads to test ct parsing — would now 409). 22/22 biometric_endpoint tests + 59/59 catalogd lib tests green post-patch. Production posture: gateway needs `cargo build --release -p gateway` + `systemctl restart lakehouse.service` to pick up the new 409 in live traffic. Counsel calendar is now the only remaining blocker for first real-photo intake. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 06:19:40 -05:00
root	03e8a91d97	STATE_OF_PLAY: 2026-05-05 — audit endpoint recovery + anchor refresh Reset gateway audit substrate after /tmp wipe disabled it on reboot: - LH_SUBJECT_AUDIT_KEY moved /tmp/lakehouse_audit/ → /etc/lakehouse/ (canonical persistent path per spec line 112; /tmp wipes on reboot and silently disabled /audit + /biometric endpoints) - Fresh 32B HMAC + 44-char legal token at /etc/lakehouse/, mode 0400 - Systemd drop-in updated; gateway restarted; both endpoints 200 - Pre-rotation chains for WORKER-{1..5} (backfill data) will now tamper-detect under the new key — expected and correct on rotation Anchor wave-table backfilled with 3 commits that landed after the last STATE_OF_PLAY refresh on 2026-05-03 evening: - 7e0112b: retention_sweep stray indent fix - 848a458: Phase 1.6 Gate 5 erasure endpoint POST /biometric/.../erase - 8ec43e0: Phase 1.6 Gate 3b deepface integration design doc Phase 1.6 status table: Gate 5 → eng-DONE; Gate 3b → design-doc-shipped (recommends Option C defer). Calendar bottleneck text updated. .gitignore extended for runtime ephemera that surfaced this session: - data/biometric/ (BIPA-quarantined photos, regulated data) - reports/scrum/ (local-only review forensics per feedback_audit_findings_log.md) - experiments/ (per "experiments stay out of tracked tree" policy) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 03:30:53 -05:00
root	47a26fdaa8	STATE_OF_PLAY: 2026-05-03 evening wave (subject-manifest substrate + Phase 1.6 BIPA) Documents the 13-commit wave end-to-end: - SUBJECT_MANIFESTS_ON_CATALOGD spec Steps 1-8 SHIPPED - 5/7 Phase 1.6 BIPA gates engineering-complete or eng-staged - 6th cross-runtime parity probe (subject_audit, 6/6 byte-identical) Status table for Phase 1.6 with evidence pointers per item. Operational state captures the LH_SUBJECT_AUDIT_KEY + LH_LEGAL_AUDIT_TOKEN_FILE systemd configuration so next session knows what's in place. Cataloged the three runtime-divergence classes the parity probe loop caught + structurally killed (omitempty stripping, time.RFC3339Nano truncation, json.Marshal HTML-escape). Future Go↔Rust work can reference these patterns instead of rediscovering them. Calendar bottleneck is now counsel review of items 1/2/5/6 — engineering has staged everything it can without legal sign-off. Engineering long pole is Gate 3b (deepface), deferred for design conversation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 05:11:52 -05:00
root	5f40b7a312	STATE_OF_PLAY: lock in today's reverts as DO NOT RELITIGATE Five new entries to prevent today's cleanup from being undone by future sessions or future PRs that don't read the full context: - PRD line 70 load-bearing — local-only on customer hot path. PR #13's cloud-routing defaults reverted (d054c0b). Cloud is opt-in dev-only. - /v1/usage by_provider=ollama is the canary. Anything else for customer-shape traffic = regression. - ./scrum is a TOOL, not architecture. Outputs to data/_kb/ scrum_findings.jsonl. Findings inform dev, do NOT auto-fold into design docs. - Test code in main is actively being cleaned. Today: 12 files / ~2900 LOC removed (commits 6aafd41 + f4ebd22). Surface more candidates, don't auto-delete unless clearly orphaned. The intent: future me (or future Claude session) reads STATE_OF_PLAY on cold-start, sees these entries, and doesn't re-make the same mistakes that drifted scope today. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 02:07:57 -05:00
root	c170ebc86e	docs: AUDIT_TRAIL_PRD — production-readiness gate for staffing client J flagged that smoke + parity tests prove the surface compiles, NOT that an audit response can be produced for a specific person — and the staffing client won't sign without defensible discrimination-claim response capability. New docs/AUDIT_TRAIL_PRD.md captures: - worked example: John Martinez at Warehouse B requests audit - subject audit response output format (per-decision row schema) - surface map: where decisions happen today, where the gaps are - PII handling rules (tokenization, protected-attribute exclusion, inferred-attribute risk) - identity service design intent (separate daemon, audited reads) - retention + right-to-be-forgotten policy intent - 9-phase implementation sequence with explicit per-phase exit criteria - cross-runtime requirement (both Rust + Go must satisfy) - 7 open questions blocking phase 2+ that need J's call STATE_OF_PLAY + PRD updated with explicit "production-ready blocker" section pointing at the new doc. The "substrate is shipped" framing gets a caveat: substrate ≠ production-ready until audit phase 9 exits. No code changes. This is the planning artifact J asked for before we start building. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 00:54:46 -05:00
root	0af62861d2	STATE_OF_PLAY: refresh for 2026-05-02 wave (Lance gauntlet + parity + housekeeping) Some checks failed lakehouse/auditor 9 blocking issues: cloud: claim not backed — "Verified end-to-end via playwright on devop.live/lakehouse:" Anchor was 5 days stale. Adds the 12-commit wave (Lance backend hardening, sidecar drop, observability parity, gitignore cleanup, gray-zone content add) with verification status for each. Updates DO NOT RELITIGATE with the 4 new things this wave makes load-bearing: - python sidecar dropped from hot path (don't wire it back) - lance gauntlet shipped (don't re-discover the bugs we just fixed) - 32/32 cross-runtime parity (don't build a 6th probe for already-covered surface) - ARCHITECTURE_COMPARISON.md is the single source of truth for cross-runtime decisions Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 22:23:36 -05:00
root	a3b65f314e	Synthetic face pool — 1000 StyleGAN headshots, ComfyUI hot-swap, 60x smaller thumbs Worker cards now ship a real photo per person instead of monogram tiles: - fetch_face_pool.py pulls 1000 faces from thispersondoesnotexist.com - tag_face_pool.py runs deepface for gender/race/age, excludes <22yo - manifest.jsonl: 952 servable, gender/race buckets populated - /headshots/_thumbs/ pre-resized to 384px webp (587KB -> 11KB, 60x smaller; without this Chrome's parallel-connection budget drops ~75% of tiles in a 40-card grid) - /headshots/:key gender x race x age intersection bucketing with gender-only fallback when intersection is sparse - /headshots/generate/:key ComfyUI on-demand for the contractor profile spotlight (cold ~1.5s, cached ~1ms; worker-derived djb2 seed makes faces deterministic-per-worker but unique across workers sharing the same prompt) - serve_imagegen.py _cache_key() now includes seed (was caching by prompt only -> 3 different worker seeds collapsed to 1 cached image; verified fix produces 3 distinct md5s) - confidence-default name resolution: Xavier->man+hispanic, Aisha->woman+black, etc. Every worker resolves to a bucket. End-to-end: playwright run on /?q=forklift+operators+IL -> 21/21 cards loaded, 0 broken, all 384px webp. Cache + binary pool gitignored; manifest tracked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 06:01:04 -05:00

8 Commits