From 03e8a91d97a6d77909e70caa1fda3bd2513600af Mon Sep 17 00:00:00 2001 From: root Date: Tue, 5 May 2026 03:30:53 -0500 Subject: [PATCH] =?UTF-8?q?STATE=5FOF=5FPLAY:=202026-05-05=20=E2=80=94=20a?= =?UTF-8?q?udit=20endpoint=20recovery=20+=20anchor=20refresh?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reset gateway audit substrate after /tmp wipe disabled it on reboot: - LH_SUBJECT_AUDIT_KEY moved /tmp/lakehouse_audit/ → /etc/lakehouse/ (canonical persistent path per spec line 112; /tmp wipes on reboot and silently disabled /audit + /biometric endpoints) - Fresh 32B HMAC + 44-char legal token at /etc/lakehouse/, mode 0400 - Systemd drop-in updated; gateway restarted; both endpoints 200 - Pre-rotation chains for WORKER-{1..5} (backfill data) will now tamper-detect under the new key — expected and correct on rotation Anchor wave-table backfilled with 3 commits that landed after the last STATE_OF_PLAY refresh on 2026-05-03 evening: - 7e0112b: retention_sweep stray indent fix - 848a458: Phase 1.6 Gate 5 erasure endpoint POST /biometric/.../erase - 8ec43e0: Phase 1.6 Gate 3b deepface integration design doc Phase 1.6 status table: Gate 5 → eng-DONE; Gate 3b → design-doc-shipped (recommends Option C defer). Calendar bottleneck text updated. .gitignore extended for runtime ephemera that surfaced this session: - data/biometric/ (BIPA-quarantined photos, regulated data) - reports/scrum/ (local-only review forensics per feedback_audit_findings_log.md) - experiments/ (per "experiments stay out of tracked tree" policy) Co-Authored-By: Claude Opus 4.7 (1M context) --- .gitignore | 15 +++++++++++++++ STATE_OF_PLAY.md | 16 ++++++++++------ 2 files changed, 25 insertions(+), 6 deletions(-) diff --git a/.gitignore b/.gitignore index 2a9ebf2..93ce997 100644 --- a/.gitignore +++ b/.gitignore @@ -50,3 +50,18 @@ reports/distillation/*-*-*-*-*/ tests/agent_test/_* tests/agent_test/sessions/ tests/real-world/runs/ + +# BIPA-quarantined photo uploads — Phase 1.6 Gate 3a writes to +# data/biometric/uploads//_. with mode 0700/0600. +# This is regulated subject-of-record data and must NEVER enter git. +data/biometric/ + +# Local-only scrum review evidence. Per `feedback_audit_findings_log.md` +# scrum runs fold fixes into a batch commit; the verdict / disposition +# files stay local for forensics. +reports/scrum/ + +# Local experiments scratchpad — per the "Test code in main is ACTIVELY +# being cleaned out" policy (commits 6aafd41 + f4ebd22), one-off +# experiments stay out of the tracked tree. +experiments/ diff --git a/STATE_OF_PLAY.md b/STATE_OF_PLAY.md index 0923e72..33d0361 100644 --- a/STATE_OF_PLAY.md +++ b/STATE_OF_PLAY.md @@ -7,7 +7,7 @@ --- -## WHAT LANDED 2026-05-03 (13 commits this wave — local-first audit substrate + Phase 1.6 BIPA gates) +## WHAT LANDED 2026-05-03 (16 commits this wave — local-first audit substrate + Phase 1.6 BIPA gates) The dominant work today: **`docs/specs/SUBJECT_MANIFESTS_ON_CATALOGD.md` Steps 1-8 SHIPPED end-to-end** + **5 of 7 Phase 1.6 BIPA pre-launch gates** + **6th cross-runtime parity probe**. Wave was structured as eight ship-then-scrum cycles — every wave caught real bugs, every fix wave landed within the same session. @@ -28,6 +28,9 @@ The dominant work today: **`docs/specs/SUBJECT_MANIFESTS_ON_CATALOGD.md` Steps 1 | `c7aa607` | Phase 1.6 scrum fixes — schema fingerprint hashes name+type+nullable, Gate 4 catches object-literal + class-field bypasses, pyarrow dep gate, item 7 deferral rationale | 4/4 + 3/3 still pass | | `f1fa6e4` | **Phase 1.6 Gate 3a** — `crates/catalogd/src/biometric_endpoint.rs`: `POST /biometric/subject/{id}/photo` with consent gate, quarantined storage (mode 0700/0600), audit chain link, `BiometricCollection` field on SubjectManifest | 11 unit tests PASS, live roundtrip 200 | | `3708e6a` | Gate 3a scrum fixes — transactional rollback on audit failure (BIPA convergent BLOCK), Content-Type parameter handling, relative data_path, ts+uuid filename, dead code removed | 11 tests + cross-runtime parity 6/6 | +| `7e0112b` | retention_sweep: stray indent fix on biometric_collection field | sweep tests still 8/8 | +| `848a458` | **Phase 1.6 Gate 5** — `POST /biometric/subject/{id}/erase` per BIPA destruction runbook. Two scopes (biometric_only / full); audit row appended BEFORE photo unlink so the chain has legal proof of intent even if file delete fails; manifest rolled back on audit failure. Trigger taxonomy: retention_expiry / consent_withdrawal / rtbf / court_order. | 21 unit tests (10 erasure-specific) PASS | +| `8ec43e0` | **Phase 1.6 Gate 3b** — deepface integration design doc (Option A subprocess / Option B ONNX-in-Rust / **Option C defer**). Recommends C: BIPA-safest, classifications field stays None, all load-bearing surfaces (consent + audit + retention + erasure) ship without it. Forces "do we actually need classifications" to be answered by product, not spec inertia. | doc-only | **Cross-runtime parity (post-this-wave):** 6 probes, 38/38 byte-identical assertions — `validator(6/6) + extract_json(12/12) + session_log(4/4) + materializer(2/2) + embed(8/8) + subject_audit(6/6)`. @@ -47,17 +50,18 @@ All three have regression-locked tests; structural impossibility going forward. | Gate 1 — public retention schedule | eng-staged, ⚖ counsel pending | `docs/policies/consent/biometric_retention_schedule_v1.md` | | Gate 2 — informed consent template | eng-staged, ⚖ counsel pending | `docs/policies/consent/biometric_consent_template_v1.md` | | Gate 3a — photo-upload endpoint | **DONE** | 11 unit tests + live `POST /biometric/subject/{id}/photo` | -| Gate 3b — deepface classification | deferred | needs Python subprocess design (sidecar dropped 2026-05-02) | +| Gate 3b — deepface classification | **design doc shipped** (`8ec43e0`) — recommends Option C (defer); awaits product confirmation | `docs/PHASE_1_6_BIPA_GATES.md` Gate 3b section | | Gate 4 — name→ethnicity removal | **DONE** | `mcp-server/phase_1_6_gate_4.test.ts` 4/4 with bypass coverage | -| Gate 5 — destruction runbook | eng-staged, ⚖ counsel pending | `docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md` | +| Gate 5 — destruction runbook + erasure endpoint | **eng-DONE** (`848a458`); ⚖ counsel review of runbook still pending | `docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md` + `POST /biometric/subject/{id}/erase` (21 tests) | | §2 cryptographic attestation | eng-DONE, signature pending | `docs/attestations/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-03.md` (SHA-256 evidence hash, 3/3 checks pass on live data) | | §3 employee training | deferred | conditional on operator population size | -**Calendar bottleneck:** counsel review of items 1/2/5/6. Engineering long pole is Gate 3b (deepface) — needs design conversation before engineering starts. +**Calendar bottleneck:** counsel review of items 1/2/5-runbook/§2 attestation. Engineering long pole is Gate 3b (deepface) — design doc landed (`8ec43e0`); needs product confirmation that classifications are required before engineering starts. Recommendation in doc is Option C (defer) on BIPA-safety grounds. **Operational state:** -- `LH_SUBJECT_AUDIT_KEY=/tmp/lakehouse_audit/subject_audit.key` (32-byte HMAC signing key) loaded into systemd unit -- `LH_LEGAL_AUDIT_TOKEN_FILE=/tmp/lakehouse_audit/legal_audit.token` (44-char legal-tier token) loaded into systemd unit +- `LH_SUBJECT_AUDIT_KEY=/etc/lakehouse/subject_audit.key` (32-byte HMAC signing key, mode 0400) loaded into systemd unit. **Moved off /tmp 2026-05-05** — /tmp wipes on reboot, which on May 5 disabled `/audit` + `/biometric` endpoints (gateway fails-closed at `crates/gateway/src/main.rs:459` if signing key is absent). Persistent path is per spec line 112. +- `LH_LEGAL_AUDIT_TOKEN_FILE=/etc/lakehouse/legal_audit.token` (44-char legal-tier token, mode 0400) loaded into systemd unit +- **Key rotation 2026-05-05:** prior key was lost when /tmp wiped on reboot. New key generated at canonical path. The 5 pre-rotation audit chains for `WORKER-{1..5}` (backfill data with `consent=pending_backfill_review`) will tamper-detect under the new key — expected and correct behavior on key rotation, not a bug. New chain entries from 2026-05-05 forward verify cleanly. - `data/_catalog/subjects/` holds 100 backfilled `WORKER-N.json` manifests + per-subject `WORKER-N.audit.jsonl` HMAC chains - `data/biometric/uploads//_.` quarantined photo storage (mode 0700 dir / 0600 file). 2 photos uploaded for WORKER-2 during live verify. - `/audit/subject/{id}` mounted on gateway with chain_verified=true on every probe