phase 1.6: lock Gate 3b = C, reconcile docs to shipped state, fix double-upload file leak

Four threads landing together — all driven by the audit J asked for before production cutover. (1) Gate 3b DECIDED: Option C (defer classifications). `BiometricCollection.classifications` stays `Option<JSON> = None` in v1. `docs/specs/GATE_3B_DEEPFACE_DESIGN.md` status flipped from "draft / awaits product" to DECIDED. Consent template + retention schedule revised to remove all "automated facial-classification" / "deepface" language so disclosed scope matches implemented scope. (2) Endpoint-path drift reconciled across 3 docs. `PHASE_1_6_BIPA_GATES.md`, `BIPA_DESTRUCTION_RUNBOOK.md`, and `biometric_retention_schedule_v1.md` had references to legacy `/v1/identity/subjects/*` paths (proposed under a separate identityd daemon, never shipped) — corrected to actual shipped routes `/biometric/subject/*` (catalogd-local). Schema block in PHASE_1_6_BIPA_GATES rewritten to reflect JSON `SubjectManifest.biometric_collection` substrate (not the proposed Postgres `subjects` table). (3) New operational artifacts: - `scripts/staffing/verify_biometric_erasure.sh` — checks 4 things post-erasure (manifest cleared, uploads dir empty, audit row matches, chain verified). Smoke-tested live against WORKER-2. - `scripts/staffing/biometric_destruction_report.sh` — monthly anonymized destruction-event aggregation. Smoke-tested clean. - `scripts/staffing/bundle_counsel_packet.sh` — tarballs the counsel-review packet with per-file SHA-256 manifest. - `docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md` — formal rotation procedure operationalized after the 2026-05-05 /tmp wipe incident. - `docs/counsel/COUNSEL_REVIEW_PACKET_2026-05-05.md` — cover note bundling all eng-staged BIPA docs for counsel review with per-doc questions, sign-off checklist, recommended review sequence. (4) Double-upload file leak fixed in `crates/catalogd/src/biometric_endpoint.rs`. `verify_biometric_erasure.sh` smoked WORKER-2 and surfaced a stranded photo file. Investigation showed the file was 13-byte test-fixture bytes (zero PII, no biometric content); audit timeline showed two consecutive uploads followed by one erasure — the second upload had silently overwritten manifest.data_path, orphaning the first file. Patched `process_upload` to refuse a second upload with HTTP 409 + `error: "biometric_already_collected"` when `biometric_collection.is_some()` on the manifest. Operator must explicitly POST `/biometric/subject/{id}/erase` first. Tests: new `second_upload_without_erase_returns_409` (asserts 409 + manifest pointer unchanged + first file untouched on disk). Replaced `repeated_uploads_grow_the_chain` with `upload_erase_upload_grows_the_chain_cleanly` (covers the legitimate re-collection cycle: chain grows to 3 rows). Updated `content_type_with_parameters_accepted` to use 2 distinct subjects (was using 1 subject with 2 uploads to test ct parsing — would now 409). 22/22 biometric_endpoint tests + 59/59 catalogd lib tests green post-patch. Production posture: gateway needs `cargo build --release -p gateway` + `systemctl restart lakehouse.service` to pick up the new 409 in live traffic. Counsel calendar is now the only remaining blocker for first real-photo intake. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 06:19:40 -05:00 · 2026-05-05 06:19:40 -05:00 · b2c34b80b3
commit b2c34b80b3
parent 03e8a91d97
13 changed files with 1451 additions and 65 deletions
--- a/.gitignore
+++ b/.gitignore
@ -61,6 +61,16 @@ data/biometric/
 # files stay local for forensics.
 reports/scrum/
 # Per-event biometric verification reports (timestamp-named, regenerated
 # per `verify_biometric_erasure.sh` invocation). Source-of-truth is the
 # audit chain itself; these reports are derived views.
 reports/biometric/
 # Counsel transmission tarballs + manifests are regenerated by
 # `bundle_counsel_packet.sh` from the tracked `docs/counsel/` source.
 # The bundle is transmittable, not source-of-truth.
 reports/counsel/
 # Local experiments scratchpad — per the "Test code in main is ACTIVELY
 # being cleaned out" policy (commits 6aafd41 + f4ebd22), one-off
 # experiments stay out of the tracked tree.
--- a/STATE_OF_PLAY.md
+++ b/STATE_OF_PLAY.md
@ -1,12 +1,58 @@
 # STATE OF PLAY — Lakehouse
-**Last verified:** 2026-05-03 evening CDT
+**Last verified:** 2026-05-05 morning CDT
-**Verified by:** live probe (gateway restarted 2x, all 11 catalogd subject tests + 11 biometric tests + 6 audit tests + 4 mcp-server Gate-4 tests green; cross-runtime parity 6/6 byte-identical against live audit logs; live curl roundtrip on /biometric returned 200 + chained audit row), not memory.
+**Verified by:** live probe (`/audit/health` 200, `/biometric/subject/{id}/erase` 21-test substrate + `/audit/subject/{id}` legal-tier endpoint live verified against WORKER-100; new `verify_biometric_erasure.sh` + `biometric_destruction_report.sh` + `bundle_counsel_packet.sh` smoke-tested clean against live data) — not memory.
 > **Read this FIRST.** When the user says "we're working on lakehouse," they mean the working code captured below — NOT what `git log` framed as "the cutover" or what memory snapshots from 2 days ago suggest. If memory contradicts this file, this file wins. Update it when something is verified working — not when a phase finishes.
 ---
 ## WHAT LANDED 2026-05-05 (doc reconciliation wave — Gate 3b decision + counsel packet ready)
 This was a **doc-only wave**, not code. Background: J asked for an audit of the BIPA/biometric documentation before production cutover. Audit found moderate fragmentation between docs and shipped code (post-`identityd` collapse, post-Gate-3a-ship, pre-Gate-3b-decision). Closed it in one pass.
 | Item | What changed | Status |
 |---|---|---|
 | **Gate 3b — DECIDED: Option C (defer classifications)** | `BiometricCollection.classifications` stays `Option<JSON> = None` in v1. `docs/specs/GATE_3B_DEEPFACE_DESIGN.md` status flipped from "draft / awaits product" to "DECIDED 2026-05-05". | Locked |
 | **Endpoint-path drift** | `PHASE_1_6_BIPA_GATES.md` (3 spots), `BIPA_DESTRUCTION_RUNBOOK.md` (2 spots), `biometric_retention_schedule_v1.md` (1 spot) updated from legacy `/v1/identity/subjects/*` (proposed under separate identityd daemon, never shipped) to actual `/biometric/subject/*` (catalogd-local, shipped `848a458`). Schema block in `PHASE_1_6_BIPA_GATES.md` rewritten to reflect JSON `SubjectManifest.biometric_collection` substrate (not the proposed Postgres `subjects` table). | Reconciled |
 | **Consent template + retention schedule** | Both revised for Option C: removed all "automated facial-classification" / "deepface" language so disclosed scope matches implemented scope. Pending counsel review — they were already eng-staged with ⚖ markers. | Eng-staged for counsel |
 | **`scripts/staffing/verify_biometric_erasure.sh`** (NEW) | Operator-side verification of an erasure event. Curls `/audit/subject/{id}` with legal-tier token, checks: manifest.biometric_collection null, uploads dir empty, last audit row is `biometric_erasure`/`full_erasure` with `erased`/`success`, chain_verified=true. Writes a hashed report to `reports/biometric/`. | Smoke-tested live |
 | **`scripts/staffing/biometric_destruction_report.sh`** (NEW) | Monthly destruction-event aggregation. Anonymizes candidate IDs (sha256-12 prefix), counts by scope + trigger, flags anomalies. Smoke-test on May 2026 data found 1 historical `biometric_erasure`/`consent_withdrawal` event (test fixture). | Smoke-tested live |
 | **`docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md`** (NEW) | Captures the rotation procedure operationalized after the 2026-05-05 `/tmp` wipe incident. Covers: when to rotate, pre-rotation snapshot, atomic-swap procedure, post-rotation verification (incl. expected pre-rotation chain tamper-detect under new key), recovery from lost key, ⚖ counsel notes. | Authored |
 | **`docs/counsel/COUNSEL_REVIEW_PACKET_2026-05-05.md`** + `bundle_counsel_packet.sh` (NEW) | Cover note bundling all eng-staged BIPA docs for counsel review with per-doc questions, sign-off checklist, recommended review sequence. Bundler script tarballs the 8 referenced files + emits a SHA-256 manifest. Tarball ready for transmission: `reports/counsel/counsel_packet_2026-05-05.tar.gz`. | Bundled, ready to send |
 ### Eng follow-up that this wave surfaced
 - **Double-upload file leak — DIAGNOSED + FIXED** (2026-05-05 same wave). `verify_biometric_erasure.sh` smoked WORKER-2 and surfaced a stranded photo file. Investigation showed:
  - The file was 13 bytes of test fixture (`ff d8 ff d9 + ASCII "TESTBYTES"`), byte-identical to the unit-test fixture at `biometric_endpoint.rs:841`. NO PII, NO biometric content, NO synthetic-face content. Came from manual integration testing on 2026-05-03.
  - Audit log timeline showed two consecutive uploads (09:54, 10:04) followed by one erasure (10:22). The erasure unlinked only the SECOND file (which the manifest pointed at by then); the first file was orphaned because the second upload had silently overwritten `manifest.data_path`.
  - **Real bug found**: the upload handler did NOT refuse a second upload to a subject with `biometric_collection.is_some()`. Patched `process_upload` to return HTTP 409 + `error: "biometric_already_collected"` when a re-upload is attempted; operator must explicitly POST `/biometric/subject/{id}/erase` first.
  - Stranded test file removed (`rm` of the 13-byte fixture).
  - New unit test `second_upload_without_erase_returns_409` asserts the 409 + that the first photo's data_path remains unchanged + that the first file remains untouched on disk.
  - Existing `repeated_uploads_grow_the_chain` replaced with `upload_erase_upload_grows_the_chain_cleanly` (covers the legitimate re-collection cycle: upload → erase → upload, chain grows to 3 rows).
  - Existing `content_type_with_parameters_accepted` test updated to use two distinct subjects (it had used one subject for two uploads to test content-type parsing — now would 409).
  - **22 biometric_endpoint tests + 59 catalogd lib tests all green** post-patch (was 21+58 pre-patch).
  - Production posture: gateway binary needs rebuild (`cargo build --release`) + `systemctl restart lakehouse.service` to pick up the new 409 behavior in live traffic.
 - **Pre-rotation chain tamper-detect (expected, not a bug).** WORKER-{1..5} had pre-2026-05-05 audit chains under the prior `LH_SUBJECT_AUDIT_KEY`. Under the new key (post-`/tmp` wipe rotation), those chains correctly tamper-detect. The rotation runbook §4.4 documents this as expected; a §2.2 pre-rotation snapshot is what would prove they were intact pre-rotation if defensibility ever needs it.
 ### What's blocking production cutover NOW (after this wave)
 - **Counsel calendar:** the four sign-off items in `COUNSEL_REVIEW_PACKET_2026-05-05.md` (retention schedule, consent template, destruction runbook, pre-identityd attestation). The packet tarball is ready; ⚖ counsel is the bottleneck.
 - **Nothing else.** Engineering is no longer the long pole.
 ### Phase 1.6 BIPA gates — status table (this is the final post-Option-C state)
 | # | Gate | Status |
 |---|---|---|
 | 1 | Public retention schedule | **eng-staged**, revised for Option C, ready for counsel |
 | 2 | Informed consent template | **eng-staged**, revised for Option C, ready for counsel |
 | 3a | Photo upload endpoint | **DONE** (shipped `f1fa6e4`, 11 unit tests, live verified) |
 | 3b | Deepface classification | **DECIDED 2026-05-05: Option C (defer)** |
 | 4 | Name → ethnicity inference removal | **DONE** (shipped, 4/4 mcp-server tests pass) |
 | 5 | Destruction runbook + erasure endpoint | **eng-DONE** (`848a458`, 21 tests). Runbook scripts (verify + report) shipped 2026-05-05. Counsel review pending. |
 | §2 | Pre-identityd attestation | **eng-DONE** (3/3 evidence checks). Awaits J + counsel signature. |
 | §3 | Employee training | **deferred** (consolidated into runbook §7 acknowledgment for current operator population) |
 ---
 ## WHAT LANDED 2026-05-03 (16 commits this wave — local-first audit substrate + Phase 1.6 BIPA gates)
 The dominant work today: **`docs/specs/SUBJECT_MANIFESTS_ON_CATALOGD.md` Steps 1-8 SHIPPED end-to-end** + **5 of 7 Phase 1.6 BIPA pre-launch gates** + **6th cross-runtime parity probe**. Wave was structured as eight ship-then-scrum cycles — every wave caught real bugs, every fix wave landed within the same session.
--- a/crates/catalogd/src/biometric_endpoint.rs
+++ b/crates/catalogd/src/biometric_endpoint.rs
@ -307,6 +307,21 @@ pub async fn process_upload(
            consent_status: None,
        }));
    }
    // Refuse double-upload. If a BiometricCollection already exists on
    // the manifest, the operator must explicitly erase before re-uploading.
    // Without this gate, a second POST silently overwrites manifest.data_path
    // and orphans the previous photo file on disk — creating a forever-leak
    // pattern and a BIPA defensibility hole ("we said we erased the photo,
    // but the previous version of it is still under the same subject dir").
    // Caught 2026-05-05 by verify_biometric_erasure.sh against WORKER-2.
    if manifest.biometric_collection.is_some() {
        return Err((StatusCode::CONFLICT, ErrorResponse {
            error: "biometric_already_collected",
            detail: "subject already has a BiometricCollection on the manifest; \
                     POST /biometric/subject/{id}/erase first if you intend to replace the photo".into(),
            consent_status: None,
        }));
    }
    let template_hash = {
        let mut h = Sha256::new();
@ -947,15 +962,92 @@ mod tests {
    }
    #[tokio::test]
-    async fn repeated_uploads_grow_the_chain() {
+    async fn second_upload_without_erase_returns_409() {
-        let state = fixture_state("repeated").await;
+        // BIPA defensibility: a second upload to a subject that already
-        let _ = state.registry.put_subject(fixture_manifest("WORKER-5", BiometricConsentStatus::Given, SubjectStatus::Active)).await;
+        // has a BiometricCollection must fail-closed. Without this gate,
        // the second upload silently overwrites manifest.data_path and
        // orphans the first photo on disk forever (caught 2026-05-05 on
        // WORKER-2 by verify_biometric_erasure.sh).
        let state = fixture_state("second_upload_409").await;
        let _ = state.registry.put_subject(fixture_manifest("WORKER-DUP", BiometricConsentStatus::Given, SubjectStatus::Active)).await;
        let storage_root = state.storage_root.clone();
        let registry = state.registry.clone();
        // First upload succeeds.
        let resp1 = process_upload(&state, "WORKER-DUP", Some(TEST_TOKEN), Some("image/jpeg"), "v1", "", &jpeg_bytes())
            .await.unwrap();
        let first_path = storage_root.join(&resp1.data_path);
        assert!(first_path.exists(), "first upload should produce a file");
        // Second upload refused with 409.
        let err = process_upload(&state, "WORKER-DUP", Some(TEST_TOKEN), Some("image/jpeg"), "v1", "", &jpeg_bytes())
            .await.unwrap_err();
        assert_eq!(err.0, StatusCode::CONFLICT);
        assert_eq!(err.1.error, "biometric_already_collected");
        // Manifest still points at the first upload — pointer was NOT overwritten.
        let m = registry.get_subject("WORKER-DUP").await.unwrap();
        let bc = m.biometric_collection.as_ref().expect("collection should still be set");
        assert_eq!(bc.data_path, resp1.data_path,
            "manifest data_path must be unchanged after refused second upload");
        // First file remains on disk untouched (refusal must not unlink it).
        assert!(first_path.exists(), "first upload's file must remain after refused second upload");
        let still_on_disk = tokio::fs::read(&first_path).await.unwrap();
        assert_eq!(still_on_disk, jpeg_bytes(),
            "first upload's bytes must not have been overwritten");
    }
    #[tokio::test]
    async fn upload_erase_upload_grows_the_chain_cleanly() {
        // Prior version of this test allowed repeated uploads to chain;
        // that conflated chain growth with allowed re-upload. Under the
        // double-upload guard (409 above), the only legitimate way to
        // re-collect is upload → erase → upload. Chain grows to 3 rows
        // (collection, erasure, collection); on-disk file count returns
        // to one after the second upload.
        let state = fixture_state("upload_erase_upload").await;
        let _ = state.registry.put_subject(fixture_manifest("WORKER-CYCLE", BiometricConsentStatus::Given, SubjectStatus::Active)).await;
        let writer = state.writer.clone();
-        for _ in 0..2 {
+        let storage_root = state.storage_root.clone();
-            let _ = process_upload(&state, "WORKER-5", Some(TEST_TOKEN), Some("image/jpeg"), "", "", &jpeg_bytes())
+
-                .await.unwrap();
+        // First upload.
-        }
+        let resp1 = process_upload(&state, "WORKER-CYCLE", Some(TEST_TOKEN), Some("image/jpeg"), "", "", &jpeg_bytes())
-        assert_eq!(writer.verify_chain("WORKER-5").await.unwrap(), 2);
+            .await.unwrap();
        let first_path = storage_root.join(&resp1.data_path);
        assert!(first_path.exists());
        // Erase. Uses process_erase test helper (the production path
        // parses the EraseRequest from request body; tests inject it
        // directly). Note: the erase flow flips biometric.status to
        // Withdrawn, so the post-erase second upload must reset consent
        // first (production flow would require new consent collection).
        let _ = process_erase(&state, "WORKER-CYCLE", Some(TEST_TOKEN), "trace-cycle", fixture_erase_request("biometric_only"))
            .await.unwrap();
        assert!(!first_path.exists(), "first photo file must be unlinked by erase");
        // Reset consent + status on the post-erase manifest so the second
        // upload can proceed (production flow would require new consent
        // collection here; for this test we directly flip the manifest).
        let mut post_erase = state.registry.get_subject("WORKER-CYCLE").await.unwrap();
        post_erase.consent.biometric.status = BiometricConsentStatus::Given;
        post_erase.status = SubjectStatus::Active;
        post_erase.biometric_collection = None;
        let _ = state.registry.put_subject(post_erase).await;
        // Second upload (legitimate, after erase).
        let resp2 = process_upload(&state, "WORKER-CYCLE", Some(TEST_TOKEN), Some("image/jpeg"), "", "", &jpeg_bytes())
            .await.unwrap();
        let second_path = storage_root.join(&resp2.data_path);
        assert!(second_path.exists(), "second upload should produce a file");
        assert_ne!(resp1.data_path, resp2.data_path, "second upload should land at a new path");
        // Chain has 3 rows: collection, erasure, collection.
        assert_eq!(writer.verify_chain("WORKER-CYCLE").await.unwrap(), 3);
        let rows = writer.read_rows_in_range("WORKER-CYCLE", None, None).await.unwrap();
        assert_eq!(rows[0].accessor.kind, "biometric_collection");
        assert_eq!(rows[1].accessor.kind, "biometric_erasure");
        assert_eq!(rows[2].accessor.kind, "biometric_collection");
    }
    #[tokio::test]
@ -985,18 +1077,23 @@ mod tests {
        // Caught 2026-05-03 opus scrum WARN; regression test ensures
        // the bare media type is matched after stripping parameters.
        let state = fixture_state("ct_with_params").await;
-        let _ = state.registry.put_subject(fixture_manifest("WORKER-CT", BiometricConsentStatus::Given, SubjectStatus::Active)).await;
+        // Two distinct subjects so each upload exercises the "first upload"
        // path. Prior version used one subject and two uploads — now blocked
        // by the double-upload guard (409). The test's actual intent is
        // content-type parsing, not re-upload tolerance.
        let _ = state.registry.put_subject(fixture_manifest("WORKER-CT-A", BiometricConsentStatus::Given, SubjectStatus::Active)).await;
        let _ = state.registry.put_subject(fixture_manifest("WORKER-CT-B", BiometricConsentStatus::Given, SubjectStatus::Active)).await;
        let resp = process_upload(
-            &state, "WORKER-CT", Some(TEST_TOKEN),
+            &state, "WORKER-CT-A", Some(TEST_TOKEN),
            Some("image/jpeg; charset=binary"), "", "", &jpeg_bytes(),
        ).await.unwrap();
-        assert_eq!(resp.candidate_id, "WORKER-CT");
+        assert_eq!(resp.candidate_id, "WORKER-CT-A");
        // Also case-insensitive matching: "Image/JPEG" should work too.
        let resp2 = process_upload(
-            &state, "WORKER-CT", Some(TEST_TOKEN),
+            &state, "WORKER-CT-B", Some(TEST_TOKEN),
            Some("Image/JPEG"), "", "", &jpeg_bytes(),
        ).await.unwrap();
-        assert_eq!(resp2.candidate_id, "WORKER-CT");
+        assert_eq!(resp2.candidate_id, "WORKER-CT-B");
    }
    // ─── Erasure tests (Gate 5) ──────────────────────────────────────
--- a/docs/PHASE_1_6_BIPA_GATES.md
+++ b/docs/PHASE_1_6_BIPA_GATES.md
@ -22,7 +22,7 @@ Each gate is a deliverable that must ship before real-photo intake. None is opti
 - `docs/policies/consent/biometric_retention_schedule_v1.md` — public file
 - Linked from public privacy policy at the deployment URL
 - Specifies:
-  - Categories of biometric data collected (facial geometry derived from candidate photos, age estimate, gender classification, race classification — per Phase 1.5 deepface walk)
+  - Categories of biometric data collected (facial photograph for staff identification at job sites; classifications deferred per Gate 3b — see `docs/specs/GATE_3B_DEEPFACE_DESIGN.md`)
  - Purpose of collection (identity matching for staffing operations)
  - Maximum retention: BIPA §15(a) caps at "3 years from the individual's last interaction with the private entity, whichever occurs first" — recommend 18-24 months as the operational ceiling (provides safety margin)
  - Destruction procedure: per Gate 5 below
@ -67,7 +67,7 @@ Each gate is a deliverable that must ship before real-photo intake. None is opti
 **What ships:**
-A new endpoint (proposed: `POST /v1/identity/subjects/{candidate_id}/photo`) with the following behavior:
+An endpoint at `POST /biometric/subject/{candidate_id}/photo` (catalogd-local — the original v1 spec named this `/v1/identity/subjects/{candidate_id}/photo` under a separate identityd daemon; that daemon was collapsed into catalogd per the architecture pivot. See `IDENTITY_SERVICE_DESIGN.md` deprecation header.) with the following behavior:
 1. Caller authenticates with service-tier token
 2. Endpoint queries identityd for `subjects.biometric_consent_status`
@ -75,18 +75,23 @@ A new endpoint (proposed: `POST /v1/identity/subjects/{candidate_id}/photo`) wit
 4. If status = `'given'`:
   a. Photo bytes accepted, stored to a quarantined path under `data/biometric/uploads/{candidate_id}/{ts}.{ext}` (NOT `data/headshots/`)
   b. deepface tagging runs against the photo
-   c. Classifications (gender, race, age) stored to `subjects` table fields (NEW columns — see schema additions below)
+   c. Classifications (gender, race, age) — **DEFERRED to Gate 3b** (`docs/specs/GATE_3B_DEEPFACE_DESIGN.md`). `BiometricCollection.classifications` remains `None` in v1.
   d. Original photo bytes encrypted under DEK + retained per Gate 1 schedule
   e. `pii_access_log` row written with `purpose_token='biometric_collection'`
 5. Response: `{candidate_id, retention_until, consent_version}`
-**Schema additions to identityd `subjects`:**
+**Schema (as shipped — catalogd `SubjectManifest.biometric_collection`):**
-```sql
+The original spec proposed JSONB columns on a Postgres `subjects` table under identityd. The shipped implementation collapses this into a per-subject JSON manifest at `data/_catalog/subjects/<id>.json`, with the `BiometricCollection` struct holding `data_path`, `template_hash`, `collected_at`, and `classifications: Option<JSON>`. See `crates/catalogd/src/subject_manifest.rs` for the canonical type.
-ALTER TABLE subjects ADD COLUMN biometric_classifications JSONB;  -- {gender, race, age} from deepface
+
-ALTER TABLE subjects ADD COLUMN biometric_data_path TEXT;          -- quarantined path
+```rust
-ALTER TABLE subjects ADD COLUMN biometric_collected_at TIMESTAMPTZ;
+// crates/catalogd/src/subject_manifest.rs (paraphrased)
-ALTER TABLE subjects ADD COLUMN biometric_template_hash TEXT;     -- hash of the photo bytes (for integrity, NOT for re-derivation)
+pub struct BiometricCollection {
    pub data_path: String,                // quarantined path
    pub template_hash: String,            // SHA-256 of original bytes (integrity, NOT re-derivation)
    pub collected_at: DateTime<Utc>,
    pub classifications: Option<Value>,   // None until Gate 3b ships (deferred — see GATE_3B_DEEPFACE_DESIGN.md)
 }
 ```
 **Engineering acceptance:**
@ -130,8 +135,8 @@ ALTER TABLE subjects ADD COLUMN biometric_template_hash TEXT;     -- hash of the
 - `docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md` — operator-facing
 - Specifies:
  - Triggers: retention expiry (per Gate 1), withdrawal, RTBF request, candidate request
-  - Procedure: identityd `POST /v1/identity/subjects/{id}/erase` (legal-tier auth)
+  - Procedure: catalogd-local `POST /biometric/subject/{id}/erase` (legal-tier auth) — formerly proposed under identityd; now serves from catalogd directly
-  - Erasure scope: `subjects.biometric_*` columns ciphertext-deleted, `biometric_data_path` files securely overwritten + unlinked, deepface classifications nulled
+  - Erasure scope: `BiometricCollection` set to `None` on the subject manifest (drops `data_path`, `template_hash`, `classifications` together), quarantined photo files at `data/biometric/uploads/<id>/*` securely unlinked, audit row appended BEFORE photo unlink so the chain proves intent even if file delete fails
  - Backup window: per `IDENTITY_SERVICE_DESIGN` v3-B12, residual exists in DB backups for 30 days max; subject is informed
  - Witnessed: every erasure event written to `pii_access_log` with `purpose_token='biometric_erasure'` and the legal-tier JWT signature (proves authorized destruction)
  - Reporting: monthly internal report of erasures + retention-expiry sweeps; available to counsel on request
@ -140,7 +145,7 @@ ALTER TABLE subjects ADD COLUMN biometric_template_hash TEXT;     -- hash of the
 **Engineering acceptance:**
 - Runbook committed
- `POST /v1/identity/subjects/{id}/erase` endpoint includes biometric-specific erasure path
+- `POST /biometric/subject/{id}/erase` endpoint includes biometric-specific erasure path (shipped `848a458` — 21 unit tests, two scopes: biometric_only / full)
 - Daily sweep job destroys biometric data past `biometric_retention_until` (separate from general retention sweep — biometric has stricter clock)
 - Erasure events are logged with cryptographic attestation
@ -188,7 +193,7 @@ of 2026-05-03 — scaffolds vs. counsel sign-off vs. shipped code:
 |---|---|---|---|---|
 | 1 | Public retention schedule | scaffolded at `docs/policies/consent/biometric_retention_schedule_v1.md` | pending | **eng-staged** |
 | 2 | Consent template | scaffolded at `docs/policies/consent/biometric_consent_template_v1.md` | pending | **eng-staged** |
-| 3 | Photo-upload endpoint with consent enforcement | DONE for the consent-gate substrate (`crates/catalogd/src/biometric_endpoint.rs` mounted at `/biometric/subject/{id}/photo`, 10 unit tests, live-verified end-to-end). Deepface classification deferred to **Gate 3b** (own session — needs Python subprocess design after sidecar drop). | n/a until 3b | **3a DONE, 3b deferred** |
+| 3 | Photo-upload endpoint with consent enforcement | DONE — `crates/catalogd/src/biometric_endpoint.rs` mounted at `/biometric/subject/{id}/photo`, 11 unit tests, live-verified end-to-end. **Gate 3b DECIDED 2026-05-05: Option C (defer classifications).** `BiometricCollection.classifications` stays `Option<JSON> = None` in v1; consent + retention docs revised to match. See `docs/specs/GATE_3B_DEEPFACE_DESIGN.md` §6 + change log. | reviewed under Gate 2 (matching consent text) | **DONE — 3a shipped, 3b deferred per design doc** |
 | 4 | Name → ethnicity inference removed | DONE — `mcp-server/search.html:3372` removal note + `mcp-server/phase_1_6_gate_4.test.ts` absence test (3/3 green) | none required | **DONE** |
 | 5 | Destruction runbook | scaffolded at `docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md`; erasure endpoint + verify/report scripts marked TODO | pending | **eng-staged** |
@ -212,10 +217,14 @@ re-promotes to blocking and a separate training program must be authored.
 expected operator population size, or restore it to the blocking set.
 **Calendar bottleneck:** Items 1, 2, 5, 6 (and #7) await counsel
-review of the engineering scaffolds. Gate 3 (photo-upload endpoint)
+review of the engineering scaffolds. Gate 3 substrate is fully
-is the only remaining engineering work; it's deferred to its own
+shipped; Gate 3b deepface classification was DECIDED on 2026-05-05
-session because it crosses into identityd photo intake and deepface
+as Option C (defer) — `BiometricCollection.classifications` stays
-integration scope that hasn't been designed yet.
+`None` in v1, consent + retention docs revised to match this
 narrower scope. If a future product requirement surfaces a real
 need for classifications, the substrate is forward-compatible
 (`Option<JSON>`) and either Option A (~1 day) or Option B (~5 days)
 of the design doc can be picked up then under a v2 consent template.
 ---
@ -258,4 +267,5 @@ integration scope that hasn't been designed yet.
 ## Change log
 - 2026-05-05 — Reconciled with shipped state: endpoint paths corrected from the legacy identityd v1 spec (`/v1/identity/subjects/*`) to the catalogd-local routes that actually shipped (`/biometric/subject/*`). Schema block rewritten to reflect the JSON `SubjectManifest.biometric_collection` substrate that replaced the proposed Postgres columns. Gate 3b deepface deferral marked in-line where Disclosure 1 / Gate 3 step 5c / Gate 5 erasure scope previously assumed classifications were collected. No legal text changed; this was doc/code drift cleanup.
 - 2026-05-03 — Initial draft. Authored after `IDENTITY_SERVICE_DESIGN` v3 §5 Step 0 named Phase 1.6 as a hard prerequisite to backfill.
--- a/docs/counsel/COUNSEL_REVIEW_PACKET_2026-05-05.md
+++ b/docs/counsel/COUNSEL_REVIEW_PACKET_2026-05-05.md
@ -0,0 +1,260 @@
 # Counsel Review Packet — Phase 1.6 BIPA Pre-Launch
 **Date assembled:** 2026-05-05
 **For:** outside counsel
 **From:** J, operator of record
 **Scope:** documents that engineering has staged for legal sufficiency review
            before the staffing platform begins collecting any real candidate
            biometric data (BIPA §15(a)(b)).
 > **What this packet is.** The Phase 1.6 BIPA gates outline what
 > engineering must ship before real-photo intake. As of 2026-05-05,
 > all engineering substrate is shipped and verified live (see §1
 > below for the inventory). What remains is binding-text authoring
 > + counsel sign-off on five documents, plus operational notification
 > obligations counsel may want to layer on top.
 >
 > **What this packet is NOT.** Not a request for counsel to write
 > binding text from scratch. The documents are eng-staged in
 > reasonable plain language; the request is for counsel to render
 > them into legally-sufficient text and attest where signatures
 > are required.
 ---
 ## 1. Engineering substrate — shipped + verified
 For factual context on what counsel is reviewing AGAINST. None of
 this requires sign-off here; it's the system the documents bind to.
 | Component | Where it lives | Verification |
 |---|---|---|
 | Subject manifest registry | `crates/catalogd/src/registry.rs`, `data/_catalog/subjects/<id>.json` | 17 unit tests + 100 backfilled WORKER manifests in production |
 | Per-subject HMAC audit chain (SHA-256) | `crates/catalogd/src/subject_audit.rs`, `data/_catalog/subjects/<id>.audit.jsonl` | Tamper-detection + concurrent-append race tests pass |
 | Photo upload (consent-gated) | `POST /biometric/subject/{id}/photo` | 11 unit tests + live roundtrip 200 |
 | Erasure (two-scope) | `POST /biometric/subject/{id}/erase` (`biometric_only` / `full`) | 21 unit tests; transactional rollback on audit failure |
 | Legal-tier audit read | `GET /audit/subject/{id}` (X-Lakehouse-Legal-Token header) | Constant-time auth, chain re-verification per request |
 | Retention sweep (BIPA-aware clock) | `crates/catalogd/src/bin/retention_sweep` | 8 unit tests; live verified against 100 backfilled subjects |
 | Cross-runtime parity (Rust ↔ Go) | `scripts/cutover/parity/subject_audit_parity.sh` | 6/6 byte-identical assertions pass |
 **Key insight for counsel:** the audit chain is the BIPA-defensible
 substrate. Every state-changing event (consent given, photo uploaded,
 photo erased, legal-tier read) appends to a per-subject HMAC-chained
 JSONL log. The chain verifies end-to-end on every legal-tier read.
 A tampered chain is detectable; a forged chain requires the HMAC
 signing key, which is held under root-only mode 0400 at
 `/etc/lakehouse/subject_audit.key` and rotated per the runbook in
 attachment §6 below.
 **Gate 3b (deepface classification) — decided 2026-05-05: Option C
 (defer).** The system collects only the photograph, not derived
 demographic information. The consent template + retention schedule
 in this packet were revised the same day to match.
 ---
 ## 2. Documents requiring counsel review + sign-off
 In recommended review order:
 | # | Document | Path | Counsel ask | Sign-off |
 |---|---|---|---|---|
 | A | Biometric Retention Schedule v1 | `docs/policies/consent/biometric_retention_schedule_v1.md` | Render into binding language; confirm 18-month operational ceiling vs. BIPA 3-year statutory cap | Counsel + J |
 | B | Biometric Consent Template v1 | `docs/policies/consent/biometric_consent_template_v1.md` | Render Disclosures 1-3 into binding consent language; specify electronic vs. paper signature mechanism | Counsel + J |
 | C | BIPA Destruction Runbook | `docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md` | Confirm 30-day SLA from trigger; confirm two-operator (operator + witness) requirement; confirm legal-hold check procedure | Counsel attestation |
 | D | BIPA Pre-IdentityD Attestation | `docs/attestations/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-03.md` | Sign as countersigning party; J signs as operator-of-record | Counsel + J |
 | E | Legal-Tier Audit Key Rotation | `docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md` | Confirm rotation cadence; opine on candidate-notification obligation when rotation is compromise-driven | Counsel notes |
 | F | Gate 3b Deepface Design (FYI) | `docs/specs/GATE_3B_DEEPFACE_DESIGN.md` | Decision-of-record showing classifications were *deliberately deferred*, not omitted by oversight. No sign-off needed; provided for audit-trail completeness. | None |
 The five documents requiring sign-off are A, B, C, D, E. Document F
 is included so the audit trail shows the Gate 3b decision was
 deliberate.
 ---
 ## 3. Specific questions for counsel — by document
 ### Document A — Retention Schedule
 1. The schedule sets an **18-month** operational ceiling against the
   BIPA 3-year statutory cap. Is the safety margin appropriate, or
   should we move to a tighter window (12 months) given the
   plaintiff-friendly Illinois posture?
 2. The schedule references the **catalogd-local** storage substrate
   rather than a separate identityd Postgres table. Does the
   public-facing language need to mention the storage architecture
   at all, or is "we keep the photo and a SHA-256 hash" sufficient?
 3. Public publication URL — counsel to specify (placeholder marked
   in §7 of the schedule).
 4. Confirm whether existing consent under v1 carries forward when
   a future v2 is published, or whether re-consent is required.
 ### Document B — Consent Template
 1. Disclosure 1 says "we do NOT run automated facial-classification
   in v1." Does that disclosure need to mention the *possibility* of
   future classification, or is silence-with-supersession-clause
   adequate?
 2. Plain-language summary in §1 — counsel to confirm it's appropriate
   to include alongside the binding disclosure, or recommend an
   alternative comprehension aid.
 3. Withdrawal SLA is set to **30 days** in §2. Counsel to confirm
   against jurisdiction (Illinois primary; secondary deployments
   would inherit).
 4. Contact for withdrawal — counsel to specify the channel
   (placeholder in §3).
 5. Sign-off mechanism: electronic signature service, in-app
   click-acceptance with timestamp, paper form? Each has different
   evidentiary weight.
 ### Document C — Destruction Runbook
 1. Confirm 30-day SLA from each of four triggers (retention expiry,
   consent withdrawal, RTBF, court order). Some interpretations
   prefer 7 or 14 days for withdrawal/RTBF.
 2. Two-operator requirement (operator-of-record + witness): is the
   witness role acceptable for counsel's defensibility view, or
   should we elevate to dual-control with cryptographic split-key?
 3. Legal-hold check procedure (§2 step 3) — counsel to specify the
   actual procedure for confirming no hold is in force before
   erasing.
 4. Backup-window disclosure (§4) — confirm 30-day backup retention
   is acceptable.
 5. Candidate notification template (§3 step 4) — counsel to supply.
 ### Document D — Pre-IdentityD Attestation
 1. Both signature lines blank — J signs as operator-of-record;
   counsel signs as the countersigning legal party.
 2. The attestation hash anchors the evidence; once signed, the
   hash itself becomes a tamper-evident witness. Counsel to confirm
   storage location for the signed copy (firm files?).
 ### Document E — Key Rotation Runbook
 1. Recommended rotation cadence — 90 days suggested in §1.
   Counsel to confirm or override.
 2. Custody schedule for `/etc/lakehouse/_archived/` raw key files —
   §7.2 question; suggested 1-year retention but counsel-driven.
 3. Candidate-notification obligation when rotation is
   compromise-driven (§7.3) — counsel call.
 ---
 ## 4. Engineering changes counsel should know about (recent)
 These reconciled doc/code drift after a rapid wave on 2026-05-03:
 - **Endpoint paths:** the original v1 spec proposed
  `/v1/identity/subjects/*` under a separate identityd daemon. That
  daemon was collapsed into catalogd; endpoints actually shipped at
  `/biometric/subject/*` (catalogd-local). Documents in this packet
  reference the catalogd-local routes; legacy references in
  `IDENTITY_SERVICE_DESIGN.md` are flagged "do NOT implement
  as-written" in that doc's deprecation header.
 - **No identityd Postgres database:** the original spec proposed
  encrypted-at-rest Postgres + HashiCorp Vault + S3 Object Lock for
  PII storage. The shipped substrate is local JSON manifests +
  per-subject HMAC-chained JSONL, sized for J's local-only
  deployment per `PRD.md` line 70 ("Everything runs locally — no
  cloud APIs").
 - **Gate 3b deferral (Option C, 2026-05-05):** classifications
  (gender / race / age inference) were deliberately deferred. The
  consent template and retention schedule in this packet do NOT
  disclose collection of derived demographic data, because we are
  not collecting it. If a future product requirement reverses this,
  we will publish a v2 consent + v2 retention with re-consent.
 - **Key rotation 2026-05-05:** the prior `LH_SUBJECT_AUDIT_KEY` was
  lost when a `/tmp` wipe on reboot disabled the audit and biometric
  endpoints. The new key is at `/etc/lakehouse/subject_audit.key`
  (mode 0400). Pre-rotation audit chains tamper-detect under the
  new key — this is correct, expected behavior, not a bug.
 ---
 ## 5. Open eng items NOT awaiting counsel
 For transparency. These are engineering work items, not legal items:
 1. **Residual photo unlink on erasure.** During verification of the
   one historical erasure event (`WORKER-2`), the verify script
   surfaced a stranded photo file that was not unlinked when
   `BiometricCollection` was cleared from the manifest. Engineering
   investigates; if the bug is real, the fix is `crates/catalogd/src/biometric_endpoint.rs`
   in the erasure handler. This does NOT affect the current packet —
   no real candidate photos have been collected yet (per §1
   attestation), so the residual is from a synthetic test event.
 2. **Phase 1.6 §3 employee training.** Currently deferred per
   acknowledgement coverage in §7 of the destruction runbook
   (single-operator population). Re-promotes to blocking if the
   operator population grows; counsel may want to opine on the
   threshold.
 ---
 ## 6. Sign-off sequence
 Recommended order so a hold-up on one doc doesn't block others:
 1. **First wave (parallel):** A (retention schedule) + B (consent
   template). These two have the tightest interdependence (consent
   v1 references retention v1 by hash); review them together.
 2. **Second wave:** C (destruction runbook). Depends on A's retention
   period being fixed.
 3. **Third wave:** D (pre-identityd attestation). Sign once A + B + C
   are settled; the attestation snapshot is the boundary between
   pre-Phase-1.6 and post-Phase-1.6 system state.
 4. **Fourth wave:** E (key rotation). Independent of A-D; can be
   reviewed in parallel any time.
 ---
 ## 7. After sign-off — engineering steps
 Once each document is signed:
 | Document | Engineering action | Trigger |
 |---|---|---|
 | A retention schedule | Hash + commit; reference in `consent_versions` table | Counsel signature |
 | B consent template | Hash + commit; reference in candidate-facing intake UI | Counsel signature |
 | C destruction runbook | Adopt; operator acknowledgment recorded in §7 | Counsel attestation |
 | D pre-identityd attestation | Anchor hash to filesystem + git; counsel keeps original signature | Both signatures |
 | E key rotation | Adopt; rotation event log seeded with counsel-approved cadence | Counsel notes |
 The HARD blocker for first real-candidate photo collection is
 A + B + D signed. C and E are operationally important but do not
 block the *first* photo (they govern destruction + key handling
 which apply to any state, not the boundary state).
 ---
 ## 8. Cover-note hash
 This packet is itself a snapshot. Future-Claude / future-J will refer
 back to this packet to know what counsel saw on 2026-05-05.
 **Packet attached files (referenced by path):**
 - `docs/policies/consent/biometric_retention_schedule_v1.md`
 - `docs/policies/consent/biometric_consent_template_v1.md`
 - `docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md`
 - `docs/attestations/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-03.md`
 - `docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md`
 - `docs/specs/GATE_3B_DEEPFACE_DESIGN.md`
 - `docs/PHASE_1_6_BIPA_GATES.md` (the spec they all reference)
 Per-file SHA-256 hashes are produced by the bundler script (next
 section); the bundler also creates a tarball ready for transmission.
 ---
 ## 9. Generating the bundle for transmission
 ```bash
 ./scripts/staffing/bundle_counsel_packet.sh
 ```
 Produces `reports/counsel/counsel_packet_<DATE>.tar.gz` with all
 referenced documents + a manifest listing per-file SHA-256 hashes.
 Counsel can verify file integrity on receipt by re-running
 sha256sum against each file in the tarball.
--- a/docs/policies/consent/biometric_consent_template_v1.md
+++ b/docs/policies/consent/biometric_consent_template_v1.md
@ -3,6 +3,7 @@
 **Spec:** docs/PHASE_1_6_BIPA_GATES.md §1 Gate 2 (BIPA §15(b)(1)-(3))
 **Status:** Engineering scaffold — ⚖ COUNSEL must author the binding text before deployment
 **Version:** v1 (initial; supersession requires a new version + new hash)
 **Updated 2026-05-05:** Disclosure 1 + plain-language summary revised to match the Gate 3b deferral recommendation in `docs/specs/GATE_3B_DEEPFACE_DESIGN.md` (Option C — defer classifications). Pending J's product confirmation of Gate 3b; if Gate 3b chooses Option A or B, this template needs counsel re-authoring.
 > This is the consent template a candidate signs (electronically or
 > on paper) BEFORE Lakehouse collects, stores, or processes any
@ -25,11 +26,12 @@ content; counsel provides the legally-sufficient wording.
 ### Disclosure 1 — Notice of collection (§15(b)(1))
-Lakehouse will collect, store, and use my **biometric identifier**
+Lakehouse will collect and store my **biometric identifier** (a
-(facial geometry derived from a photograph of me) and **biometric
+photograph of me from which facial geometry is implicit). The
-information** (gender, race, and age classifications derived from
+photograph itself is the data we keep — we do NOT run automated
-that photograph by an automated facial-classification model called
+facial-classification (gender / race / age inference) against it
-deepface).
+in v1. If at a later date we add automated classification, we will
 re-collect consent under a superseding template before doing so.
 ### Disclosure 2 — Specific purpose and length of term (§15(b)(2))
@ -66,9 +68,9 @@ is appropriate to include or whether a different plain-language
 section is preferred.
 > **What you're agreeing to:** if you upload a photo of yourself,
-> we'll keep that photo and a few descriptive labels about the photo
+> we'll keep that photo so your staffing coordinator can recognize
-> (estimated age, perceived gender, perceived race) to help your
+> you when you arrive at job sites. We don't run automated guesses
-> staffing coordinator recognize you when you arrive at job sites.
+> about your age, gender, or race against the photo.
 >
 > **How long we keep it:** at most 18 months after your last
 > placement or interaction with us, then it's permanently destroyed.
--- a/docs/policies/consent/biometric_retention_schedule_v1.md
+++ b/docs/policies/consent/biometric_retention_schedule_v1.md
@ -3,6 +3,7 @@
 **Spec:** docs/PHASE_1_6_BIPA_GATES.md §1 Gate 1 (BIPA §15(a))
 **Status:** Engineering scaffold — ⚖ COUNSEL must author the binding text before public publication
 **Version:** v1 (initial; supersession requires a new version + new hash)
 **Updated 2026-05-05:** §1 + §2 revised to match the Gate 3b deferral recommendation in `docs/specs/GATE_3B_DEEPFACE_DESIGN.md` (Option C — defer classifications). §5 destruction-trigger endpoint corrected to the shipped catalogd-local route. Pending J's product confirmation of Gate 3b.
 > This is a publicly-available retention schedule for biometric identifiers
 > and biometric information collected by the Lakehouse staffing platform.
@ -15,12 +16,15 @@
 This schedule applies to:
- **Biometric identifiers** as defined in 740 ILCS 14/10: facial geometry
+- **Biometric identifiers** as defined in 740 ILCS 14/10: candidate
-  derived from candidate photographs.
+  photographs from which facial geometry is implicit.
 - **Biometric information** as defined in 740 ILCS 14/10: any information
-  derived from a biometric identifier, including but not limited to
+  derived from a biometric identifier. **In v1 of this schedule, no
-  the gender, race, and age classifications produced by the deepface
+  derived information is collected** — automated facial-classification
-  model when applied to a candidate photograph.
+  (gender, race, age inference) is deferred per
  `docs/specs/GATE_3B_DEEPFACE_DESIGN.md` Option C. If a future version
  of this schedule introduces classification, that is a superseding
  v2 schedule with re-consent under the matching v2 consent template.
 **Out of scope** (explicitly NOT biometric data under this schedule):
@ -39,14 +43,15 @@ This schedule applies to:
 | Category | Source | Storage location |
 |---|---|---|
-| Photograph (raw bytes) | Candidate upload via the consent-gated photo endpoint | Quarantined under `data/biometric/uploads/<candidate_id>/<ts>.<ext>`; encrypted at rest |
+| Photograph (raw bytes) | Candidate upload via the consent-gated photo endpoint | Quarantined under `data/biometric/uploads/<candidate_id>/<ts>_<uuid>.<ext>`; mode 0700 dir / 0600 file |
-| Facial geometry classifications | deepface inference run against the photograph | `subjects.biometric_classifications` (JSONB on the identityd `subjects` row) |
+| Photograph integrity hash | SHA-256 of the original bytes | `SubjectManifest.biometric_collection.template_hash` (catalogd JSON manifest at `data/_catalog/subjects/<id>.json`) |
 | Photograph integrity hash | SHA-256 of the original bytes | `subjects.biometric_template_hash` |
 We do NOT collect raw biometric template vectors that could be used
-to re-derive a face from the encoded form. The deepface output is
+to re-derive a face from the encoded form. We do NOT run automated
-stored as discrete classification labels (e.g. `{"age_estimate": 32,
+facial-classification (gender, race, age inference) in v1 — see
-"gender": "...", "race": "..."}`), not as a re-identifiable embedding.
+`docs/specs/GATE_3B_DEEPFACE_DESIGN.md` for the deferral rationale.
 The `BiometricCollection.classifications` field on the subject
 manifest exists in the schema but is `None` for every subject.
 ---
@ -104,8 +109,8 @@ Runbook** (`docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md`) when:
 - Retention period under §4 expires
 - Candidate withdraws biometric consent under the consent template (Gate 2)
 - Candidate exercises a right-to-be-forgotten request
- An identityd `POST /v1/identity/subjects/{id}/erase` is invoked under
+- A catalogd-local `POST /biometric/subject/{id}/erase` is invoked
-  legal-tier authentication
+  under legal-tier authentication (shipped `848a458`)
 Every destruction event is recorded as an append-only audit row in
 the affected subject's per-subject HMAC-chained audit log (see
--- a/docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md
+++ b/docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md
@ -66,10 +66,11 @@ Before initiating destruction, the operator MUST:
 Invoke the legal-tier erasure endpoint:
 ```bash
-curl -sf -X POST "http://localhost:3100/v1/identity/subjects/${CANDIDATE_ID}/erase" \
+curl -sf -X POST "http://localhost:3100/biometric/subject/${CANDIDATE_ID}/erase" \
  -H "Authorization: Bearer $(cat /etc/lakehouse/legal_audit.token)" \
  -H "Content-Type: application/json" \
  -d '{
    "scope": "biometric_only|full",
    "trigger": "retention_expiry|consent_withdrawal|rtbf|court_order",
    "trigger_evidence_path": "<path to signed artifact>",
    "operator_of_record": "<operator name>",
@ -77,17 +78,25 @@ curl -sf -X POST "http://localhost:3100/v1/identity/subjects/${CANDIDATE_ID}/era
  }'
 ```
-⚖ ENGINEERING — `POST /v1/identity/subjects/{id}/erase` is Phase 1.6
+The endpoint is **shipped** (commit `848a458`, 21 unit tests). It is
-Gate 3 dependent. Until it ships, the manual procedure is:
+served from catalogd-local at `/biometric/subject/{id}/erase` (the
 original v1 spec proposed `/v1/identity/subjects/{id}/erase` under a
 separate identityd daemon — that daemon was collapsed into catalogd
 per the architecture pivot).
-a. Set `SubjectManifest.consent.biometric.status = "withdrawn"` and
+The endpoint exposes two scopes:
-   `SubjectManifest.status = "erased"` via direct registry write
+
-   (operator-of-record only).
+- **`scope: "biometric_only"`** — clears `BiometricCollection` from
-b. Securely overwrite + unlink the quarantined photo path:
+  the SubjectManifest (drops `data_path`, `template_hash`, and
-   `shred -uvz data/biometric/uploads/${CANDIDATE_ID}/*.jpg`
+  `classifications` together) + securely unlinks the quarantined
-   (or equivalent for the configured backend).
+  photo file. Subject manifest itself remains. Use for retention
-c. NULL the deepface classification fields on the subject row.
+  expiry / consent withdrawal where only biometric data must go.
-d. Append the destruction-event audit row (Step 2 below).
+- **`scope: "full"`** — full subject erasure (manifest + biometric
  files). Use for court-ordered erasure or full RTBF requests.
 In both scopes, the audit row is appended BEFORE photo unlink so
 the chain has legal proof of intent even if the file delete fails
 (transactional rollback on audit failure).
 ### Step 2 — Append the destruction-event audit row
@ -224,5 +233,12 @@ training program.
 ## 8. Change log
 - 2026-05-05 — Endpoint path reconciled with shipped state:
  `/v1/identity/subjects/{id}/erase` (legacy proposal under a
  separate identityd daemon) → `/biometric/subject/{id}/erase`
  (catalogd-local, shipped `848a458`). Step 1 manual-fallback
  block removed (the endpoint is no longer "TODO"). Two-scope
  body shape (`biometric_only` / `full`) documented to match
  the implementation.
 - 2026-05-03 — Initial scaffold. ⚖ COUNSEL review required before
  adoption.
--- a/docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md
+++ b/docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md
@ -0,0 +1,308 @@
 # Legal-Tier Audit Key & Token Rotation Runbook
 **Spec companion:** `docs/PHASE_1_6_BIPA_GATES.md` §2 + `docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md`
 **Audience:** Operators with root on the gateway host (J + named operators)
 **Status:** Engineering-authored — ⚖ counsel review encouraged before formal adoption
 > This runbook covers rotation of the two crypto-credentials that gate
 > the Phase 1.6 audit substrate:
 >
 > 1. **`LH_SUBJECT_AUDIT_KEY`** — the 32-byte HMAC-SHA256 signing key
 >    that chains every per-subject audit row. If this key changes, all
 >    pre-rotation chain rows tamper-detect under the new key. That is
 >    correct, expected, BIPA-defensible behavior — the chain integrity
 >    it provided pre-rotation remains intact in the archive of the old
 >    key, and post-rotation chains remain intact going forward.
 >
 > 2. **`LH_LEGAL_AUDIT_TOKEN`** — the 32+-character bearer token that
 >    authorizes calls to `/audit/subject/{id}` and
 >    `/biometric/subject/{id}/erase`. Rotation does NOT touch any audit
 >    history; only access to the legal-tier endpoints flips.
 >
 > Both live at `/etc/lakehouse/` (mode 0400, owned by root) and are
 > loaded by the gateway via systemd `Environment=` directives in
 > `/etc/systemd/system/lakehouse.service.d/audit_env.conf`. They are
 > NOT loaded from `/tmp` — a 2026-05-05 reboot incident wiped a
 > `/tmp`-resident key and caused `/audit` + `/biometric` to fail-closed
 > (which is what they should do); the rotation fix moved them to the
 > persistent path.
 ---
 ## 1. When to rotate
 Rotate **immediately** when any of the following is true:
 | Trigger | Urgency | Notes |
 |---|---|---|
 | Suspected operator credential compromise | Within 1 hour | Token mismatch is fail-closed by default; immediate rotation closes the window. |
 | Operator with legal-tier access leaves the team | Within 24 hours | Treat as compromise. |
 | Key/token file's filesystem permissions were ever weakened (mode > 0400, group readable, etc.) | Within 24 hours | Filesystem audit may have leaked the bytes. |
 | Token was ever transmitted over an untrusted channel (printed in CI log, sent over SMS, etc.) | Within 24 hours | Same reasoning. |
 | Scheduled rotation (recommended) | Every 90 days | BIPA does not mandate a rotation cadence; counsel may set one. |
 Do **not** rotate when:
 - A subject's audit chain tamper-detects in isolation. That is normal
  if the audit log was edited (which would itself be the BIPA finding,
  not the key). Investigate the chain, not the key.
 - Cross-runtime parity drift appears. That's an HMAC-input-shape bug
  (Go vs Rust serialization), not a key issue. See
  `STATE_OF_PLAY.md` "three runtime-divergence classes" entry.
 ---
 ## 2. Pre-rotation checks (5 minutes)
 Before generating new credentials, capture a clean baseline so you can
 prove the rotation cause and sequence afterward.
 ### 2.1. Take the engineering snapshot
 ```bash
 # Confirm the canonical files exist with correct permissions.
 ls -la /etc/lakehouse/subject_audit.key /etc/lakehouse/legal_audit.token
 # Hash the existing key + token (NEVER the bytes themselves) so the
 # old credential is identifiable in retrospect without storing it.
 sha256sum /etc/lakehouse/subject_audit.key
 sha256sum /etc/lakehouse/legal_audit.token
 # Confirm the gateway is currently using these files.
 sudo systemctl cat lakehouse.service | grep -E "Environment.*AUDIT"
 # Verify the audit endpoint is healthy with the current credentials.
 curl -sf http://localhost:3100/audit/health
 ```
 If `/audit/health` is already 503, the rotation is **recovery**, not
 preventive — note this in the rotation event record (§5).
 ### 2.2. Capture a known-good chain root
 Pick one or two subjects with non-empty audit logs and record their
 chain roots **under the current key**:
 ```bash
 TOKEN=$(cat /etc/lakehouse/legal_audit.token)
 for cid in WORKER-2 WORKER-100; do
  curl -sf -H "X-Lakehouse-Legal-Token: $TOKEN" \
    "http://localhost:3100/audit/subject/$cid" \
  | jq '{cid: .candidate_id, verified: .audit_log.chain_verified, root: .audit_log.chain_root, rows: .audit_log.chain_rows_total}'
 done
 ```
 Save the output. Post-rotation, those chains will tamper-detect under
 the new key — that is **expected** and the saved snapshot is the proof
 that the chain WAS intact under the old key, before rotation.
 ---
 ## 3. Generation + rotation
 ### 3.1. Generate the new key
 ```bash
 # 32 random bytes as hex = 64 chars. Either format works for HMAC-SHA256;
 # we follow the existing convention (44-char base64-ish with no padding).
 sudo install -m 0400 -o root -g root <(openssl rand -base64 33 | tr -d '\n=' | head -c 44) \
  /etc/lakehouse/subject_audit.key.new
 sudo install -m 0400 -o root -g root <(openssl rand -base64 33 | tr -d '\n=' | head -c 44) \
  /etc/lakehouse/legal_audit.token.new
 # Sanity: confirm 44-char content + correct mode.
 sudo wc -c /etc/lakehouse/subject_audit.key.new /etc/lakehouse/legal_audit.token.new
 sudo ls -la /etc/lakehouse/*.new
 ```
 Both must be `mode 0400`, owned by root, exactly **44 chars** (the
 audit endpoint refuses tokens shorter than 32 chars at load — see
 `crates/catalogd/src/audit_endpoint.rs:73`).
 ### 3.2. Atomic swap
 The gateway reads these files **once at boot** (per
 `crates/catalogd/src/audit_endpoint.rs::AuditEndpointState::new` and
 the equivalent for the writer). Atomic mv → restart is required.
 ```bash
 # Move the old credentials to a quarantine path with timestamp so the
 # old hashes remain identifiable post-rotation.
 TS=$(date -u +%Y%m%dT%H%M%SZ)
 sudo mkdir -p /etc/lakehouse/_archived
 sudo install -d -m 0700 -o root -g root /etc/lakehouse/_archived
 sudo mv /etc/lakehouse/subject_audit.key  /etc/lakehouse/_archived/subject_audit.key.$TS
 sudo mv /etc/lakehouse/legal_audit.token  /etc/lakehouse/_archived/legal_audit.token.$TS
 sudo mv /etc/lakehouse/subject_audit.key.new  /etc/lakehouse/subject_audit.key
 sudo mv /etc/lakehouse/legal_audit.token.new  /etc/lakehouse/legal_audit.token
 sudo ls -la /etc/lakehouse/subject_audit.key /etc/lakehouse/legal_audit.token
 ```
 ### 3.3. Restart the gateway
 ```bash
 sudo systemctl restart lakehouse.service
 sleep 2
 sudo systemctl status lakehouse.service --no-pager | head -10
 ```
 Wait for the gateway to bind port 3100 cleanly. If it doesn't, check
 `journalctl -u lakehouse.service -n 50 --no-pager` for the failure
 mode — the most common cause is the new file having wrong mode/owner.
 ---
 ## 4. Post-rotation verification (5 minutes)
 ### 4.1. Health probes
 ```bash
 # Audit endpoint must be 200, not 503.
 curl -sf http://localhost:3100/audit/health
 # Expect: "audit endpoint ready"
 # /v1/health must list the gateway's full provider set.
 curl -sf http://localhost:3100/v1/health | jq '.providers, .worker_count'
 ```
 ### 4.2. Confirm the new token works
 ```bash
 NEW_TOKEN=$(cat /etc/lakehouse/legal_audit.token)
 curl -sS -o /dev/null -w '%{http_code}\n' \
  -H "X-Lakehouse-Legal-Token: $NEW_TOKEN" \
  http://localhost:3100/audit/subject/WORKER-100
 # Expect: 200
 ```
 If 401, the file the gateway loaded does NOT match the file you wrote.
 Check ownership / mode / for trailing whitespace differences with
 `hexdump -C /etc/lakehouse/legal_audit.token | head`.
 ### 4.3. Confirm the new chain works
 Append-only chains are key-tied. Any *new* audit row written
 post-rotation is signed under the new key and verifies cleanly:
 ```bash
 # Issue a /v1/validate call against any worker — it spawns an audit row.
 curl -sf -X POST http://localhost:3100/v1/validate \
  -H 'Content-Type: application/json' \
  -d '{"mode":"fill","candidate_id":"WORKER-100","worker_id":"WORKER-100","fields":["exists"]}' >/dev/null
 # Read the chain back. Last row must verify under the new key.
 curl -sf -H "X-Lakehouse-Legal-Token: $NEW_TOKEN" \
  http://localhost:3100/audit/subject/WORKER-100 \
 | jq '.audit_log | {verified: .chain_verified, rows: .chain_rows_total, last_kind: .rows[-1].accessor.kind}'
 ```
 `chain_verified: true` confirms the new key is signing + verifying.
 ### 4.4. Confirm pre-rotation chains tamper-detect (expected)
 ```bash
 curl -sf -H "X-Lakehouse-Legal-Token: $NEW_TOKEN" \
  http://localhost:3100/audit/subject/WORKER-2 \
 | jq '.audit_log | {verified: .chain_verified, error: .chain_verification_error}'
 ```
 For any subject whose chain was written under the old key, this
 returns `chain_verified: false` with an HMAC-mismatch error. **This
 is correct behavior**, not a bug. The old chain was correctly signed
 under the old key and verified under it; the new key cannot retroactively
 verify rows it didn't sign. The pre-rotation snapshot you captured in
 §2.2 is the defensible proof that those rows WERE valid pre-rotation.
 If, instead, you see a chain that *should* verify post-rotation
 returning `verified: false`, that's the rotation having gone wrong —
 likely an old-key file that didn't get archived cleanly. Restore from
 `/etc/lakehouse/_archived/<ts>/`, then re-attempt.
 ---
 ## 5. Record the rotation event
 Append a row to the rotation log:
 ```bash
 sudo tee -a /etc/lakehouse/_archived/rotation_log.jsonl <<EOF
 {"ts":"$(date -u +%Y-%m-%dT%H:%M:%SZ)","operator":"<your name>","reason":"<scheduled|compromise|cred_loss|recovery>","old_key_sha256":"<hash from §2.1>","new_key_sha256":"$(sha256sum /etc/lakehouse/subject_audit.key | awk '{print $1}')","old_token_sha256":"<hash from §2.1>","new_token_sha256":"$(sha256sum /etc/lakehouse/legal_audit.token | awk '{print $1}')","witness":"<witness name or N/A for routine>"}
 EOF
 sudo chmod 0600 /etc/lakehouse/_archived/rotation_log.jsonl
 sudo chown root:root /etc/lakehouse/_archived/rotation_log.jsonl
 ```
 This file is the operator-side record of when the key changed and why.
 It does NOT contain the key itself — only hashes — so it is safe to
 back up and share with counsel on request.
 ---
 ## 6. Recovery from a lost key
 If the active `subject_audit.key` is destroyed (filesystem corruption,
 accidental delete, /tmp wipe per the 2026-05-05 incident), the gateway
 will fail-closed at startup:
 - `/audit/subject/{id}` → 503 ("audit endpoint disabled (legal token missing)" or equivalent for the signing key)
 - `/biometric/subject/{id}/photo` → 503 (same fail-closed posture)
 This is correct behavior — a server that cannot HMAC-sign new audit
 rows must not accept new biometric writes.
 **Recovery is rotation.** Generate a new key per §3.1, atomic-swap
 per §3.2, restart per §3.3, verify per §4. Pre-loss chains tamper-detect
 under the new key (the old key is gone — there is no way to verify
 them). Treat the loss event as the BIPA-defensible boundary: pre-loss
 chain verification was provided by the working key; post-loss new
 chains are signed under the new key.
 If a counsel-grade attestation of the pre-loss chains is needed, the
 `/etc/lakehouse/_archived/` folder contains the historical hashes;
 combined with the cross-runtime parity probe (Go reader gives the
 same byte-identical view as Rust), the chain history pre-loss is
 preservable as long as the on-disk JSONL files were not also lost.
 ---
 ## 7. ⚖ counsel notes
 These are areas where counsel may want to opine before this runbook
 is formally adopted:
 1. **Rotation cadence.** BIPA itself does not require periodic rotation;
   counsel may set a 90-day schedule to satisfy a separate compliance
   posture (SOC2, internal policy).
 2. **Custody of `/etc/lakehouse/_archived/`.** The archived hashes do
   NOT contain the keys, but the archived raw key files DO. Counsel
   may want a more aggressive destruction schedule for the raw archived
   keys — say 1 year — to reduce a long-tail compromise surface.
 3. **Notification obligations on rotation due to compromise.** §1
   triggers a rotation; §1 does not address whether candidates whose
   biometric data was protected by the compromised key must be notified.
   This is a counsel call.
 ---
 ## 8. Operator acknowledgment
 | Operator | Date acknowledged | Signature |
 |---|---|---|
 | J | _____ | _______________ |
 | _____ | _____ | _______________ |
 ---
 ## 9. Change log
 - 2026-05-05 — Initial runbook authored after the /tmp wipe incident
  on the same day (key was at `/tmp/subject_audit.key` and was deleted
  on reboot, disabling `/audit` + `/biometric` until the key was
  regenerated at `/etc/lakehouse/subject_audit.key`). Recovery of
  that incident produced a working procedure; this runbook captures
  it as the canonical playbook for any future rotation.
--- a/docs/specs/GATE_3B_DEEPFACE_DESIGN.md
+++ b/docs/specs/GATE_3B_DEEPFACE_DESIGN.md
@ -1,6 +1,8 @@
 # Gate 3b — Deepface Classification Integration (Design)
-**Status:** Design draft — 2026-05-03 morning · **Companion to:** [`PHASE_1_6_BIPA_GATES.md`](../PHASE_1_6_BIPA_GATES.md) Gate 3 · **Depends on:** Gate 3a (photo upload) which is shipped (`f1fa6e4`)
+**Status:** **DECIDED 2026-05-05 — Option C (defer classifications)** · Original design draft 2026-05-03 morning · **Companion to:** [`PHASE_1_6_BIPA_GATES.md`](../PHASE_1_6_BIPA_GATES.md) Gate 3 · **Depends on:** Gate 3a (photo upload) which is shipped (`f1fa6e4`)
 > **Decision summary (2026-05-05):** J accepted Option C. `BiometricCollection.classifications` remains `Option<JSON> = None` in v1. The consent template and retention schedule were revised the same day to remove all "automated facial-classification" language so the disclosed scope matches the implemented scope. If a real product requirement for classifications surfaces later, this doc's Option A (Python subprocess) or Option B (ONNX-in-Rust) is picked up under a v2 consent template + v2 retention schedule.
 > **What this is.** Three options for how `BiometricCollection.classifications` (currently `Option<JSON>`, always `None`) gets populated by an automated facial-attribute classifier. Phase 1.6 Gate 3a ships the consent-gated upload + audit chain + transactional rollback; Gate 3b adds the classification step. The substrate is ready — what's missing is the design choice for HOW classification happens.
 >
@ -152,6 +154,8 @@ Reasoning:
 ⚖ J — pick A / B / C. The substrate accommodates any choice; the cost is the design-doc → counsel-coordination → engineering loop, which differs by an order of magnitude across the options.
 **[2026-05-05] J's decision: Option C.** Reasoning recorded in change log below. Consent + retention doc revisions for Option C shipped same day; counsel review of revised text is the remaining work.
 ---
 ## Open questions for J
--- a/scripts/staffing/biometric_destruction_report.sh
+++ b/scripts/staffing/biometric_destruction_report.sh
@ -0,0 +1,263 @@
 #!/usr/bin/env bash
 # biometric_destruction_report — monthly destruction event aggregation.
 #
 # Specification: docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md §5.
 # Spec: docs/PHASE_1_6_BIPA_GATES.md §1 Gate 5.
 #
 # Why this exists: counsel and operations review need a periodic
 # attestation that destructions have happened in a defensible cadence.
 # This script produces an anonymized monthly report aggregating
 # per-subject audit logs.
 #
 # Output is anonymized — counts, timings, scope/trigger breakdowns,
 # and chain attestations. Candidate IDs are hashed (sha256-prefix) so
 # the report can be shared with counsel without exposing identifiers.
 #
 # Usage:
 #   biometric_destruction_report.sh \
 #     [--month YYYY-MM] \
 #     [--audit-dir data/_catalog/subjects] \
 #     [--output reports/biometric/destruction_<month>.md]
 #
 # Defaults:
 #   --month         — current UTC month (YYYY-MM)
 #   --audit-dir     — data/_catalog/subjects
 #   --output        — reports/biometric/destruction_<month>.md
 #
 # Exit codes:
 #   0 — report written successfully (whether or not events were found)
 #   1 — report written but with anomalies that need review
 #   2 — script error (missing tools, unreadable audit dir)
 set -uo pipefail
 cd "$(dirname "$0")/../.."
 MONTH=""
 AUDIT_DIR="data/_catalog/subjects"
 OUT=""
 while [ "$#" -gt 0 ]; do
  case "$1" in
    --month) MONTH="$2"; shift 2 ;;
    --audit-dir) AUDIT_DIR="$2"; shift 2 ;;
    --output) OUT="$2"; shift 2 ;;
    -h|--help)
      sed -n '2,30p' "$0" | sed 's/^# \?//'
      exit 0 ;;
    *) echo "unknown flag: $1" >&2; exit 2 ;;
  esac
 done
 # Default month = current UTC YYYY-MM. Validate format defensively
 # so a malformed --month value (e.g. "May 2026") doesn't silently
 # match nothing in the JSONL filter.
 if [ -z "$MONTH" ]; then
  MONTH=$(date -u +%Y-%m)
 fi
 if ! echo "$MONTH" | grep -qE '^[0-9]{4}-(0[1-9]|1[0-2])$'; then
  echo "[report] FAIL: --month must be YYYY-MM, got '$MONTH'" >&2
  exit 2
 fi
 if [ -z "$OUT" ]; then
  OUT="reports/biometric/destruction_${MONTH}.md"
 fi
 # Dependency gates.
 for cmd in jq sha256sum; do
  if ! command -v "$cmd" >/dev/null 2>&1; then
    echo "[report] FAIL: required tool '$cmd' not found in PATH" >&2
    exit 2
  fi
 done
 if [ ! -d "$AUDIT_DIR" ]; then
  echo "[report] FAIL: audit dir not found at $AUDIT_DIR" >&2
  exit 2
 fi
 mkdir -p "$(dirname "$OUT")"
 # Aggregator storage.
 EVENTS=$(mktemp)
 ANOMALIES=$(mktemp)
 trap 'rm -f "$EVENTS" "$ANOMALIES"' EXIT
 # Iterate every per-subject audit log under AUDIT_DIR. Each file is
 # JSONL — one row per line. We extract erasure rows in the requested
 # month + emit a normalized one-line record per event.
 TOTAL_FILES=0
 TOTAL_ROWS_SCANNED=0
 SHARDS_WITH_EVENTS=0
 for f in "$AUDIT_DIR"/*.audit.jsonl; do
  [ -e "$f" ] || continue
  TOTAL_FILES=$((TOTAL_FILES + 1))
  # File-level row count (cheap).
  ROWS=$(wc -l < "$f" 2>/dev/null || echo 0)
  TOTAL_ROWS_SCANNED=$((TOTAL_ROWS_SCANNED + ROWS))
  # Filter rows for the month + erasure kinds.
  HAD_EVENT=0
  while IFS= read -r line; do
    [ -n "$line" ] || continue
    KIND=$(printf '%s' "$line" | jq -r '.accessor.kind // ""' 2>/dev/null || echo "")
    case "$KIND" in
      biometric_erasure|full_erasure) ;;
      *) continue ;;
    esac
    TS=$(printf '%s' "$line" | jq -r '.ts // ""' 2>/dev/null || echo "")
    case "$TS" in
      "${MONTH}-"*) ;;     # only this month
      *) continue ;;
    esac
    HAD_EVENT=1
    CID=$(printf '%s' "$line" | jq -r '.candidate_id // ""' 2>/dev/null || echo "")
    PURPOSE=$(printf '%s' "$line" | jq -r '.accessor.purpose // ""' 2>/dev/null || echo "")
    RESULT=$(printf '%s' "$line" | jq -r '.result // ""' 2>/dev/null || echo "")
    # accessor.purpose has shape "trigger=<name>;..." per biometric_endpoint
    TRIGGER=$(printf '%s' "$PURPOSE" | sed -nE 's/.*trigger=([a-z_]+).*/\1/p')
    [ -n "$TRIGGER" ] || TRIGGER="unknown"
    # Hash candidate_id so the report stays anonymized.
    CID_HASH=$(printf '%s' "$CID" | sha256sum | awk '{print substr($1,1,12)}')
    # Anomaly: erasure row but result not in {erased, success}.
    case "$RESULT" in
      erased|success) ;;
      *)
        echo "  - candidate_hash=$CID_HASH ts=$TS kind=$KIND result=$RESULT trigger=$TRIGGER (unexpected result)" >> "$ANOMALIES"
        ;;
    esac
    # Tab-separated event line: ts, kind, trigger, result, cid_hash
    printf '%s\t%s\t%s\t%s\t%s\n' "$TS" "$KIND" "$TRIGGER" "$RESULT" "$CID_HASH" >> "$EVENTS"
  done < "$f"
  if [ "$HAD_EVENT" = "1" ]; then
    SHARDS_WITH_EVENTS=$((SHARDS_WITH_EVENTS + 1))
  fi
 done
 EVENT_COUNT=$(wc -l < "$EVENTS" 2>/dev/null || echo 0)
 EVENT_COUNT=$(printf '%s' "$EVENT_COUNT" | tr -d '[:space:]')
 : "${EVENT_COUNT:=0}"
 # Compute breakdowns.
 COUNT_BIOMETRIC_ONLY=0
 COUNT_FULL=0
 if [ "$EVENT_COUNT" != "0" ]; then
  COUNT_BIOMETRIC_ONLY=$(awk -F '\t' '$2=="biometric_erasure"' "$EVENTS" | wc -l | tr -d '[:space:]')
  COUNT_FULL=$(awk -F '\t' '$2=="full_erasure"' "$EVENTS" | wc -l | tr -d '[:space:]')
 fi
 ANOMALY_COUNT=$(wc -l < "$ANOMALIES" 2>/dev/null || echo 0)
 ANOMALY_COUNT=$(printf '%s' "$ANOMALY_COUNT" | tr -d '[:space:]')
 : "${ANOMALY_COUNT:=0}"
 # Render the report.
 GENERATED_AT=$(date -u +%Y-%m-%dT%H:%M:%SZ)
 {
  echo "# Biometric Destruction Report — $MONTH"
  echo
  echo "**Generated:** $GENERATED_AT"
  echo "**Audit dir scanned:** \`$AUDIT_DIR\`"
  echo "**Spec:** docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md §5"
  echo "**Generator:** scripts/staffing/biometric_destruction_report.sh"
  echo
  echo "## Scope"
  echo
  echo "- **Subject audit shards scanned:** $TOTAL_FILES"
  echo "- **Audit rows scanned (all kinds):** $TOTAL_ROWS_SCANNED"
  echo "- **Shards containing $MONTH erasure events:** $SHARDS_WITH_EVENTS"
  echo
  echo "## Destruction events in $MONTH"
  echo
  echo "- **Total events:** $EVENT_COUNT"
  echo "- **By scope:**"
  echo "  - \`biometric_erasure\` (BiometricCollection cleared, manifest retained): $COUNT_BIOMETRIC_ONLY"
  echo "  - \`full_erasure\` (manifest + biometric data cleared): $COUNT_FULL"
  echo
  if [ "$EVENT_COUNT" = "0" ]; then
    echo "**No destruction events recorded for $MONTH.** This is correct"
    echo "for a month with no retention expiries / withdrawal requests"
    echo "/ RTBF requests / court orders."
    echo
  else
    echo "### By trigger"
    echo
    echo "| Trigger | Count |"
    echo "|---|---|"
    awk -F '\t' '{print $3}' "$EVENTS" | sort | uniq -c | \
      sort -rn | awk '{ printf("| %s | %d |\n", $2, $1); }'
    echo
    echo "### Event detail (anonymized)"
    echo
    echo "Candidate IDs are hashed (sha256-12-prefix) so this report can"
    echo "be shared with outside counsel without exposing identifiers."
    echo
    echo "| ts | kind | trigger | result | candidate_hash |"
    echo "|---|---|---|---|---|"
    sort -k1,1 "$EVENTS" | awk -F '\t' '{
      printf("| %s | %s | %s | %s | %s |\n", $1, $2, $3, $4, $5);
    }'
    echo
  fi
  if [ "$ANOMALY_COUNT" != "0" ]; then
    echo "## Anomalies ($ANOMALY_COUNT)"
    echo
    echo "Events whose audit row deviates from expected shape (kind/result"
    echo "mismatch, missing trigger, etc.). These do NOT necessarily mean"
    echo "the destruction failed — the BIPA-load-bearing surface is the"
    echo "audit chain, which still verifies cryptographically. They are"
    echo "logged here so an operator can investigate and confirm."
    echo
    echo '```'
    cat "$ANOMALIES"
    echo '```'
    echo
  fi
  echo "## Cryptographic attestation"
  echo
  echo "This report was produced by aggregating per-subject HMAC-chained"
  echo "audit logs. The chain itself is the BIPA-defensible substrate;"
  echo "this report is a derived view, not the chain of record. To verify"
  echo "any individual event, run:"
  echo
  echo '```bash'
  echo "./scripts/staffing/verify_biometric_erasure.sh <candidate_id>"
  echo '```'
  echo "(operator must un-hash the candidate ID through their own"
  echo " operator log to perform spot-checks)."
  echo
  echo "**Cross-runtime parity:** the same audit logs are byte-identical"
  echo "under Rust + Go (per scripts/cutover/parity/subject_audit_parity.sh)."
  echo "If counsel needs cross-runtime attestation, that probe provides it."
  echo
  EVIDENCE_HASH=$(sha256sum "$EVENTS" 2>/dev/null | awk '{print $1}')
  : "${EVIDENCE_HASH:=$(echo -n '' | sha256sum | awk '{print $1}')}"
  echo "**Events SHA-256:** \`$EVIDENCE_HASH\`"
  echo
  echo "---"
  echo
  echo "**Operator (J):** _______________________________  Date: __________"
  echo
 } > "$OUT"
 echo "[report] $EVENT_COUNT destruction events in $MONTH ($COUNT_BIOMETRIC_ONLY biometric_only, $COUNT_FULL full)"
 echo "[report] anomalies: $ANOMALY_COUNT"
 echo "[report] output: $OUT"
 # Exit 1 if anomalies present (review needed) but report still written.
 if [ "$ANOMALY_COUNT" != "0" ]; then
  exit 1
 fi
 exit 0
--- a/scripts/staffing/bundle_counsel_packet.sh
+++ b/scripts/staffing/bundle_counsel_packet.sh
@ -0,0 +1,99 @@
 #!/usr/bin/env bash
 # bundle_counsel_packet — assemble the counsel-review packet tarball.
 #
 # Specification: docs/counsel/COUNSEL_REVIEW_PACKET_<DATE>.md §9.
 #
 # Why this exists: the cover note references a list of documents.
 # Counsel needs them as a single transmittable artifact, with per-file
 # integrity hashes so they can verify nothing changed in transit.
 #
 # Output:
 #   reports/counsel/counsel_packet_<DATE>.tar.gz
 #   reports/counsel/counsel_packet_<DATE>.manifest.txt   (sha256 per file)
 #
 # Usage:
 #   bundle_counsel_packet.sh [--date YYYY-MM-DD]
 #
 # Exit codes:
 #   0 — packet bundled successfully
 #   1 — one or more referenced documents are missing
 #   2 — script error (missing tools, write failure)
 set -uo pipefail
 cd "$(dirname "$0")/../.."
 DATE="$(date -u +%Y-%m-%d)"
 while [ "$#" -gt 0 ]; do
  case "$1" in
    --date) DATE="$2"; shift 2 ;;
    -h|--help)
      sed -n '2,20p' "$0" | sed 's/^# \?//'
      exit 0 ;;
    *) echo "unknown flag: $1" >&2; exit 2 ;;
  esac
 done
 # Dependency gate.
 for cmd in tar sha256sum; do
  if ! command -v "$cmd" >/dev/null 2>&1; then
    echo "[bundle] FAIL: required tool '$cmd' not found in PATH" >&2
    exit 2
  fi
 done
 # Files in the packet. Order is the recommended counsel-review order
 # from the cover note §6.
 FILES=(
  "docs/counsel/COUNSEL_REVIEW_PACKET_${DATE}.md"
  "docs/policies/consent/biometric_retention_schedule_v1.md"
  "docs/policies/consent/biometric_consent_template_v1.md"
  "docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md"
  "docs/attestations/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-03.md"
  "docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md"
  "docs/specs/GATE_3B_DEEPFACE_DESIGN.md"
  "docs/PHASE_1_6_BIPA_GATES.md"
 )
 # Verify all referenced files exist before opening the tarball.
 MISSING=0
 for f in "${FILES[@]}"; do
  if [ ! -r "$f" ]; then
    echo "[bundle] MISSING: $f" >&2
    MISSING=$((MISSING + 1))
  fi
 done
 if [ "$MISSING" -gt 0 ]; then
  echo "[bundle] FAIL: $MISSING required documents missing — aborting" >&2
  exit 1
 fi
 OUT_DIR="reports/counsel"
 mkdir -p "$OUT_DIR"
 TARBALL="$OUT_DIR/counsel_packet_${DATE}.tar.gz"
 MANIFEST="$OUT_DIR/counsel_packet_${DATE}.manifest.txt"
 # Build the manifest first — counsel uses this to verify integrity.
 {
  echo "# Counsel Packet Manifest — $DATE"
  echo "# Generated: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
  echo "# Each file is listed with its SHA-256 hash. To verify on receipt:"
  echo "#   tar xzf counsel_packet_${DATE}.tar.gz"
  echo "#   sha256sum -c counsel_packet_${DATE}.manifest.txt"
  echo "# (re-format the lines below with two spaces between hash and path"
  echo "#  for sha256sum -c compatibility — sha256sum's strict format)"
  echo
  for f in "${FILES[@]}"; do
    sha256sum "$f"
  done
 } > "$MANIFEST"
 # Build the tarball — include the manifest itself.
 tar -czf "$TARBALL" "${FILES[@]}" "$MANIFEST"
 PACKET_HASH=$(sha256sum "$TARBALL" | awk '{print $1}')
 echo "[bundle] packet: $TARBALL"
 echo "[bundle] manifest: $MANIFEST"
 echo "[bundle] tarball SHA-256: $PACKET_HASH"
 echo "[bundle] files: ${#FILES[@]}"
--- a/scripts/staffing/verify_biometric_erasure.sh
+++ b/scripts/staffing/verify_biometric_erasure.sh
@ -0,0 +1,266 @@
 #!/usr/bin/env bash
 # verify_biometric_erasure — confirm that a biometric erasure completed cleanly.
 #
 # Specification: docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md §3 step 3.
 # Spec: docs/PHASE_1_6_BIPA_GATES.md §1 Gate 5.
 #
 # Why this exists: when an operator runs the erasure curl call against
 # /biometric/subject/{id}/erase, they need a defensible artifact proving
 # destruction completed. This script produces that artifact by checking
 # four things:
 #
 #   1. SubjectManifest.biometric_collection is null  (catalogd cleared the row)
 #   2. data/biometric/uploads/<safe_id>/  is empty or absent  (photo file gone)
 #   3. Most recent audit row has accessor.kind in {biometric_erasure, full_erasure}
 #      AND result is "erased" or "success"  (the chain logged the erasure intent)
 #   4. audit_log.chain_verified is true  (HMAC chain still intact end-to-end)
 #
 # All four must pass for an operator to mark the destruction complete.
 #
 # Usage:
 #   verify_biometric_erasure.sh <candidate_id> [--from ISO] [--to ISO]
 #
 # Environment:
 #   GATEWAY_URL        — default http://localhost:3100
 #   LEGAL_TOKEN_FILE   — default /etc/lakehouse/legal_audit.token
 #   UPLOADS_ROOT       — default data/biometric/uploads (relative to repo root)
 #   OUT_DIR            — default reports/biometric (where the verification report lands)
 #
 # Exit codes:
 #   0 — all four checks pass; erasure verified
 #   1 — one or more checks failed; do NOT mark destruction complete; escalate
 #   2 — script error (missing tools, network failure, bad token)
 set -uo pipefail
 cd "$(dirname "$0")/../.."
 if [ "$#" -lt 1 ]; then
  echo "usage: verify_biometric_erasure.sh <candidate_id> [--from ISO] [--to ISO]" >&2
  exit 2
 fi
 CANDIDATE_ID="$1"
 shift
 FROM=""
 TO=""
 while [ "$#" -gt 0 ]; do
  case "$1" in
    --from) FROM="$2"; shift 2 ;;
    --to) TO="$2"; shift 2 ;;
    *) echo "unknown flag: $1" >&2; exit 2 ;;
  esac
 done
 GATEWAY_URL="${GATEWAY_URL:-http://localhost:3100}"
 LEGAL_TOKEN_FILE="${LEGAL_TOKEN_FILE:-/etc/lakehouse/legal_audit.token}"
 UPLOADS_ROOT="${UPLOADS_ROOT:-data/biometric/uploads}"
 OUT_DIR="${OUT_DIR:-reports/biometric}"
 # Dependency gates — fail fast with clear errors rather than producing
 # a misleading "evidence" file from missing tools.
 for cmd in curl jq sha256sum; do
  if ! command -v "$cmd" >/dev/null 2>&1; then
    echo "[verify] FAIL: required tool '$cmd' not found in PATH" >&2
    exit 2
  fi
 done
 if [ ! -r "$LEGAL_TOKEN_FILE" ]; then
  echo "[verify] FAIL: cannot read legal token at $LEGAL_TOKEN_FILE" >&2
  echo "[verify]   This script requires legal-tier auth to query /audit/subject/." >&2
  exit 2
 fi
 LEGAL_TOKEN=$(tr -d '[:space:]' < "$LEGAL_TOKEN_FILE")
 if [ -z "$LEGAL_TOKEN" ]; then
  echo "[verify] FAIL: legal token file is empty" >&2
  exit 2
 fi
 # safe_id matches catalogd::biometric_endpoint::sanitize_for_path:
 # any non-[A-Za-z0-9_.-] char is replaced with underscore.
 SAFE_ID=$(printf '%s' "$CANDIDATE_ID" | sed 's/[^A-Za-z0-9_.\-]/_/g')
 mkdir -p "$OUT_DIR"
 DATE=$(date -u +%Y-%m-%dT%H-%M-%SZ)
 OUT="$OUT_DIR/erasure_verify_${SAFE_ID}_${DATE}.md"
 EVIDENCE=$(mktemp)
 trap 'rm -f "$EVIDENCE"' EXIT
 PASS=0
 FAIL=0
 note() { echo "$1" >> "$EVIDENCE"; }
 mark_pass() { PASS=$((PASS+1)); note "  - PASS: $1"; }
 mark_fail() { FAIL=$((FAIL+1)); note "  - FAIL: $1"; }
 note "## Verification target"
 note ""
 note "- **candidate_id:** \`$CANDIDATE_ID\`"
 note "- **safe_id (filesystem):** \`$SAFE_ID\`"
 note "- **gateway:** \`$GATEWAY_URL\`"
 note "- **uploads root:** \`$UPLOADS_ROOT\`"
 note "- **window:** ${FROM:-unbounded}  →  ${TO:-unbounded}"
 note ""
 # ── Fetch the audit response ────────────────────────────────────────
 QUERY=""
 if [ -n "$FROM" ]; then QUERY="from=$FROM"; fi
 if [ -n "$TO" ]; then
  if [ -n "$QUERY" ]; then QUERY="${QUERY}&to=$TO"; else QUERY="to=$TO"; fi
 fi
 URL="$GATEWAY_URL/audit/subject/$CANDIDATE_ID"
 if [ -n "$QUERY" ]; then URL="$URL?$QUERY"; fi
 RESP_FILE=$(mktemp)
 HTTP_CODE=$(curl -sS -o "$RESP_FILE" -w '%{http_code}' \
  -H "X-Lakehouse-Legal-Token: $LEGAL_TOKEN" \
  -H "Accept: application/json" \
  "$URL" 2>&1) || HTTP_CODE="000"
 if [ "$HTTP_CODE" != "200" ]; then
  echo "[verify] FAIL: GET $URL returned HTTP $HTTP_CODE" >&2
  echo "[verify] response head:" >&2
  head -c 500 "$RESP_FILE" >&2
  echo >&2
  rm -f "$RESP_FILE"
  exit 2
 fi
 # Schema sanity — refuse to evaluate against an unrecognized response shape.
 SCHEMA=$(jq -r '.schema // ""' < "$RESP_FILE")
 if [ "$SCHEMA" != "subject_audit_response.v1" ]; then
  echo "[verify] FAIL: unexpected response schema '$SCHEMA' (want subject_audit_response.v1)" >&2
  rm -f "$RESP_FILE"
  exit 2
 fi
 # ── Check 1: manifest.biometric_collection is null ──────────────────
 note "## Check 1 — Subject manifest biometric_collection is null"
 note ""
 BIO_COLL=$(jq -c '.manifest.biometric_collection // null' < "$RESP_FILE")
 note "**manifest.biometric_collection:** \`$BIO_COLL\`"
 note ""
 if [ "$BIO_COLL" = "null" ]; then
  mark_pass "biometric_collection field is null on the subject manifest"
 else
  mark_fail "biometric_collection is still populated — erasure incomplete"
 fi
 note ""
 # ── Check 2: filesystem uploads dir is empty/absent ─────────────────
 note "## Check 2 — Quarantined upload directory empty or absent"
 note ""
 UPLOAD_DIR="$UPLOADS_ROOT/$SAFE_ID"
 note "**path:** \`$UPLOAD_DIR\`"
 if [ ! -e "$UPLOAD_DIR" ]; then
  note "**state:** absent (directory was removed during erasure or never existed)"
  note ""
  mark_pass "upload directory is absent"
 elif [ ! -d "$UPLOAD_DIR" ]; then
  note "**state:** path exists but is not a directory — investigate"
  note ""
  mark_fail "upload path exists and is not a directory: $UPLOAD_DIR"
 else
  REMAINING=$(find "$UPLOAD_DIR" -maxdepth 1 -mindepth 1 2>/dev/null | wc -l | tr -d '[:space:]')
  : "${REMAINING:=0}"
  note "**state:** directory exists with $REMAINING remaining entries"
  note ""
  if [ "$REMAINING" = "0" ]; then
    mark_pass "upload directory is empty (no residual photo files)"
  else
    mark_fail "$REMAINING file(s) remain under $UPLOAD_DIR — must be unlinked"
    note "### Residual files"
    note ""
    note '```'
    find "$UPLOAD_DIR" -maxdepth 2 >> "$EVIDENCE"
    note '```'
    note ""
  fi
 fi
 # ── Check 3: most recent audit row reflects erasure ─────────────────
 note "## Check 3 — Audit log records the erasure event"
 note ""
 ROW_COUNT=$(jq '.audit_log.rows | length' < "$RESP_FILE")
 note "**rows in window:** $ROW_COUNT"
 if [ "$ROW_COUNT" = "0" ]; then
  mark_fail "no audit rows in the requested window — erasure should have appended one"
  note ""
 else
  LAST_KIND=$(jq -r '.audit_log.rows | last | .accessor.kind // ""' < "$RESP_FILE")
  LAST_RESULT=$(jq -r '.audit_log.rows | last | .result // ""' < "$RESP_FILE")
  LAST_TS=$(jq -r '.audit_log.rows | last | .ts // ""' < "$RESP_FILE")
  note "**last row:** ts=\`$LAST_TS\` accessor.kind=\`$LAST_KIND\` result=\`$LAST_RESULT\`"
  note ""
  case "$LAST_KIND" in
    biometric_erasure|full_erasure)
      case "$LAST_RESULT" in
        erased|success)
          mark_pass "last audit row is an erasure event ($LAST_KIND/$LAST_RESULT)"
          ;;
        *)
          mark_fail "last row kind is $LAST_KIND but result is '$LAST_RESULT' (expected erased/success)"
          ;;
      esac
      ;;
    *)
      mark_fail "last audit row accessor.kind is '$LAST_KIND' (expected biometric_erasure or full_erasure)"
      ;;
  esac
 fi
 note ""
 # ── Check 4: HMAC chain verifies end-to-end ─────────────────────────
 note "## Check 4 — HMAC chain integrity"
 note ""
 CHAIN_VERIFIED=$(jq -r '.audit_log.chain_verified' < "$RESP_FILE")
 CHAIN_ROOT=$(jq -r '.audit_log.chain_root // ""' < "$RESP_FILE")
 CHAIN_ROWS=$(jq -r '.audit_log.chain_rows_total // 0' < "$RESP_FILE")
 CHAIN_ERR=$(jq -r '.audit_log.chain_verification_error // ""' < "$RESP_FILE")
 note "**chain_verified:** \`$CHAIN_VERIFIED\`"
 note "**chain_rows_total:** $CHAIN_ROWS"
 note "**chain_root:** \`$CHAIN_ROOT\`"
 if [ -n "$CHAIN_ERR" ]; then
  note "**chain_verification_error:** \`$CHAIN_ERR\`"
 fi
 note ""
 if [ "$CHAIN_VERIFIED" = "true" ]; then
  mark_pass "chain verifies end-to-end ($CHAIN_ROWS rows)"
 else
  mark_fail "chain integrity broken — destruction is NOT defensible until investigated"
 fi
 note ""
 # ── Render report ───────────────────────────────────────────────────
 TOTAL=$((PASS + FAIL))
 note "## Summary"
 note ""
 note "**$PASS / $TOTAL** verification checks pass."
 note ""
 if [ "$FAIL" -gt 0 ]; then
  note "**Status: ERASURE NOT VERIFIED.** Do NOT mark destruction complete. Escalate to engineering before responding to candidate / counsel."
  note ""
 fi
 # Hash response body so the report has a tamper-evident anchor.
 RESP_HASH=$(sha256sum "$RESP_FILE" | awk '{print $1}')
 EVIDENCE_HASH=$(sha256sum "$EVIDENCE" | awk '{print $1}')
 {
  echo "# Biometric Erasure Verification — $CANDIDATE_ID"
  echo
  echo "**Date:** $DATE"
  echo "**Spec:** docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md §3 step 3"
  echo "**Generator:** scripts/staffing/verify_biometric_erasure.sh"
  echo
  cat "$EVIDENCE"
  echo "---"
  echo
  echo "**Audit response SHA-256:** \`$RESP_HASH\`"
  echo "**Evidence summary SHA-256:** \`$EVIDENCE_HASH\`"
  echo
 } > "$OUT"
 rm -f "$RESP_FILE"
 echo "[verify] $PASS / $TOTAL checks pass — report: $OUT"
 echo "[verify] response hash: $RESP_HASH"
 [ "$FAIL" -eq 0 ]