lakehouse/docs/counsel/COUNSEL_REVIEW_PACKET_2026-05-05.md
root b2c34b80b3 phase 1.6: lock Gate 3b = C, reconcile docs to shipped state, fix double-upload file leak
Four threads landing together — all driven by the audit J asked for before
production cutover.

(1) Gate 3b DECIDED: Option C (defer classifications). `BiometricCollection.classifications`
    stays `Option<JSON> = None` in v1. `docs/specs/GATE_3B_DEEPFACE_DESIGN.md` status
    flipped from "draft / awaits product" to DECIDED. Consent template + retention
    schedule revised to remove all "automated facial-classification" / "deepface"
    language so disclosed scope matches implemented scope.

(2) Endpoint-path drift reconciled across 3 docs. `PHASE_1_6_BIPA_GATES.md`,
    `BIPA_DESTRUCTION_RUNBOOK.md`, and `biometric_retention_schedule_v1.md` had
    references to legacy `/v1/identity/subjects/*` paths (proposed under a separate
    identityd daemon, never shipped) — corrected to actual shipped routes
    `/biometric/subject/*` (catalogd-local). Schema block in PHASE_1_6_BIPA_GATES
    rewritten to reflect JSON `SubjectManifest.biometric_collection` substrate
    (not the proposed Postgres `subjects` table).

(3) New operational artifacts:
    - `scripts/staffing/verify_biometric_erasure.sh` — checks 4 things post-erasure
      (manifest cleared, uploads dir empty, audit row matches, chain verified).
      Smoke-tested live against WORKER-2.
    - `scripts/staffing/biometric_destruction_report.sh` — monthly anonymized
      destruction-event aggregation. Smoke-tested clean.
    - `scripts/staffing/bundle_counsel_packet.sh` — tarballs the counsel-review
      packet with per-file SHA-256 manifest.
    - `docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md` — formal rotation procedure
      operationalized after the 2026-05-05 /tmp wipe incident.
    - `docs/counsel/COUNSEL_REVIEW_PACKET_2026-05-05.md` — cover note bundling
      all eng-staged BIPA docs for counsel review with per-doc questions, sign-off
      checklist, recommended review sequence.

(4) Double-upload file leak fixed in `crates/catalogd/src/biometric_endpoint.rs`.
    `verify_biometric_erasure.sh` smoked WORKER-2 and surfaced a stranded photo
    file. Investigation showed the file was 13-byte test-fixture bytes (zero PII,
    no biometric content); audit timeline showed two consecutive uploads followed
    by one erasure — the second upload had silently overwritten manifest.data_path,
    orphaning the first file. Patched `process_upload` to refuse a second upload
    with HTTP 409 + `error: "biometric_already_collected"` when
    `biometric_collection.is_some()` on the manifest. Operator must explicitly
    POST `/biometric/subject/{id}/erase` first.

    Tests: new `second_upload_without_erase_returns_409` (asserts 409 + manifest
    pointer unchanged + first file untouched on disk). Replaced
    `repeated_uploads_grow_the_chain` with `upload_erase_upload_grows_the_chain_cleanly`
    (covers the legitimate re-collection cycle: chain grows to 3 rows). Updated
    `content_type_with_parameters_accepted` to use 2 distinct subjects (was
    using 1 subject with 2 uploads to test ct parsing — would now 409).

    22/22 biometric_endpoint tests + 59/59 catalogd lib tests green post-patch.

Production posture: gateway needs `cargo build --release -p gateway` +
`systemctl restart lakehouse.service` to pick up the new 409 in live traffic.

Counsel calendar is now the only remaining blocker for first real-photo intake.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 06:19:40 -05:00

13 KiB

Counsel Review Packet — Phase 1.6 BIPA Pre-Launch

Date assembled: 2026-05-05 For: outside counsel From: J, operator of record Scope: documents that engineering has staged for legal sufficiency review before the staffing platform begins collecting any real candidate biometric data (BIPA §15(a)(b)).

What this packet is. The Phase 1.6 BIPA gates outline what engineering must ship before real-photo intake. As of 2026-05-05, all engineering substrate is shipped and verified live (see §1 below for the inventory). What remains is binding-text authoring

  • counsel sign-off on five documents, plus operational notification obligations counsel may want to layer on top.

What this packet is NOT. Not a request for counsel to write binding text from scratch. The documents are eng-staged in reasonable plain language; the request is for counsel to render them into legally-sufficient text and attest where signatures are required.


1. Engineering substrate — shipped + verified

For factual context on what counsel is reviewing AGAINST. None of this requires sign-off here; it's the system the documents bind to.

Component Where it lives Verification
Subject manifest registry crates/catalogd/src/registry.rs, data/_catalog/subjects/<id>.json 17 unit tests + 100 backfilled WORKER manifests in production
Per-subject HMAC audit chain (SHA-256) crates/catalogd/src/subject_audit.rs, data/_catalog/subjects/<id>.audit.jsonl Tamper-detection + concurrent-append race tests pass
Photo upload (consent-gated) POST /biometric/subject/{id}/photo 11 unit tests + live roundtrip 200
Erasure (two-scope) POST /biometric/subject/{id}/erase (biometric_only / full) 21 unit tests; transactional rollback on audit failure
Legal-tier audit read GET /audit/subject/{id} (X-Lakehouse-Legal-Token header) Constant-time auth, chain re-verification per request
Retention sweep (BIPA-aware clock) crates/catalogd/src/bin/retention_sweep 8 unit tests; live verified against 100 backfilled subjects
Cross-runtime parity (Rust ↔ Go) scripts/cutover/parity/subject_audit_parity.sh 6/6 byte-identical assertions pass

Key insight for counsel: the audit chain is the BIPA-defensible substrate. Every state-changing event (consent given, photo uploaded, photo erased, legal-tier read) appends to a per-subject HMAC-chained JSONL log. The chain verifies end-to-end on every legal-tier read. A tampered chain is detectable; a forged chain requires the HMAC signing key, which is held under root-only mode 0400 at /etc/lakehouse/subject_audit.key and rotated per the runbook in attachment §6 below.

Gate 3b (deepface classification) — decided 2026-05-05: Option C (defer). The system collects only the photograph, not derived demographic information. The consent template + retention schedule in this packet were revised the same day to match.


2. Documents requiring counsel review + sign-off

In recommended review order:

# Document Path Counsel ask Sign-off
A Biometric Retention Schedule v1 docs/policies/consent/biometric_retention_schedule_v1.md Render into binding language; confirm 18-month operational ceiling vs. BIPA 3-year statutory cap Counsel + J
B Biometric Consent Template v1 docs/policies/consent/biometric_consent_template_v1.md Render Disclosures 1-3 into binding consent language; specify electronic vs. paper signature mechanism Counsel + J
C BIPA Destruction Runbook docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md Confirm 30-day SLA from trigger; confirm two-operator (operator + witness) requirement; confirm legal-hold check procedure Counsel attestation
D BIPA Pre-IdentityD Attestation docs/attestations/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-03.md Sign as countersigning party; J signs as operator-of-record Counsel + J
E Legal-Tier Audit Key Rotation docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md Confirm rotation cadence; opine on candidate-notification obligation when rotation is compromise-driven Counsel notes
F Gate 3b Deepface Design (FYI) docs/specs/GATE_3B_DEEPFACE_DESIGN.md Decision-of-record showing classifications were deliberately deferred, not omitted by oversight. No sign-off needed; provided for audit-trail completeness. None

The five documents requiring sign-off are A, B, C, D, E. Document F is included so the audit trail shows the Gate 3b decision was deliberate.


3. Specific questions for counsel — by document

Document A — Retention Schedule

  1. The schedule sets an 18-month operational ceiling against the BIPA 3-year statutory cap. Is the safety margin appropriate, or should we move to a tighter window (12 months) given the plaintiff-friendly Illinois posture?
  2. The schedule references the catalogd-local storage substrate rather than a separate identityd Postgres table. Does the public-facing language need to mention the storage architecture at all, or is "we keep the photo and a SHA-256 hash" sufficient?
  3. Public publication URL — counsel to specify (placeholder marked in §7 of the schedule).
  4. Confirm whether existing consent under v1 carries forward when a future v2 is published, or whether re-consent is required.
  1. Disclosure 1 says "we do NOT run automated facial-classification in v1." Does that disclosure need to mention the possibility of future classification, or is silence-with-supersession-clause adequate?
  2. Plain-language summary in §1 — counsel to confirm it's appropriate to include alongside the binding disclosure, or recommend an alternative comprehension aid.
  3. Withdrawal SLA is set to 30 days in §2. Counsel to confirm against jurisdiction (Illinois primary; secondary deployments would inherit).
  4. Contact for withdrawal — counsel to specify the channel (placeholder in §3).
  5. Sign-off mechanism: electronic signature service, in-app click-acceptance with timestamp, paper form? Each has different evidentiary weight.

Document C — Destruction Runbook

  1. Confirm 30-day SLA from each of four triggers (retention expiry, consent withdrawal, RTBF, court order). Some interpretations prefer 7 or 14 days for withdrawal/RTBF.
  2. Two-operator requirement (operator-of-record + witness): is the witness role acceptable for counsel's defensibility view, or should we elevate to dual-control with cryptographic split-key?
  3. Legal-hold check procedure (§2 step 3) — counsel to specify the actual procedure for confirming no hold is in force before erasing.
  4. Backup-window disclosure (§4) — confirm 30-day backup retention is acceptable.
  5. Candidate notification template (§3 step 4) — counsel to supply.

Document D — Pre-IdentityD Attestation

  1. Both signature lines blank — J signs as operator-of-record; counsel signs as the countersigning legal party.
  2. The attestation hash anchors the evidence; once signed, the hash itself becomes a tamper-evident witness. Counsel to confirm storage location for the signed copy (firm files?).

Document E — Key Rotation Runbook

  1. Recommended rotation cadence — 90 days suggested in §1. Counsel to confirm or override.
  2. Custody schedule for /etc/lakehouse/_archived/ raw key files — §7.2 question; suggested 1-year retention but counsel-driven.
  3. Candidate-notification obligation when rotation is compromise-driven (§7.3) — counsel call.

4. Engineering changes counsel should know about (recent)

These reconciled doc/code drift after a rapid wave on 2026-05-03:

  • Endpoint paths: the original v1 spec proposed /v1/identity/subjects/* under a separate identityd daemon. That daemon was collapsed into catalogd; endpoints actually shipped at /biometric/subject/* (catalogd-local). Documents in this packet reference the catalogd-local routes; legacy references in IDENTITY_SERVICE_DESIGN.md are flagged "do NOT implement as-written" in that doc's deprecation header.
  • No identityd Postgres database: the original spec proposed encrypted-at-rest Postgres + HashiCorp Vault + S3 Object Lock for PII storage. The shipped substrate is local JSON manifests + per-subject HMAC-chained JSONL, sized for J's local-only deployment per PRD.md line 70 ("Everything runs locally — no cloud APIs").
  • Gate 3b deferral (Option C, 2026-05-05): classifications (gender / race / age inference) were deliberately deferred. The consent template and retention schedule in this packet do NOT disclose collection of derived demographic data, because we are not collecting it. If a future product requirement reverses this, we will publish a v2 consent + v2 retention with re-consent.
  • Key rotation 2026-05-05: the prior LH_SUBJECT_AUDIT_KEY was lost when a /tmp wipe on reboot disabled the audit and biometric endpoints. The new key is at /etc/lakehouse/subject_audit.key (mode 0400). Pre-rotation audit chains tamper-detect under the new key — this is correct, expected behavior, not a bug.

5. Open eng items NOT awaiting counsel

For transparency. These are engineering work items, not legal items:

  1. Residual photo unlink on erasure. During verification of the one historical erasure event (WORKER-2), the verify script surfaced a stranded photo file that was not unlinked when BiometricCollection was cleared from the manifest. Engineering investigates; if the bug is real, the fix is crates/catalogd/src/biometric_endpoint.rs in the erasure handler. This does NOT affect the current packet — no real candidate photos have been collected yet (per §1 attestation), so the residual is from a synthetic test event.
  2. Phase 1.6 §3 employee training. Currently deferred per acknowledgement coverage in §7 of the destruction runbook (single-operator population). Re-promotes to blocking if the operator population grows; counsel may want to opine on the threshold.

6. Sign-off sequence

Recommended order so a hold-up on one doc doesn't block others:

  1. First wave (parallel): A (retention schedule) + B (consent template). These two have the tightest interdependence (consent v1 references retention v1 by hash); review them together.
  2. Second wave: C (destruction runbook). Depends on A's retention period being fixed.
  3. Third wave: D (pre-identityd attestation). Sign once A + B + C are settled; the attestation snapshot is the boundary between pre-Phase-1.6 and post-Phase-1.6 system state.
  4. Fourth wave: E (key rotation). Independent of A-D; can be reviewed in parallel any time.

7. After sign-off — engineering steps

Once each document is signed:

Document Engineering action Trigger
A retention schedule Hash + commit; reference in consent_versions table Counsel signature
B consent template Hash + commit; reference in candidate-facing intake UI Counsel signature
C destruction runbook Adopt; operator acknowledgment recorded in §7 Counsel attestation
D pre-identityd attestation Anchor hash to filesystem + git; counsel keeps original signature Both signatures
E key rotation Adopt; rotation event log seeded with counsel-approved cadence Counsel notes

The HARD blocker for first real-candidate photo collection is A + B + D signed. C and E are operationally important but do not block the first photo (they govern destruction + key handling which apply to any state, not the boundary state).


8. Cover-note hash

This packet is itself a snapshot. Future-Claude / future-J will refer back to this packet to know what counsel saw on 2026-05-05.

Packet attached files (referenced by path):

  • docs/policies/consent/biometric_retention_schedule_v1.md
  • docs/policies/consent/biometric_consent_template_v1.md
  • docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md
  • docs/attestations/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-03.md
  • docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md
  • docs/specs/GATE_3B_DEEPFACE_DESIGN.md
  • docs/PHASE_1_6_BIPA_GATES.md (the spec they all reference)

Per-file SHA-256 hashes are produced by the bundler script (next section); the bundler also creates a tarball ready for transmission.


9. Generating the bundle for transmission

./scripts/staffing/bundle_counsel_packet.sh

Produces reports/counsel/counsel_packet_<DATE>.tar.gz with all referenced documents + a manifest listing per-file SHA-256 hashes. Counsel can verify file integrity on receipt by re-running sha256sum against each file in the tarball.