Per docs/PHASE_1_6_BIPA_GATES.md. Status table now reflects:
DONE (engineering-only, no counsel dependency):
- Gate 4: name→ethnicity inference removed from mcp-server.
Removal note in search.html:3372 + new Bun absence test
(mcp-server/phase_1_6_gate_4.test.ts) with 3 assertions:
walker actually scans files, regex catches synthetic positives,
no offending DEFINITION patterns in any .html/.ts/.js source.
3/3 pass.
ENG-DONE, signature pending:
- §2 attestation: scripts/staffing/attest_pre_identityd_biometric_state.sh
runs three checks against the live state:
1. workers_500k.parquet schema has no biometric/photo/face/image col
2. data/_kb/*.jsonl + pathway state contain no base64 image magic
bytes (JPEG /9j/, PNG iVBOR), no data:image/* MIME prefixes,
no field-name patterns ("photo", "biometric", "deepface_*")
3. data/headshots/manifest.jsonl is entirely synthetic-tagged
3/3 evidence checks pass on the live data dir. Generates a
signed-by-operator+counsel attestation document committed at
docs/attestations/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-03.md
with SHA-256 of the evidence summary so post-signature tampering
is detectable.
ENG-STAGED, awaiting counsel review:
- Gate 1 retention schedule scaffold at
docs/policies/consent/biometric_retention_schedule_v1.md (BIPA
§15(a)). Engineering facts (categories, 18-month operational
ceiling vs 3-year statutory cap, destruction procedure pointer
to Gate 5 runbook) plus ⚖ COUNSEL markers for the binding text.
- Gate 2 consent template scaffold at
docs/policies/consent/biometric_consent_template_v1.md (BIPA
§15(b)(1)-(3)). Required disclosures + plain-language summary +
withdrawal procedure + the structured fields the consent UI must
post to identityd.
- Gate 5 destruction runbook at docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md.
Triggers, pre-destruction checks (incl. chain-verified gate via
/audit/subject/{id}), procedure (legal-tier endpoint), automatic
audit row append (subject_audit.v1 with kind=biometric_erasure),
backup-window disclosure, monthly reporting cadence, audit-trail
attestation procedure cross-referencing the cross-runtime parity
probe.
BLOCKED on engineering design:
- Gate 3 photo-upload endpoint. Requires identityd photo intake
design + deepface integration scope. Deferred to its own session.
DEFERRED:
- §3 employee training material. Gate 5 runbook §7 may serve as
substrate; counsel decides whether a separate program is needed.
Calendar bottleneck is now counsel review. Engineering can stage no
further deliverables until either (a) Gate 3's design conversation
happens or (b) counsel completes review of items 1/2/5/6.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
15 KiB
Phase 1.6 — BIPA Pre-Launch Gates
Status: Draft — 2026-05-03 · Owner: J + outside counsel · Companion to: AUDIT_TRAIL_PRD.md, AUDIT_PHASE_1_5_BIPA_AND_OUTCOMES.md, IDENTITY_SERVICE_DESIGN.md
Why this exists.
IDENTITY_SERVICE_DESIGN.mdv3 §5 Step 0 names Phase 1.6 as a HARD PREREQUISITE: identityd backfill cannot start until Phase 1.6 ships. This doc specifies what Phase 1.6 contains.Scope. BIPA (740 ILCS 14) compliance gates that must be in place BEFORE the system accepts a single real candidate photo. Synthetic-data face pool can keep operating; real-photo intake CANNOT begin without these gates.
Authority. This is an engineering scaffold. Sections marked
⚖ COUNSELneed outside counsel to author the actual legally-binding text. Engineering ships the procedural gates; counsel writes the words.
1. The five BIPA pre-launch gates
Each gate is a deliverable that must ship before real-photo intake. None is optional. Order shown is the recommended ship sequence.
Gate 1 — Public retention schedule (BIPA §15(a))
Required: A publicly-available, written retention schedule for biometric identifiers and information.
What ships:
docs/policies/consent/biometric_retention_schedule_v1.md— public file- Linked from public privacy policy at the deployment URL
- Specifies:
- Categories of biometric data collected (facial geometry derived from candidate photos, age estimate, gender classification, race classification — per Phase 1.5 deepface walk)
- Purpose of collection (identity matching for staffing operations)
- Maximum retention: BIPA §15(a) caps at "3 years from the individual's last interaction with the private entity, whichever occurs first" — recommend 18-24 months as the operational ceiling (provides safety margin)
- Destruction procedure: per Gate 5 below
- Versioned (this is v1; future updates supersede with a new version)
⚖ COUNSEL — write the actual schedule. Engineering provides the operational facts; counsel writes the binding language.
Engineering acceptance: the file is committed, the public URL renders it, and identityd's consent_versions table references it by hash.
Gate 2 — Informed written consent (BIPA §15(b))
Required: Informed, written consent BEFORE any biometric collection occurs.
What ships:
docs/policies/consent/biometric_consent_template_v1.md— public consent template- Versioned, hashed, referenced from identityd's
consent_versionstable - Must disclose, per BIPA §15(b)(1)-(3):
- That biometric identifiers/information will be collected
- The specific purpose for collection (and the length of term — references Gate 1)
- Receipt of a written release authorizing collection
- Consent flow at intake:
- Candidate sees the disclosure on a UI surface (web form / paper / digital signature)
- Candidate provides explicit affirmative action (signature, click-acceptance with timestamp, etc.)
- Identityd records
biometric_consent_status='given'withconsent_versionreference +consent_given_attimestamp - Without identityd recording 'given', no biometric data flows through deepface.
⚖ COUNSEL — write the consent template. Recommended content (engineering view):
- Clear language (not just legal boilerplate)
- Specific to facial-classification (not generic biometrics)
- Includes withdrawal procedure
- Includes data-subject rights enumeration
Engineering acceptance: consent gate is enforced in code at the photo-upload endpoint; identityd refuses biometric writes when biometric_consent_status != 'given'; pre-existing synthetic-face pool is exempt (no consent needed because no real subject).
Gate 3 — Photo-upload endpoint with consent enforcement
Required: Code-level enforcement that real-photo intake checks consent before processing.
What ships:
A new endpoint (proposed: POST /v1/identity/subjects/{candidate_id}/photo) with the following behavior:
- Caller authenticates with service-tier token
- Endpoint queries identityd for
subjects.biometric_consent_status - If status ≠
'given'→ HTTP 403 with reason"BIPA consent required before biometric processing" - If status =
'given': a. Photo bytes accepted, stored to a quarantined path underdata/biometric/uploads/{candidate_id}/{ts}.{ext}(NOTdata/headshots/) b. deepface tagging runs against the photo c. Classifications (gender, race, age) stored tosubjectstable fields (NEW columns — see schema additions below) d. Original photo bytes encrypted under DEK + retained per Gate 1 schedule e.pii_access_logrow written withpurpose_token='biometric_collection' - Response:
{candidate_id, retention_until, consent_version}
Schema additions to identityd subjects:
ALTER TABLE subjects ADD COLUMN biometric_classifications JSONB; -- {gender, race, age} from deepface
ALTER TABLE subjects ADD COLUMN biometric_data_path TEXT; -- quarantined path
ALTER TABLE subjects ADD COLUMN biometric_collected_at TIMESTAMPTZ;
ALTER TABLE subjects ADD COLUMN biometric_template_hash TEXT; -- hash of the photo bytes (for integrity, NOT for re-derivation)
Engineering acceptance:
- Endpoint refuses uploads when consent missing (verified by integration test)
- deepface output never lands in the synthetic-face manifest (
data/headshots/manifest.jsonl) - Real-photo classifications are isolated to identityd
subjectstable — never flow to JSONL sinks - The
/headshots/:keyroute in mcp-server REMAINS synthetic-only — does NOT serve real candidate photos to LLMs without an explicit allowance (proposed: real photos served only to authenticated staffer UI, never to model context)
Gate 4 — Deprecate name → ethnicity inference
Required: The hard-coded NAMES_HISPANIC / SURNAMES_* lookup tables in mcp-server/search.html:3375-3432 (per Phase 1.5 §1B walk) get removed.
What ships:
- A code commit that removes:
FEMALE_NAMES,MALE_NAMESconstantsNAMES_HISPANIC,NAMES_BLACK,NAMES_SOUTH_ASIAN,NAMES_EAST_ASIAN,NAMES_MIDDLE_EASTERNconstantsSURNAMES_HISPANIC,SURNAMES_SOUTH_ASIAN,SURNAMES_EAST_ASIAN,SURNAMES_MIDDLE_EASTERN,SURNAMES_BLACKconstants- The
genderFor()andguessEthnicityFromFirstName()functions - All call sites that consumed these (face-pool bucket selection)
- Replacement strategy:
- For SYNTHETIC face pool routing: deterministic hash of candidate_id selects a face bucket, no demographic inference
- For REAL candidate photos: the candidate's actual photo IS the representation; no inference needed
Why this is BIPA + Title VII risk separately: name-based ethnicity classification is BOTH a discriminatory feature engineering practice (Title VII) AND, when combined with photo-based attribute extraction, a "biometric information derived from a biometric identifier" pattern (BIPA broad reading). Removing the lookup tables forecloses both arguments.
Engineering acceptance:
- Lookup tables removed from search.html
- Unit test asserts no protected-attribute inference functions exist in search.html or any mcp-server module
- Face-pool routing for synthetic faces uses candidate_id hash exclusively
- Phase 1.5 §1B finding closed
Gate 5 — Documented destruction procedure
Required: A written procedure for biometric data destruction at retention expiry OR consent withdrawal OR right-to-be-forgotten request.
What ships:
docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md— operator-facing- Specifies:
- Triggers: retention expiry (per Gate 1), withdrawal, RTBF request, candidate request
- Procedure: identityd
POST /v1/identity/subjects/{id}/erase(legal-tier auth) - Erasure scope:
subjects.biometric_*columns ciphertext-deleted,biometric_data_pathfiles securely overwritten + unlinked, deepface classifications nulled - Backup window: per
IDENTITY_SERVICE_DESIGNv3-B12, residual exists in DB backups for 30 days max; subject is informed - Witnessed: every erasure event written to
pii_access_logwithpurpose_token='biometric_erasure'and the legal-tier JWT signature (proves authorized destruction) - Reporting: monthly internal report of erasures + retention-expiry sweeps; available to counsel on request
⚖ COUNSEL — review the runbook for legal sufficiency. Engineering writes the procedure; counsel attests that the procedure satisfies BIPA §15(a) destruction requirements.
Engineering acceptance:
- Runbook committed
POST /v1/identity/subjects/{id}/eraseendpoint includes biometric-specific erasure path- Daily sweep job destroys biometric data past
biometric_retention_until(separate from general retention sweep — biometric has stricter clock) - Erasure events are logged with cryptographic attestation
2. Cryptographic attestation: no biometric data exists pre-identityd
Per IDENTITY_SERVICE_DESIGN v3-B11. Plaintiffs may argue that the EXISTENCE of biometric schema fields constitutes constructive notice of intent to collect biometric data — therefore consent should have preceded the schema. The defense: prove that no biometric data was actually collected from real candidates before identityd + the consent gate.
What ships:
- A one-shot script
scripts/staffing/attest_pre_identityd_biometric_state.shthat:- Queries
data/datasets/workers_500k.parquetschema and confirms NO column namedphoto,biometric_*,face_*,image_*exists - Greps
data/_kb/*.jsonlanddata/_pathway_memory/state.jsonfor any base64-encoded image bytes (deepface output, photo blobs) - Verifies
data/headshots/manifest.jsonlrows ≤ synthetic face pool size - Hashes the schema + summary; commits the hash to S3 Object Lock (per identity service v3 anchor pattern)
- Queries
- Attestation document
docs/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-XX.mdsigned by J + outside counsel
This is a one-time defense artifact. It establishes the baseline: "as of this date, no biometric data was collected from real candidates."
3. Employee training acknowledgment (general BIPA hygiene)
Required: People with access to biometric data acknowledge BIPA-handling training.
What ships:
docs/policies/BIPA_HANDLING_TRAINING_v1.md— training material covering:- What constitutes biometric identifiers / information
- The consent + retention procedures
- Destruction obligations
- Reporting suspected exposure
- Acknowledgment record per individual (initially: J + counsel + named operators)
- Annual refresh
⚖ COUNSEL — write training content. Engineering doesn't author legal-compliance training.
4. Phase 1.6 exit criteria (gates Phase 2 backfill)
All 5 gates must be DONE before identityd backfill begins. Status as of 2026-05-03 — scaffolds vs. counsel sign-off vs. shipped code:
| # | Gate | Engineering | Counsel | Status |
|---|---|---|---|---|
| 1 | Public retention schedule | scaffolded at docs/policies/consent/biometric_retention_schedule_v1.md |
pending | eng-staged |
| 2 | Consent template | scaffolded at docs/policies/consent/biometric_consent_template_v1.md |
pending | eng-staged |
| 3 | Photo-upload endpoint with consent enforcement | NOT STARTED — depends on identityd photo intake design + deepface integration | n/a until eng | blocked-on-design |
| 4 | Name → ethnicity inference removed | DONE — mcp-server/search.html:3372 removal note + mcp-server/phase_1_6_gate_4.test.ts absence test (3/3 green) |
none required | DONE |
| 5 | Destruction runbook | scaffolded at docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md; erasure endpoint + verify/report scripts marked TODO |
pending | eng-staged |
PLUS:
| # | Item | Engineering | Counsel | Status |
|---|---|---|---|---|
| 6 | Cryptographic attestation pre-identityd | DONE — scripts/staffing/attest_pre_identityd_biometric_state.sh + docs/attestations/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-03.md (3/3 evidence checks pass; signature lines pending) |
pending signature | eng-DONE, signature-pending |
| 7 | Employee training material | scaffold deferred — Gate 5 runbook §7 acknowledgment may serve as substrate | pending | deferred |
Until items 1-5 + 6 are checked off, identity service backfill (Phase 2 §5 Step 5) cannot proceed.
Calendar bottleneck: Items 1, 2, 5, 6 (and #7) await counsel review of the engineering scaffolds. Gate 3 (photo-upload endpoint) is the only remaining engineering work; it's deferred to its own session because it crosses into identityd photo intake and deepface integration scope that hasn't been designed yet.
5. Effort estimate
| Gate | Engineering effort | Legal effort |
|---|---|---|
| Gate 1 (retention schedule) | 0.5 day | counsel-dependent (typically 1-2 weeks for review) |
| Gate 2 (consent template) | 0.5 day | counsel-dependent (typically 2-4 weeks for review and consent UX design) |
| Gate 3 (photo-upload endpoint) | 1-2 days | review of endpoint behavior |
| Gate 4 (deprecate name-ethnicity inference) | 0.5 day | none (engineering-only fix) |
| Gate 5 (destruction runbook) | 1 day | counsel sign-off |
| §2 cryptographic attestation | 0.5 day | counsel + J signature |
| §3 employee training | 0.25 day (admin) | counsel-authored content |
| Total engineering | ~4-5 days | — |
| Total counsel | — | ~3-6 weeks calendar (review cycles) |
The calendar bottleneck is counsel, not engineering. Engineering can stage all 5 gates ready-to-ship in a week. Counsel sign-off + consent UX rollout is the longer pole.
6. Open questions for J + counsel
- Photo-upload UX: is there an existing intake form / staffer console where photo upload would happen? Or is this new UI work?
- Consent collection mechanism: electronic signature service (DocuSign, Adobe Sign), in-app click-acceptance, paper form? Each has different evidentiary weight in litigation.
- Operator list with biometric access: who, today, would be on the named-operators list for §3 training?
- Counsel for sign-off: named outside counsel — same or different from the dual-control legal-token party in identity service?
- Public privacy policy URL: does one exist? If yes, where; if no, that's a separate Gate-1.5 deliverable.
7. What this PRD is NOT
- Not legal advice. The
⚖ COUNSELmarkers exist because the binding text needs lawyers, not engineers. - Not a substitute for a DPIA / PIA. Phase 1.6 satisfies BIPA-specific gates; a Data Protection Impact Assessment is broader and may be required separately.
- Not a SOC2 Type II deliverable. SOC2 is a parallel work stream.
- Not the only gate before production. The full 9-phase audit-trail program continues; Phase 1.6 specifically unblocks Phase 2 (identity service implementation).
Change log
- 2026-05-03 — Initial draft. Authored after
IDENTITY_SERVICE_DESIGNv3 §5 Step 0 named Phase 1.6 as a hard prerequisite to backfill.