Two follow-up walks per AUDIT_PHASE_1_DISCOVERY §10/C4 + gemini scrum
flag. Read-only. No code changes.
BIPA findings:
- scripts/staffing/tag_face_pool.py uses deepface to extract gender +
race + age from face images. Output persists to data/headshots/
manifest.jsonl. For synthetic faces this is fine; for real candidate
photos this becomes a regulated biometric database (740 ILCS 14/10).
- mcp-server/index.ts:1408 ComfyUI prompt EXPLICITLY embeds protected
attributes (age + race + gender) into model prompt — system-level
encoding of protected-attribute features into AI workflow.
- mcp-server/search.html:3375-3432 has hard-coded FEMALE_NAMES /
MALE_NAMES / NAMES_HISPANIC / SURNAMES_* lookup tables — name-based
ethnicity inference. Title VII / disparate-impact risk separate
from BIPA.
- data/headshots/manifest.jsonl is TRACKED IN GIT today (synthetic
classifications). For real photos, this would be biometric data
in version control — serious failure.
- No consent flow, no public retention schedule, no deletion
procedure, no employee training documented. All required by BIPA
§15(a)/(b) before real-photo intake.
outcomes.jsonl sample:
- 39/101 rows persist candidate names in fills[*].name field today
- Sample names: "Carmen I. Garcia", "Jamal Z. Jones", "Jacob N. Patel"
(synthetic but real shape)
- 0 hits for "culture fit" / "communication" / etc proxy phrases —
synthetic data doesn't generate them. When real models reason about
real candidates, they will. Append-only persistence makes RTBF
cryptographic-erasure-only.
Recommends Phase 1.6 (NEW) — BIPA pre-launch gates between Phase 1.5
and Phase 2: BIPA_COMPLIANCE_POLICY.md, consent gate at upload
endpoint, quarantine real-photo classifications to data/biometric/,
deprecate name->ethnicity lookup tables, unit test that synthetic
manifest stays synthetic. 4-8 hours of design + one code commit.
5 open questions for J: where do real photos enter, will deepface
tagging path stay for real photos, consent UX, retention duration
floor, designated privacy officer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>