lakehouse/docs/AUDIT_TRAIL_PRD.md
root b2d717ae44 audit PRD: add §10.5 jurisdictional surface (IL + IN, federal, SOC2)
J flagged that the staffing system targets Chicago + Indiana — added a
jurisdictional checklist section to the audit-trail PRD so counsel has
a working starting point.

Covered:
- Federal: Title VII, ADEA, ADA, EEOC, OFCCP, FCRA, Section 1981
- Illinois: BIPA (high risk if any candidate photos), AI Video Interview
  Act (820 ILCS 42), Illinois Human Rights Act (broader than Title VII),
  PIPA breach notification, Day and Temporary Labor Services Act
  (directly applies — staffing industry-specific recordkeeping), Cook
  County + City of Chicago Human Rights Ordinances (additional protected
  classes including source of income, parental status, credit history)
- Indiana: Data Breach Disclosure, Civil Rights Law (lighter than IL),
  Genetic Information Privacy Act
- SOC 2 Type II as the typical SaaS sale gate (Privacy + Security TSCs
  most relevant; 6-9 month effort to first report)
- HIPAA / PCI / ISO 27001 noted as out of current scope but flagged

Phase reordering implications captured:
- BIPA risk on real candidate photos may need to be resolved BEFORE
  audit-trail work (class-action exposure)
- SOC 2 Type II prep runs in parallel, not after
- IL Day and Temporary Labor Services recordkeeping may override our
  proposed 4-year retention SLA

7 open questions added that counsel must answer before the §8 phases
can be locked in. Document is explicit (multiple times) that this is
NOT legal advice — a research-grade checklist for J's counsel
conversation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 00:56:28 -05:00

26 KiB

PRD: Production-Ready Audit Trail

Status: Draft — 2026-05-03 · Owner: J · Drafted by: working session 2026-05-03

Why this document exists. Staffing client won't sign until we can prove the AI system can defend a discrimination claim. We've been claiming "production-ready" off smoke + parity tests; those prove the surface compiles, NOT that an audit response can be produced for a specific person. This PRD writes the audit-trail capability down before we start building it, so the phases are accountable and the scope doesn't drift mid-implementation.


Scenario. John Martinez worked Warehouse B as a placed candidate. Six months later he files a complaint claiming discrimination during the hiring process. His lawyer requests an audit under EEOC discovery: produce every AI-system decision affecting John between dates D1 and D2.

What we must produce. A response that proves either:

  • (a) John was treated identically to other candidates with comparable qualifications — same scoring criteria, same model invocations, same decision rules — and the outcome differences are explained by non-protected factors, OR
  • (b) The system surfaces exactly what factors led to outcomes, in a form a court can verify, so the claim can be defended on documented criteria rather than "trust the AI."

What we must NOT produce.

  • Other subjects' data (response leaks if even one other candidate's name appears)
  • Internal infrastructure details (DB paths, server names, internal IDs that aren't candidate-shaped)
  • Raw model prompts/completions that contain protected attributes (race, gender, age, etc.) — even if the model didn't use them, their presence in the audit log creates new evidence

The defensibility chain. The audit shows:

  1. Indexing-time decisions — when John was added to the candidate pool, what embedding the model produced, what features were extracted, what categories he was placed into
  2. Search-time decisions — every query that included him in candidate sets, what rank he received, what the model used to compute that rank
  3. Recommendation-time decisions — every fill/recommendation event involving him, what scoring drove it, what validators ran, what they returned
  4. Iteration decisions — any iterate retries that touched him (validator failures, model self-corrections)
  5. Outcome decisions — final fills, rejections, hand-offs

For each, the audit row must show: timestamp, decision type, model + provider, input features (sanitized of protected attributes — see §4), output decision, rationale.


2. The subject audit response — output format

GET /audit/subject/{candidate_id}?from=D1&to=D2
→ JSON or signed PDF (legal preference TBD)

Header section:

  • subject identifier (candidate_id), date range, response generation timestamp, signing daemon, integrity hash
  • pre-translation note: candidate_id ↔ PII mapping is held by the identity service (§5), NOT by this audit endpoint. Legal counsel re-correlates separately under their own access controls.

Per-decision row schema (shape, not exhaustive):

{
  "ts": "ISO-8601 UTC",
  "decision_kind": "embedding_create | search_inclusion | search_rank | fill_recommendation | validation_outcome | iterate_attempt | observer_signal",
  "daemon": "gateway | validatord | observerd | matrixd | ingestd",
  "model": "kimi-k2.6 | deepseek-v3.2 | ...",
  "provider": "ollama_cloud | opencode | openrouter",
  "input_features": { /* what the model SAW  sanitized per §4 */ },
  "output": { /* what the model decided */ },
  "rationale": "model's natural-language explanation, or rule-based justification",
  "trace_id": "X-Lakehouse-Trace-Id linking to Langfuse trace tree",
  "session_id": "iterate session that produced this row"
}

Footer section:

  • Coverage attestation: "this response includes ALL decisions about candidate_id between D1 and D2 that are retained per §6 retention policy"
  • Sign-off: cryptographic signature from a daemon whose key is in escrow (proves audit was generated by the system, not hand-edited)

3. Surface map — where decisions happen

Decision happens at Currently logged where Audit-completeness gap
Ingestion (candidate added to pool) data/_kb/outcomes.jsonl? journald mutation log? UNKNOWN — needs walk
Embedding creation (vector built for candidate) NOWHERE per-candidate; embed cache hits aren't subject-tagged MAJOR GAP — need to subject-tag every embedding
Search inclusion (candidate appeared in a result set) Pathway memory + session JSONL (?) Partial — need subject-correlation
Search rank (position in result set) Result set in chat traces, but not indexed by candidate Partial
Fill recommendation data/datasets/fill_events.parquet (per CLAUDE.md decision A) + pathway memory Probably OK but not verified
Validation outcome (FillValidator/EmailValidator pass/fail) /v1/iterate session JSONL — but validation_kind not populated per yesterday's misread Partial — fix today
Iterate retry escalations Session JSONL attempts[] array OK
Observer signals observerd events at :3800 (or :4219 Go side) UNKNOWN — needs walk
Matrix-indexer compounding (semantic flags, bug fingerprints) pathway_memory/state.json (currently 91 traces) Probably leaks — these are tagged by file/task, not by subject

Substantive finding from this walk: the matrix indexer + pathway memory are tagged by code not by subject. They surface "this code path failed for this task class" — they don't currently let us answer "every decision matrix-indexer made about John." If matrix-indexer fingerprints leak protected-attribute correlations (e.g., a fingerprint that says "candidates from [zip code with majority demographic X] got outcome Y"), that's a discrimination smoking gun that we currently have no way to audit cleanly.


4. PII handling rules

Tokenization rule: candidate_id is the only identifier that crosses runtime boundaries (logs, JSONL, traces, pathway memory, observer events, model prompts). Email / name / address / phone / SSN / DOB are NEVER in any of these surfaces.

Identity service (§5) holds the candidate_id ↔ PII mapping. Only legal-authorized access reads it.

Protected-attribute exclusion at decision time: the model NEVER receives:

  • Race, ethnicity, national origin
  • Sex, gender, marital status, pregnancy
  • Age, date of birth (allowed: years of experience)
  • Religion
  • Disability, genetic information
  • Veteran status (unless legally relevant for the role)
  • Sexual orientation, gender identity

If the model never sees these, no decision can be predicated on them. The audit row's input_features field proves this: by inspecting the row, a lawyer can confirm protected attributes were absent from input.

Inferred-attribute risk. A model can infer protected attributes from non-protected proxies (zip code → race, name → ethnicity, photo → multiple). The audit must surface this risk. Open question: do we ban photo features from candidate scoring? Do we ban surname tokenization? These are policy calls.

Audit response sanitization: the response goes to the candidate's lawyer, not to the world. It contains the candidate's own name (re-correlated by legal). It must NOT contain other candidates' names, even in comparison/ranking rows.


5. Identity service — candidate_id ↔ PII mapping

Current state: data/datasets/workers_500k.parquet has the full PII (per CLAUDE.md). The candidates_safe view (post-fix c3c9c21) is the masked projection. GAP: candidate_id is currently the row position / a derived field — there's no separate identity service. This needs to change.

Target state:

  • identity/ subsystem (new) — holds the candidate_id → {email, name, address, phone, SSN_last4, DOB, ...} mapping
  • All other systems (gateway, validatord, observerd, matrixd, pathwayd) only ever see candidate_id
  • Identity reads require a separate auth credential held by legal-authorized operators
  • Every identity read is itself audited (log who accessed PII for which candidate when)
  • Identity service runs as its own daemon, port-isolated from the gateway
  • Cross-runtime: same identity service backs both Rust and Go

Open question: does the identity service need to be a separate physical daemon (most defensible) or a logically-separated process within an existing one (easier to ship)? Recommend separate daemon — gives legal a single attestable boundary.


6. Retention policy

Current state: UNKNOWN. Pathway memory is append-only. Session JSONL is append-only. We have no documented retention SLA.

Target state (proposed):

  • Active retention: while client is in the system, all audit rows kept hot (queryable in <1s)
  • Legal hold: N years after client/candidate leaves the system, audit rows retained on warm storage. N is TBD — typical EEOC retention is 1-3 years; some state-level claims have 4-year statutes; Title VII discovery can subpoena older. Recommend 4 years minimum, configurable per client contract.
  • Right to be forgotten: if a candidate requests deletion under CCPA/GDPR, we apply tombstoning to the identity service (PII removed) BUT preserve the audit-decision rows under candidate_id (anonymized via PII removal at the source). The decision history remains; the human identification is severed.
  • Cryptographic erasure for append-only logs: pathway memory and matrix indexer can't be selectively deleted without breaking integrity. Encryption-at-rest with per-subject keys lets us "delete" by destroying the key — the encrypted row remains but is unreadable.

Open question: does the staffing client want a documented retention SLA in their contract? If yes, this PRD becomes contract-grade and the numbers above need their sign-off.


7. Current state vs target state

Capability Today Production-ready target Gap
candidate_id as canonical token partial (row position?) UUID, separate from PII Real — needs identity service
Identity service none separate daemon, audited reads Real — build new
/audit/subject/{id} endpoint none live with the §2 schema Real — build new
Subject-tagged embeddings no every embed creates an audit row Real — instrument
Subject-tagged search results partial every result set logged with subject IDs Partial — needs walk
Subject-tagged validation outcomes yes (in session JSONL) yes + integrity-signed Partial
Subject-tagged matrix indexer entries NO yes (decide first whether matrix should be subject-aware at all) Major
Protected-attribute filter at decision time informal enforced at gateway boundary, audited Unknown — needs code walk
Retention policy none documented 4-year hot, configurable cold tier Real — design + build
Right to be forgotten none per-subject cryptographic erasure Real — design + build
Cross-runtime parity for all of the above partial (5 algorithm probes) new audit-parity probes Real — extend probe set

8. Implementation phases (proposed sequence)

Each phase has an exit criterion the next phase can lean on. Don't start phase N+1 until phase N's exit holds.

Phase 1 — Discovery walk (read-only, ~3-4 hours)

Walk every daemon and tag every code path that touches subject identifiers. Output: a complete map of where candidate_id lives today, where email/name/PII leak today, what's logged where. No code changes. Fills in all "UNKNOWN" entries in §3 and §7 with file:line references.

Exit: §3 surface map is fully populated with current-state evidence. §7 gap table has no "Unknown" cells.

Phase 2 — Identity service design (design doc, ~2 hours)

Write docs/IDENTITY_SERVICE.md: schema, port, auth model, read-audit format, cross-runtime contract, migration path from current state. No code changes.

Exit: J approves the design.

Phase 3 — Audit response endpoint (skeleton, ~4-6 hours)

Build /audit/subject/{id} endpoint that returns ALL information CURRENTLY logged about the subject — even before identity service is built, even if logs leak PII, even if subject-tagging is incomplete. This is the "what John Martinez would get today" baseline. Reading the output reveals exactly what's wrong.

Exit: endpoint returns a JSON response for any candidate_id in workers_500k. Contents are reviewed; gaps catalogued.

Phase 4 — Subject tagging across substrates

Instrument the missing decision points (embedding creation, search rank, observer signals, matrix indexer entries) with subject identifiers. Each daemon's instrumentation lands as its own commit. Cross-runtime: each Rust commit ships with a Go-side mirror.

Exit: /audit/subject/{id} response is complete for the worked example (John Martinez at Warehouse B can be reconstructed end-to-end).

Phase 5 — Identity service build

Stand up the identity daemon. Migrate candidate_id ↔ PII mapping out of workers_500k.parquet into the new service. Audit every read. Update all callers to never see PII directly.

Exit: PII grep across all log files / JSONL streams / pathway memory state returns 0 hits. Cross-runtime parity probe added: audit_parity.sh validates Rust + Go produce identical audit responses for the same subject.

Phase 6 — Protected-attribute boundary enforcement

Add a hard filter at the gateway: any model invocation must declare the input features it sees, and protected attributes are stripped at the boundary. Audit row's input_features becomes load-bearing.

Exit: can run discrimination-test scenario: feed protected attribute through, verify it's stripped before model sees it, verify audit row shows the stripping.

Phase 7 — Retention + right-to-be-forgotten

Document retention SLA. Implement tier-down (hot → warm → cold → encrypted-with-deletable-key). Implement subject-erasure endpoint.

Exit: test scenario: subject requests deletion, identity service tombstones, decision rows remain under candidate_id but are unreadable post-erasure, audit response for that subject returns "subject erased" header instead of decision rows.

Decide JSON vs signed PDF for legal output. Build the export pipeline. Sign with a key in escrow.

Exit: can produce the John Martinez audit response in the format legal will accept; signature verifies.

Phase 9 — End-to-end discrimination defense rehearsal

Run the worked example: simulate John Martinez's complaint, generate the audit, walk through what a lawyer would see, identify any remaining gaps, fix them.

Exit: J + (eventually) the staffing client's legal team sign off on the format and completeness.


9. Cross-runtime requirement

Both Rust legacy and Go rewrite must satisfy every phase's exit criterion. The 5 existing parity probes (validator, extract_json, session_log, materializer, embed) cover algorithmic equivalence; they do NOT cover audit. New parity probe audit_parity.sh lands as part of phase 5.

The identity service is the new shared substrate — both runtimes call it; the daemon itself is one implementation (no per-runtime version).


10. Open questions blocking phase 1

These are the things I need J to decide before phase 1 can start, OR I need to investigate-and-propose:

  1. Identity service: separate daemon vs in-process? Recommend separate. Confirm.
  2. Retention period N years? Recommend 4. Need staffing client's legal call.
  3. Photo / surname / zip-code policy? These are inferred-attribute risks. Need policy decision.
  4. JSON or signed PDF for legal export? Different downstream costs.
  5. Right-to-be-forgotten under append-only logs: cryptographic erasure (proposed) or hard delete (breaks integrity)? Confirm crypto-erasure approach.
  6. Audit endpoint auth model: legal-only credential, or shared with admin? Recommend legal-only with separate token rotation.
  7. The "indexed before search" concern: matrix indexer + pathway memory currently fingerprint by code, not subject. Do we (a) make them subject-aware (more audit completeness, more PII surface area), (b) keep them code-only and assert in audit response that "no subject-specific compounding state was used," or (c) something else?

Items 1-6 can be resolved by J's call. Item 7 needs design discussion — the safest answer for legal defense is (b), but it loses the "pathway learns about THIS candidate" signal that may be load-bearing for the staffing UX.


10.5 Jurisdictional surface (IL + IN)

⚠ Not legal advice. This is a research-grade checklist for J to take into a conversation with actual employment + privacy counsel. The system is targeting Chicago (Illinois) and Indiana placements per 2026-05-03 conversation. Counsel needs to verify what currently applies, what's pending, and whether case law has moved any of these in 2026. Verify with counsel before claiming compliance with any item below.

Federal layer (always applies)

Statute / framework Relevance to this system
Title VII (Civil Rights Act) Bans discrimination on race, color, religion, sex, national origin in hiring. Discrimination claim defense is the worked example in §1.
ADEA (Age Discrimination in Employment) Bans age-based discrimination for workers 40+. DOB must be excluded from features per §4.
ADA (Americans with Disabilities Act) Bans disability discrimination + requires reasonable accommodation. Disability-inferring features (gait, photo features, medical history) need exclusion.
EEOC enforcement Receives complaints, issues right-to-sue. Audit response per §2 is what defends in EEOC investigation.
OFCCP Applies if our staffing client serves federal contractors. Adds affirmative-action recordkeeping on top of EEOC.
FCRA (Fair Credit Reporting Act) Triggers if background checks are performed. Pre-adverse-action notice + dispute process needed.
Section 1981 Race-based contract discrimination — staffing is contract relationship.

Illinois-specific (Chicago jurisdiction)

Statute What What we need
BIPA (Biometric Information Privacy Act, 740 ILCS 14) Bans collection of biometric identifiers (face geometry, fingerprints, voiceprints) without informed written consent + retention schedule. Penalties: $1,000-$5,000 per violation per person. Class actions are common and aggressive. If we use candidate photos for any feature (face match, headshot rendering, photo-derived attributes), BIPA almost certainly applies. The headshot pool we generate (per CLAUDE.md commit 5d93a71 area) needs careful review — synthetic faces are probably OK; real candidate photos are NOT without explicit BIPA-compliant consent. Counsel must review.
Illinois AI Video Interview Act (820 ILCS 42) If AI analyzes recorded video interviews, employer must disclose AI use, obtain consent, provide explanation of how AI works, and limit who can review the video. If we ever ingest video, this applies. Currently we don't, but worth flagging to counsel as a "what if we add this in 12 months" boundary.
Illinois Human Rights Act (775 ILCS 5) Broader than federal Title VII — adds protected classes including arrest record, military status, marital status, order of protection, citizenship status (in some cases), unfavorable military discharge. Protected attribute exclusion list in §4 needs expanding to cover IL-specific classes.
Personal Information Protection Act (PIPA, 815 ILCS 530) Breach notification — must notify Illinois residents whose unencrypted PII was breached. If identity service or workers parquet is breached, notification clock starts. Need incident response runbook.
Illinois Day and Temporary Labor Services Act (820 ILCS 175) Specific to staffing/temporary services industry. Includes equal-pay-for-equal-work + record-keeping requirements + worker notification. Highly relevant — applies directly to staffing-company clients. Audit retention may interact with these recordkeeping requirements.
Workplace Transparency Act Restrictions on non-disclosure agreements re: harassment/discrimination Tangential but worth noting.
City of Chicago Human Rights Ordinance (Title 6 Chicago Municipal Code) Adds protected classes beyond IHRA (source of income, parental status, military discharge status, credit history). Chicago-specific protected attributes list.
Cook County Human Rights Ordinance Similar additions county-wide. Chicago is in Cook County so this stacks.
Possible: AI hiring transparency Several states/cities have proposed/passed laws modeled on NYC Local Law 144 (annual bias audit + candidate notification). I do not know whether IL or Chicago has such a law on the books as of 2026-01 cutoff. Counsel must check current state. If it exists, we need annual bias audit reports (which IS what this PRD is building toward, but the report format may have specific requirements).

Indiana-specific

Statute What What we need
Indiana Data Breach Disclosure (IC 24-4.9) Breach notification within "without unreasonable delay" Same incident response runbook as IL PIPA.
Indiana Civil Rights Law (IC 22-9) State-level employment discrimination Largely tracks federal Title VII, fewer expansions than IL.
Indiana Genetic Information Privacy Act Bans use of genetic info in employment Already in §4 protected list.
General observation Indiana is generally less aggressive than Illinois on AI/employment regulation as of cutoff. The IL bar is higher — if we satisfy IL, IN typically follows. Counsel must confirm this isn't backwards.

Cross-cutting (security frameworks for SaaS sales)

These aren't laws but are commonly required by enterprise customers (including staffing clients) before sale.

Framework What Relevance
SOC 2 Type II Auditor attestation of operating effectiveness over 6-12 months across Trust Service Criteria (Security, Availability, Processing Integrity, Confidentiality, Privacy). The Privacy criterion overlaps heavily with this PRD. Privacy + Security are the two load-bearing TSCs. Effort to first Type II report: 6-9 months. Type I (point-in-time) is faster (weeks) but enterprise buyers usually want Type II.
SOC 3 Public-facing summary of SOC 2 (no detailed control descriptions). Nice-to-have for marketing but the staffing client will want SOC 2 Type II report under NDA.
HIPAA Healthcare data protection. Triggers ONLY if staffing places workers into healthcare roles where they handle PHI. Currently not in scope per CLAUDE.md. Confirm scoping with J.
PCI DSS Payment card data Not currently in scope.
ISO 27001 International information security management Alternative to SOC 2; more common in EU. Probably unnecessary for IL/IN-only deployments.

What this means for phase ordering

The 9-phase plan in §8 is technically correct but may need re-ordering once counsel weighs in:

  • BIPA risk on photos is so high and so aggressive that if we use real candidate photos anywhere, that may need to be the FIRST thing we resolve — before the audit-trail work starts. Class-action exposure is enormous.
  • SOC 2 Type II prep runs in parallel with this work, not after. If the staffing client says "show us your SOC 2 report" we need to have started the engagement weeks/months before.
  • Day and Temporary Labor Services Act may impose recordkeeping that interacts with our retention SLA (§6) — counsel may say "no, retention has to be N years for THIS reason, not your defaulted 4."

Open questions for counsel (one ask)

  1. Does the staffing client have an existing SOC 2 report we leverage, or do we need our own?
  2. Are we using any real candidate photos? If yes, is BIPA consent in place?
  3. Does Illinois have an AI hiring transparency law on the books in 2026? If yes, what does the bias audit report need to look like?
  4. What's the IL Day and Temporary Labor Services Act recordkeeping retention period? Does it interact with our 4-year proposed SLA?
  5. Are background checks performed? If yes, do we need FCRA pre-adverse-action workflow integration?
  6. Any healthcare placements? (HIPAA scoping)
  7. Is the staffing client a federal contractor? (OFCCP scoping)

Counsel's answers shape whether the §8 phase plan ships as-is or needs reordering.


11. What this PRD is NOT

  • Not a contract with the staffing client. That document needs lawyers and signs after this is built.
  • Not a regulatory compliance attestation. We can build to the spirit of GDPR/CCPA/EEOC/BIPA/etc — passing actual certification is its own project.
  • Not a guarantee against discrimination claims. It's a guarantee that if a claim is filed, we can produce evidence about how decisions were made.
  • Not a substitute for human review. The audit shows what the AI did; humans still own the final call on hires.
  • Not legal advice. The §10.5 jurisdictional surface is a research-grade checklist, NOT counsel's analysis. Verify everything with actual employment + privacy counsel licensed in IL + IN before claiming compliance with anything in this document.

12. Appendix — terms

  • Subject — a person whose data flows through the system (candidate, worker, applicant). Identified by candidate_id.
  • Decision — any system action that changes a subject's standing (added to candidate pool, ranked in search, recommended for fill, validated, scored, etc.).
  • Audit row — one record in the audit response per decision, with the schema in §2.
  • PII — personally identifiable information per the broad CCPA/GDPR definitions. In this system: name, email, phone, address, SSN, DOB, plus inferred-from-photo attributes.
  • Protected attribute — characteristics that are illegal to discriminate on under federal/state law. The §4 list.
  • Inferred attribute — a protected attribute the model derives from a non-protected feature (zip → race correlation, name → ethnicity correlation).
  • Identity service — the daemon that holds candidate_id ↔ PII mapping. Separate auth.
  • Subject tagging — the practice of labeling every decision/embedding/log row with a candidate_id so the audit endpoint can find it.
  • Cryptographic erasure — making data unrecoverable by destroying its decryption key, even if the encrypted bytes remain on disk. Used for right-to-be-forgotten on append-only logs.

Change log

  • 2026-05-03 — Initial draft. Authored after J flagged the audit-trail gap as the production-readiness blocker.