lakehouse/ops/systemd/lakehouse-retention-sweep.service
root 68d226c314 phase 1.6: BIPA withdrawal endpoint + UI + retention sweep timer
Closes the four production gaps that were live after the consent
endpoint shipped (76cb5ac):

(1) Withdrawal endpoint POST /biometric/subject/{id}/withdraw
    backs the BIPA right of withdrawal that consent template v1 §2
    explicitly promises. Without it, the only way to honor a
    candidate's withdrawal request was the heavier /erase, which
    destroys immediately rather than starting the 30-day SLA clock
    that the consent template commits to. Side-effects:
      - manifest.consent.biometric.status: Given → Withdrawn
      - manifest.consent.biometric.retention_until: 18mo → 30d
      - audit row kind=biometric_consent_withdrawal, captures
        reason + operator_of_record + evidence_path
      - DOES NOT touch general_pii or subject.status — biometric
        is independently revocable
    State machine: Given→Withdrawn (happy), NeverCollected/Pending→
    409 nothing_to_withdraw, Withdrawn→409 already_withdrawn (won't
    advance the destruction clock), Expired→409 already_expired,
    subject Erased/RetentionExpired→403 subject_inactive.
    12 new unit tests covering happy path + all guards + a full
    grant→withdraw cycle that asserts retention_until is correctly
    accelerated and the audit chain has 2 rows in correct order.

(2) Withdraw UI at /biometric/withdraw (mcp-server-served HTML).
    3-screen flow: operator auth (token + name in sessionStorage),
    withdrawal form (candidate_id + free-text reason ≥10 chars +
    optional evidence path), confirmation showing the audit row
    HMAC + the 30-day retention_until clock + a curl recipe for
    /audit/subject/{id} verification. Same neo-brutalist styling
    as biometric_intake.html. Mounted at
    http://localhost:3700/biometric/withdraw and externally at
    https://devop.live/lakehouse/biometric/withdraw.

(3) Retention sweep systemd timer. crates/catalogd/bin/retention_sweep
    binary already existed; this commit schedules it. Daily 03:00 UTC,
    Persistent=true so a missed boot triggers on next start. Service
    runs as oneshot with --apply (writes a date-stamped JSONL to
    data/_catalog/subjects/_retention_sweep_<date>.jsonl ONLY when
    overdue subjects exist, per the binary's existing semantics).
    install.sh updated to handle .timer + paired .service correctly:
    enables the timer, skips direct start of the oneshot service
    (the timer pulls it in). One-shot manual test confirmed clean:
    100 subjects scanned, 0 overdue (all backfill subjects within
    their 4-year general retention window).

(4) operator_of_record bug fix in intake UI. Previously the page
    hardcoded the literal string 'intake_ui_operator' as the
    operator_of_record sent to /consent — meaning every audit row
    captured the same useless placeholder, defeating the whole
    point of operator traceability. Fixed by adding an operator
    name field to the token-paste step (sessionStorage-backed),
    passed through to consent + photo POSTs as the actual operator.

Verified live post-restart:
- gateway /audit/health + /biometric/health both 200
- mcp-server /biometric/intake + /biometric/withdraw both 200
- Live withdraw probes: 401 (no token), 400 (empty body), 404
  (ghost subject), 409 nothing_to_withdraw on WORKER-1 (which
  is NeverCollected per backfill default) — all expected
- Binary strings contain: process_withdraw, withdraw_consent,
  biometric_consent_withdrawal, biometric_withdraw_response.v1,
  nothing_to_withdraw, already_withdrawn, already_expired,
  /subject/{candidate_id}/withdraw route
- systemd: lakehouse-retention-sweep.timer active+enabled,
  next fire Tue 2026-05-05 22:00 CDT (= 03:00 UTC May 6)
- Manual one-shot of retention sweep service: exit 0/SUCCESS,
  100 subjects loaded, 0 overdue

83/83 catalogd lib tests + 46/46 biometric_endpoint tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 15:09:32 -05:00

34 lines
1.3 KiB
Desktop File

[Unit]
Description=Lakehouse retention sweep — flag biometric + general PII subjects past their retention_until clock (BIPA + general retention compliance)
Documentation=file:///home/profit/lakehouse/docs/PHASE_1_6_BIPA_GATES.md
After=lakehouse.service
[Service]
Type=oneshot
User=root
Group=root
WorkingDirectory=/home/profit/lakehouse
# Use the release binary built alongside catalogd. retention_sweep
# deliberately does NOT auto-mutate state — it writes a date-stamped
# JSONL report to data/_catalog/subjects/_retention_sweep_<date>.jsonl
# (see crates/catalogd/src/bin/retention_sweep.rs). Operators run
# actual destruction via the destruction runbook + verify scripts.
#
# Without --apply the sweep is dry-run only; we want the persisted
# report so the audit trail captures every daily check.
ExecStart=/home/profit/lakehouse/target/release/retention_sweep \
--storage-root /home/profit/lakehouse/data \
--apply
StandardOutput=append:/var/log/lakehouse/retention_sweep.log
StandardError=append:/var/log/lakehouse/retention_sweep.log
# Bound the runtime — sweep over 100k subjects should complete in
# seconds; if it ever hits this we want a hard stop, not a runaway.
TimeoutStartSec=10min
# Guardrails. retention_sweep is read-mostly + writes one JSONL.
CPUQuota=100%
MemoryMax=2G
Nice=15
[Install]
WantedBy=multi-user.target