lakehouse/scripts/staffing/bundle_counsel_packet.sh
root b2c34b80b3 phase 1.6: lock Gate 3b = C, reconcile docs to shipped state, fix double-upload file leak
Four threads landing together — all driven by the audit J asked for before
production cutover.

(1) Gate 3b DECIDED: Option C (defer classifications). `BiometricCollection.classifications`
    stays `Option<JSON> = None` in v1. `docs/specs/GATE_3B_DEEPFACE_DESIGN.md` status
    flipped from "draft / awaits product" to DECIDED. Consent template + retention
    schedule revised to remove all "automated facial-classification" / "deepface"
    language so disclosed scope matches implemented scope.

(2) Endpoint-path drift reconciled across 3 docs. `PHASE_1_6_BIPA_GATES.md`,
    `BIPA_DESTRUCTION_RUNBOOK.md`, and `biometric_retention_schedule_v1.md` had
    references to legacy `/v1/identity/subjects/*` paths (proposed under a separate
    identityd daemon, never shipped) — corrected to actual shipped routes
    `/biometric/subject/*` (catalogd-local). Schema block in PHASE_1_6_BIPA_GATES
    rewritten to reflect JSON `SubjectManifest.biometric_collection` substrate
    (not the proposed Postgres `subjects` table).

(3) New operational artifacts:
    - `scripts/staffing/verify_biometric_erasure.sh` — checks 4 things post-erasure
      (manifest cleared, uploads dir empty, audit row matches, chain verified).
      Smoke-tested live against WORKER-2.
    - `scripts/staffing/biometric_destruction_report.sh` — monthly anonymized
      destruction-event aggregation. Smoke-tested clean.
    - `scripts/staffing/bundle_counsel_packet.sh` — tarballs the counsel-review
      packet with per-file SHA-256 manifest.
    - `docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md` — formal rotation procedure
      operationalized after the 2026-05-05 /tmp wipe incident.
    - `docs/counsel/COUNSEL_REVIEW_PACKET_2026-05-05.md` — cover note bundling
      all eng-staged BIPA docs for counsel review with per-doc questions, sign-off
      checklist, recommended review sequence.

(4) Double-upload file leak fixed in `crates/catalogd/src/biometric_endpoint.rs`.
    `verify_biometric_erasure.sh` smoked WORKER-2 and surfaced a stranded photo
    file. Investigation showed the file was 13-byte test-fixture bytes (zero PII,
    no biometric content); audit timeline showed two consecutive uploads followed
    by one erasure — the second upload had silently overwritten manifest.data_path,
    orphaning the first file. Patched `process_upload` to refuse a second upload
    with HTTP 409 + `error: "biometric_already_collected"` when
    `biometric_collection.is_some()` on the manifest. Operator must explicitly
    POST `/biometric/subject/{id}/erase` first.

    Tests: new `second_upload_without_erase_returns_409` (asserts 409 + manifest
    pointer unchanged + first file untouched on disk). Replaced
    `repeated_uploads_grow_the_chain` with `upload_erase_upload_grows_the_chain_cleanly`
    (covers the legitimate re-collection cycle: chain grows to 3 rows). Updated
    `content_type_with_parameters_accepted` to use 2 distinct subjects (was
    using 1 subject with 2 uploads to test ct parsing — would now 409).

    22/22 biometric_endpoint tests + 59/59 catalogd lib tests green post-patch.

Production posture: gateway needs `cargo build --release -p gateway` +
`systemctl restart lakehouse.service` to pick up the new 409 in live traffic.

Counsel calendar is now the only remaining blocker for first real-photo intake.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 06:19:40 -05:00

100 lines
3.0 KiB
Bash
Executable File

#!/usr/bin/env bash
# bundle_counsel_packet — assemble the counsel-review packet tarball.
#
# Specification: docs/counsel/COUNSEL_REVIEW_PACKET_<DATE>.md §9.
#
# Why this exists: the cover note references a list of documents.
# Counsel needs them as a single transmittable artifact, with per-file
# integrity hashes so they can verify nothing changed in transit.
#
# Output:
# reports/counsel/counsel_packet_<DATE>.tar.gz
# reports/counsel/counsel_packet_<DATE>.manifest.txt (sha256 per file)
#
# Usage:
# bundle_counsel_packet.sh [--date YYYY-MM-DD]
#
# Exit codes:
# 0 — packet bundled successfully
# 1 — one or more referenced documents are missing
# 2 — script error (missing tools, write failure)
set -uo pipefail
cd "$(dirname "$0")/../.."
DATE="$(date -u +%Y-%m-%d)"
while [ "$#" -gt 0 ]; do
case "$1" in
--date) DATE="$2"; shift 2 ;;
-h|--help)
sed -n '2,20p' "$0" | sed 's/^# \?//'
exit 0 ;;
*) echo "unknown flag: $1" >&2; exit 2 ;;
esac
done
# Dependency gate.
for cmd in tar sha256sum; do
if ! command -v "$cmd" >/dev/null 2>&1; then
echo "[bundle] FAIL: required tool '$cmd' not found in PATH" >&2
exit 2
fi
done
# Files in the packet. Order is the recommended counsel-review order
# from the cover note §6.
FILES=(
"docs/counsel/COUNSEL_REVIEW_PACKET_${DATE}.md"
"docs/policies/consent/biometric_retention_schedule_v1.md"
"docs/policies/consent/biometric_consent_template_v1.md"
"docs/runbooks/BIPA_DESTRUCTION_RUNBOOK.md"
"docs/attestations/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-03.md"
"docs/runbooks/LEGAL_AUDIT_KEY_ROTATION.md"
"docs/specs/GATE_3B_DEEPFACE_DESIGN.md"
"docs/PHASE_1_6_BIPA_GATES.md"
)
# Verify all referenced files exist before opening the tarball.
MISSING=0
for f in "${FILES[@]}"; do
if [ ! -r "$f" ]; then
echo "[bundle] MISSING: $f" >&2
MISSING=$((MISSING + 1))
fi
done
if [ "$MISSING" -gt 0 ]; then
echo "[bundle] FAIL: $MISSING required documents missing — aborting" >&2
exit 1
fi
OUT_DIR="reports/counsel"
mkdir -p "$OUT_DIR"
TARBALL="$OUT_DIR/counsel_packet_${DATE}.tar.gz"
MANIFEST="$OUT_DIR/counsel_packet_${DATE}.manifest.txt"
# Build the manifest first — counsel uses this to verify integrity.
{
echo "# Counsel Packet Manifest — $DATE"
echo "# Generated: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
echo "# Each file is listed with its SHA-256 hash. To verify on receipt:"
echo "# tar xzf counsel_packet_${DATE}.tar.gz"
echo "# sha256sum -c counsel_packet_${DATE}.manifest.txt"
echo "# (re-format the lines below with two spaces between hash and path"
echo "# for sha256sum -c compatibility — sha256sum's strict format)"
echo
for f in "${FILES[@]}"; do
sha256sum "$f"
done
} > "$MANIFEST"
# Build the tarball — include the manifest itself.
tar -czf "$TARBALL" "${FILES[@]}" "$MANIFEST"
PACKET_HASH=$(sha256sum "$TARBALL" | awk '{print $1}')
echo "[bundle] packet: $TARBALL"
echo "[bundle] manifest: $MANIFEST"
echo "[bundle] tarball SHA-256: $PACKET_HASH"
echo "[bundle] files: ${#FILES[@]}"