phase 1.6 BIPA: scrum-driven fixes
Per 2026-05-03 phase_1_6_bipa_gates scrum (13 findings, 0 convergent).
1 BLOCK verified false positive, 4 real fixes shipped:
False positive (verified):
- opus BLOCK on attest:55 — claimed `set -uo pipefail` without `-e`
makes the post-python3 `if [ $? -ne 0 ]` check unreachable. Verified
WRONG: `X=$(false); echo $?` prints 1. Bash propagates command-
substitution exit through $? on the assignment line. The check IS
the python3 exit gate. Inline comment added to the script noting
the false positive so future scrums don't re-flag.
Real fixes:
1. opus WARN attestation:18 — schema fingerprint hashed names ONLY,
missing column-type changes. A column repurposed to hold base64
photo bytes under its existing name would pass undetected. Now
hashes "name<TAB>type<TAB>nullable=bool" per row. Re-run produced
evidence SHA-256 1fdcc9f1... (vs old 230fffeb..., reflecting the
broader fingerprint scope).
2. opus WARN gate_4_test:60 — definition regex didn't catch
object-literal property forms (`const t = { FEMALE_NAMES: [...] }`)
or TypeScript class fields (`class L { public NAMES_X: string[] = [] }`).
Added two new patterns + a regression test
(Gate 4: object-literal and class-field bypasses are caught) that
exercises 5 bypass forms. 4/4 tests green; 1 minor regex tweak
needed mid-fix to handle single-line class bodies.
3. kimi WARN python3-reliance — script assumed pyarrow installed and
would emit a stack trace into the attestation if not. Added
`python3 -c "import pyarrow"` gate at top with clean install
instructions on failure.
4. opus INFO PHASE_1_6:200 — item 7 (training) silently dropped from
blocking set with bare "deferred" rationale. Now explicitly states
the deferral is conditional on small operator population (J + 1-2
named ops); item 7 re-promotes to blocking if population grows.
⚖ COUNSEL marker added.
Skipped (acceptable as ⚖ COUNSEL placeholders by design):
- kimi WARN consent template:30-day-SLA (counsel decides number)
- kimi WARN consent template:email-placeholder (counsel supplies)
- kimi WARN parquet absence (env override exists; redeployment-aware)
- kimi INFO runbook manual-erasure (marked TODO when /erase ships)
- qwen INFO doc path/status nits (already addressed by file moves)
Tests: 4/4 Gate 4 absence test (incl. new bypass-coverage), 3/3
attestation evidence checks pass on live data.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
4708717f6b
commit
c7aa607ae4
@ -199,7 +199,17 @@ PLUS:
|
|||||||
| 6 | Cryptographic attestation pre-identityd | DONE — `scripts/staffing/attest_pre_identityd_biometric_state.sh` + `docs/attestations/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-03.md` (3/3 evidence checks pass; signature lines pending) | pending signature | **eng-DONE, signature-pending** |
|
| 6 | Cryptographic attestation pre-identityd | DONE — `scripts/staffing/attest_pre_identityd_biometric_state.sh` + `docs/attestations/BIPA_PRE_IDENTITYD_ATTESTATION_2026-05-03.md` (3/3 evidence checks pass; signature lines pending) | pending signature | **eng-DONE, signature-pending** |
|
||||||
| 7 | Employee training material | scaffold deferred — Gate 5 runbook §7 acknowledgment may serve as substrate | pending | **deferred** |
|
| 7 | Employee training material | scaffold deferred — Gate 5 runbook §7 acknowledgment may serve as substrate | pending | **deferred** |
|
||||||
|
|
||||||
Until items 1-5 + 6 are checked off, **identity service backfill (Phase 2 §5 Step 5) cannot proceed.**
|
**Blocking set for Phase 2 backfill:** items **1, 2, 3, 4, 5, 6** must
|
||||||
|
all be DONE. Item 7 (employee training) is reduced from blocking to
|
||||||
|
"deferred" because the Gate 5 destruction runbook §7 already requires
|
||||||
|
operator acknowledgment before legal-tier credentials are issued —
|
||||||
|
that acknowledgment is procedurally equivalent to the training-record
|
||||||
|
requirement when the operator population is small (J + 1-2 named
|
||||||
|
operators). If the operator population grows beyond that, item 7
|
||||||
|
re-promotes to blocking and a separate training program must be authored.
|
||||||
|
|
||||||
|
⚖ COUNSEL — confirm whether item 7 deferral is acceptable for the
|
||||||
|
expected operator population size, or restore it to the blocking set.
|
||||||
|
|
||||||
**Calendar bottleneck:** Items 1, 2, 5, 6 (and #7) await counsel
|
**Calendar bottleneck:** Items 1, 2, 5, 6 (and #7) await counsel
|
||||||
review of the engineering scaffolds. Gate 3 (photo-upload endpoint)
|
review of the engineering scaffolds. Gate 3 (photo-upload endpoint)
|
||||||
|
|||||||
@ -22,27 +22,27 @@ tamper-evident store (filesystem with backups + version control).
|
|||||||
**Schema columns** (18 total):
|
**Schema columns** (18 total):
|
||||||
|
|
||||||
```
|
```
|
||||||
worker_id
|
worker_id int64 nullable=True
|
||||||
name
|
name string nullable=True
|
||||||
role
|
role string nullable=True
|
||||||
email
|
email string nullable=True
|
||||||
phone
|
phone string nullable=True
|
||||||
city
|
city string nullable=True
|
||||||
state
|
state string nullable=True
|
||||||
zip
|
zip int64 nullable=True
|
||||||
skills
|
skills string nullable=True
|
||||||
certifications
|
certifications string nullable=True
|
||||||
archetype
|
archetype string nullable=True
|
||||||
reliability
|
reliability double nullable=True
|
||||||
responsiveness
|
responsiveness double nullable=True
|
||||||
engagement
|
engagement double nullable=True
|
||||||
compliance
|
compliance double nullable=True
|
||||||
availability
|
availability double nullable=True
|
||||||
communications
|
communications string nullable=True
|
||||||
resume_text
|
resume_text string nullable=True
|
||||||
```
|
```
|
||||||
|
|
||||||
**Schema SHA-256:** `4ba17870ce25a186a62bdfc29a3b336947dc2fba8a62c42ca249c81f41d32e30`
|
**Schema SHA-256:** `973b9abe56420de8f88122278b633e813f90a64cf0ddaac6a9811dc0940be676`
|
||||||
|
|
||||||
- PASS: no biometric / photo / face / image column present
|
- PASS: no biometric / photo / face / image column present
|
||||||
|
|
||||||
@ -81,7 +81,7 @@ No biometric identifiers or biometric information from real
|
|||||||
candidates have been collected, processed, or stored prior to
|
candidates have been collected, processed, or stored prior to
|
||||||
the deployment of the Phase 1.6 BIPA pre-launch gates.
|
the deployment of the Phase 1.6 BIPA pre-launch gates.
|
||||||
|
|
||||||
**Evidence SHA-256:** `230fffeb77b502717bcd7161cc74d5a3401b8722acc8d6ed3d524f93e261cd0b`
|
**Evidence SHA-256:** `1fdcc9f1682de27e1a0556d698ce221b74c1e71cf54128763828b4bca7b5c1bf`
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@ -65,6 +65,12 @@ function* walkSource(dir: string): Generator<string> {
|
|||||||
// definitionPatternsFor: returns regexes that match common DEFINITION
|
// definitionPatternsFor: returns regexes that match common DEFINITION
|
||||||
// forms in JS/TS/HTML embedded scripts. A bare reference inside a
|
// forms in JS/TS/HTML embedded scripts. A bare reference inside a
|
||||||
// comment is intentionally NOT matched.
|
// comment is intentionally NOT matched.
|
||||||
|
//
|
||||||
|
// 2026-05-03 opus scrum WARN (gate_4_test:60) added object-literal +
|
||||||
|
// class-field patterns: a developer wrapping the lookup tables in
|
||||||
|
// `const tables = { FEMALE_NAMES: [...] }` or a TypeScript class
|
||||||
|
// field `FEMALE_NAMES: string[] = [...]` would have bypassed the
|
||||||
|
// original 4 patterns silently.
|
||||||
function definitionPatternsFor(symbol: string): RegExp[] {
|
function definitionPatternsFor(symbol: string): RegExp[] {
|
||||||
return [
|
return [
|
||||||
// var / const / let SYMBOL =
|
// var / const / let SYMBOL =
|
||||||
@ -73,8 +79,16 @@ function definitionPatternsFor(symbol: string): RegExp[] {
|
|||||||
new RegExp(`\\bfunction\\s+${symbol}\\s*\\(`),
|
new RegExp(`\\bfunction\\s+${symbol}\\s*\\(`),
|
||||||
// SYMBOL = function( OR SYMBOL = (...) =>
|
// SYMBOL = function( OR SYMBOL = (...) =>
|
||||||
new RegExp(`\\b${symbol}\\s*=\\s*(?:function\\s*\\(|\\([^)]*\\)\\s*=>|async\\s*(?:\\(|function))`),
|
new RegExp(`\\b${symbol}\\s*=\\s*(?:function\\s*\\(|\\([^)]*\\)\\s*=>|async\\s*(?:\\(|function))`),
|
||||||
// class member: SYMBOL(...) { (a method declaration)
|
// class method: SYMBOL(...) {
|
||||||
new RegExp(`^\\s*${symbol}\\s*\\([^)]*\\)\\s*\\{`, "m"),
|
new RegExp(`^\\s*${symbol}\\s*\\([^)]*\\)\\s*\\{`, "m"),
|
||||||
|
// object-literal property assigned to an array OR object value:
|
||||||
|
// { SYMBOL: [...] } or SYMBOL: {...} or SYMBOL: new Set(...)
|
||||||
|
new RegExp(`(?:^|[,{])\\s*${symbol}\\s*:\\s*(?:\\[|\\{|new\\s+(?:Set|Map|Array)\\b)`, "m"),
|
||||||
|
// TypeScript / class field with type annotation. The boundary
|
||||||
|
// before SYMBOL is start-of-line OR `{`/`;`/`}` so single-line
|
||||||
|
// class bodies (`class L { public NAMES_X: string[] = []; }`)
|
||||||
|
// are caught alongside multi-line ones.
|
||||||
|
new RegExp(`(?:^|[{};])\\s*(?:public|private|protected|readonly|static\\s+)*\\s*${symbol}\\s*:\\s*[^=;{]+=\\s*[\\[\\{]`, "m"),
|
||||||
];
|
];
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -128,3 +142,29 @@ test("Gate 4: regex catches a synthetic positive (defense in depth)", () => {
|
|||||||
expect(offenders.some((o) => o.includes("NAMES_HISPANIC"))).toBe(true);
|
expect(offenders.some((o) => o.includes("NAMES_HISPANIC"))).toBe(true);
|
||||||
expect(offenders.some((o) => o.includes("guessEthnicityFromFirstName"))).toBe(true);
|
expect(offenders.some((o) => o.includes("guessEthnicityFromFirstName"))).toBe(true);
|
||||||
});
|
});
|
||||||
|
|
||||||
|
// 2026-05-03 opus scrum WARN regression: the bypass forms a developer
|
||||||
|
// might use to wrap the lookup tables without tripping the original
|
||||||
|
// four patterns. All of these MUST trip a definition regex.
|
||||||
|
test("Gate 4: object-literal and class-field bypasses are caught", () => {
|
||||||
|
const bypassForms = [
|
||||||
|
// Inline object-literal property → array
|
||||||
|
`const tables = { FEMALE_NAMES: ["Maria"] };`,
|
||||||
|
// Object property → Set/Map constructor
|
||||||
|
`const lookups = { NAMES_BLACK: new Set(["X"]) };`,
|
||||||
|
// Multi-line object literal
|
||||||
|
`const all = {\n NAMES_HISPANIC: [...],\n};`,
|
||||||
|
// TypeScript class field with type annotation + initializer
|
||||||
|
`class Lookup {\n SURNAMES_BLACK: string[] = ["X"];\n}`,
|
||||||
|
// TypeScript public field
|
||||||
|
`class L { public NAMES_EAST_ASIAN: string[] = []; }`,
|
||||||
|
];
|
||||||
|
for (const synthetic of bypassForms) {
|
||||||
|
const offenders = findOffenders("bypass_synthetic", synthetic);
|
||||||
|
if (offenders.length === 0) {
|
||||||
|
throw new Error(
|
||||||
|
`Gate 4 bypass not caught — pattern would slip past:\n ${synthetic}`,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|||||||
@ -34,6 +34,17 @@
|
|||||||
set -uo pipefail
|
set -uo pipefail
|
||||||
cd "$(dirname "$0")/../.."
|
cd "$(dirname "$0")/../.."
|
||||||
|
|
||||||
|
# Dependency gate: pyarrow is required to read the parquet schema. Fail
|
||||||
|
# fast with a clear message rather than letting python3 -c emit a stack
|
||||||
|
# trace that gets captured into the attestation as "evidence". (Caught
|
||||||
|
# 2026-05-03 kimi scrum WARN python3-reliance.)
|
||||||
|
if ! python3 -c "import pyarrow" 2>/dev/null; then
|
||||||
|
echo "[attest] FAIL: python3 -c 'import pyarrow' failed." >&2
|
||||||
|
echo "[attest] pyarrow is required to verify workers_500k.parquet schema." >&2
|
||||||
|
echo "[attest] Install with: pip install pyarrow" >&2
|
||||||
|
exit 2
|
||||||
|
fi
|
||||||
|
|
||||||
DATE="${OVERRIDE_DATE:-$(date -u +%Y-%m-%d)}"
|
DATE="${OVERRIDE_DATE:-$(date -u +%Y-%m-%d)}"
|
||||||
OUT_DIR="docs/attestations"
|
OUT_DIR="docs/attestations"
|
||||||
OUT="$OUT_DIR/BIPA_PRE_IDENTITYD_ATTESTATION_${DATE}.md"
|
OUT="$OUT_DIR/BIPA_PRE_IDENTITYD_ATTESTATION_${DATE}.md"
|
||||||
@ -62,12 +73,20 @@ if [ ! -r "$WORKERS_PARQUET" ]; then
|
|||||||
rm -f "$EVIDENCE"
|
rm -f "$EVIDENCE"
|
||||||
exit 2
|
exit 2
|
||||||
fi
|
fi
|
||||||
|
# Hash NAME + TYPE + nullability per column, not just names. A schema
|
||||||
|
# fingerprint over names alone would not invalidate if a column got
|
||||||
|
# repurposed (e.g. resume_text reused to hold base64 photo bytes under
|
||||||
|
# its existing name). Including types catches that class of evasion.
|
||||||
|
# (Caught 2026-05-03 opus scrum WARN on attestation:18.)
|
||||||
SCHEMA=$(python3 -c "
|
SCHEMA=$(python3 -c "
|
||||||
import sys, pyarrow.parquet as pq
|
import sys, pyarrow.parquet as pq
|
||||||
schema = pq.read_schema('$WORKERS_PARQUET')
|
schema = pq.read_schema('$WORKERS_PARQUET')
|
||||||
for f in schema:
|
for f in schema:
|
||||||
print(f.name)
|
print(f'{f.name}\t{f.type}\tnullable={f.nullable}')
|
||||||
" 2>&1)
|
" 2>&1)
|
||||||
|
# Bash assigns + propagates the substitution's exit through \$?.
|
||||||
|
# Verified: X=\$(false); echo \$? -> 1. opus 2026-05-03 BLOCK on this
|
||||||
|
# location was a false positive — the check IS the python3 exit gate.
|
||||||
if [ $? -ne 0 ]; then
|
if [ $? -ne 0 ]; then
|
||||||
echo "[attest] FAIL: schema read error: $SCHEMA" >&2
|
echo "[attest] FAIL: schema read error: $SCHEMA" >&2
|
||||||
rm -f "$EVIDENCE"
|
rm -f "$EVIDENCE"
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user