lakehouse

Author	SHA1	Message	Date
root	dbcd05c5c5	audit docs: deprecation headers — over-scoped for local-only deployment Today's PRD-line-70 reframe (everything runs locally) means the audit-trail docs I drafted earlier this session are over-engineered for J's actual deployment model. They were sized for SaaS-tier infra (Vault/KMS/S3 Object Lock/dual-control JWT/separate Postgres) — appropriate for a multi-tenant cloud service, wrong for a single-box local install. Adding clear deprecation headers so future sessions don't read these as authoritative and propose another 17-20 day plan involving cloud infrastructure that would re-violate PRD line 70. What STAYS valid (preserved in headers): - The legal use case (John Martinez worked example) - The IL/IN jurisdictional surface (counsel checklist) - The Phase 1 + 1.5 discovery findings (PII flow paths file:line) - Phase 1.6 BIPA gates (when real photos arrive) What's OVER-SCOPED (flagged in headers): - The 9-phase implementation plan - The identity service design (Vault/KMS/dual-control) Future v2 of these docs needs to be sized for local single-box: a few hundred LOC of local writers + signed local audit file, not 17-20 days of distributed-systems design. No code changes. Just doc-level guardrails for future scope drift. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 02:42:05 -05:00
root	5f40b7a312	STATE_OF_PLAY: lock in today's reverts as DO NOT RELITIGATE Five new entries to prevent today's cleanup from being undone by future sessions or future PRs that don't read the full context: - PRD line 70 load-bearing — local-only on customer hot path. PR #13's cloud-routing defaults reverted (d054c0b). Cloud is opt-in dev-only. - /v1/usage by_provider=ollama is the canary. Anything else for customer-shape traffic = regression. - ./scrum is a TOOL, not architecture. Outputs to data/_kb/ scrum_findings.jsonl. Findings inform dev, do NOT auto-fold into design docs. - Test code in main is actively being cleaned. Today: 12 files / ~2900 LOC removed (commits 6aafd41 + f4ebd22). Surface more candidates, don't auto-delete unless clearly orphaned. The intent: future me (or future Claude session) reads STATE_OF_PLAY on cold-start, sees these entries, and doesn't re-make the same mistakes that drifted scope today. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 02:07:57 -05:00
root	f4ebd2278b	remove 7 more orphaned experimental scripts from scripts/ Continuing the test-code-in-main cleanup. These are sequential mode-runner experiment passes (2/3/4/5) that completed and whose findings were captured in pathway_memory + the matrix index — the scripts themselves are dead weight. Plus two one-off probe scripts. Removed (all 0 refs in production code or automation): - mode_pass2_corpus_sweep.ts — 2026-04 corpus sweep experiment - mode_pass3_variance.ts — variance measurement run - mode_pass4_staffing.ts — staffing-domain pass - mode_pass5_summarize.ts — summarization variance - mode_pass5_variance_paid.ts — paid-model variance - overnight_proof.sh — overnight stress probe (output in logs/) - ab_t3_test.sh — T3 overseer A/B test (output captured in KB) Verified: 0 references in package.json / justfile / Makefile / any active .ts/.rs/.sh file. Two mentions remain in docs/recon and docs/MODE_RUNNER_ TUNING_PLAN — those are historical design-doc references, not consumers. KEPT in scripts/ (have live consumers OR are runtime tools): - mode_experiment.ts (14 refs), mode_compare.ts (7 refs) - lance_smoke.sh, build_*_corpus.ts, staffing_demo.py, lance_tune.py, generate_demo.py, generate_workers.py, copilot.py, kb_measure.py, kb_staffer_report.py, analyze_chicago_contracts.ts, dump_raw_corpus.sh, check_phase44_callers.sh, autonomous_agent.py, build_answers_corpus.ts, build_lakehouse_corpus.ts, build_scrum_findings_corpus.ts, build_symbols_corpus.ts, e2e_pipeline_check.sh, scale_test.py, scale_10m_test.sh, run_staffer_demo.sh, stress_test.py Build clean. If any of these are needed back: git show HEAD~1 -- scripts/<file> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 02:06:29 -05:00
root	6aafd41785	remove 5 orphaned dev experiments from tests/real-world/ Per J: "all of our test code ended up in the main." These are 5 one-time dev experiments that were never wired into any automation and have zero live consumers in the production code path. Deleting them. Removed (1418 LOC total): - enrich_prd_pipeline.ts (528 LOC) — Phase 21 architecture stress test - nine_consecutive_audits.ts (185 LOC) — empirical study of audit compounding - hard_task_escalation.ts (267 LOC) — escalation-ladder test (refs retired cloud models qpt-oss:20b/120b) - autonomous_loop.ts (214 LOC) — wrapper experiment around scrum_master - consensus_reducer_design.ts (224 LOC) — N=3 design consultation; output JSON referenced from pathway_memory.rs comment but the script itself has no consumer Verified: 0 references in package.json / justfile / Makefile / any production .ts/.rs/.sh file. The single mention from pathway_memory.rs is a //! doc comment referencing the JSON output (data/_kb/ consensus_reducer_design_*.json), not the script. Build clean post-delete. KEPT: - scrum_master_pipeline.ts — referenced from observer.ts, vectord, scripts - scrum_applier.ts — referenced from auditor schemas If you need any of these back, they're in git history. cherry-pick or git show HEAD~1 -- tests/real-world/<file>.ts will recover the source. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 02:05:24 -05:00
root	bb5a3b3f5e	execution_loop: align overseer log/KB strings with reverted local route Yesterday's revert (d054c0b) changed the API CALL from cloud to local but missed the LogEntry + KB row that record what model fired. Result: honest API call to qwen3.5:latest, dishonest log/KB rows saying "claude-opus-4-7". That's a real audit-trail integrity issue — the record didn't match reality. Fixed: - LogEntry "system" role label (line 663) - KB row's "model" field (line 685) Both now correctly show "qwen3.5:latest". Build + restart + smoke 10/10 green. Gateway healthy. Side note: the only remaining "claude-opus-4-7" mentions in this file are now in COMMENTS describing the v1 cloud route + the revert rationale — those are documentation, not log fields. Safe to keep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 02:03:06 -05:00
root	a44ccde845	observer: overseer fallback label → qwen3.5:latest (matches reverted route) Mirror of yesterday's execution_loop overseer revert (commit d054c0b). The observer logs an "overseer:<model>" endpoint string for analysis; when row.model is missing it falls back to a hardcoded label. PR #13 set that fallback to "claude-opus-4-7" — but the route now goes to local Ollama qwen3.5:latest, so the label was wrong. Trivial one-line fix, no behavior change. Just keeps observer's endpoint string honest when older rows from the cloud-routing window get re-analyzed. End-to-end verification of the local hot path (post-revert): BEFORE /v1/usage by_provider: [] AFTER /v1/usage by_provider: [{"k":"ollama","v":2}] → /v1/iterate fired 2 chat calls, both to local ollama → ZERO cloud requests (no kimi/openrouter/opencode/ollama_cloud) → API meter on cloud providers stays at 0 for customer requests Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 02:01:18 -05:00
root	2f5ca95875	scrum: real tool — auto-bundle current diff, run 3-reviewer scrum, push to KB J's frustration captured: scrum was a TOOL meant to find gaps in the work and push findings to a KB that informs how WE work on the code. It became welded into architecture instead of staying a tool. This commits the tool in the form J actually meant. Usage: ./scrum auto-bundle origin/main..HEAD, auto-label ./scrum my_label same with explicit label ./scrum --staged bundle staged-only diff (pre-commit) ./scrum --since=COMMIT bundle from a specific commit Output: KB row → data/_kb/scrum_findings.jsonl (one row per scrum run) Verdicts → reports/scrum/_evidence/<date>/verdicts/<label>_*.md The KB row carries: timestamp, label, diff size, findings count, convergent count, branch, head SHA, tally excerpt, paths to full verdicts. Queryable via jq or DuckDB. Cloud models (opus + kimi-k2 + qwen3-coder) are used here BY DESIGN — 3-lineage cross-review needs distinct training corpora. This is dev tooling, not the runtime hot path. PRD line 70 (no cloud APIs) applies to customer requests, not to J's dev tools. Tested live: ./scrum --since=HEAD~1 revert_only → 7 findings, 1 convergent (INFO), all 3 reviewers VERDICT: ship for the revert commit. KB row written: {"label":"revert_only","findings_total":7,"findings_convergent":1, "diff_bytes":5806,"branch":"demo/post-pr11-polish-2026-04-28",...} What scrum does NOT do (intentionally): - It does NOT auto-fold findings into architecture / design docs - It does NOT block any commit/push - It does NOT mutate any code - It does NOT run as part of normal customer requests It reports. J reads. J decides what to do with the findings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:59:32 -05:00
root	d054c0b8b1	REVERT cloud routing on hot path — back to local Ollama per PRD line 70 PRD line 70: "Everything runs locally — no cloud APIs, total data privacy." Yesterday's PR #13 (feb638e) violated this by routing customer-facing inference paths to opencode + ollama_cloud + openrouter. Reverting the hot-path routes only; cloud providers stay configured in providers.toml for explicit dev-tool opt-in. Reverted: - modes.toml staffing_inference: kimi-k2.6 → qwen3.5:latest (local Ollama) - modes.toml doc_drift_check: gemini-3-flash-preview → qwen3.5:latest - execution_loop overseer: opencode/claude-opus-4-7 → ollama/qwen3.5:latest Was a paid Anthropic call on every overseer escalation; now local + free. Gateway compiles + restarts clean. Lance smoke 10/10. Live providers list unchanged (kimi/ollama_cloud/opencode/openrouter all still CONFIGURED; they just aren't ROUTED to from the staffing inference path anymore). This stops the API meter on customer requests. Cloud providers remain opt-in via explicit provider= caller hint, which the scrum tool + auditor pipeline + bot/propose use deliberately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:57:20 -05:00
root	0c74b82fc8	phase 1.6 gate 4: REMOVE name → ethnicity / gender inference Per docs/PHASE_1_6_BIPA_GATES.md Gate 4 + AUDIT_TRAIL_PRD §4 protected- attribute exclusion rule. The lookup tables + inference functions in search.html (3375-3499) and console.html (245-311) were dead code in the rendering path — headshot rendering disabled 2026-04-28 left these functions defined but unused. Removing them forecloses both Title VII discriminatory-feature-engineering AND BIPA biometric-information- derived-from-biometric-identifier arguments. Removed: - FEMALE_NAMES, MALE_NAMES, NAMES_HISPANIC, NAMES_BLACK, NAMES_SOUTH_ASIAN, NAMES_EAST_ASIAN, NAMES_MIDDLE_EASTERN - SURNAMES_HISPANIC, SURNAMES_SOUTH_ASIAN, SURNAMES_EAST_ASIAN, SURNAMES_MIDDLE_EASTERN, SURNAMES_BLACK - guessGenderFromFirstName(), guessEthnicityFromName(), guessEthnicityFromFirstName(), genderFor() From both search.html and console.html. Replacement: deprecation comment block referencing the BIPA gates doc. Verified: zero live consumers anywhere in mcp-server/. Searched for genderFor()/guessEthnicityFromName()/guessEthnicityFromFirstName()/ guessGenderFromFirstName() call sites — none remain. Per J 2026-05-03: this kind of test-code-leaked-into-main is exactly what J wants cleaned up. The face-pool inference was meant as a testing tool for synthetic icon generation but ended up as production-shape inference logic in the customer-facing UI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:48:21 -05:00
root	cd440d4cee	audit phase 1.6: BIPA pre-launch gates — block identity-service backfill Per IDENTITY_SERVICE_DESIGN v3 §5 Step 0, Phase 1.6 is hard prerequisite to identityd backfill. This doc specifies the 5 gates + 2 supporting deliverables that must ship before real-photo intake. Five gates (BIPA §15 compliance): 1. Public retention schedule — counsel writes; engineering files+hash 2. Informed written consent — counsel writes template; engineering wires identityd consent-status enforcement 3. Photo-upload endpoint with consent enforcement — POST /v1/identity/ subjects/{id}/photo with hard 403 when biometric_consent_status != 'given'; quarantined storage path; deepface output isolated to identityd subjects table (not synthetic-face manifest) 4. Deprecate name → ethnicity inference (mcp-server/search.html lookup tables removed; Phase 1.5 §1B finding closed) 5. Destruction runbook — operator-facing; ties to identityd /erase endpoint with biometric-specific erasure path; daily sweep job for biometric_retention_until expiry Plus: - Cryptographic attestation that no biometric data exists pre-identityd (per v3-B11) — defends against infrastructure-as-notice plaintiff argument - Employee BIPA-handling training acknowledgment Engineering effort: ~4-5 days (one week to stage everything ready). Counsel effort: ~3-6 weeks calendar (review cycles dominate). Calendar bottleneck is counsel, not engineering. Phase 1.6 exit = 7 checked gates + signoffs. Until done, identityd backfill cannot proceed (per identity service design v3 §5 Step 0). 5 open questions for J + counsel: photo-upload UX, consent mechanism (DocuSign/click/paper), named operator list, named counsel for sign-off, public privacy policy URL. No code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:41:29 -05:00
root	8129ddd883	identity service: v3 amendments — second-pass scrum BUILD-WITH-CHANGES Re-scrummed v2 across opus + kimi + gemini. All 3 verdict: BUILD-WITH-CHANGES. v1 blockers verified RESOLVED. 12 new v2 findings folded as v3 amendments in §12. Convergent v2 findings (≥2 reviewers): v3-A1: mTLS CA root must NOT live in identityd (opus + gemini). v3 fix: Vault PKI for CA, identityd as intermediate. v3-A2: Dual-control public key registry must be tamper-evident (opus + gemini). v3 fix: Vault KV with separate access policies + server-issued nonces for replay protection. Single-reviewer v3 amendments (10 more): - B1: Step 8 fallback-to-SQL needs explicit 14-day time bound - B2: NER drop-on-detect needs Prometheus alerting - B3: legal-tier notification transport spec'd (signed Slack/email, no PII in body, failure non-blocking) - B4: Step 6 human review SLA flagged — ~7 months at 500/day for ~100k unknown rows; operational decision needed - B5: Memory zeroing in Go is best-effort (Rust uses zeroize crate); documented as not cryptographic-grade - B6: purpose_definitions needs versioning + emergency revocation (purpose_versions + purpose_revocations tables) - B7: Cache invalidation needs erasure_generation atomicity (subjects.erasure_generation int; gateway rejects stale-gen cache hits) — replaces best-effort pub/sub - B8: 15-min cooling-off period for dual-control issuance to prevent emergency-bypass culture - B9: NER calibrated test set with target recall ≥99.5% on synthetic adversarial PII - B10: S3 Object Lock in separate AWS account with write-only IAM; root credentials held by external party - B11: BIPA infrastructure-as-notice attestation in Phase 1.6 doc - B12: Backup retention vs ciphertext-deletion erasure window documented in RTBF runbook Estimate revised v2 12-15d → v3 17-20d. Worth it — the cost is what buys "I would build this" from 3 independent senior security architects across 3 model lineages. Must-have v3 items (block implementation): A1, A2, B1, B6, B7, B11. Should-have (ship in Phase 5 if calendar tight): B2-B5, B8-B10, B12. Re-scrum NOT recommended for v3 — diminishing returns; must-have items are concrete fixes with clear acceptance criteria. No code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:39:35 -05:00
root	298fadce41	identity service: v2 — fold cross-lineage scrum findings + 4 'would not build' blocker fixes Scrummed v1 across opus + kimi + gemini lineages via the new model fleet. 3/3 reviewers said 'I would NOT build v1 as written.' 4 convergent blockers, all resolved in v2: 1. Migration order wrong — backfill before validation creates dark database; if backfill bug, no production traffic catches it. v2 inserts BIPA-prereq Step 0 + shadow-write before backfill + shadow-read before cutover. 9-step migration with cryptographic attestation of completeness at quarantine. 2. Master key on disk + legal token static file = 'security theater' per all 3. v2: HashiCorp Vault Transit / AWS KMS for KEK (not sealed file). Legal token: split-secret short-lived JWT (max 24h), dual-control issuance (J + counsel both sign), revocable in <60s. 3. consent_status='inferred_existing' is BIPA prima facie violation (kimi+gemini explicit). v2 backfill uses 'pending_backfill_review'; biometric data NEVER backfilled — separate consent stream. 4. Healthcare default 'general' = HIPAA exposure window for every misclassified subject. v2 default 'unknown' with fail-closed routing (treat unknown as healthcare-equivalent until classified by manual review). Auto-escalation to healthcare on resume_text pattern match. Plus 12 single-reviewer additions: - mTLS mandatory between gateway↔identityd (kimi) - External anchor for audit chain: S3 Object Lock 7-year compliance mode, hourly + on-event commits (all 3) - Audit-log signing key separate from encryption KEK (opus) - Field-level authorization via purpose_definitions table (kimi) - Per-row encryption keys deferred to Phase 7 (kimi simplification) - pii_access_log itself needs legal-tier read auth (opus) - Synchronous cache invalidation pub/sub on RTBF (opus) - Outbound NER pass for Langfuse defense-in-depth (opus TOCTOU) - model_version_hash per decision row (gemini) - /vertical minimal-disclosure endpoint (kimi HIPAA min-necessary) - Auto-escalation healthcare on resume_text pattern (kimi) - Rate limiting + token revocation list (opus) - Oracle tests in audit_parity.sh (kimi SOC2 CC4.1) Architecturally simplified per scrum: - Per-row encryption keys deferred to Phase 7 (single DEK + HSM- wrapped KEK + ciphertext deletion is equivalent practical erasure with less complexity) - PDF render deferred (JSON ships first) - Training-safe export deferred (not critical path) Estimated effort revised 8-10 → 12-15 days. Worth it — every addition was a 3/3-reviewer convergent finding. Re-scrum recommended before implementation starts to verify v2 addresses the v1 blockers. No code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:36:07 -05:00
root	565ea4b32a	audit phase 2: IDENTITY_SERVICE_DESIGN.md — full design doc Incorporates J's confirmed answers (2026-05-03): - separate daemon (identityd) on :3225 / :4225 - signed JSON with PDF render for legal export - legal-only credential separate from admin token - Langfuse self-hosted (drops cross-border concern) - EU placeholder fields, not enforced - healthcare vertical routing — local-only models for healthcare PHI - training-safe export with hashed pseudonyms Plus Phase 1 + 1.5 findings + scrum-driven priorities: - UUID v7 candidate_id (drops kimi enumeration risk) - per-row encryption with per-subject keys (crypto-erasure target) - pii_access_log with Merkle-style integrity hash chain (FRE 901) - subject_id top-level promotion in all JSONL sinks - Langfuse boundary redaction layer (scrum C2 priority) - adverse-impact comparator pool in audit response (scrum C3) - BIPA-specific consent + retention metadata (scrum C4) - vertical detection at gateway boundary (J answer 10) Implementation single-language: Go (one identityd, both runtimes call it via HTTP). Postgres backing store, isolated schema. Master key in sealed file v1, vault migration path documented. 8-step migration path: stand up empty → backfill from parquet → behind feature flag → cut over reads incrementally → quarantine PII columns in workers_500k. Each step its own commit + gate + rollback. 6 open questions for J before implementation: master key location, Postgres shared vs isolated, vertical backfill default, legal token issuance procedure, crypto-erasure sweep cadence, EU enforcement timeline. Estimated 8-10 working days total. Largest single phase in the audit program. No code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:25:40 -05:00
root	fd429f4185	audit phase 1.5: BIPA schema audit + outcomes.jsonl content sample Two follow-up walks per AUDIT_PHASE_1_DISCOVERY §10/C4 + gemini scrum flag. Read-only. No code changes. BIPA findings: - scripts/staffing/tag_face_pool.py uses deepface to extract gender + race + age from face images. Output persists to data/headshots/ manifest.jsonl. For synthetic faces this is fine; for real candidate photos this becomes a regulated biometric database (740 ILCS 14/10). - mcp-server/index.ts:1408 ComfyUI prompt EXPLICITLY embeds protected attributes (age + race + gender) into model prompt — system-level encoding of protected-attribute features into AI workflow. - mcp-server/search.html:3375-3432 has hard-coded FEMALE_NAMES / MALE_NAMES / NAMES_HISPANIC / SURNAMES_* lookup tables — name-based ethnicity inference. Title VII / disparate-impact risk separate from BIPA. - data/headshots/manifest.jsonl is TRACKED IN GIT today (synthetic classifications). For real photos, this would be biometric data in version control — serious failure. - No consent flow, no public retention schedule, no deletion procedure, no employee training documented. All required by BIPA §15(a)/(b) before real-photo intake. outcomes.jsonl sample: - 39/101 rows persist candidate names in fills[*].name field today - Sample names: "Carmen I. Garcia", "Jamal Z. Jones", "Jacob N. Patel" (synthetic but real shape) - 0 hits for "culture fit" / "communication" / etc proxy phrases — synthetic data doesn't generate them. When real models reason about real candidates, they will. Append-only persistence makes RTBF cryptographic-erasure-only. Recommends Phase 1.6 (NEW) — BIPA pre-launch gates between Phase 1.5 and Phase 2: BIPA_COMPLIANCE_POLICY.md, consent gate at upload endpoint, quarantine real-photo classifications to data/biometric/, deprecate name->ethnicity lookup tables, unit test that synthetic manifest stays synthetic. 4-8 hours of design + one code commit. 5 open questions for J: where do real photos enter, will deepface tagging path stay for real photos, consent UX, retention duration floor, designated privacy officer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:22:53 -05:00
root	64bda21614	audit PRD: J answered 5 open questions — fold into §10, revise phase plan Conversation 2026-05-03 — J confirmed: - Photos/video YES → BIPA in full force ($1k-$5k per violation) - Langfuse self-hosted → drops GDPR Art. 44 cross-border concern - EU not in scope now but placeholder needed → design EU-compatible - Healthcare vertical YES → HIPAA BAA needed with model providers, PHI redaction at gateway boundary OR local-only routing for those requests, vertical-detection at boundary is Phase 2 requirement - Training/RAG MAY re-run on outcomes → design as if it will, training- safe export interface needed, crypto-erasure becomes load-bearing evidence chain §10 updated with answered/pending status per question. New §10.5 "Effect on phase plan" introduces: - Phase 1.5 (NEW) — BIPA photo/video schema audit + Langfuse boundary scoping + outcomes.jsonl content sample, BEFORE Phase 2 design - Phase 2 design must now include: EU-placeholder fields, vertical detection, training-safe export, BIPA consent metadata - Phase 9 rehearsal must cover discrimination + BIPA + healthcare PHI 3 questions still pending J's call before Phase 2 design ships: identity service daemon vs in-process, JSON vs signed PDF for legal export, audit endpoint auth model. No code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:16:27 -05:00
root	627a5f0c3d	audit phase 1: §10 scrum-review findings + walk back §1F over-claim Ran cross-lineage scrum on the discovery doc with the new model fleet (opus + kimi-k2.6 + gemini-3-flash via Go gateway :4110, custom "senior security architect" prompt). 3/3 reviewers responded with substantive 800-1200 word reviews. Saved at /tmp/audit_scrum/. 5 convergent findings (≥2 reviewers) added as §10/C1-C5: C1. §1F matrix-indexer "good for audit defensibility" claim is over- claimed — walked back in TL;DR. Trace bodies unverified; treat as SUSPECTED PII sink until §8.1 sampling completes. C2. §1E (Langfuse) is the most dangerous leak — fix FIRST, ahead of view-routing. Boundary-crossing leak (GDPR Art. 44 / CPRA sale / SOC2 disposal). All 3 reviewers converge on this priority. C3. Discrimination defense requires the FULL CANDIDATE POOL, not just fills. EEOC UGESP (1978): need adverse-impact stats on everyone who could have been picked. Phase 1 worked example missed this. C4. BIPA / biometric exposure understated in findings (in PRD §10.5 but not translated to actionables). $1k-$5k per-violation regime. C5. candidate_id must be promoted to top-level field in all JSONL sinks. Grepping natural-language strings is not defensible audit strategy. 3/3 reviewers converge. 11 single-reviewer high-value catches added as §10 single-reviewer section: opus on LLM provider egress (8th PII path), Art. 22 right- to-explanation, special-category data, DPIA/ROPA/DPA inventory; kimi on sequential ID enumeration risk, Langfuse retention config, CCPA de-identified-in-place vs crypto-shred, Bun common-mode failure, cryptographic audit-trail integrity (Merkle/FRE 901), HIPAA BAA, revised SELECT * effort estimate; gemini on data residency, "culture fit" reasoning proxies, comparator-pool snapshot. §9 reordered: sample first → defense-layer second → Langfuse boundary third (was view-routing first per original draft; boundary-crossing leak is higher priority per scrum). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:13:07 -05:00
root	505ea93726	audit phase 1: discovery walk complete — subject + PII surface map Read-only walk of both runtimes per AUDIT_TRAIL_PRD.md §8 phase 1. Fills "UNKNOWN" cells in PRD §3 + §7 with file:line evidence. Headline findings: - candidates_safe + workers_safe views EXIST as a defense layer but are BYPASSED — tool registry SQL templates query raw tables - PII traverses 7+ persistence/transmission paths per fill scenario: SQL → tool_result → LogEntry → /v1/respond → Langfuse → outcomes.jsonl → overseer_corrections.jsonl - candidate_id is stable but co-located with PII in workers_500k.parquet (no separate identity service) - /audit/subject/{id} endpoint does not exist - Append-only persistence is universal — RTBF requires crypto-erasure - Pathway memory is structurally subject-agnostic in fingerprints (defensive); trace bodies may leak PII (needs sampling) - Go side mirrors Rust PII shape — parity in the leak too - Worked example (John Martinez audit today): NOT POSSIBLE to produce complete-and-defensible response Recommends 4 cheap high-value moves before Phase 2 design starts: defense-layer enforcement (rewrite 3 SQL templates to _safe views), sample state.json/Langfuse to confirm pathway memory is clean, walk Bun mcp-server tool surface, schema-audit for protected-attribute proxies. None are commitments — J's call. No code changes in this commit. Companion to AUDIT_TRAIL_PRD.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:04:07 -05:00
root	b2d717ae44	audit PRD: add §10.5 jurisdictional surface (IL + IN, federal, SOC2) J flagged that the staffing system targets Chicago + Indiana — added a jurisdictional checklist section to the audit-trail PRD so counsel has a working starting point. Covered: - Federal: Title VII, ADEA, ADA, EEOC, OFCCP, FCRA, Section 1981 - Illinois: BIPA (high risk if any candidate photos), AI Video Interview Act (820 ILCS 42), Illinois Human Rights Act (broader than Title VII), PIPA breach notification, Day and Temporary Labor Services Act (directly applies — staffing industry-specific recordkeeping), Cook County + City of Chicago Human Rights Ordinances (additional protected classes including source of income, parental status, credit history) - Indiana: Data Breach Disclosure, Civil Rights Law (lighter than IL), Genetic Information Privacy Act - SOC 2 Type II as the typical SaaS sale gate (Privacy + Security TSCs most relevant; 6-9 month effort to first report) - HIPAA / PCI / ISO 27001 noted as out of current scope but flagged Phase reordering implications captured: - BIPA risk on real candidate photos may need to be resolved BEFORE audit-trail work (class-action exposure) - SOC 2 Type II prep runs in parallel, not after - IL Day and Temporary Labor Services recordkeeping may override our proposed 4-year retention SLA 7 open questions added that counsel must answer before the §8 phases can be locked in. Document is explicit (multiple times) that this is NOT legal advice — a research-grade checklist for J's counsel conversation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 00:56:28 -05:00
root	c170ebc86e	docs: AUDIT_TRAIL_PRD — production-readiness gate for staffing client J flagged that smoke + parity tests prove the surface compiles, NOT that an audit response can be produced for a specific person — and the staffing client won't sign without defensible discrimination-claim response capability. New docs/AUDIT_TRAIL_PRD.md captures: - worked example: John Martinez at Warehouse B requests audit - subject audit response output format (per-decision row schema) - surface map: where decisions happen today, where the gaps are - PII handling rules (tokenization, protected-attribute exclusion, inferred-attribute risk) - identity service design intent (separate daemon, audited reads) - retention + right-to-be-forgotten policy intent - 9-phase implementation sequence with explicit per-phase exit criteria - cross-runtime requirement (both Rust + Go must satisfy) - 7 open questions blocking phase 2+ that need J's call STATE_OF_PLAY + PRD updated with explicit "production-ready blocker" section pointing at the new doc. The "substrate is shipped" framing gets a caveat: substrate ≠ production-ready until audit phase 9 exits. No code changes. This is the planning artifact J asked for before we start building. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 00:54:46 -05:00
root	5368aca4d4	docs: sync ADR-019 + PRD + DECISIONS with 2026-05-02 substrate changes ADR-019: closed the "re-bench when 10M corpus exists" follow-up. Added "Follow-up: 10M re-bench (2026-05-02)" section with the post-fix numbers (search ~20ms warm / ~46ms cold, doc-fetch ~5ms post-btree). Documented the lance-bench-bypassing-IndexMeta bug + 2-layer fix + gauntlet (7 unit + 12 sanitize + 10 smoke probes). Reframes the strategic question as "Lance vs Parquet+HNSW-with-spilling" since HNSW doesn't fit RAM at 10M. DECISIONS: added ADR-022 — drop Python sidecar from Rust hot path. Captures the rationale (236× embed perf gap was pure overhead), co-shipped LRU cache, dev-only Python that survives, cross-runtime parity verification, and the operator runbook signal (ps -ef ABSENT post-deploy). PRD: updated AI Boundary table line + aibridge crate description to reflect direct Ollama path (was: Python FastAPI sidecar → Ollama). Both lines reference ADR-022 for the full rationale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 00:44:57 -05:00
root	e9d17f7d5a	sanitize: drop over-broad path-missing branch + UTF-8-safe redaction Re-scrum of yesterday's sanitizer fix surfaced 2 more real bugs in the fix itself (opus, both WARN, neither caught by kimi/qwen): W1 (service.rs:1949) — `mentions_path_missing` standalone branch was too aggressive. A registry-internal error like "/root/.cargo/.../x.rs: no such file or directory" would 404 because it triggers without dataset context. That's a real 500. Dropped the standalone branch; require dataset context AND missing-shape phrase. Lance's actual "Dataset at path X was not found" still satisfies it. W2 (service.rs:2018) — `out.push(bytes[i] as char)` corrupted multi-byte UTF-8 by casting raw bytes to char (only sound for ASCII < 128). A path containing user-supplied non-ASCII names produced Latin-1 mojibake. Rewrote redact_paths to track byte indices and emit unmatched runs as &str slices via push_str(&s[range]) — preserves multi-byte sequences verbatim. Step advance is now per-char, not per-byte, via small utf8_char_len helper. Two new regression tests: - is_not_found_does_not_match_unrelated_path_missing - redact_preserves_multibyte_utf8 (uses 工作 + café in input) 12/12 sanitize tests PASS. Smoke 10/10 PASS. Loop closure for opus re-scrum on the 2026-05-02 fix bundle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 00:15:23 -05:00
root	ac7c996596	sweep up scrum WARNs — model const, stale config, temp_path entropy, smoke gate Four findings deferred from the 2026-05-02 scrum, all 1-5 line fixes: W1 (kimi WARN @ scrum_master_pipeline.ts:1143) — `gemini-3-flash-preview` hardcoded twice in MAP and REDUCE phases. Extracted TREE_SPLIT_MODEL + TREE_SPLIT_PROVIDER constants near the existing config block. Diverging the two would break tree-split coherence (per-shard digests must come from the same model the reducer collapses). W2 (qwen WARN @ providers.toml:30) — stale `kimi-k2:1t` reference in operator-facing comments after PR #13 noted it's upstream-broken. Reframed as historical context ("was X here pre-2026-05-03 — that model is broken") so future operators don't paste-route from the comment. W3 (opus WARN @ vectord-lance/src/lib.rs:622) — temp_path() entropy was only pid+nanos, which collide under tokio scheduling when multiple tests in the same cargo process create temp dirs back-to-back. Added per-process AtomicU64 sequence counter — guarantees uniqueness regardless of clock. W4 (opus INFO @ scripts/lance_smoke.sh:38) — `\|\| echo '{}'` swallowed curl transport failures (gateway down, network broken, timeout), surfacing as misleading "no method field" jq errors at the next probe. Now captures $? separately, gates a "curl reachable" probe, and only falls back to empty body for the dependent jq parse. Smoke went 9 → 10 probes. Verified: vectord-lance 7/7 tests PASS, gateway cargo check clean, lance_smoke.sh 10/10 PASS against live gateway. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 00:11:59 -05:00
root	7bb66f08c3	lance: scrum-driven sanitizer + smoke-gate fixes (opus 2026-05-02 BLOCK) Some checks failed lakehouse/auditor 9 blocking issues: cloud: claim not backed — "Verified live (post-restart): scale_test_10m doc-fetch 4-15ms across" Cross-lineage scrum on the lance wave (4 bundles, 33 distinct findings) surfaced 1 real BLOCK and 2 real WARNs from opus that the kimi/qwen lineages missed. Per feedback_cross_lineage_review.md, opus is the load-bearing reviewer; cross-lineage convergence is noise unless verified. BLOCK fix — sanitize_lance_err path-stripping was unsound: err.split("/home/").next().unwrap_or(&err) returns Some("") when err STARTS with "/home/", erasing the entire message. Replaced truncation with redact_paths() — a hand-rolled scanner that walks the input once, replacing path-shaped substrings with [REDACTED] while preserving surrounding error context. Catches: - absolute paths under /root/.cargo, /home, /var, /tmp, /etc, /usr, /opt - relative variants (Lance occasionally strips leading slash — observed live "Dataset at path home/profit/lakehouse/data/lance/x was not found") - multiple occurrences in one error - preserves quote/comma/whitespace terminators WARN fix #1 — is_not_found heuristic was too broad: lower.contains("not found") caught real 500s like "column not found", "field not found in schema". Narrowed to require dataset-shape phrasing AND exclude the column/field/schema patterns explicitly. WARN fix #2 — lance_smoke.sh `grep -qvE` was an unsound regression gate. bash -c "echo '$BODY' \| grep -qvE 'pat'" With -v -q, exits 0 if ANY line lacks the pattern — so a multi-line body with one leak line + any clean line FALSE-PASSES. Replaced with the correct "pattern absent" form: `! grep -qE 'pat'`. Also expanded the pattern set (added /var/, /tmp/) since the scrum surfaced these as additional leak vectors. Also unblocks pre-existing pathway_memory test compile error (stale PathwayTrace init missing 6 Mem0-versioning fields added in 6ac7f61). Tests filled in with sensible defaults — needed to run sanitize_tests. 10/10 new sanitize tests pass. Smoke 9/9 PASS against rebuilt+restarted gateway. Live missing-index probe now returns: "lance dataset not found: no-such-11205" + HTTP 404 (was: leaked absolute paths + HTTP 500 → leaked absolute and relative paths post-first-fix → clean message + 404 now.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 23:34:54 -05:00
root	a294a61ee4	Merge remote-tracking branch 'origin/main' into demo/post-pr11-polish-2026-04-28 Some checks failed lakehouse/auditor 9 blocking issues: cloud: claim not backed — "Verified live (post-restart): scale_test_10m doc-fetch 4-15ms across"	2026-05-02 22:40:03 -05:00
profit	feb638e4cd	infra: replace gpt-oss with Ollama Pro + OpenCode Zen across hot paths (#13 ) 7 hot-path call sites swapped to Ollama Pro / OpenCode Zen models. All replacements live-probed. Auditor surfaced 2 kimi BLOCKs both verified false-positive on 2026-05-02. Compiles cleanly in isolation.	2026-05-03 03:39:52 +00:00
root	0af62861d2	STATE_OF_PLAY: refresh for 2026-05-02 wave (Lance gauntlet + parity + housekeeping) Some checks failed lakehouse/auditor 9 blocking issues: cloud: claim not backed — "Verified end-to-end via playwright on devop.live/lakehouse:" Anchor was 5 days stale. Adds the 12-commit wave (Lance backend hardening, sidecar drop, observability parity, gitignore cleanup, gray-zone content add) with verification status for each. Updates DO NOT RELITIGATE with the 4 new things this wave makes load-bearing: - python sidecar dropped from hot path (don't wire it back) - lance gauntlet shipped (don't re-discover the bugs we just fixed) - 32/32 cross-runtime parity (don't build a 6th probe for already-covered surface) - ARCHITECTURE_COMPARISON.md is the single source of truth for cross-runtime decisions Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 22:23:36 -05:00
root	41b0a99ed2	chore: add real content that was sitting untracked Surfaced by today's untracked-files audit. None of these are accidents — multiple are referenced by name in CLAUDE.md and memory files but were never added. Categories: - docs/PHASE_AUDIT_GUIDE.md (106 LOC) — Claude Code phase audit guidance - ops/systemd/lakehouse-langfuse-bridge.service — Langfuse bridge unit - package.json — top-level npm manifest - scripts/e2e_pipeline_check.sh + production_smoke.sh — real test scripts - reports/kimi/audit-last-week.md — the "Two reports live" CLAUDE.md cites - tests/multi-agent/scenarios/ — 44 staffing scenarios (cutover decision A) - tests/multi-agent/playbooks/ — 102 playbook records - tests/battery/, tests/agent_test/PRD.md, tests/real-world/ — real tests - sidecar/sidecar/{lab_ui,pipeline_lab}.py — 888 LOC dev-only UIs that remain in service post-sidecar-drop (commit ba928b1 explicitly kept them) Sensitivity check: scenarios use synthetic company names ("Heritage Foods", "Cornerstone Fabrication"); audit reports describe code findings only; no PII or secrets surfaced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 22:22:10 -05:00
root	6e34ef7baf	gitignore: stop tracking runtime data, logs, build artifacts, scratch Untracked count was 100+; almost all were data/_*/ daemon state, generated parquets under data/datasets and data/vectors, the 33GB data/lance/ tree, node_modules, exports, logs, per-run distillation reports, and test scratchpads. None of these are content — all regenerate from inputs. Now down to 33 untracked items, all real content (scripts, systemd unit, test scenarios, dev-only sidecar UIs, kimi audit reports). Those need J's call on what to track vs leave parked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 22:20:14 -05:00
root	044650a1da	lance-bench: also build doc_id btree post-IVF — match gateway's migrate behavior The bench's own measure_random_access_lance uses take(row_position) — doesn't need the btree. But datasets written by this bench are commonly queried via /vectors/lance/doc/<name>/<doc_id> downstream, and without the btree that path falls back to a full table scan. Building inline keeps bench-produced datasets immediately production-shape and removes a footgun (the same one that made scale_test_10m's doc-fetch ~100ms until commit 5d30b3d fixed it via the migrate handler path). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 22:19:16 -05:00
root	5d30b3da89	lance: auto-build doc_id btree in migrate handler (root-cause for 10M doc-fetch slowness) scale_test_10m doc-fetch p50 was ~100ms — full table scan over 35GB. Root cause: the auto-build at service.rs:1492-1503 only fires for IndexMeta- registered indexes during set_active_profile warming. lance-bench writes datasets through /vectors/lance/migrate/* directly, bypassing IndexMeta, so its datasets never get the doc_id btree that ADR-019 depends on. Fix: build the btree inline at the end of lance_migrate. Costs ~1.2s on 10M rows (+269MB on disk), drops doc-fetch from ~100ms to ~5ms (20x). Failure is non-fatal — logs a warning and the dataset stays queryable. Verified live (post-restart): scale_test_10m doc-fetch 4-15ms across 5 calls, smoke 9/9 PASS, vectord-lance 7/7 unit tests PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 21:38:00 -05:00
root	7594725c25	lance backend: 4-pack — bug fix + smoke + tests + 10M re-bench Some checks failed lakehouse/auditor 12 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:" Surfaced by the 2026-05-02 audit (vectord-lance + lance-bench + glue existed and worked but had no tests, no smoke, leaked server paths on missing-index search, and the ADR-019 10M re-bench was deferred). ## 1. Fix: missing-index search returned 500 + leaked filesystem path Pre-fix: $ POST /vectors/lance/search/no-such-index HTTP 500 Dataset at path home/profit/lakehouse/data/lance/no-such-index was not found: Not found: home/profit/lakehouse/data/lance/no-such-index/ _versions, /root/.cargo/registry/src/index.crates.io-...-1949cf8c.../ lance-table-4.0.0/src/io/commit.rs:364:26, ... Post-fix: HTTP 404 lance dataset not found: no-such-index Added `sanitize_lance_err()` in crates/vectord/src/service.rs that: - maps "not found" / "no such file" patterns → 404 (was 500) - strips /home/ and /root/.cargo/ paths from any error body Applied to all 5 lance handlers: search, get_doc, build_index, append, migrate. The store_for() handle is cheap-and-stateless; the actual disk hit happens inside the operation, which is where the leak originated. ## 2. scripts/lance_smoke.sh — first regression gate 9-probe smoke against the live HTTP surface. Exercises only read paths (no state mutation in CI). Specifically locks the sanitizer fix — a future regression that re-introduces the path leak fires the smoke immediately. 9/9 PASS against the live :3100 today. ## 3. Unit tests on vectord-lance/src/lib.rs (was: zero tests) 7 tests covering the public LanceVectorStore API: - fresh_store_reports_no_state — handle is lazy - migrate_then_count_and_fetch — Parquet → Lance round-trip - get_by_doc_id_missing_returns_none — Ok(None) vs Err contract that lets the HTTP handler return 404 cleanly - append_grows_count_and_new_rows_fetchable — ADR-019's structural-difference claim verified at the unit level - append_dim_mismatch_errors — guards against silently breaking search by accepting inconsistent-dim rows - search_returns_nearest — exact-vector match → top-1 - stats_reports_post_migrate_state — locks the field shape 7/7 PASS. cargo test -p vectord-lance --lib green. ## 4. 10M re-bench (deferred from ADR-019) reports/lance_10m_rebench_2026-05-02.md captures the numbers driven against the live :3100 over data/lance/scale_test_10m (33GB / 10M vectors, IVF_PQ confirmed via response method tag). Headline: Search cold (10 diverse queries): median ~32ms, mean ~46ms Search warm (5x same query): ~20ms p50 Doc fetch (5x same id): ~100ms p50 Search latency at 10M is acceptable for batch / async workloads, too slow for sub-10ms voice/recommendation paths. ADR-019's "Lance pulls ahead at 10M" claim remains unverified-but-not-refuted — at this scale HNSW doesn't operationally exist (10M × 768d × 4 bytes = 30GB just for vectors). Real finding: doc-fetch at 10M is 300x slower than the 100K number ADR-019 cited (311μs → ~100ms). Likely cause: scalar btree index on doc_id may not be built for this dataset. Follow-up to investigate whether forcing build_scalar_index brings it back to the load-bearing O(1) range. Captured in the report. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 20:06:56 -05:00
root	98b6647f2a	gateway: IterateResponse echoes trace_id + enable session_log_path Some checks failed lakehouse/auditor 14 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:" Closes the 2026-05-02 cross-runtime parity gap: Go's validator.IterateResponse carried trace_id back to callers; Rust's didn't. A caller pivoting from response → Langfuse → session log worked on Go but failed on Rust because the join key wasn't visible in the response body. ## Changes crates/gateway/src/v1/iterate.rs: - IterateResponse + IterateFailure gain `trace_id: Option<String>` (skip-serializing-if-none preserves backward-compat for any consumer parsing the response without the field) - Both return sites populated with the resolved trace_id lakehouse.toml: - [gateway].session_log_path set to /tmp/lakehouse-validator/sessions.jsonl — same path Go validatord writes to. The two daemons now co-write one unified longitudinal log; rows tag daemon="gateway" vs daemon="validatord" so producers stay distinguishable in DuckDB queries. Append-write is atomic at the row sizes both runtimes produce, so concurrent writes from both daemons are safe. ## Verification Post-restart of lakehouse.service: POST /v1/iterate with X-Lakehouse-Trace-Id: rust-fix1-test → response.trace_id = "rust-fix1-test" ✓ (was: field absent) → sessions.jsonl latest row daemon=gateway, session_id=rust-fix1-test ✓ (was: no row) Cross-runtime drive — same prompt to Rust :3100 and Go :4110: Rust: trace_id=unified-rust-001, daemon=gateway, accepted Go: trace_id=unified-go-001, daemon=validatord, accepted Same file, distinct daemons, one query covers both: SELECT daemon, COUNT(*) FROM read_json_auto('sessions.jsonl', format='nd') GROUP BY daemon → gateway: 2, validatord: 19 All 4 parity probes still 6/6 + 12/12 + 4/4 + 2/2 against live :3100 + :4110 stacks. Cargo test 4/4 PASS for v1::iterate module. ## Architecture invariant The "unified longitudinal log" thesis is now demonstrated. Operators running both runtimes in production point both daemons at the same session_log_path and DuckDB queries naturally span both producers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 06:24:41 -05:00
root	57bde63a06	gateway: trace-id propagation + coordinator session JSONL (Rust parity) Some checks failed lakehouse/auditor 10 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:" Cross-runtime parity with the Go-side observability wave (commits d6d2fdf + 1a3a82a in golangLAKEHOUSE). The two layers J flagged: the LIVE per-call view (Langfuse) and the LONGITUDINAL forensic view (JSONL queryable via DuckDB). Hard correctness gate (FillValidator phantom-rejection) was already in place; this is the observability on top. ## Trace-id propagation X-Lakehouse-Trace-Id header constant declared in crates/gateway/src/v1/iterate.rs (matches Go's shared.TraceIDHeader byte-for-byte). When set on an inbound /v1/iterate request, the handler reuses it; the chat + validate self-loopback hops forward the same header so chatd's trace emit nests under the parent rather than minting a fresh top-level trace per call. ChatTrace gains a parent_trace_id field. emit_chat_inner skips the trace-create event when parent is set, only emits the generation-create which attaches to the existing trace tree. Result: an iterate session with N retries shows in Langfuse as ONE tree, not N+1 disconnected traces. emit_attempt_span (new) writes one Langfuse span per iteration attempt with input={iteration, model, provider, prompt} and output={verdict, raw, error}. WARNING level on non-accepted verdicts. The returned span id is stamped on the corresponding SessionRecord attempt for cross-log correlation. ## Coordinator session JSONL crates/gateway/src/v1/session_log.rs — new writer matching Go's internal/validator/session_log.go schema byte-for-byte: - SessionRecord with schema=session.iterate.v1 - SessionAttemptRecord per retry - SessionLogger.append: tokio Mutex serialized append-only - Best-effort posture (slog.Warn on error, never blocks request) iterate.rs builds + appends a row on EVERY code path: - accepted: write_session_accepted with grounded_in_roster bool derived from validate_workers WorkerLookup (matches Go's handlers.rosterCheckFor("fill") semantics) - max-iter-exhausted: write_session_failure - infra-error: write_infra_error (so a missing /v1/iterate event never silently disappears from the longitudinal log) [gateway].session_log_path config field (empty = disabled). Production: /var/lib/lakehouse/gateway/sessions.jsonl. Operators who want a unified longitudinal stream can point both Rust and Go loggers at the same path — write-append is safe at the row sizes we produce. ## Cross-runtime parity probe crates/gateway/src/bin/parity_session_log: tiny stdin/stdout helper that round-trips a fixture through SessionRecord serde. golangLAKEHOUSE/scripts/cutover/parity/session_log_parity.sh feeds 4 fixtures through both helpers and diffs the rows after stripping timestamp + daemon (the two fields that legitimately differ between producers). Result: 4/4 byte-equal including the unicode-prompt fixture ("Café résumé ⭐ 你好"). Schema parity holds. The non-trivial-equal guard in the probe rejects the case where both sides fail identically — protecting against a regression where one side silently stops producing valid JSON. ## Verification - cargo test -p gateway --lib: 90/90 PASS (3 new session_log tests including concurrent-append safety) - cargo check --workspace: clean - session_log_parity.sh: 4/4 fixtures byte-equal - Both runtimes can append to the same path; DuckDB sees one stream - The Go-side validatord smoke remains 5/5 (unchanged) ## Architecture invariant Don't propose to "wire trace-id propagation in Rust" or "add Rust session log" — both are now shipped on the demo/post-pr11-polish branch. The longitudinal log + Langfuse tree together cover the multi-call observability concern J flagged 2026-05-02. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 05:39:29 -05:00
root	ba928b1d64	aibridge: drop Python sidecar from hot path; AiClient → direct Ollama Some checks failed lakehouse/auditor 11 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:" The "drop Python sidecar from Rust aibridge" item from the architecture_comparison decisions tracker. Universal-win cleanup — removes 1 process + 1 runtime + 1 hop from every embed/generate request, with no behavior change. ## What was on the hot path before gateway → AiClient → http://:3200 (FastAPI sidecar) ├── embed.py → http://:11434 (Ollama) ├── generate.py → http://:11434 ├── rerank.py → http://:11434 (loops generate) └── admin.py → http://:11434 (/api/ps + nvidia-smi) The sidecar's hot-path code (~120 LOC across embed.py / generate.py / rerank.py / admin.py) was pure pass-through: each route translated its request body to Ollama's wire format and returned Ollama's response in a sidecar envelope. Zero logic, one full HTTP hop of overhead. ## What's on the hot path now gateway → AiClient → http://:11434 (Ollama directly) Inline rewrites in crates/aibridge/src/client.rs: - embed_uncached: per-text loop to /api/embed; computes dimension from response[0].length (matches the sidecar's prior shape) - generate (direct path): translates GenerateRequest → /api/generate (model, prompt, stream:false, options:{temperature, num_predict}, system, think); maps response → GenerateResponse using Ollama's field names (response, prompt_eval_count, eval_count) - rerank: per-doc loop with the same score-prompt the sidecar used; parses leading number, clamps 0-10, sorts desc - unload_model: /api/generate with prompt:"", keep_alive:0 - preload_model: /api/generate with prompt:" ", keep_alive:"5m", num_predict:1 - vram_snapshot: GET /api/ps + std::process::Command nvidia-smi; same envelope shape as the sidecar's /admin/vram so callers keep parsing - health: GET /api/version, wrapped in a sidecar-shaped envelope ({status, ollama_url, ollama_version}) Public AiClient API is unchanged — Request/Response types untouched. Callers (gateway routes, vectord, etc.) require zero updates. ## Config changes - crates/shared/src/config.rs: default_sidecar_url() bumps to :11434. The TOML field stays `[sidecar].url` for migration compat (operators with existing configs don't need to rename anything). - lakehouse.toml + config/providers.toml: bumped to localhost:11434 with comments explaining the 2026-05-02 transition. ## What stays Python sidecar/sidecar/lab_ui.py (385 LOC) + pipeline_lab.py (503 LOC) are dev-mode Streamlit-shape UIs for prompt experimentation. Not on the runtime hot path; continue running for ad-hoc work. The embed/generate/rerank/admin routes inside sidecar can be retired, but operators who want to keep the sidecar process running for the lab UI face no breakage — those routes still call Ollama and work. ## Verification - cargo check --workspace: clean - cargo test -p aibridge --lib: 32/32 PASS - Live smoke against test gateway on :3199 with new config: /ai/embed → 768-dim vector for "forklift operator" ✓ /v1/chat → provider=ollama, model=qwen2.5:latest, content=OK ✓ - nvidia-smi parsing tested via std::process::Command path - Live `lakehouse.service` (port :3100) NOT yet restarted — deploy step is operator-driven (sudo systemctl restart lakehouse.service) ## Architecture comparison update (Captured separately in golangLAKEHOUSE/docs/ARCHITECTURE_COMPARISON.md decisions tracker.) The "drop Python sidecar" line moves from _open_ to DONE. The Rust process model now has 1 mega-binary instead of 1 mega-binary + 1 sidecar process — a small but real reduction in ops surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 04:59:47 -05:00
root	654797a429	gateway: pub extract_json + parity_extract_json bin (cross-runtime probe) Some checks failed lakehouse/auditor 10 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:" Supports the 2026-05-02 cross-runtime parity probe at golangLAKEHOUSE/scripts/cutover/parity/extract_json_parity.sh which feeds identical model-output strings through both runtimes' extract_json and diffs results. ## Changes - crates/gateway/src/v1/iterate.rs: extract_json gains `pub` + a comment pointing at the Go counterpart and the parity probe path - crates/gateway/src/lib.rs: NEW thin lib facade re-exporting the modules so sub-binaries can reuse them. main.rs is unchanged (still uses local mod declarations) - crates/gateway/src/bin/parity_extract_json.rs: NEW ~30-LOC binary that reads stdin, calls extract_json, prints {matched, value} JSON ## Probe result (logged in golangLAKEHOUSE) 12/12 match across fenced blocks, nested objects, unicode, escaped quotes, top-level array, malformed JSON. Both runtimes' algorithms are genuinely equivalent. Substrate gate the probe enforces: `cargo test -p gateway extract_json` PASS before any parity comparison runs. So a future divergence in the live extract_json fires either as a Rust test failure (live behavior changed) or a probe diff (Go behavior changed) — never silently. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 04:44:11 -05:00
root	c5654d417c	docs: pointer to ARCHITECTURE_COMPARISON.md source in golangLAKEHOUSE Some checks failed lakehouse/auditor 18 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:" Per J's request: the parallel-runtime comparison is a living source file maintained at /home/profit/golangLAKEHOUSE/docs/ARCHITECTURE_COMPARISON.md. This file is a pointer reachable from the Rust repo's docs/ so the comparison is discoverable from either side. Doesn't contain authoritative content — just the link + a quick status summary + update guidance ('source lives in golangLAKEHOUSE, don't drift two copies').	2026-05-01 04:57:09 -05:00
root	150cc3b681	aibridge: LRU embed cache - 236x RPS gain on warm workloads. Per architecture_comparison.md universal-win for Rust side. Cache key (model,text), default 4096 entries, in-process inside gateway. Load test: 128 RPS -> 30k+ RPS, p50 78ms -> 129us. Some checks failed lakehouse/auditor 20 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:"	2026-05-01 04:45:20 -05:00
root	9eed982f1a	mcp-server: /_go/* pass-through for G5 cutover slice Adds an opt-in pass-through that routes Bun mcp-server requests to the Go gateway when GO_LAKEHOUSE_URL is set. /_go/v1/embed, /_go/v1/matrix/search etc. flow through Bun frontend → Go backend without touching any existing tool. Off-by-default (empty GO_LAKEHOUSE_URL → 503 with rationale); enabled via systemd drop-in at: /etc/systemd/system/lakehouse-agent.service.d/go-cutover.conf This is the first slice of real Bun-fronted traffic hitting the Go substrate. The /api/* pass-through (Rust gateway) and every existing tool are unmodified — fully additive cutover step. Reversible: unset GO_LAKEHOUSE_URL or remove the systemd drop-in and restart lakehouse-agent.service. Verified end-to-end against persistent Go stack on :4110: /_go/health → {"status":"ok","service":"gateway"} /_go/v1/embed → nomic-embed-text-v2-moe vectors (dim=768) /_go/v1/matrix/search → 3/3 Forklift Operators (role+geo match) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 03:44:10 -05:00
root	3d068681f5	distillation: regenerated acceptance + audit reports (run_hash refresh) Some checks failed lakehouse/auditor 17 blocking issues: cloud: claim not backed — "Verified end-to-end via playwright on devop.live/lakehouse:" Phase 6 acceptance + Phase 8 full-audit reports re-run; bit-for-bit reproducibility property still holds (run 1 hash == run 2 hash), just at a new value. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 00:13:17 -05:00
root	8de94eba08	cleanup: bump qwen2.5 → qwen3.5:latest in active defaults Some checks failed lakehouse/auditor 16 blocking issues: cloud: claim not backed — "Verified end-to-end via playwright on devop.live/lakehouse:" stronger local rung is now the small-model-pipeline tier-1 default across both Rust legacy + Go rewrite (cf. golangLAKEHOUSE phase 1). same JSON-clean property as qwen2.5, more capacity. ollama still serves both side-by-side; rollback is a 4-line revert if a workload regresses. active-default sites: - lakehouse.toml [ai] gen_model + rerank_model → qwen3.5:latest - mcp-server/observer.ts diagnose call (Phase 44 /v1/chat path) → qwen3.5:latest - mcp-server/index.ts model roster doc → qwen3.5:latest first - crates/vectord/src/rag.rs ContinuableOpts + RagResponse.model → qwen3.5:latest skipped: execution_loop/mod.rs comments describing historic qwen2.5 tool_call quirks — those are documentation of past behavior, not active defaults. data/_catalog/profiles/*.json are runtime-generated (gitignored), not in scope for tracked changes. cargo check -p vectord: clean. no behavioral change in the audit pipeline — same JSON-clean local model, same think=Some(false) posture, just stronger upstream. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 00:10:57 -05:00
root	a00e9bb438	infra: replace gpt-oss with Ollama Pro + OpenCode Zen across hot paths Some checks failed lakehouse/auditor 2 blocking issues: State field rename likely incomplete — `opencode_key` may not exist on `self.state` Ollama Pro plan went live today (39-model fleet on the same OLLAMA_CLOUD_KEY) and OpenCode Zen was already wired in the gateway but not consumed. Routing every gpt-oss call site to faster / stronger replacements: \| Site \| gpt-oss → replacement \| Why \| \|---\|---\|---\| \| ollama_cloud default \| gpt-oss:120b → deepseek-v3.2 \| newest DeepSeek revision; live-probed `pong` \| \| openrouter default \| openai/gpt-oss-120b:free → x-ai/grok-4.1-fast \| already the scrum LADDER's PRIMARY \| \| modes.toml staffing_inference \| openai/gpt-oss-120b:free → kimi-k2.6 \| coding-specialized, on Ollama Pro \| \| modes.toml doc_drift_check \| gpt-oss:120b → gemini-3-flash-preview \| speed leader for factual checks \| \| scrum_master_pipeline tree-split MAP+REDUCE \| gpt-oss:120b → gemini-3-flash-preview \| latency-dominated path (5-20× per file) \| \| bot/propose.ts CLOUD_MODEL \| gpt-oss:120b → deepseek-v3.2 \| same Ollama key, faster \| \| mcp-server/observer.ts overseer label fallback \| gpt-oss:120b → claude-opus-4-7 \| matches new overseer model \| \| crates/gateway/src/execution_loop overseer escalation \| ollama_cloud/gpt-oss:120b → opencode/claude-opus-4-7 \| frontier reasoning matters here — fires only after local self-correct fails twice; Zen pay-per-token cost is bounded \| Verification: - `cargo check -p gateway --tests` — clean - Live probes through localhost:3100/v1/chat: - `opencode/claude-opus-4-7` → "pong" - `gemini-3-flash-preview` (ollama_cloud) → "pong" - `kimi-k2.6` (ollama_cloud) → "pong" - `deepseek-v3.2` (ollama_cloud) → "Pong! 🏓" Notes: - kimi-k2:1t still upstream-broken (HTTP 500 on Ollama Pro probe today, matches yesterday's memory). Replacement table never picks it. - The Rust changes need a `systemctl restart lakehouse.service` to take effect on the running gateway. TS callers reload on next run. - aibridge/src/context.rs still has gpt-oss:{20b,120b} in its window- size lookup table; harmless and kept for callers that pass it explicitly as an override. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 06:13:48 -05:00
root	d475fc7fff	infra: replace gpt-oss with Ollama Pro + OpenCode Zen across hot paths Ollama Pro plan went live today (39-model fleet on the same OLLAMA_CLOUD_KEY) and OpenCode Zen was already wired in the gateway but not consumed. Routing every gpt-oss call site to faster / stronger replacements: \| Site \| gpt-oss → replacement \| Why \| \|---\|---\|---\| \| ollama_cloud default \| gpt-oss:120b → deepseek-v3.2 \| newest DeepSeek revision; live-probed `pong` \| \| openrouter default \| openai/gpt-oss-120b:free → x-ai/grok-4.1-fast \| already the scrum LADDER's PRIMARY \| \| modes.toml staffing_inference \| openai/gpt-oss-120b:free → kimi-k2.6 \| coding-specialized, on Ollama Pro \| \| modes.toml doc_drift_check \| gpt-oss:120b → gemini-3-flash-preview \| speed leader for factual checks \| \| scrum_master_pipeline tree-split MAP+REDUCE \| gpt-oss:120b → gemini-3-flash-preview \| latency-dominated path (5-20× per file) \| \| bot/propose.ts CLOUD_MODEL \| gpt-oss:120b → deepseek-v3.2 \| same Ollama key, faster \| \| mcp-server/observer.ts overseer label fallback \| gpt-oss:120b → claude-opus-4-7 \| matches new overseer model \| \| crates/gateway/src/execution_loop overseer escalation \| ollama_cloud/gpt-oss:120b → opencode/claude-opus-4-7 \| frontier reasoning matters here — fires only after local self-correct fails twice; Zen pay-per-token cost is bounded \| Verification: - `cargo check -p gateway --tests` — clean - Live probes through localhost:3100/v1/chat: - `opencode/claude-opus-4-7` → "pong" - `gemini-3-flash-preview` (ollama_cloud) → "pong" - `kimi-k2.6` (ollama_cloud) → "pong" - `deepseek-v3.2` (ollama_cloud) → "Pong! 🏓" Notes: - kimi-k2:1t still upstream-broken (HTTP 500 on Ollama Pro probe today, matches yesterday's memory). Replacement table never picks it. - The Rust changes need a `systemctl restart lakehouse.service` to take effect on the running gateway. TS callers reload on next run. - aibridge/src/context.rs still has gpt-oss:{20b,120b} in its window- size lookup table; harmless and kept for callers that pass it explicitly as an override. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 06:13:30 -05:00
root	f4dc1b29e3	demo: search.html — Live Market explainer rewrite + fp-bar viewport-paint + compact contract cards Some checks failed lakehouse/auditor 18 blocking issues: cloud: claim not backed — "Verified end-to-end via playwright on devop.live/lakehouse:" Four UI changes landing together since they all polish Section ① and Section ② of the public demo: 1. Section ① (Live Market — Chicago) explainer rewritten data-source- first ("Live from City of Chicago Open Data...") with bolded dial names so a skimmer can map the visual to the prose. Drops the "internal calendar" jargon and the slightly-overclaiming "rest of the page is reacting" framing — downstream sections read the same feed but don't react to the per-shift filter, so the new copy says "this row is its heartbeat" instead. 2. Fill-probability bar gets a left-to-right paint reveal (clip-path inset animation) so the green→gold→orange→red gradient reads as a timeline growing instead of a static heatmap with a "danger zone" at the right. Followed by a 30%-wide shimmer sweep on a 3.4s loop for live-signal feel. 3. Paint trigger moved from on-render to IntersectionObserver — by the time the user scrolls to Section ② the on-render animation had already finished. Now each bar paints in over 2.8s when it enters viewport (threshold 0.2, 350ms entry delay). Single shared observer, unobserve()s after firing so the watch list trends to zero. 4. Contract cards now compact-by-default with click-to-expand. New summary strip shows revenue / margin / fill-by-1wk / top candidate so scanners get the punchline without expanding. Click anywhere on the card surface (excluding inner content) to expand the full FP curve, economics grid, candidates list, and Project Index. Project Index auto-opens with the parent card so users actually find the build signals — but only on user-driven expand (avoiding 20× OSHA scrapes on page load). grid-template-rows: 0fr → 1fr animation handles the smooth height transition. All four animations honor prefers-reduced-motion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 06:01:04 -05:00
root	f892230699	demo: search.html UX polish — skeleton loader, card-in stagger, hero takeover, B&W faces Search results no longer pop in as a single block. New behavior: - Skeleton list pre-claims the vertical space results will occupy with shimmering placeholder cards, so arriving results fade in over the skeleton instead of pushing layout. Sweep is staggered per row for a "rolling wave" not "everything blinking together". - Domain-language stage caption ("matching against permits", "ranking by reliability") rotates on a fixed schedule so users read progress, not a stuck spinner. - @keyframes card-in: real worker cards rise 4px and fade in over 350ms with nth-child stagger across the first ~12 rows. Honors prefers-reduced-motion. - Avatar imgs filter through grayscale + slight contrast/blur to pull the SDXL Turbo color cast (which screams "AI generated" at small sizes). Cert icons get the same treatment. - Once-per-session hero takeover compresses the Section ⓪ strip ("Not a CRM — an index that learns from you") into a centered hero on first paint, dismissed by clicking anywhere. Stats hydrate from live endpoints. console.html: mirrors the avatar B&W filter for visual consistency, and removes the headshot insertion entirely — back to monogram initials. The console (internal staffer view) doesn't need synthetic faces; the public demo at /lakehouse/ does. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 06:01:04 -05:00
root	4b92d1da91	demo: icon recipe pipeline + role-aware portraits + ComfyUI negative-prompt override Adds two single-source-of-truth recipe files that drive both the hot-path render server and the offline pre-render scripts: - role_scenes.ts: per-role-band scene clauses (clothing + backdrop). Forklift operators look like forklift operators instead of collapsing to interchangeable studio shots. SCENES_VERSION mixes into the headshot cache key so a coordinator tweak refreshes every matching face on next view. - icon_recipes.ts: cert / role-prop / status / hazard / empty icons with deterministic per-recipe seeds + fuzzy text resolver. ICONS_VERSION suffix on the cached file means edits don't overwrite in place — misfires are recoverable. Routes (mcp-server/index.ts): - GET /headshots/_scenes — exposes SCENES + version to the pre-render script so prompts don't drift between batch and hot-path. - GET /icons/_recipes — same idea for icons. - GET /icons/cert?text=... — resolves free-text cert names to a recipe and 302s to the rendered icon. 404 (not 500) when no recipe matches so the front-end can hang `onerror="this.remove()"`. - GET /icons/render/{category}/{slug} — cache-or-render at 256² (8 steps) for crisper edges than 512² when downsampled to 14px. ComfyUI portrait support (scripts/serve_imagegen.py): The editorial workflow had `human, person, face` baked into its negative prompt — actively sabotaging portraits. _comfyui_generate now accepts negative_prompt/cfg/sampler/scheduler overrides, and those mix into the cache key so portrait calls don't collapse into hero-shot cache hits. scripts/staffing/render_role_pool.py: pre-renders the role-aware face pool by reading SCENES from /headshots/_scenes — single source of truth verified at run time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 06:01:04 -05:00
root	1745881426	staffing: face pool fetch preserves prior tags + --shrink gate + atomic manifest write fetch_face_pool was wiping 952 hand-classified rows when re-run from a Python without deepface installed (it reset every gender to None). Now: - Loads existing manifest by id and overlays only fetch-owned fields, so gender/race/age/excluded survive a refetch. - deepface pass tags only records that don't already have a gender; deepface unavailable means "leave existing tags alone" not "reset". - New --shrink flag required to drop ids >= --count. Default refuses to shrink the pool silently. - Atomic write via tmp + os.replace so an interrupted run can't corrupt the manifest. - Dedupes duplicate id lines (root cause of the 2497-row manifest backing a 1000-face pool). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 06:01:04 -05:00
root	a05174d2fa	ops: track tif_polygons.ts orphan import entity.ts imports findTifDistrict from ./tif_polygons.js but the source file was never committed — only present in the working tree. Adding it so a fresh clone compiles. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 06:01:04 -05:00
root	f9a408e4c4	Surname → ethnicity routing + ComfyUI fallback for sparse pool buckets + cache-buster Three problems J flagged ("not matching properly", "same faces", "still showing old icons") had three different roots: 1. MISMATCH: front-end was first-name only, so "Anna Cruz" / "Patricia Garcia" / "John Jimenez" all defaulted to caucasian. Added SURNAMES_HISPANIC / _SOUTH_ASIAN / _EAST_ASIAN / _MIDDLE_EASTERN dicts to both search.html and console.html. Surname is checked FIRST (stronger signal for hispanic + asian than first names), then first-name fallback. Cruz → hispanic, Patel → south_asian, Nguyen → east_asian, regardless of first name. 2. SAME FACES: pool buckets are uneven — woman/south_asian=3, man/black=4, woman/middle_eastern=2 — so any worker in those buckets collapses to 2-4 photos no matter how good the hash is. /headshots/:key now 302-redirects to /headshots/generate/:key when the gender × race intersection is below 30 faces. ComfyUI on-demand gives infinite uniqueness for the sparse buckets (deterministic-per-worker via djb2 seed). Dense buckets still serve from the pool — no GPU cost there. 3. STALE CACHE: Cache-Control was max-age=86400, immutable — pinned old photos in browsers for 24h after any server-side update. Dropped to max-age=3600, must-revalidate, and added a v=2 cache-buster query param to all front-end /headshots/ URLs so existing cached entries are bypassed on next page load. Also surfacing X-Face-Pool-Bucket / Bucket-Size headers for diagnosis. Verified: playwright run shows surname routing correct (Torres, Rivera, Alvarez, Gutierrez, Patel, Nguyen, Omar all bucketed correctly), sparse buckets 302 to ComfyUI, dense buckets stay on the thumb pool. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 06:01:04 -05:00
root	a3b65f314e	Synthetic face pool — 1000 StyleGAN headshots, ComfyUI hot-swap, 60x smaller thumbs Worker cards now ship a real photo per person instead of monogram tiles: - fetch_face_pool.py pulls 1000 faces from thispersondoesnotexist.com - tag_face_pool.py runs deepface for gender/race/age, excludes <22yo - manifest.jsonl: 952 servable, gender/race buckets populated - /headshots/_thumbs/ pre-resized to 384px webp (587KB -> 11KB, 60x smaller; without this Chrome's parallel-connection budget drops ~75% of tiles in a 40-card grid) - /headshots/:key gender x race x age intersection bucketing with gender-only fallback when intersection is sparse - /headshots/generate/:key ComfyUI on-demand for the contractor profile spotlight (cold ~1.5s, cached ~1ms; worker-derived djb2 seed makes faces deterministic-per-worker but unique across workers sharing the same prompt) - serve_imagegen.py _cache_key() now includes seed (was caching by prompt only -> 3 different worker seeds collapsed to 1 cached image; verified fix produces 3 distinct md5s) - confidence-default name resolution: Xavier->man+hispanic, Aisha->woman+black, etc. Every worker resolves to a bucket. End-to-end: playwright run on /?q=forklift+operators+IL -> 21/21 cards loaded, 0 broken, all 384px webp. Cache + binary pool gitignored; manifest tracked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 06:01:04 -05:00
root	10ed3bc630	demo: real synthetic headshots — fetch pool + serve route + UI wire Three layers shipped: 1. SCRIPT — scripts/staffing/fetch_face_pool.py Pulls N synthetic StyleGAN faces from thispersondoesnotexist.com into data/headshots/face_NNNN.jpg, writes manifest.jsonl. Idempotent: re-running skips existing files. Optional gender tagging via deepface (currently unavailable on this box; the script handles ImportError gracefully and tags everything as untagged). Fetched 198 faces with concurrency=3 in ~67s. 2. SERVER — /headshots/:key route in mcp-server/index.ts Loads manifest at first hit, caches in globalThis._faces. Hashes the key with djb2-style mixing → pool index → returns the JPG. Same key always gets the same face (deterministic). Accepts ?g=man\|woman&e=caucasian\|black\|hispanic\|south_asian\|east_asian\|middle_eastern to bias pool selection — the gender/ethnicity buckets fall back to the full pool when no tagged matches exist. Cache-Control: 86400 immutable so faces ride the browser cache after first hit. /headshots/__reload re-reads the manifest without restart. 3. UI — search.html + console.html worker cards Re-added overlay <img> on top of the monogram .av circle. img.src = /headshots/<encoded-key>?g=<hint>&e=<hint>. img.onerror removes the failed image so the monogram stays visible if the face pool isn't fetched / CDN is blocked. .av now has overflow:hidden + position:relative to clip the img to a perfect circle. Forced-confident name resolution (J: "we're CREATING the profile, created as though you truly have the information Xavier is more likely Hispanic and he's a male"): genderFor(name) — looks up MALE_NAMES + FEMALE_NAMES, falls back to a deterministic hash split so unknown names spread ~50/50. Sets now include cross-cultural names: Alejandro/ Andres/Mateo/Santiago/Joaquin/Cesar/Hugo/ Felipe/Gerardo/Salvador/Ramon (Hispanic), Raj/Anil/Vikram/Krishna/Pradeep (South Asian), Wei/Yi/Hiroshi/Akira/Hyun (East Asian), Demetrius/Kareem/DaQuan/Khalil (Black), Omar/Khalid/Hassan/Ahmed/Bilal (Middle Eastern). FEMALE_NAMES extended in parallel. guessEthnicityFromFirstName(name) — confident default of 'caucasian' for any name not in the cultural buckets so every worker resolves to a category the face pool can be biased toward. Order: ME → Black → Hispanic → South Asian → East Asian → Caucasian (matters where names overlap, e.g. Aisha appears in ME + Black, biases toward ME for visual fit). Both helpers also ported into console.html so the triage backfills and try-it-yourself rendering get the same hint stack. Privacy note in the script + route comments: the synthetic data uses the worker's name as the seed; production should hash worker_id (not name) to avoid leaking PII to a third-party CDN. The fetch URL itself is referenced once per pool build, not per-worker. .gitignore — added data/headshots/face_*.jpg (~100MB for 198 faces; the manifest + script are tracked). Re-running the script on a fresh checkout rebuilds the pool from scratch. Verified end-to-end via playwright on devop.live/lakehouse: forklift query → 10 worker cards 10/10 with face images (real synthetic headshots, not monograms) 0/10 broken Alejandro G. Nelson → ?g=man&e=hispanic Patricia K. Garcia → ?g=woman&e=caucasian Each name → unique face, deterministic across loads. Console triage backfills get the same treatment.	2026-04-28 06:01:04 -05:00

1 2 3 4 5 ...

384 Commits