lakehouse

Author	SHA1	Message	Date
root	87b034f5f9	phase 1.6: ops dashboard + consent_versions allowlist + subject timeline tool Closes the afternoon's "all four" wave (per J's request to do all the items in one pass instead of pick-one-of-options): (1) Live demo on WORKER-100 — full lifecycle exercised end-to-end against the running gateway. 3 audit rows landed in correct order (consent_grant → biometric_collection → consent_withdrawal), chain_verified=true, photo on disk at data/biometric/uploads/WORKER-100/1778011967957907731_027b6bb1.jpg (180 bytes JFIF). retention_until=2026-06-04 (30d from withdrawal per consent template v1 §2). (2) GET /biometric/stats — read-only aggregate over all subjects. Returns counts by biometric.status + subject.status, photo count, oldest_active_retention_until, and the last 20 state-change events (consent_grant / collection / withdrawal / erasure — validator_lookup and other noise filtered out). Walks per-subject audit logs via the existing writer; cheap for 100 subjects, would want an event-stream index at 100k. Legal-tier auth (same posture as /audit). 4 unit tests. (3) /biometric/dashboard mcp-server frontend. Auto-refreshes /biometric/stats every 15s, neo-brutalist tile layout for the per-status counts + retention horizon block + recent events table with kind badges + event-kind breakdown pills. sessionStorage-backed token; logout button clears state. DOM-built throughout (textContent + createElement) — never innerHTML on audit-row values, since trace_id et al. could in theory carry operator-supplied strings. (4) consent_versions allowlist. BiometricEndpointState gains `allowed_consent_versions: Option<Arc<HashSet<String>>>`, loaded at startup from /etc/lakehouse/consent_versions.json (override via LH_CONSENT_VERSIONS_FILE). process_consent refuses unknown hashes with HTTP 400 consent_version_unknown when configured. Resolution semantics: - Missing file → permissive (v1 compat, warn-log) - Parse error → permissive (error-log; broken config silently going strict would be worse) - Empty array → strict, refuse all (deliberate freeze mode for "counsel hasn't signed v1 yet") - Populated → strict, lowercase-normalized comparison 5 unit tests (known/unknown/case/empty/none-permissive). Example template at ops/consent_versions.example.json with a counsel-tier deployment note. (5) scripts/staffing/subject_timeline.sh — operator one-shot pretty-print of any subject's full BIPA lifecycle. Curls /audit/subject/{id} with legal token; renders manifest summary + on-disk photo state + chronological audit chain with kind badges + chain verification status. Smoke-tested on WORKER-100 (3 rows verified). (6) STATE_OF_PLAY.md refresh. New section "afternoon wave" captures all four commits (76cb5ac, 7f0f500, 68d226c, this one) + the live demo evidence + the v1 endpoint matrix + UI/CLI inventory + the production-cutover blocking set (counsel calendar only — eng substrate is done). Verified live post-restart: - /audit/health + /biometric/health both 200 - /biometric/stats returns 100 subjects, 2 withdrawn (WORKER-2 from earlier scrum + WORKER-100 from today's demo), 1 photo on record, 6 recent state-change events - /biometric/intake + /biometric/withdraw + /biometric/dashboard all 200 on mcp-server :3700 - subject_timeline.sh on WORKER-100: chain_verified=true, chain_root=a47563ff937d50de… - 88/88 catalogd lib tests + 55/55 biometric_endpoint tests green Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 15:27:52 -05:00
root	68d226c314	phase 1.6: BIPA withdrawal endpoint + UI + retention sweep timer Closes the four production gaps that were live after the consent endpoint shipped (76cb5ac): (1) Withdrawal endpoint POST /biometric/subject/{id}/withdraw backs the BIPA right of withdrawal that consent template v1 §2 explicitly promises. Without it, the only way to honor a candidate's withdrawal request was the heavier /erase, which destroys immediately rather than starting the 30-day SLA clock that the consent template commits to. Side-effects: - manifest.consent.biometric.status: Given → Withdrawn - manifest.consent.biometric.retention_until: 18mo → 30d - audit row kind=biometric_consent_withdrawal, captures reason + operator_of_record + evidence_path - DOES NOT touch general_pii or subject.status — biometric is independently revocable State machine: Given→Withdrawn (happy), NeverCollected/Pending→ 409 nothing_to_withdraw, Withdrawn→409 already_withdrawn (won't advance the destruction clock), Expired→409 already_expired, subject Erased/RetentionExpired→403 subject_inactive. 12 new unit tests covering happy path + all guards + a full grant→withdraw cycle that asserts retention_until is correctly accelerated and the audit chain has 2 rows in correct order. (2) Withdraw UI at /biometric/withdraw (mcp-server-served HTML). 3-screen flow: operator auth (token + name in sessionStorage), withdrawal form (candidate_id + free-text reason ≥10 chars + optional evidence path), confirmation showing the audit row HMAC + the 30-day retention_until clock + a curl recipe for /audit/subject/{id} verification. Same neo-brutalist styling as biometric_intake.html. Mounted at http://localhost:3700/biometric/withdraw and externally at https://devop.live/lakehouse/biometric/withdraw. (3) Retention sweep systemd timer. crates/catalogd/bin/retention_sweep binary already existed; this commit schedules it. Daily 03:00 UTC, Persistent=true so a missed boot triggers on next start. Service runs as oneshot with --apply (writes a date-stamped JSONL to data/_catalog/subjects/_retention_sweep_<date>.jsonl ONLY when overdue subjects exist, per the binary's existing semantics). install.sh updated to handle .timer + paired .service correctly: enables the timer, skips direct start of the oneshot service (the timer pulls it in). One-shot manual test confirmed clean: 100 subjects scanned, 0 overdue (all backfill subjects within their 4-year general retention window). (4) operator_of_record bug fix in intake UI. Previously the page hardcoded the literal string 'intake_ui_operator' as the operator_of_record sent to /consent — meaning every audit row captured the same useless placeholder, defeating the whole point of operator traceability. Fixed by adding an operator name field to the token-paste step (sessionStorage-backed), passed through to consent + photo POSTs as the actual operator. Verified live post-restart: - gateway /audit/health + /biometric/health both 200 - mcp-server /biometric/intake + /biometric/withdraw both 200 - Live withdraw probes: 401 (no token), 400 (empty body), 404 (ghost subject), 409 nothing_to_withdraw on WORKER-1 (which is NeverCollected per backfill default) — all expected - Binary strings contain: process_withdraw, withdraw_consent, biometric_consent_withdrawal, biometric_withdraw_response.v1, nothing_to_withdraw, already_withdrawn, already_expired, /subject/{candidate_id}/withdraw route - systemd: lakehouse-retention-sweep.timer active+enabled, next fire Tue 2026-05-05 22:00 CDT (= 03:00 UTC May 6) - Manual one-shot of retention sweep service: exit 0/SUCCESS, 100 subjects loaded, 0 overdue 83/83 catalogd lib tests + 46/46 biometric_endpoint tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 15:09:32 -05:00
root	41b0a99ed2	chore: add real content that was sitting untracked Surfaced by today's untracked-files audit. None of these are accidents — multiple are referenced by name in CLAUDE.md and memory files but were never added. Categories: - docs/PHASE_AUDIT_GUIDE.md (106 LOC) — Claude Code phase audit guidance - ops/systemd/lakehouse-langfuse-bridge.service — Langfuse bridge unit - package.json — top-level npm manifest - scripts/e2e_pipeline_check.sh + production_smoke.sh — real test scripts - reports/kimi/audit-last-week.md — the "Two reports live" CLAUDE.md cites - tests/multi-agent/scenarios/ — 44 staffing scenarios (cutover decision A) - tests/multi-agent/playbooks/ — 102 playbook records - tests/battery/, tests/agent_test/PRD.md, tests/real-world/ — real tests - sidecar/sidecar/{lab_ui,pipeline_lab}.py — 888 LOC dev-only UIs that remain in service post-sidecar-drop (commit ba928b1 explicitly kept them) Sensitivity check: scenarios use synthetic company names ("Heritage Foods", "Cornerstone Fabrication"); audit reports describe code findings only; no PII or secrets surfaced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 22:22:10 -05:00
root	21fd3b9c61	Scrum-driven fixes: P5-001 auth wired, P42-001 truth evaluator, P9-001 journal on ingest Some checks failed lakehouse/auditor 2 blocking issues: cloud: claim not backed — "\| P9-001 (partial) \| `crates/ingestd/src/service.rs` \| 3 → 6 ↑↑↑ \| `journal.record_ing Apply the highest-confidence findings from the Phase 0→42 forensic sweep after four scrum-master iterations under the adversarial prompt. Each fix is independently validated by a later scrum iteration scoring the same file higher under the same bar. Code changes ──────────── P5-001 — crates/gateway/src/auth.rs + main.rs api_key_auth was marked #[allow(dead_code)] and never wrapped around the router, so `[auth] enabled=true` logged a green message and enforced nothing. Now wired via from_fn_with_state, with constant-time header compare and /health exempted for LB probes. P42-001 — crates/truth/src/lib.rs TruthStore::check() ignored RuleCondition entirely — signature looked like enforcement, body returned every action unconditionally. Added evaluate(task_class, ctx) that actually walks FieldEquals / FieldEmpty / FieldGreater / Always against a serde_json::Value via dot-path lookup. check() kept for back-compat. Tests 14 → 24 (10 new exercising real pass/fail semantics). serde_json moved to [dependencies]. P9-001 (partial) — crates/ingestd/src/service.rs Added Optional<Journal> to IngestState + a journal.record_ingest() call on /ingest/file success. Gateway wires it with `journal.clone()` before the /journal nest consumes the original. First-ever internal mutation journal event verified live (total_events_created 0→1 after probe). Iter-4 scrum scored these files higher under same prompt: ingestd/src/service.rs 3 → 6 (P9-001 visible) truth/src/lib.rs 3 → 4 (P42-001 visible) gateway/src/auth.rs 3 → 4 (P5-001 visible) gateway/src/execution_loop 4 → 6 (indirect) storaged/src/federation 3 → 4 (indirect) Infrastructure additions ──────────────────────── * tests/real-world/scrum_master_pipeline.ts - cloud-first ladder: kimi-k2:1t → deepseek-v3.1:671b → mistral-large-3:675b → gpt-oss:120b → devstral-2:123b → qwen3.5:397b (deep final thinker) - LH_SCRUM_FORENSIC env: injects SCRUM_FORENSIC_PROMPT.md as adversarial preamble - LH_SCRUM_PROPOSAL env: per-iter fix-wave doc override - Confidence extraction (markdown + JSON), schema v4 KB rows with: verdict, critical_failures_count, verified_components_count, missing_components_count, output_format, gradient_tier - Model trust profile written per file-accept to data/_kb/model_trust.jsonl - Fire-and-forget POST to observer /event so by_source.scrum appears in /stats * mcp-server/observer.ts — unchanged in shape, confirmed receiving scrum events * ui/ — new Visual Control Plane on :3950 - Bun.serve with /data/{services,reviews,metrics,trust,overrides,findings,file,refactor_signals,search,logs/:svc,scrum_log} - Views: MAP (D3 graph, 5 overlays) / TRACE (per-file iter timeline) / TRAJECTORY (refactor signals + reverse index search) / METRICS (explainers with SOURCE + GOOD lines) / KB (card grid with tooltips) / CONSOLE (per-service journalctl tail, tabs for gateway/sidecar/observer/mcp/ctx7/auditor/langfuse) - tryFetch always attempts JSON.parse (fix for observer returning JSON without content-type) - renderNodeContext primitive-vs-object guard (fix for gateway /health string) * docs/SCRUM_FIX_WAVE.md — iter-specific scope directing the scrum * docs/SCRUM_FORENSIC_PROMPT.md — adversarial audit prompt (verdict/critical/verified schema) * docs/SCRUM_LOOP_NOTES.md — iteration observations + fix-next-loop queue * docs/SYSTEM_EVOLUTION_LAYERS.md — Layers 1-10 roadmap (trust profiling, execution DNA, drift sentinel, etc) Measurements across iterations ────────────────────────────── iter 1 (soft prompt, gpt-oss:120b): mean score 5.00/10 iter 3 (forensic, kimi-k2:1t): mean score 3.56/10 (−1.44 — bar raised) iter 4 (same bar, post fixes): mean score 4.00/10 (+0.44 — fixes landed) Score movement iter3→iter4: ↑5 ↓1 =12 21/21 first-attempt accept by kimi-k2:1t in iter 4 20/21 emitted forensic JSON (richer signal than markdown) 16 verified_components captured (proof-of-life, new metric) Permission Gradient distribution: 0 auto · 16 dry_run · 4 sim · 1 block Observer loop: by_source {scrum: 21, langfuse: 1985, phase24_audit: 1} v1/usage: 224 requests, 477K tokens, all tracked Signal classes per file (iter 3 → iter 4): CONVERGING: 1 (ingestd/service.rs — fix clearly landed) LOOPING: 4 (catalogd/registry, main, queryd/service, vectord/index_registry) ORBITING: 1 (truth — novel findings surfacing as surface ones fix) PLATEAU: 9 (scores flat with high confidence — diminishing returns) MIXED: 6 Loop thesis status ────────────────── A file's score rises only when the scrum confirms a real fix landed. No false positives yet across 3 iterations. Fixes applied to 3 files all raised their independent scores under the same adversarial prompt. Loop is measurable, not hand-wavy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 02:25:43 -05:00
profit	c85c55006d	ops: systemd units for auditor + context7 bridge Some checks failed lakehouse/auditor 3 warnings — see review Promotes two previously manual-start Bun services to systemd so they survive restarts + run continuously. - ops/systemd/lakehouse-auditor.service — polls Gitea every 90s, runs 4 audit checks per PR head SHA, posts commit status + review comment. Runs as root to match existing lakehouse-* service conventions on this host; can read /home/profit/.git-credentials (0600 profit:profit). - ops/systemd/lakehouse-context7-bridge.service — HTTP wrapper on :3900 for Phase 45 doc-drift detection. Decoupled from gateway; runs independently. - ops/systemd/install.sh — idempotent installer (copy → daemon-reload → enable --now). Prints post-install active/enabled status. - ops/systemd/README.md — run/stop/logs/pause docs. Pause control stays per-service (bot.paused / auditor.paused files at repo root). Not wired to branch protection yet — the auditor's commit status is currently advisory, not enforcing. Flip via Gitea branch_protections API when confident.	2026-04-22 04:15:58 -05:00

5 Commits