371 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
fd429f4185 |
audit phase 1.5: BIPA schema audit + outcomes.jsonl content sample
Two follow-up walks per AUDIT_PHASE_1_DISCOVERY §10/C4 + gemini scrum flag. Read-only. No code changes. BIPA findings: - scripts/staffing/tag_face_pool.py uses deepface to extract gender + race + age from face images. Output persists to data/headshots/ manifest.jsonl. For synthetic faces this is fine; for real candidate photos this becomes a regulated biometric database (740 ILCS 14/10). - mcp-server/index.ts:1408 ComfyUI prompt EXPLICITLY embeds protected attributes (age + race + gender) into model prompt — system-level encoding of protected-attribute features into AI workflow. - mcp-server/search.html:3375-3432 has hard-coded FEMALE_NAMES / MALE_NAMES / NAMES_HISPANIC / SURNAMES_* lookup tables — name-based ethnicity inference. Title VII / disparate-impact risk separate from BIPA. - data/headshots/manifest.jsonl is TRACKED IN GIT today (synthetic classifications). For real photos, this would be biometric data in version control — serious failure. - No consent flow, no public retention schedule, no deletion procedure, no employee training documented. All required by BIPA §15(a)/(b) before real-photo intake. outcomes.jsonl sample: - 39/101 rows persist candidate names in fills[*].name field today - Sample names: "Carmen I. Garcia", "Jamal Z. Jones", "Jacob N. Patel" (synthetic but real shape) - 0 hits for "culture fit" / "communication" / etc proxy phrases — synthetic data doesn't generate them. When real models reason about real candidates, they will. Append-only persistence makes RTBF cryptographic-erasure-only. Recommends Phase 1.6 (NEW) — BIPA pre-launch gates between Phase 1.5 and Phase 2: BIPA_COMPLIANCE_POLICY.md, consent gate at upload endpoint, quarantine real-photo classifications to data/biometric/, deprecate name->ethnicity lookup tables, unit test that synthetic manifest stays synthetic. 4-8 hours of design + one code commit. 5 open questions for J: where do real photos enter, will deepface tagging path stay for real photos, consent UX, retention duration floor, designated privacy officer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
64bda21614 |
audit PRD: J answered 5 open questions — fold into §10, revise phase plan
Conversation 2026-05-03 — J confirmed: - Photos/video YES → BIPA in full force ($1k-$5k per violation) - Langfuse self-hosted → drops GDPR Art. 44 cross-border concern - EU not in scope now but placeholder needed → design EU-compatible - Healthcare vertical YES → HIPAA BAA needed with model providers, PHI redaction at gateway boundary OR local-only routing for those requests, vertical-detection at boundary is Phase 2 requirement - Training/RAG MAY re-run on outcomes → design as if it will, training- safe export interface needed, crypto-erasure becomes load-bearing evidence chain §10 updated with answered/pending status per question. New §10.5 "Effect on phase plan" introduces: - Phase 1.5 (NEW) — BIPA photo/video schema audit + Langfuse boundary scoping + outcomes.jsonl content sample, BEFORE Phase 2 design - Phase 2 design must now include: EU-placeholder fields, vertical detection, training-safe export, BIPA consent metadata - Phase 9 rehearsal must cover discrimination + BIPA + healthcare PHI 3 questions still pending J's call before Phase 2 design ships: identity service daemon vs in-process, JSON vs signed PDF for legal export, audit endpoint auth model. No code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
627a5f0c3d |
audit phase 1: §10 scrum-review findings + walk back §1F over-claim
Ran cross-lineage scrum on the discovery doc with the new model fleet
(opus + kimi-k2.6 + gemini-3-flash via Go gateway :4110, custom
"senior security architect" prompt). 3/3 reviewers responded with
substantive 800-1200 word reviews. Saved at /tmp/audit_scrum/.
5 convergent findings (≥2 reviewers) added as §10/C1-C5:
C1. §1F matrix-indexer "good for audit defensibility" claim is over-
claimed — walked back in TL;DR. Trace bodies unverified; treat as
SUSPECTED PII sink until §8.1 sampling completes.
C2. §1E (Langfuse) is the most dangerous leak — fix FIRST, ahead of
view-routing. Boundary-crossing leak (GDPR Art. 44 / CPRA sale /
SOC2 disposal). All 3 reviewers converge on this priority.
C3. Discrimination defense requires the FULL CANDIDATE POOL, not just
fills. EEOC UGESP (1978): need adverse-impact stats on everyone
who could have been picked. Phase 1 worked example missed this.
C4. BIPA / biometric exposure understated in findings (in PRD §10.5
but not translated to actionables). $1k-$5k per-violation regime.
C5. candidate_id must be promoted to top-level field in all JSONL
sinks. Grepping natural-language strings is not defensible audit
strategy. 3/3 reviewers converge.
11 single-reviewer high-value catches added as §10 single-reviewer
section: opus on LLM provider egress (8th PII path), Art. 22 right-
to-explanation, special-category data, DPIA/ROPA/DPA inventory; kimi
on sequential ID enumeration risk, Langfuse retention config, CCPA
de-identified-in-place vs crypto-shred, Bun common-mode failure,
cryptographic audit-trail integrity (Merkle/FRE 901), HIPAA BAA,
revised SELECT * effort estimate; gemini on data residency, "culture
fit" reasoning proxies, comparator-pool snapshot.
§9 reordered: sample first → defense-layer second → Langfuse
boundary third (was view-routing first per original draft;
boundary-crossing leak is higher priority per scrum).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
505ea93726 |
audit phase 1: discovery walk complete — subject + PII surface map
Read-only walk of both runtimes per AUDIT_TRAIL_PRD.md §8 phase 1.
Fills "UNKNOWN" cells in PRD §3 + §7 with file:line evidence.
Headline findings:
- candidates_safe + workers_safe views EXIST as a defense layer but
are BYPASSED — tool registry SQL templates query raw tables
- PII traverses 7+ persistence/transmission paths per fill scenario:
SQL → tool_result → LogEntry → /v1/respond → Langfuse → outcomes.jsonl
→ overseer_corrections.jsonl
- candidate_id is stable but co-located with PII in workers_500k.parquet
(no separate identity service)
- /audit/subject/{id} endpoint does not exist
- Append-only persistence is universal — RTBF requires crypto-erasure
- Pathway memory is structurally subject-agnostic in fingerprints
(defensive); trace bodies may leak PII (needs sampling)
- Go side mirrors Rust PII shape — parity in the leak too
- Worked example (John Martinez audit today): NOT POSSIBLE to produce
complete-and-defensible response
Recommends 4 cheap high-value moves before Phase 2 design starts:
defense-layer enforcement (rewrite 3 SQL templates to _safe views),
sample state.json/Langfuse to confirm pathway memory is clean, walk
Bun mcp-server tool surface, schema-audit for protected-attribute
proxies. None are commitments — J's call.
No code changes in this commit. Companion to AUDIT_TRAIL_PRD.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
b2d717ae44 |
audit PRD: add §10.5 jurisdictional surface (IL + IN, federal, SOC2)
J flagged that the staffing system targets Chicago + Indiana — added a jurisdictional checklist section to the audit-trail PRD so counsel has a working starting point. Covered: - Federal: Title VII, ADEA, ADA, EEOC, OFCCP, FCRA, Section 1981 - Illinois: BIPA (high risk if any candidate photos), AI Video Interview Act (820 ILCS 42), Illinois Human Rights Act (broader than Title VII), PIPA breach notification, Day and Temporary Labor Services Act (directly applies — staffing industry-specific recordkeeping), Cook County + City of Chicago Human Rights Ordinances (additional protected classes including source of income, parental status, credit history) - Indiana: Data Breach Disclosure, Civil Rights Law (lighter than IL), Genetic Information Privacy Act - SOC 2 Type II as the typical SaaS sale gate (Privacy + Security TSCs most relevant; 6-9 month effort to first report) - HIPAA / PCI / ISO 27001 noted as out of current scope but flagged Phase reordering implications captured: - BIPA risk on real candidate photos may need to be resolved BEFORE audit-trail work (class-action exposure) - SOC 2 Type II prep runs in parallel, not after - IL Day and Temporary Labor Services recordkeeping may override our proposed 4-year retention SLA 7 open questions added that counsel must answer before the §8 phases can be locked in. Document is explicit (multiple times) that this is NOT legal advice — a research-grade checklist for J's counsel conversation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c170ebc86e |
docs: AUDIT_TRAIL_PRD — production-readiness gate for staffing client
J flagged that smoke + parity tests prove the surface compiles, NOT that an audit response can be produced for a specific person — and the staffing client won't sign without defensible discrimination-claim response capability. New docs/AUDIT_TRAIL_PRD.md captures: - worked example: John Martinez at Warehouse B requests audit - subject audit response output format (per-decision row schema) - surface map: where decisions happen today, where the gaps are - PII handling rules (tokenization, protected-attribute exclusion, inferred-attribute risk) - identity service design intent (separate daemon, audited reads) - retention + right-to-be-forgotten policy intent - 9-phase implementation sequence with explicit per-phase exit criteria - cross-runtime requirement (both Rust + Go must satisfy) - 7 open questions blocking phase 2+ that need J's call STATE_OF_PLAY + PRD updated with explicit "production-ready blocker" section pointing at the new doc. The "substrate is shipped" framing gets a caveat: substrate ≠ production-ready until audit phase 9 exits. No code changes. This is the planning artifact J asked for before we start building. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5368aca4d4 |
docs: sync ADR-019 + PRD + DECISIONS with 2026-05-02 substrate changes
ADR-019: closed the "re-bench when 10M corpus exists" follow-up. Added "Follow-up: 10M re-bench (2026-05-02)" section with the post-fix numbers (search ~20ms warm / ~46ms cold, doc-fetch ~5ms post-btree). Documented the lance-bench-bypassing-IndexMeta bug + 2-layer fix + gauntlet (7 unit + 12 sanitize + 10 smoke probes). Reframes the strategic question as "Lance vs Parquet+HNSW-with-spilling" since HNSW doesn't fit RAM at 10M. DECISIONS: added ADR-022 — drop Python sidecar from Rust hot path. Captures the rationale (236× embed perf gap was pure overhead), co-shipped LRU cache, dev-only Python that survives, cross-runtime parity verification, and the operator runbook signal (ps -ef ABSENT post-deploy). PRD: updated AI Boundary table line + aibridge crate description to reflect direct Ollama path (was: Python FastAPI sidecar → Ollama). Both lines reference ADR-022 for the full rationale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
e9d17f7d5a |
sanitize: drop over-broad path-missing branch + UTF-8-safe redaction
Re-scrum of yesterday's sanitizer fix surfaced 2 more real bugs in the fix itself (opus, both WARN, neither caught by kimi/qwen): W1 (service.rs:1949) — `mentions_path_missing` standalone branch was too aggressive. A registry-internal error like "/root/.cargo/.../x.rs: no such file or directory" would 404 because it triggers without dataset context. That's a real 500. Dropped the standalone branch; require dataset context AND missing-shape phrase. Lance's actual "Dataset at path X was not found" still satisfies it. W2 (service.rs:2018) — `out.push(bytes[i] as char)` corrupted multi-byte UTF-8 by casting raw bytes to char (only sound for ASCII < 128). A path containing user-supplied non-ASCII names produced Latin-1 mojibake. Rewrote redact_paths to track byte indices and emit unmatched runs as &str slices via push_str(&s[range]) — preserves multi-byte sequences verbatim. Step advance is now per-char, not per-byte, via small utf8_char_len helper. Two new regression tests: - is_not_found_does_not_match_unrelated_path_missing - redact_preserves_multibyte_utf8 (uses 工作 + café in input) 12/12 sanitize tests PASS. Smoke 10/10 PASS. Loop closure for opus re-scrum on the 2026-05-02 fix bundle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
ac7c996596 |
sweep up scrum WARNs — model const, stale config, temp_path entropy, smoke gate
Four findings deferred from the 2026-05-02 scrum, all 1-5 line fixes: W1 (kimi WARN @ scrum_master_pipeline.ts:1143) — `gemini-3-flash-preview` hardcoded twice in MAP and REDUCE phases. Extracted TREE_SPLIT_MODEL + TREE_SPLIT_PROVIDER constants near the existing config block. Diverging the two would break tree-split coherence (per-shard digests must come from the same model the reducer collapses). W2 (qwen WARN @ providers.toml:30) — stale `kimi-k2:1t` reference in operator-facing comments after PR #13 noted it's upstream-broken. Reframed as historical context ("was X here pre-2026-05-03 — that model is broken") so future operators don't paste-route from the comment. W3 (opus WARN @ vectord-lance/src/lib.rs:622) — temp_path() entropy was only pid+nanos, which collide under tokio scheduling when multiple tests in the same cargo process create temp dirs back-to-back. Added per-process AtomicU64 sequence counter — guarantees uniqueness regardless of clock. W4 (opus INFO @ scripts/lance_smoke.sh:38) — `|| echo '{}'` swallowed curl transport failures (gateway down, network broken, timeout), surfacing as misleading "no method field" jq errors at the next probe. Now captures $? separately, gates a "curl reachable" probe, and only falls back to empty body for the dependent jq parse. Smoke went 9 → 10 probes. Verified: vectord-lance 7/7 tests PASS, gateway cargo check clean, lance_smoke.sh 10/10 PASS against live gateway. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
7bb66f08c3 |
lance: scrum-driven sanitizer + smoke-gate fixes (opus 2026-05-02 BLOCK)
Some checks failed
lakehouse/auditor 9 blocking issues: cloud: claim not backed — "Verified live (post-restart): scale_test_10m doc-fetch 4-15ms across"
Cross-lineage scrum on the lance wave (4 bundles, 33 distinct findings)
surfaced 1 real BLOCK and 2 real WARNs from opus that the kimi/qwen
lineages missed. Per feedback_cross_lineage_review.md, opus is the
load-bearing reviewer; cross-lineage convergence is noise unless verified.
BLOCK fix — sanitize_lance_err path-stripping was unsound:
err.split("/home/").next().unwrap_or(&err)
returns Some("") when err STARTS with "/home/", erasing the entire
message. Replaced truncation with redact_paths() — a hand-rolled scanner
that walks the input once, replacing path-shaped substrings with
[REDACTED] while preserving surrounding error context. Catches:
- absolute paths under /root/.cargo, /home, /var, /tmp, /etc, /usr, /opt
- relative variants (Lance occasionally strips leading slash —
observed live "Dataset at path home/profit/lakehouse/data/lance/x
was not found")
- multiple occurrences in one error
- preserves quote/comma/whitespace terminators
WARN fix #1 — is_not_found heuristic was too broad:
lower.contains("not found")
caught real 500s like "column not found", "field not found in schema".
Narrowed to require dataset-shape phrasing AND exclude the
column/field/schema patterns explicitly.
WARN fix #2 — lance_smoke.sh `grep -qvE` was an unsound regression gate.
bash -c "echo '$BODY' | grep -qvE 'pat'"
With -v -q, exits 0 if ANY line lacks the pattern — so a multi-line
body with one leak line + any clean line FALSE-PASSES. Replaced with
the correct "pattern absent" form: `! grep -qE 'pat'`. Also expanded
the pattern set (added /var/, /tmp/) since the scrum surfaced these
as additional leak vectors.
Also unblocks pre-existing pathway_memory test compile error (stale
PathwayTrace init missing 6 Mem0-versioning fields added in 6ac7f61).
Tests filled in with sensible defaults — needed to run sanitize_tests.
10/10 new sanitize tests pass. Smoke 9/9 PASS against rebuilt+restarted
gateway. Live missing-index probe now returns:
"lance dataset not found: no-such-11205" + HTTP 404
(was: leaked absolute paths + HTTP 500 → leaked absolute and relative
paths post-first-fix → clean message + 404 now.)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
a294a61ee4 |
Merge remote-tracking branch 'origin/main' into demo/post-pr11-polish-2026-04-28
Some checks failed
lakehouse/auditor 9 blocking issues: cloud: claim not backed — "Verified live (post-restart): scale_test_10m doc-fetch 4-15ms across"
|
||
| feb638e4cd |
infra: replace gpt-oss with Ollama Pro + OpenCode Zen across hot paths (#13)
7 hot-path call sites swapped to Ollama Pro / OpenCode Zen models. All replacements live-probed. Auditor surfaced 2 kimi BLOCKs both verified false-positive on 2026-05-02. Compiles cleanly in isolation. |
|||
|
|
0af62861d2 |
STATE_OF_PLAY: refresh for 2026-05-02 wave (Lance gauntlet + parity + housekeeping)
Some checks failed
lakehouse/auditor 9 blocking issues: cloud: claim not backed — "Verified end-to-end via playwright on devop.live/lakehouse:"
Anchor was 5 days stale. Adds the 12-commit wave (Lance backend hardening, sidecar drop, observability parity, gitignore cleanup, gray-zone content add) with verification status for each. Updates DO NOT RELITIGATE with the 4 new things this wave makes load-bearing: - python sidecar dropped from hot path (don't wire it back) - lance gauntlet shipped (don't re-discover the bugs we just fixed) - 32/32 cross-runtime parity (don't build a 6th probe for already-covered surface) - ARCHITECTURE_COMPARISON.md is the single source of truth for cross-runtime decisions Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
41b0a99ed2 |
chore: add real content that was sitting untracked
Surfaced by today's untracked-files audit. None of these are accidents —
multiple are referenced by name in CLAUDE.md and memory files but were
never added.
Categories:
- docs/PHASE_AUDIT_GUIDE.md (106 LOC) — Claude Code phase audit guidance
- ops/systemd/lakehouse-langfuse-bridge.service — Langfuse bridge unit
- package.json — top-level npm manifest
- scripts/e2e_pipeline_check.sh + production_smoke.sh — real test scripts
- reports/kimi/audit-last-week*.md — the "Two reports live" CLAUDE.md cites
- tests/multi-agent/scenarios/ — 44 staffing scenarios (cutover decision A)
- tests/multi-agent/playbooks/ — 102 playbook records
- tests/battery/, tests/agent_test/PRD.md, tests/real-world/* — real tests
- sidecar/sidecar/{lab_ui,pipeline_lab}.py — 888 LOC dev-only UIs that
remain in service post-sidecar-drop (commit ba928b1 explicitly kept them)
Sensitivity check: scenarios use synthetic company names ("Heritage Foods",
"Cornerstone Fabrication"); audit reports describe code findings only;
no PII or secrets surfaced.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
6e34ef7baf |
gitignore: stop tracking runtime data, logs, build artifacts, scratch
Untracked count was 100+; almost all were data/_*/ daemon state, generated parquets under data/datasets and data/vectors, the 33GB data/lance/ tree, node_modules, exports, logs, per-run distillation reports, and test scratchpads. None of these are content — all regenerate from inputs. Now down to 33 untracked items, all real content (scripts, systemd unit, test scenarios, dev-only sidecar UIs, kimi audit reports). Those need J's call on what to track vs leave parked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
044650a1da |
lance-bench: also build doc_id btree post-IVF — match gateway's migrate behavior
The bench's own measure_random_access_lance uses take(row_position) — doesn't need the btree. But datasets written by this bench are commonly queried via /vectors/lance/doc/<name>/<doc_id> downstream, and without the btree that path falls back to a full table scan. Building inline keeps bench-produced datasets immediately production-shape and removes a footgun (the same one that made scale_test_10m's doc-fetch ~100ms until commit 5d30b3d fixed it via the migrate handler path). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5d30b3da89 |
lance: auto-build doc_id btree in migrate handler (root-cause for 10M doc-fetch slowness)
scale_test_10m doc-fetch p50 was ~100ms — full table scan over 35GB. Root cause: the auto-build at service.rs:1492-1503 only fires for IndexMeta- registered indexes during set_active_profile warming. lance-bench writes datasets through /vectors/lance/migrate/* directly, bypassing IndexMeta, so its datasets never get the doc_id btree that ADR-019 depends on. Fix: build the btree inline at the end of lance_migrate. Costs ~1.2s on 10M rows (+269MB on disk), drops doc-fetch from ~100ms to ~5ms (20x). Failure is non-fatal — logs a warning and the dataset stays queryable. Verified live (post-restart): scale_test_10m doc-fetch 4-15ms across 5 calls, smoke 9/9 PASS, vectord-lance 7/7 unit tests PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
7594725c25 |
lance backend: 4-pack — bug fix + smoke + tests + 10M re-bench
Some checks failed
lakehouse/auditor 12 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:"
Surfaced by the 2026-05-02 audit (vectord-lance + lance-bench + glue
existed and worked but had no tests, no smoke, leaked server paths
on missing-index search, and the ADR-019 10M re-bench was deferred).
## 1. Fix: missing-index search returned 500 + leaked filesystem path
Pre-fix:
$ POST /vectors/lance/search/no-such-index
HTTP 500
Dataset at path home/profit/lakehouse/data/lance/no-such-index was
not found: Not found: home/profit/lakehouse/data/lance/no-such-index/
_versions, /root/.cargo/registry/src/index.crates.io-...-1949cf8c.../
lance-table-4.0.0/src/io/commit.rs:364:26, ...
Post-fix:
HTTP 404
lance dataset not found: no-such-index
Added `sanitize_lance_err()` in crates/vectord/src/service.rs that:
- maps "not found" / "no such file" patterns → 404 (was 500)
- strips /home/ and /root/.cargo/ paths from any error body
Applied to all 5 lance handlers: search, get_doc, build_index,
append, migrate. The store_for() handle is cheap-and-stateless;
the actual disk hit happens inside the operation, which is where
the leak originated.
## 2. scripts/lance_smoke.sh — first regression gate
9-probe smoke against the live HTTP surface. Exercises only read
paths (no state mutation in CI). Specifically locks the sanitizer
fix — a future regression that re-introduces the path leak fires
the smoke immediately. 9/9 PASS against the live :3100 today.
## 3. Unit tests on vectord-lance/src/lib.rs (was: zero tests)
7 tests covering the public LanceVectorStore API:
- fresh_store_reports_no_state — handle is lazy
- migrate_then_count_and_fetch — Parquet → Lance round-trip
- get_by_doc_id_missing_returns_none — Ok(None) vs Err contract
that lets the HTTP handler return 404 cleanly
- append_grows_count_and_new_rows_fetchable — ADR-019's
structural-difference claim verified at the unit level
- append_dim_mismatch_errors — guards against silently breaking
search by accepting inconsistent-dim rows
- search_returns_nearest — exact-vector match → top-1
- stats_reports_post_migrate_state — locks the field shape
7/7 PASS. cargo test -p vectord-lance --lib green.
## 4. 10M re-bench (deferred from ADR-019)
reports/lance_10m_rebench_2026-05-02.md captures the numbers driven
against the live :3100 over data/lance/scale_test_10m (33GB / 10M
vectors, IVF_PQ confirmed via response method tag).
Headline:
Search cold (10 diverse queries): median ~32ms, mean ~46ms
Search warm (5x same query): ~20ms p50
Doc fetch (5x same id): ~100ms p50
Search latency at 10M is acceptable for batch / async workloads,
too slow for sub-10ms voice/recommendation paths. ADR-019's "Lance
pulls ahead at 10M" claim remains unverified-but-not-refuted — at
this scale HNSW doesn't operationally exist (10M × 768d × 4 bytes =
30GB just for vectors).
Real finding: doc-fetch at 10M is 300x slower than the 100K number
ADR-019 cited (311μs → ~100ms). Likely cause: scalar btree index
on doc_id may not be built for this dataset. Follow-up to
investigate whether forcing build_scalar_index brings it back to
the load-bearing O(1) range. Captured in the report.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
98b6647f2a |
gateway: IterateResponse echoes trace_id + enable session_log_path
Some checks failed
lakehouse/auditor 14 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:"
Closes the 2026-05-02 cross-runtime parity gap: Go's
validator.IterateResponse carried trace_id back to callers; Rust's
didn't. A caller pivoting from response → Langfuse → session log
worked on Go but failed on Rust because the join key wasn't visible
in the response body.
## Changes
crates/gateway/src/v1/iterate.rs:
- IterateResponse + IterateFailure gain `trace_id: Option<String>`
(skip-serializing-if-none preserves backward-compat for any
consumer parsing the response without the field)
- Both return sites populated with the resolved trace_id
lakehouse.toml:
- [gateway].session_log_path set to /tmp/lakehouse-validator/sessions.jsonl
— same path Go validatord writes to. The two daemons now co-write
one unified longitudinal log; rows tag daemon="gateway" vs
daemon="validatord" so producers stay distinguishable in DuckDB
queries. Append-write is atomic at the row sizes both runtimes
produce, so concurrent writes from both daemons are safe.
## Verification
Post-restart of lakehouse.service:
POST /v1/iterate with X-Lakehouse-Trace-Id: rust-fix1-test
→ response.trace_id = "rust-fix1-test" ✓ (was: field absent)
→ sessions.jsonl latest row daemon=gateway, session_id=rust-fix1-test ✓ (was: no row)
Cross-runtime drive — same prompt to Rust :3100 and Go :4110:
Rust: trace_id=unified-rust-001, daemon=gateway, accepted
Go: trace_id=unified-go-001, daemon=validatord, accepted
Same file, distinct daemons, one query covers both:
SELECT daemon, COUNT(*) FROM read_json_auto('sessions.jsonl', format='nd') GROUP BY daemon
→ gateway: 2, validatord: 19
All 4 parity probes still 6/6 + 12/12 + 4/4 + 2/2 against live
:3100 + :4110 stacks. Cargo test 4/4 PASS for v1::iterate module.
## Architecture invariant
The "unified longitudinal log" thesis is now demonstrated. Operators
running both runtimes in production point both daemons at the same
session_log_path and DuckDB queries naturally span both producers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
57bde63a06 |
gateway: trace-id propagation + coordinator session JSONL (Rust parity)
Some checks failed
lakehouse/auditor 10 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:"
Cross-runtime parity with the Go-side observability wave (commits
d6d2fdf + 1a3a82a in golangLAKEHOUSE). The two layers J flagged:
the LIVE per-call view (Langfuse) and the LONGITUDINAL forensic view
(JSONL queryable via DuckDB). Hard correctness gate (FillValidator
phantom-rejection) was already in place; this is the observability
on top.
## Trace-id propagation
X-Lakehouse-Trace-Id header constant declared in
crates/gateway/src/v1/iterate.rs (matches Go's shared.TraceIDHeader
byte-for-byte). When set on an inbound /v1/iterate request, the
handler reuses it; the chat + validate self-loopback hops forward
the same header so chatd's trace emit nests under the parent rather
than minting a fresh top-level trace per call.
ChatTrace gains a parent_trace_id field. emit_chat_inner skips the
trace-create event when parent is set, only emits the
generation-create which attaches to the existing trace tree. Result:
an iterate session with N retries shows in Langfuse as ONE tree, not
N+1 disconnected traces.
emit_attempt_span (new) writes one Langfuse span per iteration
attempt with input={iteration, model, provider, prompt} and
output={verdict, raw, error}. WARNING level on non-accepted
verdicts. The returned span id is stamped on the corresponding
SessionRecord attempt for cross-log correlation.
## Coordinator session JSONL
crates/gateway/src/v1/session_log.rs — new writer matching Go's
internal/validator/session_log.go schema byte-for-byte:
- SessionRecord with schema=session.iterate.v1
- SessionAttemptRecord per retry
- SessionLogger.append: tokio Mutex serialized append-only
- Best-effort posture (slog.Warn on error, never blocks request)
iterate.rs builds + appends a row on EVERY code path:
- accepted: write_session_accepted with grounded_in_roster bool
derived from validate_workers WorkerLookup (matches Go's
handlers.rosterCheckFor("fill") semantics)
- max-iter-exhausted: write_session_failure
- infra-error: write_infra_error (so a missing /v1/iterate event
never silently disappears from the longitudinal log)
[gateway].session_log_path config field (empty = disabled).
Production: /var/lib/lakehouse/gateway/sessions.jsonl. Operators who
want a unified longitudinal stream can point both Rust and Go
loggers at the same path — write-append is safe at the row sizes we
produce.
## Cross-runtime parity probe
crates/gateway/src/bin/parity_session_log: tiny stdin/stdout helper
that round-trips a fixture through SessionRecord serde.
golangLAKEHOUSE/scripts/cutover/parity/session_log_parity.sh feeds
4 fixtures through both helpers and diffs the rows after stripping
timestamp + daemon (the two fields that legitimately differ between
producers).
Result: **4/4 byte-equal** including the unicode-prompt fixture
("Café résumé ⭐ 你好"). Schema parity holds. The non-trivial-equal
guard in the probe rejects the case where both sides fail
identically — protecting against a regression where one side
silently stops producing valid JSON.
## Verification
- cargo test -p gateway --lib: 90/90 PASS (3 new session_log tests
including concurrent-append safety)
- cargo check --workspace: clean
- session_log_parity.sh: 4/4 fixtures byte-equal
- Both runtimes can append to the same path; DuckDB sees one stream
- The Go-side validatord smoke remains 5/5 (unchanged)
## Architecture invariant
Don't propose to "wire trace-id propagation in Rust" or "add Rust
session log" — both are now shipped on the demo/post-pr11-polish
branch. The longitudinal log + Langfuse tree together cover the
multi-call observability concern J flagged 2026-05-02.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
ba928b1d64 |
aibridge: drop Python sidecar from hot path; AiClient → direct Ollama
Some checks failed
lakehouse/auditor 11 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:"
The "drop Python sidecar from Rust aibridge" item from the architecture_comparison decisions tracker. Universal-win cleanup — removes 1 process + 1 runtime + 1 hop from every embed/generate request, with no behavior change. ## What was on the hot path before gateway → AiClient → http://:3200 (FastAPI sidecar) ├── embed.py → http://:11434 (Ollama) ├── generate.py → http://:11434 ├── rerank.py → http://:11434 (loops generate) └── admin.py → http://:11434 (/api/ps + nvidia-smi) The sidecar's hot-path code (~120 LOC across embed.py / generate.py / rerank.py / admin.py) was pure pass-through: each route translated its request body to Ollama's wire format and returned Ollama's response in a sidecar envelope. Zero logic, one full HTTP hop of overhead. ## What's on the hot path now gateway → AiClient → http://:11434 (Ollama directly) Inline rewrites in crates/aibridge/src/client.rs: - embed_uncached: per-text loop to /api/embed; computes dimension from response[0].length (matches the sidecar's prior shape) - generate (direct path): translates GenerateRequest → /api/generate (model, prompt, stream:false, options:{temperature, num_predict}, system, think); maps response → GenerateResponse using Ollama's field names (response, prompt_eval_count, eval_count) - rerank: per-doc loop with the same score-prompt the sidecar used; parses leading number, clamps 0-10, sorts desc - unload_model: /api/generate with prompt:"", keep_alive:0 - preload_model: /api/generate with prompt:" ", keep_alive:"5m", num_predict:1 - vram_snapshot: GET /api/ps + std::process::Command nvidia-smi; same envelope shape as the sidecar's /admin/vram so callers keep parsing - health: GET /api/version, wrapped in a sidecar-shaped envelope ({status, ollama_url, ollama_version}) Public AiClient API is unchanged — Request/Response types untouched. Callers (gateway routes, vectord, etc.) require zero updates. ## Config changes - crates/shared/src/config.rs: default_sidecar_url() bumps to :11434. The TOML field stays `[sidecar].url` for migration compat (operators with existing configs don't need to rename anything). - lakehouse.toml + config/providers.toml: bumped to localhost:11434 with comments explaining the 2026-05-02 transition. ## What stays Python sidecar/sidecar/lab_ui.py (385 LOC) + pipeline_lab.py (503 LOC) are dev-mode Streamlit-shape UIs for prompt experimentation. Not on the runtime hot path; continue running for ad-hoc work. The embed/generate/rerank/admin routes inside sidecar can be retired, but operators who want to keep the sidecar process running for the lab UI face no breakage — those routes still call Ollama and work. ## Verification - cargo check --workspace: clean - cargo test -p aibridge --lib: 32/32 PASS - Live smoke against test gateway on :3199 with new config: /ai/embed → 768-dim vector for "forklift operator" ✓ /v1/chat → provider=ollama, model=qwen2.5:latest, content=OK ✓ - nvidia-smi parsing tested via std::process::Command path - Live `lakehouse.service` (port :3100) NOT yet restarted — deploy step is operator-driven (sudo systemctl restart lakehouse.service) ## Architecture comparison update (Captured separately in golangLAKEHOUSE/docs/ARCHITECTURE_COMPARISON.md decisions tracker.) The "drop Python sidecar" line moves from _open_ to DONE. The Rust process model now has 1 mega-binary instead of 1 mega-binary + 1 sidecar process — a small but real reduction in ops surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
654797a429 |
gateway: pub extract_json + parity_extract_json bin (cross-runtime probe)
Some checks failed
lakehouse/auditor 10 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:"
Supports the 2026-05-02 cross-runtime parity probe at
golangLAKEHOUSE/scripts/cutover/parity/extract_json_parity.sh which
feeds identical model-output strings through both runtimes' extract_json
and diffs results.
## Changes
- crates/gateway/src/v1/iterate.rs: extract_json gains `pub` + a
comment pointing at the Go counterpart and the parity probe path
- crates/gateway/src/lib.rs: NEW thin lib facade re-exporting the
modules so sub-binaries can reuse them. main.rs is unchanged
(still uses local mod declarations)
- crates/gateway/src/bin/parity_extract_json.rs: NEW ~30-LOC binary
that reads stdin, calls extract_json, prints {matched, value} JSON
## Probe result (logged in golangLAKEHOUSE)
12/12 match across fenced blocks, nested objects, unicode, escaped
quotes, top-level array, malformed JSON. Both runtimes' algorithms
are genuinely equivalent.
Substrate gate the probe enforces: `cargo test -p gateway extract_json`
PASS before any parity comparison runs. So a future divergence in
the live extract_json fires either as a Rust test failure (live
behavior changed) or a probe diff (Go behavior changed) — never
silently.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
c5654d417c |
docs: pointer to ARCHITECTURE_COMPARISON.md source in golangLAKEHOUSE
Some checks failed
lakehouse/auditor 18 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:"
Per J's request: the parallel-runtime comparison is a living source
file maintained at /home/profit/golangLAKEHOUSE/docs/ARCHITECTURE_COMPARISON.md.
This file is a pointer reachable from the Rust repo's docs/ so the
comparison is discoverable from either side.
Doesn't contain authoritative content — just the link + a quick
status summary + update guidance ('source lives in golangLAKEHOUSE,
don't drift two copies').
|
||
|
|
150cc3b681 |
aibridge: LRU embed cache - 236x RPS gain on warm workloads. Per architecture_comparison.md universal-win for Rust side. Cache key (model,text), default 4096 entries, in-process inside gateway. Load test: 128 RPS -> 30k+ RPS, p50 78ms -> 129us.
Some checks failed
lakehouse/auditor 20 blocking issues: cloud: claim not backed — "Verified end-to-end against persistent Go stack on :4110:"
|
||
|
|
9eed982f1a |
mcp-server: /_go/* pass-through for G5 cutover slice
Adds an opt-in pass-through that routes Bun mcp-server requests
to the Go gateway when GO_LAKEHOUSE_URL is set. /_go/v1/embed,
/_go/v1/matrix/search etc. flow through Bun frontend → Go
backend without touching any existing tool. Off-by-default
(empty GO_LAKEHOUSE_URL → 503 with rationale); enabled via
systemd drop-in at:
/etc/systemd/system/lakehouse-agent.service.d/go-cutover.conf
This is the first slice of real Bun-fronted traffic hitting the
Go substrate. The /api/* pass-through (Rust gateway) and every
existing tool are unmodified — fully additive cutover step.
Reversible: unset GO_LAKEHOUSE_URL or remove the systemd drop-in
and restart lakehouse-agent.service.
Verified end-to-end against persistent Go stack on :4110:
/_go/health → {"status":"ok","service":"gateway"}
/_go/v1/embed → nomic-embed-text-v2-moe vectors (dim=768)
/_go/v1/matrix/search → 3/3 Forklift Operators (role+geo match)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
3d068681f5 |
distillation: regenerated acceptance + audit reports (run_hash refresh)
Some checks failed
lakehouse/auditor 17 blocking issues: cloud: claim not backed — "Verified end-to-end via playwright on devop.live/lakehouse:"
Phase 6 acceptance + Phase 8 full-audit reports re-run; bit-for-bit reproducibility property still holds (run 1 hash == run 2 hash), just at a new value. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8de94eba08 |
cleanup: bump qwen2.5 → qwen3.5:latest in active defaults
Some checks failed
lakehouse/auditor 16 blocking issues: cloud: claim not backed — "Verified end-to-end via playwright on devop.live/lakehouse:"
stronger local rung is now the small-model-pipeline tier-1 default across both Rust legacy + Go rewrite (cf. golangLAKEHOUSE phase 1). same JSON-clean property as qwen2.5, more capacity. ollama still serves both side-by-side; rollback is a 4-line revert if a workload regresses. active-default sites: - lakehouse.toml [ai] gen_model + rerank_model → qwen3.5:latest - mcp-server/observer.ts diagnose call (Phase 44 /v1/chat path) → qwen3.5:latest - mcp-server/index.ts model roster doc → qwen3.5:latest first - crates/vectord/src/rag.rs ContinuableOpts + RagResponse.model → qwen3.5:latest skipped: execution_loop/mod.rs comments describing historic qwen2.5 tool_call quirks — those are documentation of past behavior, not active defaults. data/_catalog/profiles/*.json are runtime-generated (gitignored), not in scope for tracked changes. cargo check -p vectord: clean. no behavioral change in the audit pipeline — same JSON-clean local model, same think=Some(false) posture, just stronger upstream. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a00e9bb438 |
infra: replace gpt-oss with Ollama Pro + OpenCode Zen across hot paths
Some checks failed
lakehouse/auditor 2 blocking issues: State field rename likely incomplete — `opencode_key` may not exist on `self.state`
Ollama Pro plan went live today (39-model fleet on the same
OLLAMA_CLOUD_KEY) and OpenCode Zen was already wired in the gateway
but not consumed. Routing every gpt-oss call site to faster /
stronger replacements:
| Site | gpt-oss → replacement | Why |
|---|---|---|
| ollama_cloud default | gpt-oss:120b → deepseek-v3.2 | newest DeepSeek revision; live-probed `pong` |
| openrouter default | openai/gpt-oss-120b:free → x-ai/grok-4.1-fast | already the scrum LADDER's PRIMARY |
| modes.toml staffing_inference | openai/gpt-oss-120b:free → kimi-k2.6 | coding-specialized, on Ollama Pro |
| modes.toml doc_drift_check | gpt-oss:120b → gemini-3-flash-preview | speed leader for factual checks |
| scrum_master_pipeline tree-split MAP+REDUCE | gpt-oss:120b → gemini-3-flash-preview | latency-dominated path (5-20× per file) |
| bot/propose.ts CLOUD_MODEL | gpt-oss:120b → deepseek-v3.2 | same Ollama key, faster |
| mcp-server/observer.ts overseer label fallback | gpt-oss:120b → claude-opus-4-7 | matches new overseer model |
| crates/gateway/src/execution_loop overseer escalation | ollama_cloud/gpt-oss:120b → opencode/claude-opus-4-7 | frontier reasoning matters here — fires only after local self-correct fails twice; Zen pay-per-token cost is bounded |
Verification:
- `cargo check -p gateway --tests` — clean
- Live probes through localhost:3100/v1/chat:
- `opencode/claude-opus-4-7` → "pong"
- `gemini-3-flash-preview` (ollama_cloud) → "pong"
- `kimi-k2.6` (ollama_cloud) → "pong"
- `deepseek-v3.2` (ollama_cloud) → "Pong! 🏓"
Notes:
- kimi-k2:1t still upstream-broken (HTTP 500 on Ollama Pro probe today,
matches yesterday's memory). Replacement table never picks it.
- The Rust changes need a `systemctl restart lakehouse.service` to
take effect on the running gateway. TS callers reload on next run.
- aibridge/src/context.rs still has gpt-oss:{20b,120b} in its window-
size lookup table; harmless and kept for callers that pass it
explicitly as an override.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
d475fc7fff |
infra: replace gpt-oss with Ollama Pro + OpenCode Zen across hot paths
Ollama Pro plan went live today (39-model fleet on the same
OLLAMA_CLOUD_KEY) and OpenCode Zen was already wired in the gateway
but not consumed. Routing every gpt-oss call site to faster /
stronger replacements:
| Site | gpt-oss → replacement | Why |
|---|---|---|
| ollama_cloud default | gpt-oss:120b → deepseek-v3.2 | newest DeepSeek revision; live-probed `pong` |
| openrouter default | openai/gpt-oss-120b:free → x-ai/grok-4.1-fast | already the scrum LADDER's PRIMARY |
| modes.toml staffing_inference | openai/gpt-oss-120b:free → kimi-k2.6 | coding-specialized, on Ollama Pro |
| modes.toml doc_drift_check | gpt-oss:120b → gemini-3-flash-preview | speed leader for factual checks |
| scrum_master_pipeline tree-split MAP+REDUCE | gpt-oss:120b → gemini-3-flash-preview | latency-dominated path (5-20× per file) |
| bot/propose.ts CLOUD_MODEL | gpt-oss:120b → deepseek-v3.2 | same Ollama key, faster |
| mcp-server/observer.ts overseer label fallback | gpt-oss:120b → claude-opus-4-7 | matches new overseer model |
| crates/gateway/src/execution_loop overseer escalation | ollama_cloud/gpt-oss:120b → opencode/claude-opus-4-7 | frontier reasoning matters here — fires only after local self-correct fails twice; Zen pay-per-token cost is bounded |
Verification:
- `cargo check -p gateway --tests` — clean
- Live probes through localhost:3100/v1/chat:
- `opencode/claude-opus-4-7` → "pong"
- `gemini-3-flash-preview` (ollama_cloud) → "pong"
- `kimi-k2.6` (ollama_cloud) → "pong"
- `deepseek-v3.2` (ollama_cloud) → "Pong! 🏓"
Notes:
- kimi-k2:1t still upstream-broken (HTTP 500 on Ollama Pro probe today,
matches yesterday's memory). Replacement table never picks it.
- The Rust changes need a `systemctl restart lakehouse.service` to
take effect on the running gateway. TS callers reload on next run.
- aibridge/src/context.rs still has gpt-oss:{20b,120b} in its window-
size lookup table; harmless and kept for callers that pass it
explicitly as an override.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f4dc1b29e3 |
demo: search.html — Live Market explainer rewrite + fp-bar viewport-paint + compact contract cards
Some checks failed
lakehouse/auditor 18 blocking issues: cloud: claim not backed — "Verified end-to-end via playwright on devop.live/lakehouse:"
Four UI changes landing together since they all polish Section ① and
Section ② of the public demo:
1. Section ① (Live Market — Chicago) explainer rewritten data-source-
first ("Live from City of Chicago Open Data...") with bolded dial
names so a skimmer can map the visual to the prose. Drops the
"internal calendar" jargon and the slightly-overclaiming "rest of
the page is reacting" framing — downstream sections read the same
feed but don't react to the per-shift filter, so the new copy says
"this row is its heartbeat" instead.
2. Fill-probability bar gets a left-to-right paint reveal (clip-path
inset animation) so the green→gold→orange→red gradient reads as a
*timeline growing* instead of a static heatmap with a "danger zone"
at the right. Followed by a 30%-wide shimmer sweep on a 3.4s loop
for live-signal feel.
3. Paint trigger moved from on-render to IntersectionObserver — by
the time the user scrolls to Section ② the on-render animation had
already finished. Now each bar paints in over 2.8s when it enters
viewport (threshold 0.2, 350ms entry delay). Single shared observer,
unobserve()s after firing so the watch list trends to zero.
4. Contract cards now compact-by-default with click-to-expand. New
summary strip shows revenue / margin / fill-by-1wk / top candidate
so scanners get the punchline without expanding. Click anywhere on
the card surface (excluding inner content) to expand the full FP
curve, economics grid, candidates list, and Project Index. Project
Index auto-opens with the parent card so users actually find the
build signals — but only on user-driven expand (avoiding 20× OSHA
scrapes on page load). grid-template-rows: 0fr → 1fr animation
handles the smooth height transition.
All four animations honor prefers-reduced-motion.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f892230699 |
demo: search.html UX polish — skeleton loader, card-in stagger, hero takeover, B&W faces
Search results no longer pop in as a single block. New behavior:
- Skeleton list pre-claims the vertical space results will occupy
with shimmering placeholder cards, so arriving results fade in
over the skeleton instead of pushing layout. Sweep is staggered
per row for a "rolling wave" not "everything blinking together".
- Domain-language stage caption ("matching against permits",
"ranking by reliability") rotates on a fixed schedule so users
read progress, not a stuck spinner.
- @keyframes card-in: real worker cards rise 4px and fade in over
350ms with nth-child stagger across the first ~12 rows. Honors
prefers-reduced-motion.
- Avatar imgs filter through grayscale + slight contrast/blur to
pull the SDXL Turbo color cast (which screams "AI generated" at
small sizes). Cert icons get the same treatment.
- Once-per-session hero takeover compresses the Section ⓪ strip
("Not a CRM — an index that learns from you") into a centered
hero on first paint, dismissed by clicking anywhere. Stats
hydrate from live endpoints.
console.html: mirrors the avatar B&W filter for visual consistency,
and removes the headshot insertion entirely — back to monogram
initials. The console (internal staffer view) doesn't need synthetic
faces; the public demo at /lakehouse/ does.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
4b92d1da91 |
demo: icon recipe pipeline + role-aware portraits + ComfyUI negative-prompt override
Adds two single-source-of-truth recipe files that drive both the
hot-path render server and the offline pre-render scripts:
- role_scenes.ts: per-role-band scene clauses (clothing + backdrop).
Forklift operators look like forklift operators instead of
collapsing to interchangeable studio shots. SCENES_VERSION mixes
into the headshot cache key so a coordinator tweak refreshes every
matching face on next view.
- icon_recipes.ts: cert / role-prop / status / hazard / empty icons
with deterministic per-recipe seeds + fuzzy text resolver.
ICONS_VERSION suffix on the cached file means edits don't
overwrite in place — misfires are recoverable.
Routes (mcp-server/index.ts):
- GET /headshots/_scenes — exposes SCENES + version to the
pre-render script so prompts don't drift between batch and hot-path.
- GET /icons/_recipes — same idea for icons.
- GET /icons/cert?text=... — resolves free-text cert names to a
recipe and 302s to the rendered icon. 404 (not 500) when no recipe
matches so the front-end can hang `onerror="this.remove()"`.
- GET /icons/render/{category}/{slug} — cache-or-render at 256² (8
steps) for crisper edges than 512² when downsampled to 14px.
ComfyUI portrait support (scripts/serve_imagegen.py):
The editorial workflow had `human, person, face` baked into its
negative prompt — actively sabotaging portraits. _comfyui_generate
now accepts negative_prompt/cfg/sampler/scheduler overrides, and
those mix into the cache key so portrait calls don't collapse into
hero-shot cache hits.
scripts/staffing/render_role_pool.py: pre-renders the role-aware
face pool by reading SCENES from /headshots/_scenes — single source
of truth verified at run time.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
1745881426 |
staffing: face pool fetch preserves prior tags + --shrink gate + atomic manifest write
fetch_face_pool was wiping 952 hand-classified rows when re-run from a Python without deepface installed (it reset every gender to None). Now: - Loads existing manifest by id and overlays only fetch-owned fields, so gender/race/age/excluded survive a refetch. - deepface pass tags only records that don't already have a gender; deepface unavailable means "leave existing tags alone" not "reset". - New --shrink flag required to drop ids >= --count. Default refuses to shrink the pool silently. - Atomic write via tmp + os.replace so an interrupted run can't corrupt the manifest. - Dedupes duplicate id lines (root cause of the 2497-row manifest backing a 1000-face pool). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a05174d2fa |
ops: track tif_polygons.ts orphan import
entity.ts imports findTifDistrict from ./tif_polygons.js but the source file was never committed — only present in the working tree. Adding it so a fresh clone compiles. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
f9a408e4c4 |
Surname → ethnicity routing + ComfyUI fallback for sparse pool buckets + cache-buster
Three problems J flagged ("not matching properly", "same faces", "still
showing old icons") had three different roots:
1. MISMATCH: front-end was first-name only, so "Anna Cruz" / "Patricia
Garcia" / "John Jimenez" all defaulted to caucasian. Added
SURNAMES_HISPANIC / _SOUTH_ASIAN / _EAST_ASIAN / _MIDDLE_EASTERN
dicts to both search.html and console.html. Surname is checked
FIRST (stronger signal for hispanic + asian than first names),
then first-name fallback. Cruz → hispanic, Patel → south_asian,
Nguyen → east_asian, regardless of first name.
2. SAME FACES: pool buckets are uneven — woman/south_asian=3,
man/black=4, woman/middle_eastern=2 — so any worker in those
buckets collapses to 2-4 photos no matter how good the hash is.
/headshots/:key now 302-redirects to /headshots/generate/:key
when the gender × race intersection is below 30 faces. ComfyUI
on-demand gives infinite uniqueness for the sparse buckets
(deterministic-per-worker via djb2 seed). Dense buckets still
serve from the pool — no GPU cost there.
3. STALE CACHE: Cache-Control was max-age=86400, immutable — pinned
old photos in browsers for 24h after any server-side update.
Dropped to max-age=3600, must-revalidate, and added a v=2
cache-buster query param to all front-end /headshots/ URLs so
existing cached entries are bypassed on next page load.
Also surfacing X-Face-Pool-Bucket / Bucket-Size headers for diagnosis.
Verified: playwright run shows surname routing correct (Torres,
Rivera, Alvarez, Gutierrez, Patel, Nguyen, Omar all bucketed
correctly), sparse buckets 302 to ComfyUI, dense buckets stay on
the thumb pool.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
a3b65f314e |
Synthetic face pool — 1000 StyleGAN headshots, ComfyUI hot-swap, 60x smaller thumbs
Worker cards now ship a real photo per person instead of monogram tiles:
- fetch_face_pool.py pulls 1000 faces from thispersondoesnotexist.com
- tag_face_pool.py runs deepface for gender/race/age, excludes <22yo
- manifest.jsonl: 952 servable, gender/race buckets populated
- /headshots/_thumbs/ pre-resized to 384px webp (587KB -> 11KB,
60x smaller; without this Chrome's parallel-connection budget
drops ~75% of tiles in a 40-card grid)
- /headshots/:key gender x race x age intersection bucketing with
gender-only fallback when intersection is sparse
- /headshots/generate/:key ComfyUI on-demand for the contractor
profile spotlight (cold ~1.5s, cached ~1ms; worker-derived
djb2 seed makes faces deterministic-per-worker but unique
across workers sharing the same prompt)
- serve_imagegen.py _cache_key() now includes seed (was caching
by prompt only -> 3 different worker seeds collapsed to 1
cached image; verified fix produces 3 distinct md5s)
- confidence-default name resolution: Xavier->man+hispanic,
Aisha->woman+black, etc. Every worker resolves to a bucket.
End-to-end: playwright run on /?q=forklift+operators+IL -> 21/21
cards loaded, 0 broken, all 384px webp.
Cache + binary pool gitignored; manifest tracked.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
10ed3bc630 |
demo: real synthetic headshots — fetch pool + serve route + UI wire
Three layers shipped:
1. SCRIPT — scripts/staffing/fetch_face_pool.py
Pulls N synthetic StyleGAN faces from thispersondoesnotexist.com
into data/headshots/face_NNNN.jpg, writes manifest.jsonl. Idempotent:
re-running skips existing files. Optional gender tagging via deepface
(currently unavailable on this box; the script handles ImportError
gracefully and tags everything as untagged). Fetched 198 faces with
concurrency=3 in ~67s.
2. SERVER — /headshots/:key route in mcp-server/index.ts
Loads manifest at first hit, caches in globalThis._faces. Hashes the
key with djb2-style mixing → pool index → returns the JPG. Same
key always gets the same face (deterministic). Accepts
?g=man|woman&e=caucasian|black|hispanic|south_asian|east_asian|middle_eastern
to bias pool selection — the gender/ethnicity buckets fall back to
the full pool when no tagged matches exist. Cache-Control:
86400 immutable so faces ride the browser cache after first hit.
/headshots/__reload re-reads the manifest without restart.
3. UI — search.html + console.html worker cards
Re-added overlay <img> on top of the monogram .av circle. img.src
= /headshots/<encoded-key>?g=<hint>&e=<hint>. img.onerror removes
the failed image so the monogram stays visible if the face pool
isn't fetched / CDN is blocked. .av now has overflow:hidden +
position:relative to clip the img to a perfect circle.
Forced-confident name resolution (J: "we're CREATING the profile,
created as though you truly have the information Xavier is more
likely Hispanic and he's a male"):
genderFor(name) — looks up MALE_NAMES + FEMALE_NAMES,
falls back to a deterministic hash split
so unknown names spread ~50/50. Sets now
include cross-cultural names: Alejandro/
Andres/Mateo/Santiago/Joaquin/Cesar/Hugo/
Felipe/Gerardo/Salvador/Ramon (Hispanic),
Raj/Anil/Vikram/Krishna/Pradeep (South
Asian), Wei/Yi/Hiroshi/Akira/Hyun (East
Asian), Demetrius/Kareem/DaQuan/Khalil
(Black), Omar/Khalid/Hassan/Ahmed/Bilal
(Middle Eastern). FEMALE_NAMES extended
in parallel.
guessEthnicityFromFirstName(name)
— confident default of 'caucasian' for any
name not in the cultural buckets so every
worker resolves to a category the face
pool can be biased toward. Order: ME → Black
→ Hispanic → South Asian → East Asian →
Caucasian (matters where names overlap,
e.g. Aisha appears in ME + Black, biases
toward ME for visual fit).
Both helpers also ported into console.html so the triage backfills
and try-it-yourself rendering get the same hint stack.
Privacy note in the script + route comments: the synthetic data uses
the worker's name as the seed; production should hash worker_id (not
name) to avoid leaking PII to a third-party CDN. The fetch URL itself
is referenced once per pool build, not per-worker.
.gitignore — added data/headshots/face_*.jpg (~100MB for 198 faces;
the manifest + script are tracked). Re-running the script on a fresh
checkout rebuilds the pool from scratch.
Verified end-to-end via playwright on devop.live/lakehouse:
forklift query → 10 worker cards
10/10 with face images (real synthetic headshots, not monograms)
0/10 broken
Alejandro G. Nelson → ?g=man&e=hispanic
Patricia K. Garcia → ?g=woman&e=caucasian
Each name → unique face, deterministic across loads.
Console triage backfills get the same treatment.
|
||
|
|
cdf5f5926a |
demo: console — sober worker cards (mirror dashboard styling)
J: "can you update Staffer's Console too the same look." Console
rendered worker rows in three places (Chapter 4 permit-contract
candidates, Chapter 8 triage backfills, Chapter 9 try-it-yourself
results) with the original 28px square avatar + flat backgrounds —
inconsistent with the new dashboard design.
Three changes:
1. CSS — .worker now has a 3px left-edge border that color-codes the
role family, and .av is a 32px circle with a muted dark background
+ 1px ring + monogram initials. Five role-band colors mirror
search.html: warehouse blue / production amber / trades purple /
driver green / lead orange. Plus a .role-pill style matching the
dashboard's small uppercase chip.
2. Helpers — added ROLE_BANDS regex table + roleBand() classifier and
a new workerRow(name, role, detail, opts) builder. Same regex
patterns as search.html so a "Forklift Operator" classifies
identically on every page. opts.endorsed adds the green endorsed
chip; opts.score appends a rank badge.
3. Replaced the three inline avatar+row constructors with workerRow()
calls. Net: console.html lost ~20 lines of duplicated DOM building
while gaining role bands + pills.
Verified end-to-end via playwright on devop.live/lakehouse/console:
Chapter 8 triage scenario "Marcus running late site 4422":
5 backfill rows render with [warehouse] band + WAREHOUSE pill +
monogram avatars (SBC, ETW, SHC, WMG, MEB).
Same sober look as the dashboard worker cards. No emojis, no
cartoons, color-coded role family on the left edge.
|
||
|
|
f92b55615f |
demo: worker cards — sober monogram avatars + role bands (no cartoons)
J: "It's two cartoonish right now the website looks like it was made
by first grade teacher." Pulled the DiceBear personas-style headshots
and the emoji role badges. They were generative-illustration playful;
this is supposed to read like a staffing tool, not a kindergarten
attendance sheet.
Replacement design — restraint, signal, no glyphs:
Avatar: 40px circle, monogram initials, muted dark background
(#161b22), 1px ring (#21262d), white-ish text. No image,
no emoji. Looks like a pre-photo placeholder slot in a
real ATS.
Role band: the role gets classified into one of five families:
WAREHOUSE / PRODUCTION / SKILLED TRADE / DRIVER / LEAD
(regex-based; falls back to the first word of the role
for unknown families). Each family has a single muted
color: blue / amber / purple / green / orange. The
color appears as:
- a 3px left border on the .iworker card
- a 2px left border + matching text color on a small
uppercase pill in the detail line
That's it. No images, no emojis, no per-role illustrations. The
staffer sees role-family at a glance via the band color, name and
initials prominently, full role + city + zip in the detail string
behind the pill. Five colors total instead of an eight-color rainbow.
CSS:
.iworker[data-role-band="warehouse"] etc. → 3px left border
.role-pill[data-rb="warehouse"] etc. → matching pill border
JS:
ROLE_BANDS = 6 regex → band+label entries (warehouse, production,
trades, driver, lead, quality)
roleBand(role) = first matching entry, fallback to first
word of role uppercased
Verified end-to-end via playwright on devop.live/lakehouse:
forklift query → 10 cards
every card → monogram avatar + WAREHOUSE pill (blue band)
no images, no emojis, no rainbow
Restart sequence after these edits:
pkill -9 -f "/home/profit/lakehouse/mcp-server/index.ts"
( setsid bun run /home/profit/lakehouse/mcp-server/index.ts \
> /tmp/mcp-server.log 2>&1 < /dev/null & disown )
|
||
|
|
d571d62e9b |
demo: spec — refresh repo layout + model fleet + per-staffer + paths 8-9
J: "how about devop.live/lakehouse/spec." Spec was anchored on
2026-04-21 state (v2 footer): mistral mentioned in the model matrix,
13 crates not 15, missing validator/truth/auditor crates, no mention
of OpenCode 40-model fleet, no pathway memory, no per-staffer
hot-swap, no Construction Activity Signal Engine, ADR count was 20.
Footer claimed Phases 19-25.
Edits, in order:
Ch1 Repository layout
+ crates/truth/ (ADR-021 rule store)
+ crates/validator/ (Phase 43 — schema/completeness/policy gates)
+ auditor/ (cross-lineage Kimi↔Haiku/Opus auto-promote)
+ scripts/distillation/ (frozen substrate v1.0.0 at e7636f2)
Updated aibridge to mention ProviderAdapter dispatch
Updated gateway to mention OpenAI-compat /v1/* drop-in middleware
Updated mcp-server route list to include /staffers + profiler/contractor pages
Updated config/ to mention modes.toml + providers.toml + routing.toml
Updated docs/ ADR count from 20 → 21
Updated data/ to mention _pathway_memory + _auditor/kimi_verdicts
Ch3 Measurement & indexing
REPLACED stale "Model matrix (Phase 20)" T1-T5 table that
mentioned mistral with the current 5-provider fleet:
ollama / ollama_cloud / openrouter / opencode (40 models, one
sk-* key reaches Claude Opus 4.7, GPT-5.5-pro, Gemini 3.1-pro,
Kimi K2.6, GLM, DeepSeek, Qwen, MiniMax, free) / kimi
ADDED 9-rung cloud-first ladder pseudocode
ADDED N=3 consensus + cross-architecture tie-breaker math
ADDED auditor cross-lineage Kimi K2.6 ↔ Haiku 4.5 + Opus auto-promote
ADDED distillation v1.0.0 freeze paragraph (145 tests, 22/22, 16/16)
Updated Continuation primitive to mention Phase 44 Rust port
Ch5 What a CRM can't do
Extended the table with 6 new capabilities:
- Per-staffer relevance gradient
- Triage in one shot (late-worker → backfills + draft SMS)
- Permit → fill plan derivation
- Public-issuer attribution across contractor graph
- Cross-lineage AI audit on every PR
- Pathway memory (system-level hot-swap, ADR-021)
Ch6 How it gets better over time
Lede updated from 7 paths → 10 paths
NEW Path 7 — Pathway memory (ADR-021)
NEW Path 8 — Per-staffer hot-swap index
NEW Path 9 — Construction Activity Signal Engine
Original Path 7 (observer ingest) renumbered to Path 10
Ch9 Per-staffer context
Lede now anchors Path 8 from Ch6
NEW lead section: Per-staffer hot-swap index — Maria/Devon/Aisha,
same query reshapes per coordinator (167 IL / 89 IN / 16 WI),
MARIA'S MEMORY pill, /staffers endpoint, metro-agnostic by
construction. The original Phase 17 profile / Phase 23 competence
sections retained beneath as the deeper architecture detail.
Ch10 A day in the life
Updated 14:00 emergency event to use the late-worker triage
handler — coordinator types "Dave running late site 4422", gets
profile + draft SMS + 5 backfills + Copy SMS button in 250ms.
The old Click No-show button → /log_failure flow remains valid
(penalty still records); the user-facing surface is the new
triage card.
Ch11 Known limits
REPLACED the Mem0/Letta/Phase-26 era list with current honest
limits: BAI persistence + backtesting, NYC DOB adapter, 12
awaiting public-data sources for contractor profile, rate/margin
awareness, Mem0-style UPDATE/DELETE, Letta hot cache (now 5K
not 1.9K), confidence calibration, SEC fuzzy precision, tighter
pathway+scrum integration.
Footer
v2 2026-04-21 → v3 2026-04-27
Phases 19-25 → 19-45
Lists today's phases: distillation v1.0.0 substrate, gateway as
OpenAI-compat drop-in, mode runner, validator + iterate, ADR-021
pathway memory, per-staffer hot-swap, Construction Activity Signal
Engine.
Nav
+ Profiler link
Date pill v1 · 2026-04-20 → v3 · 2026-04-27
Verified end-to-end on devop.live/lakehouse/spec — 11 chapter h2s
render in order, 67KB page (was 50KB-ish), all internal links resolve.
|
||
|
|
631b0329b1 |
demo: proof — full architecture-page rewrite for current state
J: "needs a rewrite." Old version was anchored on a dual-agent
mistral+qwen2.5 loop that hasn't been the model story for weeks,
called the system 13 crates (it's 15), referenced "Local 7B models"
in the honest-limits section, and had no mention of:
- the 40-model OpenCode fleet via one sk-* key
- the 9-rung cloud-first ladder
- N=3 consensus + cross-architecture tie-breaker
- auditor cross-lineage (Kimi K2.6 ↔ Haiku 4.5, Opus auto-promote)
- distillation v1.0.0 frozen substrate (e7636f2)
- pathway memory (88 traces, 11/11 replays, ADR-021)
- per-staffer hot-swap index
- Construction Activity Signal Engine + BAI + ticker network
- the gateway as OpenAI-compat drop-in middleware
Rewrote into 10 chapters:
1. Receipts — live tests + new live tile showing the Signal Engine
view for THIS load (issuer count, attributed build value,
contractor count, attribution edges)
2. Architecture — corrected to 15 crates with current responsibilities;
ASCII diagram showing OpenAI consumers + MCP + Browser all hitting
gateway /v1/*; provider fleet table with all 5 (ollama, ollama_cloud,
openrouter, opencode 40-model, kimi); validator + truth + auditor
crates added
3. Model fleet — REPLACED the dual-agent mistral story. Now: the
9-rung ladder (kimi-k2:1t through openrouter:free → ollama local),
N=3 consensus + tie-breaker math, auditor Kimi↔Haiku alternation
with Opus auto-promote on big diffs, distillation v1.0.0 freeze
tag e7636f2 (145 tests · 22/22 · 16/16 · bit-identical)
4. Two memory layers — kept playbook content (Phase 19 boost math
still load-bearing), added pathway memory (ADR-021) section with
live counters in the page (88 / 11-11 / 100% reuse rate)
5. Per-staffer hot-swap — NEW. Pseudocode showing how staffer_id
scopes state filter + playbook geo + UI relabel to MARIA'S MEMORY
6. Construction Activity Signal Engine — NEW. Three attribution
flavors (direct, parent, associated), BAI math, cross-metro
replication framing (NYC DOB next, then LA / Houston / Boston)
7. Architectural choices — added ADR-021 row + distillation freeze row
8. Measured at scale — kept (uses /proof.json scale data)
9. Verify or dispute — REFRESHED with current endpoints. Removed the
stale "bun run tests/multi-agent/scenario.ts" recipe; added curl
examples for /v1/health, pathway/stats, per-staffer scoping (3-loop
bash script), late-worker triage, profiler_index, ticker_quotes,
auditor verdicts, distillation acceptance gate
10. What we are NOT claiming — REFRESHED. Removed "Local 7B models"
caveat; added: 12 awaiting public-data sources are placeholders,
SEC name-fuzzy has rare false positives, BAI is a thesis not a
backtest yet, single-metro today
Live data probes added:
loadPathwayLive — fills pwm-traces / pwm-replays / pwm-rate spans
loadSignalLive — renders the LIVE Signal Engine tile under Ch1
Nav also gained a Profiler link to match search.html and console.html.
Verified end-to-end on devop.live/lakehouse/proof:
10 chapters render, 5/5 live tests pass, pathway shows 88 traces +
100% reuse rate, live signal tile shows 11 issuers + $347M attributed
+ 200 contractors + 14 attribution edges. Architecture diagram +
crate table accurate as of HEAD.
|
||
|
|
4c46cf6a21 |
demo: console — three new chapters reflecting recent shipments
J: "it's outdated." Console walkthrough was stuck on the original 6
chapters (legacy-bridge / permits / catalog / ranking demo / playbook
memory / try-it-yourself). Three weeks of new work weren't visible.
Three new chapters added between the existing playbook-memory chapter
and the input box; all pull live data from the running system:
Chapter 6 — Three coordinators, three views of the same corpus
Renders Maria/Devon/Aisha cards from /staffers with their
territories. Frames the per-staffer hot-swap as the relevance
gradient that compounds independently per coordinator. Same query
"forklift operators" returns 89 IN / 16 WI / 167 IL workers
depending on who's acting.
Chapter 7 — The hidden signal — public issuers in your contractor graph
Pulls /intelligence/profiler_index, builds the basket, shows
issuer count + attributed build value + contractor count as the
three top metrics. Lists top 8 issuers with attribution counts
and direct-link to the profiler. This is the BAI / Signal Engine
pitch in walkthrough form: every contractor name is also a forward
indicator on a public equity. Cross-metro replication explicit
in the closing paragraph.
Chapter 8 — When something breaks — triage in one shot
Live triage demo against /intelligence/chat with body
{message:"Marcus running late site 4422"}. Renders the worker
card + draft SMS + 5 backfills + duration_ms. The 250ms-vs-20min
moment, made concrete with real Quincy IL workers.
Chapter 9 (was 6) — Try it yourself
Updated input examples to demonstrate each new route:
"8 production workers near 60607" → headcount + zip parser
"Marcus running late site 4422" → triage handler
"Marcus" → bare-name lookup
"what came in last night" → temporal route
"reliable forklift operators with OSHA certs" → hybrid SQL+vector
Each is a click-to-run link beneath the input.
Two new accent classes: .accent-g (green for issuer-count) and
.accent-r (red for triage event).
Verified end-to-end on devop.live/lakehouse/console: 9 chapters
render, ch6 shows 3 staffer personas, ch7 shows 11 issuers / $347M /
200 contractors, ch8 shows Marcus V. Campbell + draft SMS + 5
backfills.
|
||
|
|
6366487b45 |
ops: persist runtime fixes — iterate.rs unused state, catalog cleanup
Two load-bearing runtime changes that were never committed: 1. crates/gateway/src/v1/iterate.rs — `state` → `_state` on the unused route-state parameter. Cleared the one cargo workspace warning. Fix was made earlier this session but the working-tree change never made it into a commit. 2. data/_catalog/manifests/564b00ae-cbf3-4efd-aa55-84cdb6d2b0b7.json — DELETED. This was the dead manifest for `client_workerskjkk`, a typo dataset whose parquet was deleted but whose catalog entry stayed registered. Every SQL query failed schema inference on the missing file before reaching its target table — that's the bug that made /system/summary report 0 workers and the demo show zero bench. Deleting the manifest keeps the fix on disk; committing the deletion keeps it in git so a fresh checkout doesn't regress. 3. data/_catalog/manifests/32ee74a0-59b4-4e5b-8edb-70c9347a4bf3.json — runtime catalog metadata update from the successful_playbooks_live write path. Ride-along change. Reports under reports/distillation/phase[68]-*.md are auto-regenerated by the audit cycle each run; skipping those. |
||
|
|
db81fd8836 |
demo: System Activity panel — capability index reflects every recent shipment
Old panel showed playbook ops + search counts and went empty in a
fresh demo (no operations yet). J: "update System Activity to coincide
with all of our recent updates."
Rebuilt as a live capability index — each tile is a thing the
substrate has learned to do, with the metric proving it's running.
Pulled in parallel from /staffers, /system/summary,
/api/vectors/playbook_memory/stats, /api/vectors/pathway/stats,
/intelligence/profiler_index, /intelligence/activity. Each probe
catches its own error so a single missing endpoint doesn't collapse
the panel.
Nine capability cards (verified end-to-end on devop.live/lakehouse):
1. Per-staffer hot-swap index 3 personas (Maria/Devon/Aisha)
2. Construction Activity Signal Engine 11 issuers · $347M attributed
build value · network 11/14
3. Late-worker / no-show triage one-shot — name+late → backfills+SMS
4. Permit → staffing bridge 24/day, every Chicago permit ≥$250K
5. Hybrid SQL + vector search 500K workers · 5,474 playbook entries
6. Schema-agnostic ingestion 36 datasets · 2.98M rows
7. Contractor profile + project index 6 wired · 12 queued sources
8. Pathway memory 88 traces · 11/11 replays · 100%
9. Ticker association network 11 tickers · 3 direct + 11 associated
Each card carries:
- capability title + ship date pill ("baseline" or "shipped 2026-04-27")
- big metric (live, not pre-baked)
- sub-context line in coordinator language
- "why a staffer cares" explanation
- optional "Open →" deep link to the surface (Profiler, Contractor)
Header + intro paragraph reframed: "what the substrate has learned to
do" instead of "what the substrate has learned." Operational learning
(fills, playbooks, hot-swaps) compounds INSIDE each capability; the
panel surfaces the set of capabilities the corpus knows how to express.
Closing operational-stats row at the bottom shows fills/searches/
recent playbooks when /intelligence/activity has any.
|
||
|
|
a789000982 |
demo: profiler — Construction Activity Signal Engine narrative + BAI
J's prompt: shoot for the stars, frame the data corpus's value as a
predictive signal, not just a contractor directory. The thesis is
that every name in this corpus is also a forward indicator on public
equities — permits filed today predict construction starts in ~45
days, staffing in ~30, revenue recognition months later. The
associated-ticker network surfaces this signal before any 10-Q does.
Two new layers above the basket:
1. HERO THESIS PANEL — "Chicago Construction Activity Signal Engine"
header + 3-line value statement, then 4 live metrics:
- BAI (Building Activity Index) — attribution-weighted average of
day-change % across surfaced issuers. Weight = attribution count
so issuers we have more depth on count more. Today: +0.76%
(9 issuers · top contributors FCBC +2.4%, ACRE +1.7%, JPM +1.5%).
Color-coded green/red.
- Indexed build value — total $ of permits attributable to ANY
public issuer in this view. Today: $344M.
- Network depth — issuers / attribution edges. Today: 9 / 15.
This is the "we see what nobody else sees" metric: how many
contractors are bridges from a private builder back to a public
equity holder.
- Market replication roadmap — chips showing "Chicago — live ·
NYC DOB — adapter ready · LA County · Houston BCD · Boston ISD
· DC DCRA". Frames the corpus as metro-agnostic from day one.
2. PER-TICKER ACTIVITY MAP — when a basket card is clicked, a leaflet
map appears below the basket plotting that ticker's geocoded permit
activity. Pulls /intelligence/contractor_profile for up to 6
attributed contractors, merges their geocoded permits, plots on a
dark Chicago tile layer. Color-banded by permit cost (green <$100K,
amber $100K-$1M, red ≥$1M). Click TGT → 23 Target permits across
Chicago; click JPM → JPMorgan-adjacent contractor activity. Cached
per ticker so toggling is instant.
Verified end-to-end on devop.live/lakehouse/profiler:
Default load: hero panel renders with all 4 metrics, basket strip
with 9 issuers + live prices in 669ms.
Click TGT : signal map activates, "23 geocoded permits across
1 contractor", table filters to 2 rows.
Tooltip on basket cards: full reason path including matched name +
contributors attributed to that ticker.
Architecture-side: zero new server code — all metrics computed
client-side from the existing profiler_index + ticker_quotes payloads.
The corpus already had the value; the page just needed to articulate it.
|
||
|
|
aa56fbce61 |
demo: profiler — scrolling ticker basket with live prices + click-to-filter
J asked: "kind of like a scrolling ticker that has all of the companies
and their stock prices and where they fit in the map." Implemented as
a horizontal-scroll strip at the top of /profiler:
9 public issuers in this view · quotes via Stooq · 669ms
┌────┬────┬────┬────┬────┬────┬────┐
│TGT │JPM │BALY│ACRE│FCBC│NREF│LSBK│ ← live price + day-change per
│129 │311 │... │... │... │... │... │ ticker, color-banded by
│+.17│+1.5│... │... │... │... │... │ attribution kind
└────┴────┴────┴────┴────┴────┴────┘
Each card carries:
- ticker + live price + day-change % (red/green)
- attribution count + kind (exact / direct / parent / associated)
- left bar color = strongest attribution kind (green for direct
issuer, amber for parent, blue for co-permit associated, gradient
when both direct and associated apply)
- tooltip on hover lists the contractors attributed to this ticker
- click toggles a filter on the table below — clicking TGT cuts the
200-row list down to just TARGET CORPORATION + TORNOW, KYLE F
(Target's primary co-permit contractor)
Server-side:
- entity.ts exports fetchStooqQuote (was internal)
- new POST /intelligence/ticker_quotes — accepts {tickers: [...]},
fans out to Stooq.us in parallel, returns
{ticker, price, price_date, open, high, low, day_change_pct,
stooq_url} per symbol or null for non-US listings (HOC.DE, SKA-B.ST,
LLC.AX). Capped at 50 symbols per call.
Front-end:
- mcp-server/profiler.html — new .basket-wrap section above the
controls. buildBasket() runs after profiler_index loads:
1. Aggregates unique tickers from .tickers.direct + .associated
across all surfaced contractors
2. Renders shells immediately (ticker symbol + "—" placeholder)
3. Batch-fetches quotes via /intelligence/ticker_quotes
4. Updates each card with price + day-change in place
Click on a card sets a tickerFilter; render() skips rows whose
attributions don't include that ticker. "clear filter" button on
the basket strip resets it.
Verified end-to-end on devop.live/lakehouse/profiler:
Default load → 9 issuers, live prices populated in 669ms
TGT click → table filters to TARGET CORPORATION + TORNOW, KYLE F
(the contractor who runs 3 of Target's recent permits
gets the TGT correlation indicator)
JPM card → $311.63, +1.55% — JPMorgan-adjacent contractors
Tooltip → list of contractors attributed to the ticker
|
||
|
|
ba41ad2846 |
demo: profiler index — ticker associations (direct, parent, co-permit)
J's framing: "if a contractor works for Target, future Target contracts
mean money flows back to the contractor — the ticker is an associated
indicator." Now the profiler index attaches three flavors of ticker per
contractor and renders them as colored pills:
green DIRECT contractor IS the public issuer (Target Corp → TGT)
amber PARENT contractor is a subsidiary of a public parent
(Turner Construction → HOC.DE via Hochtief AG)
blue ASSOCIATED contractor co-appears on permits with a public
entity (TORNOW, KYLE F → TGT, 3 shared permits with
TARGET CORPORATION)
The associated flavor is the correlation signal J described — it pulls
the ticker for whoever the contractor has been working *with*, not
just what they are themselves. Most contractors are private; the
associated link is how the moat shows up.
Server-side:
- entity.ts new export `lookupTickerLite(name)` — cheap in-memory
resolver that does only the SEC tickers index lookup + curated
KNOWN_PARENT_MAP check, no per-call SEC profile or Stooq fetch.
~10ms per name after the index is loaded once.
- /intelligence/profiler_index now runs a third Socrata pull
(5K permit pairs in window) to build a co-occurrence map. For each
contractor in the result, attaches:
.tickers.direct[] — name matches a public issuer
.tickers.associated[] — top 5 co-permit partners that resolve
to a ticker, with partner_name +
co_permits count + partner_via reason
Front-end:
- mcp-server/profiler.html — new .ticker-pill styles (3 colors per
attribution kind), pills render under the contractor name in the
table. Hover title gives the full reason path.
Verified end-to-end on the public URL:
search="tornow" → blue TGT pill, hint "Associated via co-permits
with TARGET CORPORATION (3 shared permits) —
TARGET CORP"
search="target" → green TGT × 2 (TARGET CORPORATION +
CORPORATION TARGET name variants both resolve
direct to the same issuer)
default top 200 → 15 ticker pills surface across the page including
JPM (via JPMORGAN CHASE BANK co-permits) and
parent-link tickers for the construction majors.
|
||
|
|
f6a7621b2d |
demo: profiler index — directory of every Chicago contractor
J asked for "a profiler index that shows a history of everyone." This
is a /profiler directory page (also reachable via /contractors) that
ranks every contractor who's filed a Chicago permit, by total permit
value. Rows are clickable into the full /contractor profile.
Defaults: since 2025-06-01, min permit cost $250K, top 200 contractors
by total_cost. Server pulls two Socrata GROUP BY queries (one keyed on
contact_1_name, one on contact_2_name), merges them so contractors
listed in either applicant or contractor slot appear once with combined
counts/cost. ~300ms cold.
UI: live search box, since-date selector, min-cost selector, sortable
columns (name / permits / total_cost / last_filed). Live numbers as of
this write: 200 contractors, 1,702 permits, $14.22B aggregate. Filter
"Target" returns TARGET CORPORATION + CORPORATION TARGET (name variants
from Socrata).
Also fixes J's other complaint — "no new contracts, Target is gone":
/intelligence/permit_contracts was hard-capped at $limit=6 + only
the most recent 6 over $250K, so any day with 6 fresh permits would
push older contractors (Target) off the panel entirely. Now defaults
to 24 (caller can pass body.limit up to 100), so 2-3 days of permits
stay on the panel. Added body.contractor — passes a name into the
WHERE so the staffer can pin a specific contractor to the panel
("Target Corporation" → 3 of their permits over $250K).
Server-side:
- new POST /intelligence/profiler_index — paginated contractor index
(since, min_cost, search, limit) with merged contact_1+contact_2
aggregations
- /intelligence/permit_contracts — body.limit + body.contractor
- /profiler and /contractors routes serve profiler.html
Front-end:
- new mcp-server/profiler.html — sortable table, live filter, deep
links to /contractor?name=... (prefix-aware via P, so /lakehouse
works on devop.live)
- search.html + console.html nav: added "Profiler" link
Verified end-to-end via playwright on the public URL.
|
||
|
|
31d8ef918c |
demo: contractor links — respect the /lakehouse path prefix
J reported https://devop.live/contractor?name=3115%20W%20POLK%20ST.%20LLC returned 404. Cause: the anchor href was a bare /contractor, which on devop.live routes to the LLM Team UI (port 5000) at the main site root, not the lakehouse mcp-server (which lives under /lakehouse/*). Every page that renders a contractor link now uses the same prefix detector the dashboard already had: var P = location.pathname.indexOf('/lakehouse') >= 0 ? '/lakehouse' : ''; Files updated: - search.html: entity-brief anchor + preview anchor → P+/contractor - console.html: permit-card contractor list → P+/contractor - contractor.html: history.replaceState + back-link + the /intelligence/contractor_profile fetch all use P prefix. The page is reachable at /lakehouse/contractor on the public URL and bare /contractor on localhost; both work without further config. Verified: https://devop.live/lakehouse/contractor?name=3115%20W%20POLK%20ST.%20LLC → 200, 29.9 KB, full profile renders. Contractor has 1 permit on file (a small LLC), 1 geocoded so the heat map plots one marker. |
||
|
|
a1066db87b |
demo: contractor profile — heat map, project index, 12 awaiting sources
The contractor.html click-target J asked for: a separate page (not a
modal, not a fall-through search) showing every angle on a contractor.
Reachable from the Co-Pilot dashboard, the staffers console, and the
search box — all anchor-wrap contractor names to /contractor?name=...
What's new on the page:
1. PROJECT INDEX — build-signal score
Single 0-100 number with the drivers laid out beneath. Driver list
is staffer-readable: "59 Chicago permits in 180d (+30) · OSHA 20
inspections (-25) · federal contractor (+15)". Score weights are
placeholders to be replaced by an ML model once the 12 awaiting
sources ship — the current 6 wired signals would not give a real
model enough features.
2. HEAT MAP — every Chicago permit they've been contact_1 or contact_2
on, last 24 months, plotted on a leaflet dark map. Color by cost
(green <$100K, amber $100K-$1M, red ≥$1M), radius proportional to
cost so the staffer sees where money + activity concentrates. Click
a marker for permit detail (cost, date, work type, address, permit
ID). All 50 of Turner Construction's geocoded recent permits in
Chicago plot end-to-end.
3. ACTIVITY TIMELINE — monthly permit count, bar chart, with the
first/last month labels so the staffer sees momentum. Tooltip on
each bar gives the count and total cost for that month.
4. 12 AWAITING SOURCES — placeholder cards for the public datasets
that would 3× the build-signal feature count. Each card has:
- source name (real, e.g. DOL Wage & Hour, EPA ECHO, MSHA, BBB)
- one-liner in coordinator language ("Has this contractor stiffed
workers? Will they pay our staffing invoices?")
- "Would show:" sample shape so the engineering scope is concrete
Order is staffing-decision relevance:
1. DOL Wage & Hour (WHD violations)
2. State Licensure Boards (active license + expiry)
3. Surety Bond Capacity (bonding ceiling)
4. EPA ECHO Compliance (env violations at sites)
5. DOT/FMCSA Carrier Safety (crash + OOS rates)
6. BBB Complaints + Rating
7. PACER Civil Suits (FLSA / Title VII / ADA)
8. UCC Lien Filings (cash flow distress)
9. D&B / Credit Bureau (PAYDEX, payment behavior)
10. State UI Employer Claims (workforce stability)
11. MSHA Mine Safety (excavation / aggregate / heavy)
12. Registered Apprenticeships (DOL RAPIDS pipeline)
Server-side: entity.ts fetchContractorHistory now pulls the 50 most
recent permits with id + lat/lng + work_description, so the heat map
and timeline have what they need without a second SQL hop. The
ContractorHistory.recent_permits type gained the optional fields.
Front-end: contractor.html got 4 new render sections, leaflet wiring
(stylesheet + script in head), placeholder grid CSS, and a PLACEHOLDERS
const at the bottom with the 12 sources. All popup HTML is built via
DOM construction (textContent + appendChild) — no innerHTML, no XSS.
console.html: contractor names from /intelligence/permit_contracts now
anchor-wrapped to /contractor?name=... so the click-through J described
works from the staffers console too. Click stops propagation so the
permit details element doesn't toggle on the same click.
Verified end-to-end via playwright — Turner Construction profile shows:
PIX score "Mixed signals — review drivers below"
Heat map: "50 permits plotted · green/amber/red"
4 section labels in order
12 placeholder cards in the documented order
|