golangLAKEHOUSE

profit/golangLAKEHOUSE

Fork 0

Commit Graph

Author	SHA1	Message	Date
root	277884b5eb	multitier_100k: 335k scenarios @ 1,115/sec against 100k corpus, 4/6 at 0% fail J asked for a much more sophisticated test using the 100k corpus from the Rust legacy database. This commit ships: scripts/cutover/multitier/main.go — 6-scenario harness with weighted random selection per goroutine. Mixes search, email/SMS/fill validators (in-process via internal/validator), profile swap with ExcludeIDs, repeat-cache exercise, and playbook record/replay. Scenarios + weights (cumulative scenario fractions): 35% cold_search_email — search + email outreach + EmailValidator 15% surge_fill_validate — search + fill proposal + FillValidator + record 15% profile_swap — original search + ExcludeIDs swap + no-overlap check 15% repeat_cache — same query × 5 (cache effectiveness) 10% sms_validate — SMS draft (≤160 chars, phone for SSN-FP guard) 10% playbook_record_replay — cold → record → warm w/ use_playbook=true Test results (5-min sustained, conc=50, 100k workers indexed): TOTAL 335,257 scenarios @ 1,115/sec cold_search_email 117k @ 0.0% fail · p50 2.2ms · p99 8.6ms surge_fill_validate 50k @ 98.8% fail (substrate bug below) profile_swap 50k @ 0.0% fail · p50 4.5ms · ExcludeIDs verified repeat_cache 50k × 5 = 252k searches @ 0.0% fail · p50 11.7ms sms_validate 33k @ 0.0% fail · phone-pattern guard works playbook_record_replay 33k @ 96.8% fail (substrate bug below) Total successful workflows: ~250k+ Validator integration verified at load: 150,930 EmailValidator passes across cold_search_email + sms_validate 35 + 1,061 successful FillValidator + playbook_record (where the bug didn't fire) zero false positives on the SSN-pattern guard against phone numbers Resource footprint at 100k: vectord 1.23GB RSS (linear with 100k vectors) matrixd 26MB, 75% CPU (1-core saturated at conc=50) Total across 11 daemons: 1.7GB Compare to Rust at 14.9GB — ~10× less even at 100k. SUBSTRATE BUG SURFACED: coder/hnsw v0.6.1 nil-deref in layerNode.search at graph.go:95. Triggers on /v1/matrix/playbooks/record under sustained writes to the small playbook_memory index. Both Add and Search paths can panic. Workaround applied (this commit) in internal/vectord/index.go BatchAdd: recover() guard converts panic to error; daemon stays up instead of crashing the request handler. Operator recovery procedure (also documented in the report): curl -X DELETE http://localhost:4215/vectors/index/playbook_memory Next record recreates the index fresh. Real fix DEFERRED — open in docs/ARCHITECTURE_COMPARISON.md Decisions tracker. Three options: a) upstream patch to coder/hnsw b) custom small-index Add path that always rebuilds when len < threshold c) alternate store for playbook_memory (Lance? in-memory map?) Evidence: reports/cutover/multitier_100k.md (full methodology + results + repro + bug analysis). docs/ARCHITECTURE_COMPARISON.md Decisions tracker updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 06:28:50 -05:00
root	2a974d6dea	docs: ARCHITECTURE_COMPARISON.md as living source file Per J's request: move the parallel-runtime comparison from reports/cutover/ (where it lived as cutover-prep evidence) into docs/ as the source-of-truth file. J will keep updating it as fixes ship on either side. Restructured for living-document use: - Status header (last refresh date, owner, update triggers) - 'How to update this doc' section with explicit dos and don'ts - Decisions tracker at top — actioned items with commit refs + open backlog with LOC estimates - Each comparison section now has 'Last verified' columns where numbers are time-sensitive - Change log section at bottom for one-line entries on every meaningful refresh The original at reports/cutover/architecture_comparison.md gains a 'THIS IS A SNAPSHOT' header pointing at the docs/ source. Kept as historical record but no longer the place to update. Sister pointer file in /home/profit/lakehouse/docs/ARCHITECTURE_COMPARISON.md so the doc is reachable from either repo side. That file explicitly says the source lives in golangLAKEHOUSE and warns against authoritative content in the pointer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 04:56:20 -05:00

Author

SHA1

Message

Date

root

277884b5eb

multitier_100k: 335k scenarios @ 1,115/sec against 100k corpus, 4/6 at 0% fail

J asked for a much more sophisticated test using the 100k corpus from
the Rust legacy database. This commit ships:

scripts/cutover/multitier/main.go — 6-scenario harness with weighted
random selection per goroutine. Mixes search, email/SMS/fill
validators (in-process via internal/validator), profile swap with
ExcludeIDs, repeat-cache exercise, and playbook record/replay.

Scenarios + weights (cumulative scenario fractions):
  35% cold_search_email      — search + email outreach + EmailValidator
  15% surge_fill_validate    — search + fill proposal + FillValidator + record
  15% profile_swap           — original search + ExcludeIDs swap + no-overlap check
  15% repeat_cache           — same query × 5 (cache effectiveness)
  10% sms_validate           — SMS draft (≤160 chars, phone for SSN-FP guard)
  10% playbook_record_replay — cold → record → warm w/ use_playbook=true

Test results (5-min sustained, conc=50, 100k workers indexed):
  TOTAL 335,257 scenarios @ 1,115/sec
  cold_search_email     117k @ 0.0% fail · p50 2.2ms · p99 8.6ms
  surge_fill_validate    50k @ 98.8% fail (substrate bug below)
  profile_swap           50k @ 0.0% fail · p50 4.5ms · ExcludeIDs verified
  repeat_cache           50k × 5 = 252k searches @ 0.0% fail · p50 11.7ms
  sms_validate           33k @ 0.0% fail · phone-pattern guard works
  playbook_record_replay 33k @ 96.8% fail (substrate bug below)
  Total successful workflows: ~250k+

Validator integration verified at load:
  150,930 EmailValidator passes across cold_search_email + sms_validate
  35 + 1,061 successful FillValidator + playbook_record (where the bug
    didn't fire)
  zero false positives on the SSN-pattern guard against phone numbers

Resource footprint at 100k:
  vectord 1.23GB RSS (linear with 100k vectors)
  matrixd 26MB, 75% CPU (1-core saturated at conc=50)
  Total across 11 daemons: 1.7GB
  Compare to Rust at 14.9GB — ~10× less even at 100k.

SUBSTRATE BUG SURFACED: coder/hnsw v0.6.1 nil-deref in
layerNode.search at graph.go:95. Triggers on /v1/matrix/playbooks/record
under sustained writes to the small playbook_memory index. Both Add
and Search paths can panic.

Workaround applied (this commit) in internal/vectord/index.go
BatchAdd: recover() guard converts panic to error; daemon stays up
instead of crashing the request handler.

Operator recovery procedure (also documented in the report):
  curl -X DELETE http://localhost:4215/vectors/index/playbook_memory
Next record recreates the index fresh.

Real fix DEFERRED — open in docs/ARCHITECTURE_COMPARISON.md
Decisions tracker. Three options:
  a) upstream patch to coder/hnsw
  b) custom small-index Add path that always rebuilds when len < threshold
  c) alternate store for playbook_memory (Lance? in-memory map?)

Evidence: reports/cutover/multitier_100k.md (full methodology +
results + repro + bug analysis). docs/ARCHITECTURE_COMPARISON.md
Decisions tracker updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-01 06:28:50 -05:00

root

2a974d6dea

docs: ARCHITECTURE_COMPARISON.md as living source file

Per J's request: move the parallel-runtime comparison from
reports/cutover/ (where it lived as cutover-prep evidence) into
docs/ as the source-of-truth file. J will keep updating it as
fixes ship on either side.

Restructured for living-document use:
- Status header (last refresh date, owner, update triggers)
- 'How to update this doc' section with explicit dos and don'ts
- Decisions tracker at top — actioned items with commit refs
  + open backlog with LOC estimates
- Each comparison section now has 'Last verified' columns where
  numbers are time-sensitive
- Change log section at bottom for one-line entries on every
  meaningful refresh

The original at reports/cutover/architecture_comparison.md gains
a 'THIS IS A SNAPSHOT' header pointing at the docs/ source. Kept
as historical record but no longer the place to update.

Sister pointer file in /home/profit/lakehouse/docs/ARCHITECTURE_COMPARISON.md
so the doc is reachable from either repo side. That file explicitly
says the source lives in golangLAKEHOUSE and warns against
authoritative content in the pointer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-01 04:56:20 -05:00

2 Commits