-
eb4308d8fd
real_006 diagnosis: Q43 leak is cross-city, not cross-role
main
root
2026-05-05 05:13:19 -05:00
-
95f155b017
real_006: distribution-shift test on rows 10-59 of fill_events
root
2026-05-05 04:54:03 -05:00
-
0e530f4436
drift fix: validatord in start_go_stack + parity refresh
root
2026-05-05 03:27:47 -05:00
-
c0a55b1182
parity reports: regenerated 2026-05-03 morning verification
root
2026-05-03 05:27:14 -05:00
-
df2a9d1f77
catalogd: BiometricCollection on Go SubjectManifest reader
root
2026-05-03 04:55:49 -05:00
-
857ca4c971
catalogd: HTML-safe escape fix + decisions tracker entry
root
2026-05-03 04:29:53 -05:00
-
262a77a52a
subject-audit parity (Step 8) — Go reader + cross-runtime probe
root
2026-05-03 04:17:15 -05:00
-
22c0b42e96
config: mark unwired ModelsConfig tier fields as scaffolding
root
2026-05-03 02:54:10 -05:00
-
5d3996b51d
STATE_OF_PLAY: Rust is not maintenance-only as of 2026-05-02
root
2026-05-02 22:24:14 -05:00
-
916923440a
docs(comparison): close Lance backend deferral, reframe as Lance-vs-Parquet+HNSW
root
2026-05-02 22:10:46 -05:00
-
b314ed1c94
parity: /v1/embed cross-runtime probe (5th probe, 8/8 cosine match)
root
2026-05-02 06:28:40 -05:00
-
a21a34b057
docs: close 2 cross-runtime parity gaps + document unified log
root
2026-05-02 06:25:21 -05:00
-
6847bbc180
validatord: honor X-Lakehouse-Trace-Id even when Langfuse is off
root
2026-05-02 06:16:25 -05:00
-
1263720497
validatord: always populate session_id (fallback when Langfuse off)
root
2026-05-02 06:03:43 -05:00
-
fa4e1b4e16
parity: session_log probe + Rust observability parity recorded
root
2026-05-02 05:39:49 -05:00
-
1a3a82aedb
validatord: coordinator session JSONL for offline analysis (B follow-up)
root
2026-05-02 05:22:09 -05:00
-
d6d2fdf81f
trace-id propagation through /v1/iterate (multi-call observability)
root
2026-05-02 05:13:18 -05:00
-
afdeca80d9
docs: record Python sidecar drop in architecture comparison
root
2026-05-02 05:00:51 -05:00
-
7d6636b33e
validator: align ValidationError JSON to Rust serde shape (6/6 parity)
root
2026-05-02 04:49:28 -05:00
-
b0c8a3f227
parity probes: materializer + extract_json (caught + fixed real bug)
root
2026-05-02 04:43:54 -05:00
-
e8cf113af8
gauntlet 2026-05-02: smoke chain + per-component scrum + parity probe
root
2026-05-02 04:05:18 -05:00
-
f9e72412c1
validatord: /v1/validate + /v1/iterate HTTP surface (port 3221)
root
2026-05-02 03:53:20 -05:00
-
09299a27b7
scrum 2026-05-02: materializer+replay+vectord — ship-with-fixes
root
2026-05-02 03:35:12 -05:00
-
89ca72d471
materializer + replay ports + vectord substrate fix verified at scale
root
2026-05-02 03:31:02 -05:00
-
277884b5eb
multitier_100k: 335k scenarios @ 1,115/sec against 100k corpus, 4/6 at 0% fail
root
2026-05-01 06:28:50 -05:00
-
3a2823c02f
g5 cutover: bigger load test — 5.87M req, 0 errors, 370MB RSS
root
2026-05-01 05:18:00 -05:00
-
2a974d6dea
docs: ARCHITECTURE_COMPARISON.md as living source file
root
2026-05-01 04:56:20 -05:00
-
b03521a506
validator: port FillValidator + EmailValidator from Rust validator crate
root
2026-05-01 04:49:55 -05:00
-
b3ad14832d
architecture_comparison: Rust vs Go lakehouse — weaknesses, strengths, abstracts to address
root
2026-05-01 04:34:24 -05:00
-
c164a3da96
g5 cutover: production load test — 0 errors / 101k req · Go direct = 2,772 RPS
root
2026-05-01 04:20:41 -05:00
-
6507dff26d
g5 cutover: first 5-loop end-to-end through Bun frontend
root
2026-05-01 04:14:21 -05:00
-
c522acec8b
g5 cutover slice live — first real Bun-frontend traffic to Go substrate
root
2026-05-01 03:45:41 -05:00
-
4fd560cad6
start_go_stack.sh: third isolation layer (port range :4xxx for persistent)
root
2026-05-01 03:26:41 -05:00
-
c48b58ff8d
start_go_stack.sh: 2-layer isolation from smoke harness
root
2026-05-01 03:20:00 -05:00
-
77a3dcf266
cutover: first end-to-end coordinator query against persistent Go stack
root
2026-05-01 03:10:09 -05:00
-
54b2e7db76
start_go_stack.sh: document smoke-vs-persistent-stack pkill conflict
root
2026-05-01 02:56:52 -05:00
-
09904d5222
cutover: persistent Go stack milestone — first long-running deployment + first Go-emitted audit_baselines entry
root
2026-05-01 02:55:29 -05:00
-
ee2a40c505
audit-FULL: port phases 1/2/5/7 — only acceptance.ts (TS-only) remains skipped
root
2026-05-01 02:35:13 -05:00
-
55b8c76a8c
distillation: audit-FULL pipeline port (phases 0/3/4) — cross-runtime metric parity verified
root
2026-05-01 01:30:23 -05:00
-
eb0dfdff04
vectord: v2 envelope + handleMerge robustness — actions post_role_gate_v1 scrum
root
2026-05-01 01:20:37 -05:00
-
0d4f033b34
audit_baselines: round-trip validation against live Rust data
root
2026-05-01 00:19:36 -05:00
-
ca142b9271
distillation: audit-baselines lineage port — fully closes the OPEN #2 surface
root
2026-05-01 00:11:47 -05:00
-
7bb432f6c8
distillation: full SFT export port — closes OPEN #2 fully
root
2026-05-01 00:06:57 -05:00
-
b216b7e5b6
fix the other 4: close all OPEN-list items in one wave
root
2026-04-30 23:42:11 -05:00
-
356d76b4b0
multi_coord_stress: thread role through matrix retrieve + playbook record
root
2026-04-30 23:10:49 -05:00
-
cca32344f3
reality_test real_005: negation probe — substrate gap is correctly out-of-scope
root
2026-04-30 23:06:06 -05:00
-
434f466288
matrix: roleNormalize allowlist for non-plural-s tokens (scrum role_gate_v1)
root
2026-04-30 22:58:02 -05:00
-
0331288641
playbook_lift: LLM-based role extractor closes shorthand bleed (real_004)
root
2026-04-30 22:51:27 -05:00
-
3263254f1c
reality_test real_003: 40-query paraphrase stress + extractor extension
root
2026-04-30 21:42:02 -05:00
-
997527be4d
matrix: cross-role playbook gate — closes real_001 bleed (OPEN #1)
root
2026-04-30 20:34:10 -05:00
-
7f2f112e6a
reality_test real_001: real-shape coordinator queries — surfaces cross-role bleed
root
2026-04-30 20:18:40 -05:00
-
5687ec65c2
G5 cutover prep: embed parity probe — Rust /ai/embed ↔ Go /v1/embed verified
root
2026-04-30 20:07:04 -05:00
-
a2fa9a2ce7
scripts/scrum_review: pipe diff via temp files — fixes argv overflow on large bundles
root
2026-04-30 19:57:34 -05:00
-
68d9e554b0
shared: auto-emit Langfuse trace+span per HTTP request — closes OPEN #2
root
2026-04-30 19:55:42 -05:00
-
5a3364f539
matrix: judge-gated Shape B inject — closes lift-suite tail issues
root
2026-04-30 19:38:12 -05:00
-
247e36e687
STATE_OF_PLAY: trim OPEN list — 9 rows → 6, ordered by product leverage
root
2026-04-30 19:32:31 -05:00
-
54a05d9311
Sprint 4 deployment artifacts: Dockerfile + docker-compose
root
2026-04-30 18:58:47 -05:00
-
a59ef5b930
Sprint 4 deployment artifacts: 11 systemd units + REPLICATION.md + env templates
root
2026-04-30 18:54:49 -05:00
-
814197cfd3
ADR-006: auth posture for non-loopback deploy + token rotation impl
root
2026-04-30 17:51:14 -05:00
-
6c93a38093
scrum multi_coord_phase3: 4 fixes from cross-lineage review
root
2026-04-30 17:42:07 -05:00
-
f971e64745
g2_smoke: accept nomic-embed-text* family members as default
root
2026-04-30 17:37:20 -05:00
-
db2e57402e
STATE_OF_PLAY: capture multi-coord stress wave (Phase 1-3 verified)
root
2026-04-30 17:30:04 -05:00
-
5d49967833
multi_coord_stress: full Langfuse coverage — every phase + every call
root
2026-04-30 16:43:32 -05:00
-
08a086779b
multi_coord_stress: fresh_workers two-tier index — fresh-resume now top-1
root
2026-04-30 16:31:45 -05:00
-
7e6431e4fd
langfuse: Go-side client + Phase 1c instrumentation
root
2026-04-30 16:25:03 -05:00
-
ce940f4a14
multi_coord_stress: judge re-rates inbox top-1 — recovers honesty signal
root
2026-04-30 16:16:49 -05:00
-
186d209aae
multi_coord_stress: LLM-parsed inbox demands (qwen2.5)
root
2026-04-30 14:51:19 -05:00
-
e7fc63b216
observerd: /observer/inbox + multi-coord stress phase 1c (priority-ordered events)
root
2026-04-30 08:34:36 -05:00
-
4da32ad102
embedd: bump default to nomic-embed-text-v2-moe (475M MoE, 768d drop-in)
root
2026-04-30 08:26:52 -05:00
-
84a32f0d29
multi-coord stress Phase 2: ExcludeIDs + fresh-resume + 200-worker swap
root
2026-04-30 08:19:29 -05:00
-
0fa42a0cc3
multi-coord stress Phase 1.5: shared-role contracts + paraphrase handover
root
2026-04-30 08:03:16 -05:00
-
61c7b55e48
multi-coord stress harness — Phase 1 of 48-hour mock
root
2026-04-30 07:55:29 -05:00
-
b13b5cd7a1
playbook_lift v4 metric: warm-top-1 re-judge — quality lift +24%/-14%
root
2026-04-30 07:42:04 -05:00
-
87cbd10090
STATE_OF_PLAY: v4 split-threshold result + adjacent-query observation
root
2026-04-30 07:26:23 -05:00
-
67d1957b87
matrix: split boost / inject thresholds — kills Shape B cross-pollination
root
2026-04-30 07:24:55 -05:00
-
94fc3b67ec
STATE_OF_PLAY: capture v3 reality test + Shape B + cross-pollination
root
2026-04-30 07:09:31 -05:00
-
154a72ea5e
matrix: Shape B — inject playbook misses + 6/6 paraphrase recovery
root
2026-04-30 07:06:13 -05:00
-
e9822f025d
playbook_lift v2: paraphrase pass + run #002 finds boost-only limit
root
2026-04-30 06:47:41 -05:00
-
9ce067bd9d
observerd: test that locks ADR-005 Decision 5.3
root
2026-04-30 06:35:41 -05:00
-
2c71d1c637
ADR-005: observer fail-safe semantics
root
2026-04-30 06:32:12 -05:00
-
6c02c905c8
scrum lift_001: 4 fixes from cross-lineage review
root
2026-04-30 06:27:24 -05:00
-
b2e45f7f26
playbook_lift: harness expansion + reality test #001 (7/8 lift, 87.5%)
root
2026-04-30 06:22:21 -05:00
-
740eb0d00c
scrum_review: switch curl to stdin so large diffs don't blow argv
root
2026-04-30 02:46:52 -05:00
-
511083ae40
docs: SPEC §3.9 (chatd) + §3.10 (local-review-harness sibling)
root
2026-04-30 01:01:23 -05:00
-
c5c31b6ca6
docs: STATE_OF_PLAY.md — Go-side truth anchor (mirrors Rust convention)
root
2026-04-30 00:37:24 -05:00
-
e4ee0029c0
scrum_review.sh: reusable 3-lineage cross-review driver
root
2026-04-30 00:29:36 -05:00
-
0efc7363c5
scrum 2026-04-30: 4 real fixes + 2 INFOs from cross-lineage review
root
2026-04-30 00:28:08 -05:00
-
05273ac06b
phase 4: chatd — multi-provider LLM dispatcher (ollama / cloud / openrouter / opencode / kimi)
root
2026-04-30 00:08:29 -05:00
-
848cbf5fef
phase 3: playbook_lift harness reads judge from config
root
2026-04-29 23:57:28 -05:00
-
622e124b8f
phase 2: matrix.downgrade reads WeakModels from config
root
2026-04-29 23:52:18 -05:00
-
ec1d031996
phase 1: add [models] tier config — additive, no callers migrate yet
root
2026-04-29 23:48:45 -05:00
-
3dd7d9fe30
reality-tests: playbook-lift harness — does the 5-loop substrate beat raw cosine?
root
2026-04-29 23:22:36 -05:00
-
8278eb9a87
scrum2 cleanup: JSON-marshal in stringifyValue, drop dead detectCycle, name SourceWorkflow
root
2026-04-29 23:16:07 -05:00
-
c41698acae
scrum rerun-2 — 50/60 (Δ R1 +7, Δ baseline +15) at c7e3124
root
2026-04-29 23:13:01 -05:00
-
c7e3124208
§3.8 second slice: real modes wired (matrix.relevance/downgrade/search, distillation.score, drift.scorer)
root
2026-04-29 20:39:26 -05:00
-
e30da6e5aa
§3.8 first slice: workflow runner skeleton + DAG executor + observerd integration
root
2026-04-29 20:34:30 -05:00
-
97dd3f826d
SPEC §3.5/§3.6/§3.7/§3.8 — name F/B/C as port targets + add Archon-style workflow runner
root
2026-04-29 20:27:41 -05:00
-
bc9ab93afe
H: observerd — autonomous-iteration witness loop (SPEC §2 port)
root
2026-04-29 20:18:02 -05:00
-
6392772f41
C: bulk playbook record — operational rating wiring
root
2026-04-29 20:10:13 -05:00
-
b199093d1f
B: matrix metadata filter — post-retrieval structured gate
root
2026-04-29 20:08:56 -05:00