golangLAKEHOUSE

Author	SHA1	Message	Date
root	d6d2fdf81f	trace-id propagation through /v1/iterate (multi-call observability) Closes J's 2026-05-02 multi-call observability gap: a single /v1/iterate session with N retries used to surface in Langfuse as N+1 disconnected traces (one per /v1/chat hop + one for the iterate request itself), with no parent/child linkage. Operators couldn't scroll the retry chain in one trace tree to spot where grounding failed. ## Wire-level change - New header constant `shared.TraceIDHeader = "X-Lakehouse-Trace-Id"` - `langfuseMiddleware` honors the header on inbound requests: if set, reuses that trace id instead of minting a new one. Stashes the trace id on the request context so handlers can attach application-level child spans. - `validatord.chatCaller` forwards the header to chatd. Every chat hop in an iterate session lands as a child of the parent trace. ## Application-level spans - `validator.IterateConfig` gains `Tracer` (optional callback). When wired, each iteration attempt emits one Langfuse span via `validator.AttemptSpan`: Name: iterate.attempt[N] Input: { iteration, model, provider, prompt } Output: { verdict, raw, error } Level: WARNING when verdict != accepted - `validatord.iterTracer` is the production hook — bridges `validator.Tracer` → `langfuse.Client.Span`. - `IterateRequest`/`IterateResponse`/`IterateFailure` gain `TraceID`; each `IterateAttempt` gains `SpanID`. The /v1/iterate caller can pivot from the JSON response straight into the Langfuse trace tree. ## What an operator sees post-cutover GET /v1/iterate {kind=fill, prompt=...} → Trace TR-1 ├─ http.request span (from middleware) ├─ iterate.attempt[0] span (validator.Iterate emit) │ input: prompt+model │ output: { verdict: validation_failed, error: ..., raw } ├─ chatd /v1/chat call (X-Lakehouse-Trace-Id: TR-1) │ ├─ http.request span (chatd middleware) │ └─ chatd-internal spans (existing) ├─ iterate.attempt[1] span └─ ... All in one Langfuse trace tree, not N+1 separate traces. ## Hallucinated-worker safety net is unchanged The /v1/iterate flow's hard correctness gate is still FillValidator + WorkerLookup. Phantom candidate IDs raise ValidationError::Consistency which 422s and forces the iteration loop to retry. The trace-id propagation is the OBSERVABILITY layer on top — it makes the existing safety net's outcomes visible per-call, not a replacement for it. ## Verification - internal/validator: 4 new tests - TestIterate_TracerEmitsSpanPerAttempt — span/attempt count + SpanID - TestIterate_NoTraceIDSkipsTracer — no orphan spans without trace_id - TestIterate_ChatCallerReceivesTraceID — propagation contract - (existing iterate tests updated for new ChatCaller signature) - internal/shared: 1 new test - TestLangfuseMiddleware_HonorsTraceIDHeader — cross-service linkage - cmd/validatord: existing HTTP tests still PASS via the dual-shape UnmarshalJSON contract. - validatord_smoke.sh: 5/5 PASS through gateway :3110 (unchanged). - Full go test ./... green across 33 packages. ## Architecture invariant added STATE_OF_PLAY "DO NOT RELITIGATE" gains a paragraph documenting the X-Lakehouse-Trace-Id header contract + the iterate.attempt[N] span emission. Future-Claude won't re-propose "wire trace-id propagation" — the header IS the wiring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 05:13:18 -05:00
root	7d6636b33e	validator: align ValidationError JSON to Rust serde shape (6/6 parity) Closes the 2026-05-02 parity finding: validator_parity probe found 5/6 body shapes diverging because Go emitted {"Kind":"...","Field":"...","Reason":"..."} while Rust emits the externally-tagged-enum {"Schema":{"field":"...","reason":"..."}}. A caller parsing the error envelope would break silently in cutover. ## Changes internal/validator/types.go: - Custom MarshalJSON emits the Rust shape: Schema: {"Schema": {"field":"x","reason":"y"}} Completeness: {"Completeness":{"reason":"y"}} Consistency: {"Consistency": {"reason":"y"}} Policy: {"Policy": {"reason":"y"}} - Custom UnmarshalJSON accepts BOTH the new Rust shape AND the legacy flat shape (migration safety for any persisted error rows). - Unknown variants (e.g. a future Rust addition Go hasn't learned) surface as an Unmarshal error, not a silent default. internal/validator/types_test.go: - 4 pinning tests anchor the wire format. Failing them = wire-format drift; the parity probe is the secondary line of defense. scripts/validatord_smoke.sh: - Updated probes to read the new variant-name shape (jq keys[0], .Schema.field) instead of legacy .Kind/.Field. ## Verification - internal/validator unit tests: PASS (4 new + all existing). - cmd/validatord HTTP tests: PASS (UnmarshalJSON falls through to flat shape so existing tests reading ValidationError still work). - validatord_smoke.sh: 5/5 PASS through gateway :3110. - validator parity probe re-run: 6/6 match (was 1/6). ## Pattern Per architecture_comparison's "use the dual-implementation as a measurement instrument" thesis: a parity probe surfaced this gap; 50 LOC of MarshalJSON closed it; 4 pinning tests prevent regression; the probe is the longitudinal gate. Cutover-friendly direction (Go matches Rust) chosen because Rust is the existing production contract. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 04:49:28 -05:00
root	f9e72412c1	validatord: /v1/validate + /v1/iterate HTTP surface (port 3221) Closes the last "Go primary" backlog item in docs/ARCHITECTURE_COMPARISON.md. Go now owns the entire validator path end-to-end — no Rust dep for staffing safety net. Architecture: cmd/validatord on :3221 hosts both endpoints. Calls chatd directly for the iterate loop's LLM hop (no gateway self-loopback like the Rust shape). Gateway proxies /v1/validate + /v1/iterate to validatord. What's in: - internal/validator/playbook.go — 3rd validator kind (PRD checks: fill: prefix, endorsed_names ≤ target_count×2, fingerprint required) - internal/validator/lookup_jsonl.go — JSONL roster loader (Parquet deferred; producer one-liner documented in package comment) - internal/validator/iterate.go — ExtractJSON helper + Iterate orchestrator with ChatCaller seam for unit tests - cmd/validatord/main.go — HTTP routes, roster load, chat client - internal/shared/config.go — ValidatordConfig + gateway URL field - lakehouse.toml — [validatord] section - cmd/gateway/main.go — proxy routes for /v1/validate + /v1/iterate Smoke: 5/5 PASS through gateway :3110: ✓ playbook happy path ✓ playbook missing fingerprint → 422 schema/fingerprint ✓ phantom candidate W-PHANTOM → 422 consistency ✓ unknown kind → 400 ✓ roster loaded with 3 records go test ./... green across 33 packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 03:53:20 -05:00
root	b03521a506	validator: port FillValidator + EmailValidator from Rust validator crate Per architecture_comparison.md universal-win for Go side: ports the Rust crates/validator/src/staffing/ to internal/validator/. Production safety net Go was missing — FillValidator catches phantom worker IDs + status/blacklist/geo/role mismatches; EmailValidator catches SSN-shape PII + salary disclosure + wrong-target name in email/SMS drafts. Files: - types.go: Artifact (FillProposal \| EmailDraft), Validator interface, WorkerLookup interface, ValidationError + Finding + Severity - lookup.go: InMemoryWorkerLookup with case-insensitive ID lookup - fill.go: FillValidator — schema → completeness → cross-roster (phantom ID / status / blacklist / geo / role) - email.go: EmailValidator — schema → length → PII (SSN + salary) → worker-name consistency - fill_test.go + email_test.go: 24 tests covering happy path + every error variant + the load-bearing edge cases (phone-pattern not flagged as SSN, flanking-digit guard rejects extended numeric runs) Validator names match Rust (staffing.fill / staffing.email) so cross-runtime audit logs share the same identifier. PII scanners (containsSSNPattern, containsSalaryDisclosure) ported byte-for-byte so a draft flagged by one runtime is flagged by the other. Caveat: the Rust validator crate also has parquet_lookup.rs (loads workers_500k.parquet at startup) and playbook.rs (additional checks). Those weren't ported in this wave — only the two load-bearing validators that were named in the comparison doc. Closes one of the two universal-win items for Go side. The other (materializer port) remains deferred — it's a bigger surface change and depends on transforms.ts source-class adapters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 04:49:55 -05:00

4 Commits