Surfaced by the 2026-05-02 cross-runtime test: when a caller
forwarded X-Lakehouse-Trace-Id but the langfuse middleware was a
passthrough (no Langfuse env), the header was never read — Go minted
a fallback id, breaking cross-daemon parent-trace linkage.
The middleware only honored the header when its lf client was
non-nil. With LANGFUSE_URL unset on the persistent stack, every
inbound iterate request lost the parent linkage.
Fix: validatord's iterate handler reads the header DIRECTLY (matches
Rust's iterate.rs pattern) before falling through to the ctx value
+ fallback id. Now Go behavior matches Rust regardless of Langfuse
configuration.
Resolution order is:
1. req.TraceID (caller put it in the JSON body)
2. X-Lakehouse-Trace-Id header (read directly here)
3. context value from langfuse middleware (when configured)
4. fallback to a locally-minted time-ordered hex id
Verified end-to-end:
curl -H 'X-Lakehouse-Trace-Id: go-cmp-fixed' POST /v1/iterate
→ response.trace_id = "go-cmp-fixed" ✓
→ sessions.jsonl row session_id = "go-cmp-fixed" ✓
Pre-fix (this commit's parent ran from /tmp/val-fresh3 binary):
same call → trace_id minted as 18abbb5a008061b7-008061e9
(header silently ignored)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surfaced during the 2026-05-02 deploy + reality wave: the persistent
Go stack runs without LANGFUSE_URL/PUBLIC_KEY/SECRET_KEY env, so
shared.langfuseMiddleware operates as a passthrough — never minting
a trace id, never stashing it on the request context. Result:
session_id was empty on every JSONL row, breaking correlation across
the longitudinal log + replay_runs.jsonl + future Langfuse traces.
The fix: validatord falls back to a locally-generated time-ordered
hex id when both the X-Lakehouse-Trace-Id header AND the middleware
context are empty. Same shape Langfuse accepts, so a future deploy
that turns Langfuse on doesn't break correlation — already-emitted
session_ids stay valid as Langfuse trace ids.
Verified post-deploy by driving 9 /v1/iterate sessions through the
persistent stack at :4110:
- 6 accepted on iter 0 (qwen2.5:latest first-shot 75%)
- 2 max_iter_exhausted (no_json on prose-y prompts)
- 1 infra_error (chatd cold-start probe timed out at 5s)
Latest row's session_id: "18abbabdc2306a83-c2306aa9" (was: "")
Probe re-runs (validator_parity, session_log_parity) included as
post-deploy artifacts; both 6/6 + 4/4 with the freshly-restarted
persistent gateway+validatord binaries.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the second half of J's 2026-05-02 multi-call observability
concern. Trace-id propagation (commit d6d2fdf) gave us the *live*
view in Langfuse; this gives us the *longitudinal* view for ad-hoc
DuckDB queries over thousands of sessions:
"show me every session where the model produced a real candidate
without ever needing a retry"
"find sessions where validation rejected three times in a row"
"first-shot success rate per model — did we feed it enough corpus?"
## What's in
internal/validator/session_log.go:
- SessionRecord type (schema=session.iterate.v1)
- SessionLogger writer — mutex-guarded append, best-effort posture,
nil-safe (NewSessionLogger("") = nil = no-op on Append)
- BuildSessionRecord helper — assembles a row from any
iterate response/failure/infra-error combination, callable from
other daemons that wrap iterate (cross-daemon shared schema)
- 7 unit tests including concurrent-append safety + the three
code paths (success / max_iter_exhausted / infra_error)
cmd/validatord/main.go:
- handlers.sessionLog field + wiring from cfg.Validatord.SessionLogPath
- Iterate handler: build + append a SessionRecord on every call
- rosterCheckFor("fill") closure stamps grounded_in_roster — the
load-bearing forensic property J flagged ("we can never
hallucinate available staff members to contracts")
internal/shared/config.go + lakehouse.toml:
- [validatord].session_log_path field; empty = disabled
- Production: /var/lib/lakehouse/validator/sessions.jsonl
scripts/validatord_smoke.sh:
- Adds a probe verifying validatord announces session log path on
startup. Smoke is now 6/6 (was 5/5).
docs/SESSION_LOG.md:
- Schema reference + 5 worked DuckDB query examples including the
"alarm" query (sessions where grounded_in_roster=false on an
accepted fill — should always be empty; if not, something is
bypassing FillValidator).
## What this is NOT
This is NOT a duplicate of replay_runs.jsonl. They're siblings:
- replay_runs.jsonl: replay tool's per-task retrieval+model output
- sessions.jsonl: validatord's per-iterate full retry chain +
grounded-in-roster verdict
A single coordinator session can produce rows in both streams; the
session_id (= Langfuse trace_id) is the join key.
## Layered observability now in place
Live view: Langfuse trace tree (X-Lakehouse-Trace-Id propagation)
`iterate.attempt[N]` spans with prompt/raw/verdict
Offline: coordinator_sessions.jsonl (this commit)
DuckDB-queryable; longitudinal forensics
Hard gate: FillValidator + WorkerLookup (existing)
phantom IDs structurally rejected, never reach
session log's grounded_in_roster=true bucket
Per the architecture invariant in STATE_OF_PLAY's DO NOT RELITIGATE
section — these layers are wired; future work targets the data, not
the wiring.
## Verification
- internal/validator: 7 new tests (session_log_test.go) — all PASS
- cmd/validatord: 3 new integration tests covering the success,
failure, and grounded=false paths — all PASS
- validatord_smoke.sh: 6/6 PASS through gateway :3110
- Full go test ./... green across 33 packages
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes J's 2026-05-02 multi-call observability gap: a single
/v1/iterate session with N retries used to surface in Langfuse as
N+1 disconnected traces (one per /v1/chat hop + one for the iterate
request itself), with no parent/child linkage. Operators couldn't
scroll the retry chain in one trace tree to spot where grounding
failed.
## Wire-level change
- New header constant `shared.TraceIDHeader = "X-Lakehouse-Trace-Id"`
- `langfuseMiddleware` honors the header on inbound requests: if
set, reuses that trace id instead of minting a new one. Stashes
the trace id on the request context so handlers can attach
application-level child spans.
- `validatord.chatCaller` forwards the header to chatd. Every chat
hop in an iterate session lands as a child of the parent trace.
## Application-level spans
- `validator.IterateConfig` gains `Tracer` (optional callback).
When wired, each iteration attempt emits one Langfuse span
via `validator.AttemptSpan`:
Name: iterate.attempt[N]
Input: { iteration, model, provider, prompt }
Output: { verdict, raw, error }
Level: WARNING when verdict != accepted
- `validatord.iterTracer` is the production hook — bridges
`validator.Tracer` → `langfuse.Client.Span`.
- `IterateRequest`/`IterateResponse`/`IterateFailure` gain
`TraceID`; each `IterateAttempt` gains `SpanID`. The /v1/iterate
caller can pivot from the JSON response straight into the
Langfuse trace tree.
## What an operator sees post-cutover
GET /v1/iterate {kind=fill, prompt=...} → Trace TR-1
├─ http.request span (from middleware)
├─ iterate.attempt[0] span (validator.Iterate emit)
│ input: prompt+model
│ output: { verdict: validation_failed, error: ..., raw }
├─ chatd /v1/chat call (X-Lakehouse-Trace-Id: TR-1)
│ ├─ http.request span (chatd middleware)
│ └─ chatd-internal spans (existing)
├─ iterate.attempt[1] span
└─ ...
All in one Langfuse trace tree, not N+1 separate traces.
## Hallucinated-worker safety net is unchanged
The /v1/iterate flow's hard correctness gate is still
FillValidator + WorkerLookup. Phantom candidate IDs raise
ValidationError::Consistency which 422s and forces the iteration
loop to retry. The trace-id propagation is the OBSERVABILITY layer
on top — it makes the existing safety net's outcomes visible per-call,
not a replacement for it.
## Verification
- internal/validator: 4 new tests
- TestIterate_TracerEmitsSpanPerAttempt — span/attempt count + SpanID
- TestIterate_NoTraceIDSkipsTracer — no orphan spans without trace_id
- TestIterate_ChatCallerReceivesTraceID — propagation contract
- (existing iterate tests updated for new ChatCaller signature)
- internal/shared: 1 new test
- TestLangfuseMiddleware_HonorsTraceIDHeader — cross-service linkage
- cmd/validatord: existing HTTP tests still PASS via the dual-shape
UnmarshalJSON contract.
- validatord_smoke.sh: 5/5 PASS through gateway :3110 (unchanged).
- Full go test ./... green across 33 packages.
## Architecture invariant added
STATE_OF_PLAY "DO NOT RELITIGATE" gains a paragraph documenting
the X-Lakehouse-Trace-Id header contract + the iterate.attempt[N]
span emission. Future-Claude won't re-propose "wire trace-id
propagation" — the header IS the wiring.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>