proof harness: fix queryd refresh-tick race in 04_query_correctness
Caught by the audit rerun: with cache-warm binaries, 04 fires its
first SELECT faster than queryd's 500ms refresh tick — Q1 returned
400 ("table not found") even though 03_ingest had registered the
manifest. Subsequent queries (after the next tick) succeeded.
This is an eventual-consistency wait, not a retry — queryd's
contract is that views appear within one tick of catalogd having the
manifest. Production code does not need changing.
Added to lib/http.sh:
proof_wait_for_sql <budget_sec> <sql>
polls a SQL probe until it returns 200 or budget elapses; emits
no evidence (test setup, not a claim).
Used in 04_query_correctness:
Wait up to 5s for queryd to have the view before running the 5
SQL assertions. Skip-with-loud-reason if the view never appears.
Verified: integration mode back to 104 pass / 0 fail / 1 skip after
fix. The skip is the unchanged GOLAKE-085 informational record.
This is exactly the kind of finding the harness was designed to
surface — the regression existed in the codebase the moment Phase D
shipped, but only fired when the next compare run hit cache-warm
timing. Without the harness, it would have surfaced on a CI run
weeks from now and been hard to bisect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
4bb6548cbc
commit
4840c10311
@ -26,6 +26,15 @@ substitute_table() {
|
||||
sed "s/FROM workers/FROM ${DATASET}/g; s/from workers/from ${DATASET}/g"
|
||||
}
|
||||
|
||||
# Wait for queryd to have the view from 03's ingest. queryd refreshes
|
||||
# every 500ms (proof override of the 30s prod default); on cache-warm
|
||||
# runs cases fire faster than the next tick. Up to 5s budget.
|
||||
if ! proof_wait_for_sql 5 "SELECT 1 FROM ${DATASET} LIMIT 0"; then
|
||||
proof_skip "$CASE_ID" "queryd view ${DATASET} never appeared in 5s" \
|
||||
"queryd refresh ticker may be stalled or 03_ingest registration failed"
|
||||
return 0 2>/dev/null || exit 0
|
||||
fi
|
||||
|
||||
# Iterate the 5 queries.
|
||||
n=$(jq '.queries | length' "$EXPECTED_FILE")
|
||||
for i in $(seq 0 $((n-1))); do
|
||||
|
||||
@ -108,6 +108,32 @@ proof_call() {
|
||||
_proof_http_run "$case_id" "$probe" "$method" "$url" "$@"
|
||||
}
|
||||
|
||||
# proof_wait_for_sql: wait for a SQL probe to return 200, up to budget
|
||||
# seconds. Use when a case follows an ingest and queryd's view-refresh
|
||||
# (default 500ms tick) may not have fired yet. NOT a retry — a wait
|
||||
# for a known eventual-consistency event. No evidence emitted (this
|
||||
# is test setup, not a claim).
|
||||
#
|
||||
# proof_wait_for_sql <budget_sec> <sql>
|
||||
#
|
||||
# Returns 0 if the probe succeeded; 1 on timeout.
|
||||
proof_wait_for_sql() {
|
||||
local budget="${1:-10}" sql="$2"
|
||||
local deadline=$(($(date +%s) + budget))
|
||||
local body
|
||||
body=$(jq -nc --arg s "$sql" '{sql:$s}')
|
||||
while [ "$(date +%s)" -lt "$deadline" ]; do
|
||||
if curl -sf --max-time 1 -X POST \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d "$body" \
|
||||
"${PROOF_GATEWAY_URL}/v1/sql" >/dev/null 2>&1; then
|
||||
return 0
|
||||
fi
|
||||
sleep 0.1
|
||||
done
|
||||
return 1
|
||||
}
|
||||
|
||||
# Helper accessors — reads the per-probe JSON.
|
||||
proof_status_of() {
|
||||
local case_id="$1" probe="$2"
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user