root 7bb66f08c3
Some checks failed
lakehouse/auditor 9 blocking issues: cloud: claim not backed — "Verified live (post-restart): scale_test_10m doc-fetch 4-15ms across"
lance: scrum-driven sanitizer + smoke-gate fixes (opus 2026-05-02 BLOCK)
Cross-lineage scrum on the lance wave (4 bundles, 33 distinct findings)
surfaced 1 real BLOCK and 2 real WARNs from opus that the kimi/qwen
lineages missed. Per feedback_cross_lineage_review.md, opus is the
load-bearing reviewer; cross-lineage convergence is noise unless verified.

BLOCK fix — sanitize_lance_err path-stripping was unsound:
  err.split("/home/").next().unwrap_or(&err)
returns Some("") when err STARTS with "/home/", erasing the entire
message. Replaced truncation with redact_paths() — a hand-rolled scanner
that walks the input once, replacing path-shaped substrings with
[REDACTED] while preserving surrounding error context. Catches:
- absolute paths under /root/.cargo, /home, /var, /tmp, /etc, /usr, /opt
- relative variants (Lance occasionally strips leading slash —
  observed live "Dataset at path home/profit/lakehouse/data/lance/x
  was not found")
- multiple occurrences in one error
- preserves quote/comma/whitespace terminators

WARN fix #1 — is_not_found heuristic was too broad:
  lower.contains("not found")
caught real 500s like "column not found", "field not found in schema".
Narrowed to require dataset-shape phrasing AND exclude the
column/field/schema patterns explicitly.

WARN fix #2 — lance_smoke.sh `grep -qvE` was an unsound regression gate.
  bash -c "echo '$BODY' | grep -qvE 'pat'"
With -v -q, exits 0 if ANY line lacks the pattern — so a multi-line
body with one leak line + any clean line FALSE-PASSES. Replaced with
the correct "pattern absent" form: `! grep -qE 'pat'`. Also expanded
the pattern set (added /var/, /tmp/) since the scrum surfaced these
as additional leak vectors.

Also unblocks pre-existing pathway_memory test compile error (stale
PathwayTrace init missing 6 Mem0-versioning fields added in 6ac7f61).
Tests filled in with sensible defaults — needed to run sanitize_tests.

10/10 new sanitize tests pass. Smoke 9/9 PASS against rebuilt+restarted
gateway. Live missing-index probe now returns:
  "lance dataset not found: no-such-11205" + HTTP 404
(was: leaked absolute paths + HTTP 500 → leaked absolute and relative
paths post-first-fix → clean message + 404 now.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 23:34:54 -05:00
2026-04-22 02:41:15 -05:00
Description
Rust-first object storage system
6.3 GiB
Languages
TypeScript 38.4%
Rust 35.8%
HTML 13.9%
Python 7.8%
Shell 2.1%
Other 2%