Phase E gave us soft-delete at query time (tombstones hide rows via a
DataFusion filter view). This completes the invariant: after compact,
tombstoned rows are PHYSICALLY absent from the parquet on disk.
delta::compact changes:
- Signature adds tombstones: &[Tombstone]
- After merging base + deltas, apply_tombstone_filter builds a
BooleanArray keep-mask per batch (True where row_key_value is NOT
in the tombstone set) and applies arrow::compute::filter_record_batch
- Supports Utf8, Int32, Int64 key columns (matches refresh.rs coverage
for pg- and csv-derived schemas)
- CompactResult gains tombstones_applied + rows_dropped_by_tombstones
- Caller clears tombstone store on success
Critical correctness fix surfaced during E2E testing:
The original Phase 8 compact concatenated N independent Parquet byte
streams from record_batch_to_parquet() — each with its own footer.
Parquet readers only see the FIRST footer's data; the rest is invisible.
Latent since Phase 8 shipped; triggered by tombstone-filtering produc-
ing multiple batches. Corrupted candidates.parquet on first test run
(restored from UI fixture copy — good argument for test data in repo).
Fix:
- Single ArrowWriter per compaction, writes every batch into one
properly-footered Parquet
- Snappy compression to match ingest defaults (otherwise rewrite
inflated file 3× — 10.5MB → 34MB — because no compression was set)
- Verify-before-swap: parse written buf back to confirm row count
matches expected; refuses to overwrite base_key if verification fails
- Write to {base_key}.compact-{ts}.tmp first, then to base_key; delete
temp; only then delete delta files. Any error along the way leaves
the original base intact.
TombstoneStore::clear(dataset) drops all tombstone batch files and
evicts the per-dataset AppendLog from cache. Called after successful
compact.
QueryEngine::catalog() accessor exposes the Registry so queryd
handlers can reach the tombstone store without routing through gateway
state.
E2E on candidates (100K rows, 15 cols):
- Baseline: 10.59 MB, 100000 rows
- Tombstone CAND-000001/2/3 (soft-delete): 99997 visible, 100000 raw
- Compact: tombstones_applied=3, rows_dropped=3, final_rows=99997
- Post: 10.72 MB (Snappy), valid parquet (1 row_group), 99997 rows
- Restart: persists, tombstones list empty, __raw__candidates also
99997 (the 3 IDs are physically gone from disk)
PRD invariant close: deletion is now actually deletion, not just
masking. GDPR erasure request → tombstone + schedule compact → data
gone.
Deferred:
- Compact-all-datasets cron (currently manual per-dataset via
POST /query/compact)
- Compaction of tombstone batch files themselves (they grow at
flush_threshold=1 per tombstone; TombstoneStore::compact exists
but not auto-called)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
23 lines
579 B
TOML
23 lines
579 B
TOML
[package]
|
|
name = "queryd"
|
|
version = "0.1.0"
|
|
edition = "2024"
|
|
|
|
[dependencies]
|
|
shared = { path = "../shared" }
|
|
catalogd = { path = "../catalogd" }
|
|
storaged = { path = "../storaged" }
|
|
tokio = { workspace = true }
|
|
axum = { workspace = true }
|
|
serde = { workspace = true }
|
|
serde_json = { workspace = true }
|
|
tracing = { workspace = true }
|
|
datafusion = { workspace = true }
|
|
object_store = { workspace = true }
|
|
arrow = { workspace = true }
|
|
parquet = { workspace = true }
|
|
bytes = { workspace = true }
|
|
futures = { workspace = true }
|
|
url = { workspace = true }
|
|
chrono = { workspace = true }
|