lakehouse

Go to file

root cd8c59a53d gateway: Step 5 — wire SubjectAuditWriter into validator WorkerLookup

Implementation of docs/specs/SUBJECT_MANIFESTS_ON_CATALOGD.md §5 Step 5.
Every WorkerLookup.find() call from the validator path now produces
one audit row in the per-subject HMAC-chained JSONL. Failures are
non-blocking — validator continues whether audit succeeds or fails.

Approach: decorator pattern. WorkerLookup is a sync trait by design
(validator's contract is "in-memory snapshot, no per-call I/O") and
audit writes are async, so we can't expand the trait. Instead, a
new AuditingWorkerLookup wraps the inner lookup, captures a
tokio::runtime::Handle at construction, and spawns audit writes from
sync find() onto that handle. The chain stays intact under spawn fan-
out because the writer's per-subject Mutex (shipped in the previous
scrum-fix commit) serializes same-subject appends regardless of how
the spawn calls arrive.

Files changed:

  crates/gateway/src/v1/auditing_worker_lookup.rs (NEW, 175 LOC):
    - AuditingWorkerLookup<inner: dyn WorkerLookup, audit: Option<Arc<Writer>>>
    - new() captures Tokio Handle if audit is Some
    - find() runs inner lookup, then spawns audit append with:
        accessor.kind = "validator_lookup"
        accessor.purpose = "validator_worker_lookup"
        fields_accessed = ["exists"] (validator only proves existence
                          of a subject; downstream code reads policy
                          fields separately and would have its own
                          audit if those become PII)
        result = "success" if found, "not_found" otherwise
    - Audit-disabled path (audit: None) is a transparent passthrough
      — zero overhead, no panic, no runtime requirement.

  crates/gateway/src/v1/mod.rs:
    + pub mod auditing_worker_lookup;

  crates/gateway/src/main.rs:
    - Hoisted subject_audit_writer construction OUT of the V1State
      literal (declaration-order constraint: validate_workers needs
      access to the writer). The hoisted Arc is then reused for the
      V1State.subject_audit field.
    - validate_workers now wraps the raw lookup with
      AuditingWorkerLookup::new(raw, subject_audit_writer.clone())

Tests (4/4 passing):
  - find_existing_subject_writes_success_audit_row
  - find_missing_subject_writes_not_found_audit_row (phantom-id case)
  - audit_disabled_means_no_writes_no_overhead (None pathway)
  - many_finds_to_same_subject_produce_intact_chain (30 sequential
    spawns on the same subject — chain verifies all 30, regression
    against the race we fixed in catalogd subject_audit)

Also catches the iterate.rs:324 phantom-ID check transparently —
that codepath calls state.validate_workers.find(...) which now goes
through the wrapper, so every phantom-id rejection logs an audit row
for free.

NOT in this commit (future steps):
  - Step 6: /audit/subject/{id} HTTP endpoint
  - Step 7: Daily retention sweep
  - Threading X-Lakehouse-Trace-Id from request through to audit row
    (currently audit row's accessor.trace_id is empty)

cargo build --release clean. cargo test -p gateway auditing_worker_lookup
4/4 PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-03 03:43:40 -05:00

.archon/workflows

.archon: add lakehouse-architect-review workflow

2026-04-26 18:05:43 -05:00

auditor

auditor: layer-2 path-traversal guard — symlink resolution before read

2026-04-27 08:32:33 -05:00

bot

infra: replace gpt-oss with Ollama Pro + OpenCode Zen across hot paths

2026-04-28 06:13:30 -05:00

config

REVERT cloud routing on hot path — back to local Ollama per PRD line 70