Implementation of docs/specs/SUBJECT_MANIFESTS_ON_CATALOGD.md Step 2.
Per-subject append-only audit JSONL with HMAC-SHA256 chain. Local-first
— no Vault, no external anchor (those are v2 if SOC2 Type II becomes
contract-required; v1 deliberately stays small).
shared/types.rs additions:
- AuditAccessor — kind, daemon, purpose, trace_id
- SubjectAuditRow — schema/ts/candidate_id/accessor/fields_accessed/
result/prev_chain_hash/row_hmac
crates/catalogd/src/subject_audit.rs (NEW):
- SubjectAuditWriter — holds signing key + per-subject latest-hash cache
- from_key_file() — loads key from sealed file, requires ≥32 bytes
- with_inline_key() — for tests + bring-up
- append() — computes HMAC chain link, persists JSONL row, returns new
chain root (caller mirrors to SubjectManifest.audit_log_chain_root)
- verify_chain() — full re-verification of a subject's audit log,
catches both prev_hash drift AND row-level HMAC tampering
- scan_latest_hash() — cold-start path, finds prev_hash from JSONL tail
- append_line() — read-modify-write pattern (object stores have no
native append; same shape as the rest of catalogd's persistence)
Crypto: HMAC-SHA256 via the standard `hmac` crate (added to workspace
+ catalogd deps; not implementing crypto by hand). Output is lowercase
hex matching the rest of the codebase's SHA-256 conventions.
Security choices:
- NO Debug impl on SubjectAuditWriter — auto-deriving Debug would risk
leaking the signing key into log lines. Tests work around this by
matching on Result instead of using .unwrap_err().
- Key min length 32 bytes (HMAC-SHA256 block size guidance).
- Failures are NOT swallowed — Result returned, caller decides whether
to log + continue (per spec §3.2 the gateway tool registry SHOULD
log + continue rather than block reads).
Tests (7/7 passing):
- first_append_uses_genesis_prev_hash
- chain_links_each_append (3-row chain verifies)
- separate_subjects_have_independent_chains (per-subject isolation)
- tamper_detected_on_verify (mutation in middle of chain breaks verify)
- cold_writer_picks_up_existing_chain (process restart preserves chain)
- empty_candidate_id_rejected
- key_too_short_rejected_via_file
NOT in this commit (future steps):
- Step 3: Backfill ETL from workers_500k.parquet (next per J)
- Step 4: Wire gateway tool registry to call append() on every
candidate_id returned by search_candidates / get_candidate
- Step 5: Wire validator WorkerLookup similarly
- Step 6: /audit/subject/{id} HTTP endpoint
- Step 7: Daily retention sweep
- Mirroring chain root to SubjectManifest.audit_log_chain_root
(separate concern; do at the call site)
cargo check --workspace clean. cargo test -p catalogd subject_audit
7/7 PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
55 lines
1.5 KiB
TOML
55 lines
1.5 KiB
TOML
[workspace]
|
|
resolver = "2"
|
|
members = [
|
|
"crates/shared",
|
|
"crates/proto",
|
|
"crates/storaged",
|
|
"crates/catalogd",
|
|
"crates/queryd",
|
|
"crates/aibridge",
|
|
"crates/ingestd",
|
|
"crates/vectord",
|
|
"crates/journald",
|
|
"crates/gateway",
|
|
"crates/ui",
|
|
"crates/lance-bench",
|
|
"crates/vectord-lance",
|
|
"crates/truth",
|
|
"crates/validator",
|
|
]
|
|
|
|
[workspace.dependencies]
|
|
tokio = { version = "1", features = ["full"] }
|
|
axum = "0.8"
|
|
serde = { version = "1", features = ["derive"] }
|
|
serde_json = "1"
|
|
tracing = "0.1"
|
|
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }
|
|
thiserror = "2"
|
|
uuid = { version = "1", features = ["v4", "serde"] }
|
|
chrono = { version = "0.4", features = ["serde"] }
|
|
tower-http = { version = "0.6", features = ["cors", "trace"] }
|
|
object_store = { version = "0.12", features = ["aws"] }
|
|
arrow = "55"
|
|
parquet = { version = "55", features = ["arrow", "async"] }
|
|
datafusion = "47"
|
|
bytes = "1"
|
|
futures = "0.3"
|
|
sha2 = "0.10"
|
|
hmac = "0.12"
|
|
url = "2"
|
|
tonic = "0.13"
|
|
prost = "0.13"
|
|
tonic-build = "0.13"
|
|
opentelemetry = "0.28"
|
|
opentelemetry_sdk = { version = "0.28", features = ["rt-tokio"] }
|
|
opentelemetry-stdout = { version = "0.28", features = ["trace"] }
|
|
tracing-opentelemetry = "0.29"
|
|
toml = "0.8"
|
|
csv = "1"
|
|
lopdf = "0.35"
|
|
encoding_rs = "0.8"
|
|
instant-distance = "0.6"
|
|
tokio-postgres = { version = "0.7", features = ["with-serde_json-1", "with-chrono-0_4", "with-uuid-1"] }
|
|
mysql_async = { version = "0.34", default-features = false, features = ["minimal"] }
|