root 7c1222d240 Phase E: Scheduled ingest — the substrate runs itself
Background Scheduler task fires due ingests on interval, records
outcomes, reschedules. Single-flight per schedule_id so a slow run
can't pile up. 10s tick cadence, schedules' own intervals independent.

ScheduleDef persisted as JSON at primary://_schedules/{id}.json,
rebuilt on startup. ScheduleKind supports Mysql and Postgres (both
through existing streaming paths). ScheduleTrigger::Interval is
live; Cron variant defined in the enum but parsing stubbed with a
safe 1h fallback.

next_run_at set to "now" on creation so operators see success or
failure within one tick — no waiting for the first full interval.
run-now endpoint fires even when schedule is disabled (manual
override for testing). Full catalog integration: PII detection,
lineage with redacted DSN, mark-stale + autotune agent trigger.

Verified live: 20s MySQL schedule against MariaDB lh_demo.customers.
Source mutated between runs (added row + updated value). Second
auto-fire picked up both changes (10→11 rows). DataFusion SQL
confirmed mutations in the lakehouse. 6 unit tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 20:36:04 -05:00

13 lines
230 B
Rust

pub mod db_ingest;
pub mod my_stream;
pub mod pg_stream;
pub mod schedule;
pub mod detect;
pub mod csv_ingest;
pub mod json_ingest;
pub mod pdf_ingest;
pub mod pipeline;
pub mod schema_evolution;
pub mod service;
pub mod watcher;