New /onboard page. Client-facing wizard for getting real data into the system without engineering help. Flow: 1. Drop a CSV (or click 'Use the sample as my data' — ships a 25-row realistic staffing roster under /samples/staffing_roster_sample.csv) 2. Browser parses client-side. Columns auto-typed (text/int/decimal/ date). PII flagged by name hint AND content regex (emails, phones). First rows previewed. Read-only — nothing written yet. 3. Name the dataset (lowercase+underscores). Commit. 4. Post-commit: dataset is live. Shows 4 next steps the operator can take (SQL query, vector index, dashboard search, playbook training). Backend: - /onboard serves onboard.html - /samples/*.csv serves CSV files from mcp-server/samples/ with filename validation (only [a-zA-Z0-9_-.]+.csv, prevents path traversal) - /onboard/ingest forwards multipart/form-data to gateway /ingest/file preserving the boundary. The generic /api/* passthrough breaks multipart because it reads as text and forwards as JSON; this route uses arrayBuffer + original Content-Type. Verified end-to-end: upload sample roster (25 rows, 12 columns) → parse in browser → show columns + PII flags + preview → commit → gateway writes Parquet, registers in catalog → immediately queryable: SELECT * FROM onboard_demo2 LIMIT 3 → Sarah Johnson, Forklift Operator, Chicago, IL, 0.92 Round-trip <1 second. Nav updated on all pages to link Onboard. Shipped with a sample CSV so the full flow is demonstrable without real client data. When a real client shows up, same path — they upload their CSV. No engineering ticket, no code change, no schema pre-definition. Security: sample filename regex prevents path traversal. CSV parse is client-side pure JS (no DOM injection). Commit uses existing /ingest/file validation (schema fingerprint, PII server-side, content-hash dedup).
Description
Rust-first object storage system
Languages
TypeScript
38.4%
Rust
35.8%
HTML
13.9%
Python
7.8%
Shell
2.1%
Other
2%