Per architecture_comparison.md universal-win for Go side: ports the Rust crates/validator/src/staffing/ to internal/validator/. Production safety net Go was missing — FillValidator catches phantom worker IDs + status/blacklist/geo/role mismatches; EmailValidator catches SSN-shape PII + salary disclosure + wrong-target name in email/SMS drafts. Files: - types.go: Artifact (FillProposal | EmailDraft), Validator interface, WorkerLookup interface, ValidationError + Finding + Severity - lookup.go: InMemoryWorkerLookup with case-insensitive ID lookup - fill.go: FillValidator — schema → completeness → cross-roster (phantom ID / status / blacklist / geo / role) - email.go: EmailValidator — schema → length → PII (SSN + salary) → worker-name consistency - fill_test.go + email_test.go: 24 tests covering happy path + every error variant + the load-bearing edge cases (phone-pattern not flagged as SSN, flanking-digit guard rejects extended numeric runs) Validator names match Rust (staffing.fill / staffing.email) so cross-runtime audit logs share the same identifier. PII scanners (containsSSNPattern, containsSalaryDisclosure) ported byte-for-byte so a draft flagged by one runtime is flagged by the other. Caveat: the Rust validator crate also has parquet_lookup.rs (loads workers_500k.parquet at startup) and playbook.rs (additional checks). Those weren't ported in this wave — only the two load-bearing validators that were named in the comparison doc. Closes one of the two universal-win items for Go side. The other (materializer port) remains deferred — it's a bigger surface change and depends on transforms.ts source-class adapters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
57 lines
1.6 KiB
Go
57 lines
1.6 KiB
Go
package validator
|
|
|
|
import "strings"
|
|
|
|
// InMemoryWorkerLookup is a zero-deps WorkerLookup useful for tests
|
|
// and small-fixture validation. Mirrors Rust's
|
|
// `InMemoryWorkerLookup::from_records`.
|
|
//
|
|
// Lookup is case-insensitive on candidate_id since Rust's
|
|
// HashMap with PartialEq + the source data's casing inconsistency
|
|
// (some IDs uppercase, some lowercase, some mixed) means
|
|
// case-sensitive lookup misses real matches. Lower-casing on
|
|
// insert keeps the contract.
|
|
type InMemoryWorkerLookup struct {
|
|
byID map[string]WorkerRecord
|
|
}
|
|
|
|
// NewInMemoryWorkerLookup builds a lookup from a list of records.
|
|
// Duplicate candidate_ids: last-write-wins. Empty candidate_id: skipped.
|
|
func NewInMemoryWorkerLookup(records []WorkerRecord) *InMemoryWorkerLookup {
|
|
m := make(map[string]WorkerRecord, len(records))
|
|
for _, r := range records {
|
|
if r.CandidateID == "" {
|
|
continue
|
|
}
|
|
m[strings.ToLower(strings.TrimSpace(r.CandidateID))] = r
|
|
}
|
|
return &InMemoryWorkerLookup{byID: m}
|
|
}
|
|
|
|
// Find satisfies WorkerLookup. Returns (rec, true) on hit,
|
|
// (nil, false) on miss.
|
|
func (l *InMemoryWorkerLookup) Find(candidateID string) (*WorkerRecord, bool) {
|
|
if l == nil {
|
|
return nil, false
|
|
}
|
|
r, ok := l.byID[strings.ToLower(strings.TrimSpace(candidateID))]
|
|
if !ok {
|
|
return nil, false
|
|
}
|
|
// Return a copy so callers can't mutate the lookup's internal state.
|
|
cp := r
|
|
return &cp, true
|
|
}
|
|
|
|
// Len exposes the size for tests + admin endpoints.
|
|
func (l *InMemoryWorkerLookup) Len() int {
|
|
if l == nil {
|
|
return 0
|
|
}
|
|
return len(l.byID)
|
|
}
|
|
|
|
// strPtr is a tiny convenience for tests that need *string fields
|
|
// on WorkerRecord.City/State/Role.
|
|
func strPtr(s string) *string { return &s }
|