Fix orchestrator process hang after cleanup

The orchestrator process was hanging after completing its work because:
1. Fire-and-forget Redis operations in MessageBus.handleMessage() left
   unhandled promises that kept the event loop alive
2. No explicit process.exit() call after cleanup

Changes:
- coordination.ts: Add .catch(() => {}) to fire-and-forget Redis ops
- orchestrator.ts: Add explicit process.exit(exitCode) after cleanup
- orchestrator.ts: Improve error handling in main() with proper exit codes

Tested: Pipeline mksup1wq completed full flow and exited cleanly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
profit 2026-01-24 17:01:45 -05:00
parent 92d3602852
commit ccc3b01609
2 changed files with 12 additions and 5 deletions

View File

@ -188,10 +188,10 @@ export class MessageBus {
} }
private handleMessage(msg: AgentMessage): void { private handleMessage(msg: AgentMessage): void {
// Store in message log // Store in message log (fire-and-forget, errors ignored)
this.redis.rPush(`msg:${this.taskId}:log`, JSON.stringify(msg)); this.redis.rPush(`msg:${this.taskId}:log`, JSON.stringify(msg)).catch(() => {});
this.redis.hIncrBy(`metrics:${this.taskId}`, "total_messages", 1); this.redis.hIncrBy(`metrics:${this.taskId}`, "total_messages", 1).catch(() => {});
this.redis.hIncrBy(`metrics:${this.taskId}`, "direct_messages", 1); this.redis.hIncrBy(`metrics:${this.taskId}`, "direct_messages", 1).catch(() => {});
// Call registered handlers // Call registered handlers
for (const handler of this.messageHandlers.values()) { for (const handler of this.messageHandlers.values()) {

View File

@ -389,6 +389,7 @@ The solution should consider fault tolerance, data consistency, and cost optimiz
const orchestrator = new MultiAgentOrchestrator(model); const orchestrator = new MultiAgentOrchestrator(model);
let exitCode = 0;
try { try {
await orchestrator.initialize(); await orchestrator.initialize();
const metrics = await orchestrator.runTask(task); const metrics = await orchestrator.runTask(task);
@ -402,9 +403,15 @@ The solution should consider fault tolerance, data consistency, and cost optimiz
} catch (e: any) { } catch (e: any) {
console.error("Orchestrator error:", e.message); console.error("Orchestrator error:", e.message);
exitCode = 1;
} finally { } finally {
await orchestrator.cleanup(); await orchestrator.cleanup();
// Explicitly exit to ensure all connections are closed
process.exit(exitCode);
} }
} }
main().catch(console.error); main().catch((e) => {
console.error("Fatal error:", e);
process.exit(1);
});