Primary finding
Provider selection can oversubscribe maxConcurrency when all providers are full
- src/providers/orchestrator.ts:197-201
- src/providers/orchestrator.ts:203-209
When every provider has reached its maxConcurrency, the code rebuilds the eligible set ignoring the concurrency check, allowing providers already at their concurrency limit to be selected. Since the send/stream loops increment activeConcurrency without rechecking, this can push a provider beyond its configured maxConcurrency, violating the throttle and risking overload.
Recommendation
When repopulating eligible after an empty set, do not ignore maxConcurrency. Instead, either: (a) include only providers with status !== 'down' AND activeConcurrency < maxConcurrency but allow degraded status; or (b) if you must relax constraints, relax status (include degraded) but keep the concurrency guard. Alternatively, enqueue or short-circuit with a clear overload error.