AntFleet

Anatomy · 52c1a3b9-0

isTransientError always returns true — non-transient errors are retried, burning attempts and LLM budget

highbugclosed in 867c815
repo e24ef98c·PR #11·reviewed 1 week ago·closed 1 week ago

The vulnerable code

apps/web/lib/review-worker.ts:268-282

268 ms: p.ms,
269 })),
270 });
271
272 // Onboarder summary fires regardless of agreed/degraded. It self-gates
273 // on ONBOARDER_ENABLED + first-review-only. Failures are logged but
274 // never bubble; the review itself is the load-bearing outcome.
275 try {
276 const perProviderFindingCounts: Record<string, number> = {};
277 for (const p of bundle.perProvider) {
278 perProviderFindingCounts[p.name] = p.output?.findings.length ?? 0;
279 }
280 await deps.runFirstReviewSummary({
281 installationId: row.installationId,
282 owner: row.owner,

The reasoning

Opus

isTransientError always returns true — non-transient errors are retried, burning attempts and LLM budget

highbughigh
  • apps/web/lib/review-worker.ts:268-282
  • apps/web/lib/review-worker.ts:247-282
The function's documentation says non-transient errors (own bugs, 4xx other than 429) should not be retried because retrying won't help and would burn LLM budget. However, the function unconditionally returns true at the end after the (incomplete) heuristics. This means any TypeError, ReferenceError, 400, 401, 403, 404, 422 from the GitHub API, validation error in code, etc. is classified as transient and triggers retries up to MAX_PROCESSING_ATTEMPTS (6) attempts, costing ~62 minutes of backoff per failure plus 6× the cost of the LLM call (since reviewPR can be invoked again each retry). The tests in review-worker.test.ts only exercise positive cases of isTransientError; they never assert that a deterministic non-retryable error returns false, so this lying default is unguarded. The comment block above the bare `return true;` actively misleads readers: it says reviewPR-pipeline throws are 'almost certainly infrastructure' — but ANY error (even from getChangedFiles, updateReview, postPRComment, recordFindingStatuses, getInstallationToken, runFirstReviewSummary, formatPRComment) is rolled into this default-true.

Recommendation

Default to `return false;` after the explicit transient signal checks. Optionally add an allow-list for known 4xx-permanent codes (400/401/403/404/422) to be explicit. Update tests to cover the false case.

GPT-5

Output unavailable for this row.

The agreement

Both frontier models flagged this within the same line range. AntFleet's unanimous gate fired — the finding posted on the PR. Closed in 867c815.

The fix

268 name: p.name,
269 ok: p.output !== null,
270 ms: p.ms,
271 })),
272 });
273
274 // Onboarder summary fires regardless of agreed/degraded. It self-gates
275 // on ONBOARDER_ENABLED + first-review-only. Failures are logged but
276 // never bubble; the review itself is the load-bearing outcome.
277 try {
278 const perProviderFindingCounts: Record<string, number> = {};
279 for (const p of bundle.perProvider) {
280 perProviderFindingCounts[p.name] = p.output?.findings.length ?? 0;
281 }
282 await deps.runFirstReviewSummary({

Closure

Closed 1 week ago

SHA: 867c815221cf9983afea6d51360baecf012cfdc6

View closure receipt on GitHub →

Tweet thread template

tweet 1 of 8185 / 280

Two frontier models reviewed PR #11 on e24ef98c. Both found this bug: high bug: isTransientError always returns true — non-transient errors are retried, burning attempts and LLM budget

tweet 2 of 8123 / 280

The vulnerable code (apps/web/lib/review-worker.ts:268-282): (full snippet at https://www.antfleet.dev/anatomy/52c1a3b9-0)

tweet 3 of 8280 / 280

What Opus saw: "The function's documentation says non-transient errors (own bugs, 4xx other than 429) should not be retried because retrying won't help and would burn LLM budget. However, the function unconditionally returns true at the end after the (incomplete) heuristics. Th…

tweet 4 of 837 / 280

What GPT-5 saw: "Output unavailable"

tweet 5 of 897 / 280

Both flagged the same line range. AntFleet's unanimous gate fired — the finding posted on the PR.

tweet 6 of 893 / 280

The fix landed in commit 867c815: (view diff at https://www.antfleet.dev/anatomy/52c1a3b9-0)

tweet 7 of 881 / 280

AntFleet reviews every PR with two frontier models. Only unanimous findings post.

tweet 8 of 877 / 280

Full anatomy + reasoning + diffs: https://www.antfleet.dev/anatomy/52c1a3b9-0

Paste into X composer one tweet at a time. X has no multi-tweet intent API.