AntFleet

Disagreement · 748568f3-openai-2

Dry-run summary counter can be misleading: rowsFlipped always 0 while decisions report would-flip counts

solo GPT-5
repo e24ef98c·PR #9·reviewed 1 week ago

GPT-5 finding

Dry-run summary counter can be misleading: rowsFlipped always 0 while decisions report would-flip counts

lowmaintainabilitymedium
  • apps/web/scripts/backfill-benchmark-flag.ts
In dry-run mode, each decision in `decisions` records the number of rows that would flip, but the summary `rowsFlipped` remains 0 by design. This mismatch can confuse operators reading only the summary line, since it shows `rowsFlipped=0 (dry-run)` despite nonzero per-group flipped counts above.

Recommendation

In dry-run mode, either (a) set `rowsFlipped` to the total would-flip count, or (b) rename the field to `rowsWouldFlip` in the returned summary and the post-state log when `dryRun` is true. Clarify the label to avoid misinterpretation.

Other reviewer

The other reviewer flagged nothing in this file/line range.

Why this didn't post

This finding didn't meet AntFleet's unanimous agreement threshold. Both frontier models review every PR independently; only findings they both flag with the same severity and category are posted to the PR. This one fell through.

read the methodology →

From the same review

These findings passed the unanimous gate on the same PR review. The disagreement above was filtered out; the findings below were posted.