GPT-5 finding
Dry-run summary counter can be misleading: rowsFlipped always 0 while decisions report would-flip counts
- apps/web/scripts/backfill-benchmark-flag.ts
In dry-run mode, each decision in `decisions` records the number of rows that would flip, but the summary `rowsFlipped` remains 0 by design. This mismatch can confuse operators reading only the summary line, since it shows `rowsFlipped=0 (dry-run)` despite nonzero per-group flipped counts above.
Recommendation
In dry-run mode, either (a) set `rowsFlipped` to the total would-flip count, or (b) rename the field to `rowsWouldFlip` in the returned summary and the post-state log when `dryRun` is true. Clarify the label to avoid misinterpretation.