AntFleet

Disagreement · b83a0cbc-anthropic-3

OUT-OF-SCOPE template states the PR will be closed; an unconditional 'Closing-as-not-planned' message is posted even when §8 blocks the close

mismatch
repo 6f7fc663·PR #7·reviewed 1 week ago

Primary finding

OUT-OF-SCOPE template states the PR will be closed; an unconditional 'Closing-as-not-planned' message is posted even when §8 blocks the close

mediumdocs-gaphigh
  • skills/pr-triage/SKILL.md:125-128
  • skills/pr-triage/SKILL.md:121-131
The OUT-OF-SCOPE comment template unconditionally tells the author the PR is being closed, but §8 restricts closing to a narrower set than what §5 classifies as OUT-OF-SCOPE. A contributor reading the comment will see a contradictory result (PR comment says closed, PR remains open) for protected scripts/prefetch-* and scripts/postprocess-* paths.

Recommendation

Make the OUT-OF-SCOPE comment template parametric on whether the close will run, or simplify §5 so OUT-OF-SCOPE only covers paths that §8 actually closes.

Counterpart finding

Fallback dedup based on recent triage comment ignores headRefOid; can skip re-triage after new commits

mediumbughigh
  • skills/pr-triage/SKILL.md:77-81
  • skills/pr-triage/SKILL.md:118
Primary idempotency uses (PR, headRefOid). The fallback dedup (when the JSON state is missing) only checks for any triage comment in the last 7 days, not matching headRefOid. If a new commit is pushed within 7 days and the state file is absent, the PR can be incorrectly skipped and not re-triaged.

Recommendation

Make the fallback dedup head-aware. Options: - Embed the headRefOid (short SHA) in the triage comment (e.g., hidden HTML or in the body), then compare current head with the SHA in the most recent triage comment. - Or, when the state file is missing, do not skip based solely on recent comment; instead, consult logs or always allow re-triage if headRefOid differs from any SHA found in comments. - Document the chosen approach in Step 3 and update the `gh api` example accordingly.

Why this didn't post

This finding didn't meet AntFleet's unanimous agreement threshold. Both frontier models review every PR independently; only findings they both flag with the same severity and category are posted to the PR. This one fell through.

read the methodology →