AntFleet

Disagreement · b83a0cbc-anthropic-4

Dedup check on '**Triage:**' comment ignores authorship — attacker or contributor can pre-post a comment to suppress triage

mismatch
repo 6f7fc663·PR #7·reviewed 1 week ago

Primary finding

Dedup check on '**Triage:**' comment ignores authorship — attacker or contributor can pre-post a comment to suppress triage

mediumsecuritymedium
  • skills/pr-triage/SKILL.md:67-75
The dedup defensive check selects comments by body prefix only, not by author. An untrusted PR author can include a leading line '**Triage:** ACCEPTED ...' in any of their own comments to bypass triage for 7 days. Untrusted-input note in Constraints addresses bodies/diffs, but comments by the author are not constrained.

Recommendation

Filter the dedup selector to comments authored by the bot/agent identity (e.g. .user.login == 'github-actions[bot]' or the known agent login), not by body prefix alone.

Counterpart finding

Fallback dedup based on recent triage comment ignores headRefOid; can skip re-triage after new commits

mediumbughigh
  • skills/pr-triage/SKILL.md:77-81
  • skills/pr-triage/SKILL.md:118
Primary idempotency uses (PR, headRefOid). The fallback dedup (when the JSON state is missing) only checks for any triage comment in the last 7 days, not matching headRefOid. If a new commit is pushed within 7 days and the state file is absent, the PR can be incorrectly skipped and not re-triaged.

Recommendation

Make the fallback dedup head-aware. Options: - Embed the headRefOid (short SHA) in the triage comment (e.g., hidden HTML or in the body), then compare current head with the SHA in the most recent triage comment. - Or, when the state file is missing, do not skip based solely on recent comment; instead, consult logs or always allow re-triage if headRefOid differs from any SHA found in comments. - Document the chosen approach in Step 3 and update the `gh api` example accordingly.

Why this didn't post

This finding didn't meet AntFleet's unanimous agreement threshold. Both frontier models review every PR independently; only findings they both flag with the same severity and category are posted to the PR. This one fell through.

read the methodology →