AntFleet

Disagreement · 24997e37-openai-0

Remote custom-skill scan hardcodes ref=main, breaking forks with non-main default branches

mismatch
repo 6f7fc663·PR #12·reviewed 1 week ago

Primary finding

Remote custom-skill scan hardcodes ref=main, breaking forks with non-main default branches

mediumbughigh
  • skills/v4-readiness/SKILL.md:42
  • skills/v4-readiness/SKILL.md:117
In remote mode the spec instructs listing custom skills via the GitHub Contents API with ?ref=main. Repositories whose default branch is not named main (e.g., master or a custom branch) will return incorrect or missing data, causing custom-skill detection to be incomplete or wrong. Elsewhere in the same section, the table omits a ref parameter, creating inconsistent behavior.

Recommendation

Do not hardcode ref=main. Either omit ref to use the repository default branch, or dynamically query the repo’s default_branch via gh api repos/${TARGET} and pass that value consistently for all remote reads. Document the branch used for remote audits.

Counterpart finding

Var regex allows leading dot / consecutive dots — accepts invalid GitHub slugs

lowapi-contractmedium
  • skills/v4-readiness/SKILL.md:113-114
The regex permits values like `./.git/config` segments, `..`, or leading dashes which GitHub does not accept as owner/repo names. While `gh api` would ultimately reject these, the documentation also says `Anything else → log V4_READINESS_BAD_VAR and exit (no notify, no article)`. Strings like `.../...` slip past the validator and then degrade into V4_READINESS_REMOTE_API_ERROR (itself undocumented). Not a security risk because the value is passed to `gh api` as a URL path segment via gh's own handling, but it weakens the BAD_VAR contract.

Recommendation

Tighten the regex to GitHub's actual owner/repo rules (e.g. `^[A-Za-z0-9][A-Za-z0-9-]{0,38}/[A-Za-z0-9._-]{1,100}$` excluding leading `.`/`-`).

Why this didn't post

This finding didn't meet AntFleet's unanimous agreement threshold. Both frontier models review every PR independently; only findings they both flag with the same severity and category are posted to the PR. This one fell through.

read the methodology →