Anatomy · e488cbca-0

Trusted-sources logic documented in SKILL.md is not implemented in scan.sh

mediumdocs-gapclosed in 4b9b492

repo 6f7fc663·PR #29·reviewed 1 week ago·closed 1 week ago

The vulnerable code

skills/skill-security-scan/SKILL.md:30-35

30   ./skills/skill-security-scan/scan.sh <path-to-SKILL.md>
31   ```
32   The scanner checks for the threat categories above and outputs findings with severity levels (HIGH, MEDIUM, LOW).
33
343. **Check trusted sources**: Read `skills/security/trusted-sources.txt`. Skills from trusted sources get a reduced scan (format validation only, skip content analysis). The source is determined by checking git remote or the skill's frontmatter for an origin field.
35

The reasoning

Opus

Scanner scans itself and self-flags HIGH due to pattern strings, producing false positives on --all

mediumbughigh

skills/skill-security-scan/scan.sh:79-85
skills/skill-security-scan/SKILL.md:1-60

When run with --all, scan.sh finds skills/skill-security-scan/SKILL.md and scans it. That SKILL.md contains literal strings like "ignore previous instructions", "you are now...", "rm -rf", "git push --force", and "curl/wget" exfiltration discussions which match HIGH and MEDIUM patterns (e.g. '[Ii]gnore\s+(all\s+)?previous\s+instructions', '[Yy]ou\s+are\s+now\s+', 'rm\s+-rf\s+\*', 'git\s+push\s+--force'). As a result, the skill that defines the scan will FAIL its own scan, causing the orchestrator (per SKILL.md step 6) to notify and exit 1 even when nothing is wrong. There is no allowlist/self-skip and no trusted-source filtering actually applied in scan.sh (TRUSTED_OWNERS/TRUSTED_REPOS are loaded but never consulted).

Recommendation

Either (a) exclude the security-scan skill from --all by default, (b) treat fenced-code/threat-model sections in SKILL.md differently, or (c) actually consult TRUSTED_OWNERS/TRUSTED_REPOS to downgrade self-scan / known sources to format validation as the SKILL.md step 3 promises.

GPT-5

Security scanner uses PCRE tokens (\s, \b) with grep -E, causing widespread false negatives

highsecurityhigh

skills/skill-security-scan/scan.sh:196
skills/skill-security-scan/scan.sh:81
skills/skill-security-scan/scan.sh:97
skills/skill-security-scan/scan.sh:106-109
skills/skill-security-scan/scan.sh:113-118
skills/skill-security-scan/scan.sh:136-141
skills/skill-security-scan/scan.sh:144-145

grep -E implements POSIX ERE, which does not support \s (whitespace) or \b (word boundary). These tokens are used throughout HIGH/MEDIUM/LOW patterns, so many intended matches (e.g., "rm -rf /", "curl http://...") will not be detected, producing false negatives and potentially allowing dangerous skills to pass.

Recommendation

Either: (a) rewrite patterns to POSIX ERE using [[:space:]]+, explicit separators, and avoid \b; or (b) switch to grep -P (PCRE) if available, with a runtime check/fallback to POSIX patterns. Example fixes: replace '\s+' with '[[:space:]]+', replace '\b' with '(^|[^[:alnum:]_])' around tokens; audit all patterns accordingly.

The agreement

Both frontier models flagged this within the same line range. AntFleet's unanimous gate fired — the finding posted on the PR. Closed in 4b9b492.

The fix

30- `.github/workflows/*.yml` (CI — especially `run:` blocks referencing `${{ ... }}`)
31- `scripts/*.sh` (repo-level scripts)
32
33When `${var}` is set:
34- If it matches an existing SKILL.md path (absolute or relative) → scan that file only
35- Else if a directory exists at `skills/${var}/` → scan everything under it

Closure

Closed 1 week ago

SHA: 4b9b49251c8c9808bf147d55aa2930352af2e8c0

View closure receipt on GitHub →

Tweet thread template

tweet 1 of 8162 / 280

Two frontier models reviewed PR #29 on 6f7fc663. Both found this bug: medium docs-gap: Trusted-sources logic documented in SKILL.md is not implemented in scan.sh

tweet 2 of 8127 / 280

The vulnerable code (skills/skill-security-scan/SKILL.md:30-35): (full snippet at https://www.antfleet.dev/anatomy/e488cbca-0)

tweet 3 of 8280 / 280

What Opus saw: "When run with --all, scan.sh finds skills/skill-security-scan/SKILL.md and scans it. That SKILL.md contains literal strings like "ignore previous instructions", "you are now...", "rm -rf", "git push --force", and "curl/wget" exfiltration discussions which match …

tweet 4 of 8280 / 280

What GPT-5 saw: "grep -E implements POSIX ERE, which does not support \s (whitespace) or \b (word boundary). These tokens are used throughout HIGH/MEDIUM/LOW patterns, so many intended matches (e.g., "rm -rf /", "curl http://...") will not be detected, producing false negatives…

tweet 5 of 897 / 280

Both flagged the same line range. AntFleet's unanimous gate fired — the finding posted on the PR.

tweet 6 of 893 / 280

The fix landed in commit 4b9b492: (view diff at https://www.antfleet.dev/anatomy/e488cbca-0)

tweet 7 of 881 / 280

AntFleet reviews every PR with two frontier models. Only unanimous findings post.

tweet 8 of 877 / 280

Full anatomy + reasoning + diffs: https://www.antfleet.dev/anatomy/e488cbca-0

Paste into X composer one tweet at a time. X has no multi-tweet intent API.

view receipt →all receipts →view disagreements →case studies →