Disagreement · e488cbca-openai-0

Security scanner uses PCRE tokens (\s, \b) with grep -E, causing widespread false negatives

mismatch

repo 6f7fc663·PR #29·reviewed 1 week ago

Primary finding

Security scanner uses PCRE tokens (\s, \b) with grep -E, causing widespread false negatives

highsecurityhigh

skills/skill-security-scan/scan.sh:196
skills/skill-security-scan/scan.sh:81
skills/skill-security-scan/scan.sh:97
skills/skill-security-scan/scan.sh:106-109
skills/skill-security-scan/scan.sh:113-118
skills/skill-security-scan/scan.sh:136-141
skills/skill-security-scan/scan.sh:144-145

grep -E implements POSIX ERE, which does not support \s (whitespace) or \b (word boundary). These tokens are used throughout HIGH/MEDIUM/LOW patterns, so many intended matches (e.g., "rm -rf /", "curl http://...") will not be detected, producing false negatives and potentially allowing dangerous skills to pass.

Recommendation

Either: (a) rewrite patterns to POSIX ERE using [[:space:]]+, explicit separators, and avoid \b; or (b) switch to grep -P (PCRE) if available, with a runtime check/fallback to POSIX patterns. Example fixes: replace '\s+' with '[[:space:]]+', replace '\b' with '(^|[^[:alnum:]_])' around tokens; audit all patterns accordingly.

Counterpart finding

Scanner scans itself and self-flags HIGH due to pattern strings, producing false positives on --all

mediumbughigh

skills/skill-security-scan/scan.sh:79-85
skills/skill-security-scan/SKILL.md:1-60

When run with --all, scan.sh finds skills/skill-security-scan/SKILL.md and scans it. That SKILL.md contains literal strings like "ignore previous instructions", "you are now...", "rm -rf", "git push --force", and "curl/wget" exfiltration discussions which match HIGH and MEDIUM patterns (e.g. '[Ii]gnore\s+(all\s+)?previous\s+instructions', '[Yy]ou\s+are\s+now\s+', 'rm\s+-rf\s+\*', 'git\s+push\s+--force'). As a result, the skill that defines the scan will FAIL its own scan, causing the orchestrator (per SKILL.md step 6) to notify and exit 1 even when nothing is wrong. There is no allowlist/self-skip and no trusted-source filtering actually applied in scan.sh (TRUSTED_OWNERS/TRUSTED_REPOS are loaded but never consulted).

Recommendation

Either (a) exclude the security-scan skill from --all by default, (b) treat fenced-code/threat-model sections in SKILL.md differently, or (c) actually consult TRUSTED_OWNERS/TRUSTED_REPOS to downgrade self-scan / known sources to format validation as the SKILL.md step 3 promises.

Why this didn't post

This finding didn't meet AntFleet's unanimous agreement threshold. Both frontier models review every PR independently; only findings they both flag with the same severity and category are posted to the PR. This one fell through.

read the methodology →

From the same review

These findings passed the unanimous gate on the same PR review. The disagreement above was filtered out; the findings below were posted.

← back to all disagreements view public receipts see unanimous findings + anatomies →

Tweet ↗