Anatomy · cdf9ffa0-3

ADMANAGE_API_KEY missing is documented as 'Hard-fails' but script actually exits 0

mediumdocs-gapclosed in 4b9b492

repo 6f7fc663·PR #4·reviewed 1 week ago·closed 1 week ago

The vulnerable code

scripts/postprocess-admanage.sh:18-18

18#   - On any API error, writes the error to results/ and continues with the next

The reasoning

Opus

ADMANAGE_API_KEY missing is documented as 'Hard-fails' but script actually exits 0

mediumdocs-gaphigh

scripts/postprocess-admanage.sh:18
scripts/postprocess-admanage.sh:46-50
scripts/postprocess-admanage-create.sh:22
scripts/postprocess-admanage-create.sh:49-54

The header comments in both scripts assert a hard-fail on missing key, but both scripts actually emit a warning and `exit 0` — i.e. soft-fail. This is a deceptive comment that an auditor will rely on when assessing safety posture. The behavior may even be intentional (so CI doesn't fail when key isn't injected for forks/PRs), but the docs must match the code.

Recommendation

Either change the header to 'Soft-fails (warns + notifies) if ADMANAGE_API_KEY is not set' or change the exit to `exit 1` to actually hard-fail.

GPT-5

Daily spend cap circuit breaker fails open if spend API returns invalid/empty JSON

mediumbughigh

scripts/postprocess-admanage.sh:55-69

If SPEND_RESP is non-JSON (or jq fails), TODAY_SPEND becomes empty. The AWK comparison then evaluates an invalid expression (" >= <cap>") and returns non-zero, which makes the if not trigger. The script proceeds to launch despite an unknown/possibly over-cap spend state. A circuit breaker should fail closed for safety.

Recommendation

Harden parsing: default TODAY_SPEND to a safe numeric value and fail closed on parse errors. Example: parsed=$(echo "$SPEND_RESP" | jq -er '.metadata.totalSpend' 2>/dev/null || echo '__ERR__'); if [ "$parsed" = '__ERR__' ]; then block launches with a warning; else compare numerically using bc or awk with explicit numbers. Alternatively, treat any fetch/parse failure as over-cap and exit.

The agreement

Both frontier models flagged this within the same line range. AntFleet's unanimous gate fired — the finding posted on the PR. Closed in 4b9b492.

The fix

18#   - On any API error, writes the error to results/ and continues with the next

Closure

Closed 1 week ago

SHA: 4b9b49251c8c9808bf147d55aa2930352af2e8c0

View closure receipt on GitHub →

Tweet thread template

tweet 1 of 8169 / 280

Two frontier models reviewed PR #4 on 6f7fc663. Both found this bug: medium docs-gap: ADMANAGE_API_KEY missing is documented as 'Hard-fails' but script actually exits 0

tweet 2 of 8123 / 280

The vulnerable code (scripts/postprocess-admanage.sh:18-18): (full snippet at https://www.antfleet.dev/anatomy/cdf9ffa0-3)

tweet 3 of 8280 / 280

What Opus saw: "The header comments in both scripts assert a hard-fail on missing key, but both scripts actually emit a warning and `exit 0` — i.e. soft-fail. This is a deceptive comment that an auditor will rely on when assessing safety posture. The behavior may even be intent…

tweet 4 of 8280 / 280

What GPT-5 saw: "If SPEND_RESP is non-JSON (or jq fails), TODAY_SPEND becomes empty. The AWK comparison then evaluates an invalid expression (" >= <cap>") and returns non-zero, which makes the if not trigger. The script proceeds to launch despite an unknown/possibly over-cap sp…

tweet 5 of 897 / 280

Both flagged the same line range. AntFleet's unanimous gate fired — the finding posted on the PR.

tweet 6 of 893 / 280

The fix landed in commit 4b9b492: (view diff at https://www.antfleet.dev/anatomy/cdf9ffa0-3)

tweet 7 of 881 / 280

AntFleet reviews every PR with two frontier models. Only unanimous findings post.

tweet 8 of 877 / 280

Full anatomy + reasoning + diffs: https://www.antfleet.dev/anatomy/cdf9ffa0-3

Paste into X composer one tweet at a time. X has no multi-tweet intent API.

view receipt →all receipts →view disagreements →case studies →