Anatomy · 57e5c9ae-3

README claims testsPassed=13/13 but API hardcodes testsPassed=8/8

lowdocs-gapclosed in 3c10efc

repo bf0d040b·PR #3·reviewed 1 month ago·closed 1 month ago

The vulnerable code

agent/src/api.ts:109-117

109    health: {
110      lastEpochAt,
111      agentAlive,
112      totalEpochs: scoreCount,
113      totalProposals: proposals.length,
114      totalPromotions: proposals.filter((p: any) => p.status === 'executed').length,
115      testsPassed: 8,
116      testsTotal: 8,
117    },

The reasoning

Opus

README claims testsPassed=13/13 but API hardcodes testsPassed=8/8

lowdocs-gaphigh

agent/src/api.ts:109-117
README.md:88-110

The README and pitch deck loudly claim '13/13 Forge tests passing,' but the agent's /api/state response reports testsPassed=8, testsTotal=8 as hardcoded constants. The dashboard reading this state will display the wrong number, contradicting the README narrative. Also, the values are stale (a previous test count), suggesting they were never wired to actual test results.

Recommendation

Either remove these hardcoded fields or read them from a build artifact / forge JSON output.

GPT-5

Proposal status update endpoint uses unsanitized path segment, enabling path traversal within BRAIN_PATH

mediumsecuritymedium

agent/src/api.ts:112-156

proposalId comes from the URL and is interpolated directly into a filename without validating against path separators. An attacker with admin privileges could use segments like ../ to target unintended files within BRAIN_PATH. While admin auth is required, this still expands the write surface and can be abused if the admin list is misconfigured (see prior finding).

Recommendation

Validate proposalId against a strict whitelist regex (e.g., /^[A-Za-z0-9._-]+$/) and reject any path separator. Alternatively, resolve the path and assert it remains within the proposals directory using path.resolve and a prefix check.

The agreement

Both frontier models flagged this within the same line range. AntFleet's unanimous gate fired — the finding posted on the PR. Closed in 3c10efc.

The fix

109    health: {
110      lastEpochAt,
111      agentAlive,
112      totalEpochs: scoreCount,
113      totalProposals: proposals.length,
114      totalPromotions: proposals.filter((p: any) => p.status === 'executed').length,
115      testsPassed: 8,
116      testsTotal: 8,
117    },

Closure

Closed 1 month ago

SHA: 3c10efc6038bc5ab182e8b192224745b99bcf729

View closure receipt on GitHub →

Tweet thread template

tweet 1 of 8149 / 280

Two frontier models reviewed PR #3 on bf0d040b. Both found this bug: low docs-gap: README claims testsPassed=13/13 but API hardcodes testsPassed=8/8

tweet 2 of 8110 / 280

The vulnerable code (agent/src/api.ts:109-117): (full snippet at https://www.antfleet.dev/anatomy/57e5c9ae-3)

tweet 3 of 8280 / 280

What Opus saw: "The README and pitch deck loudly claim '13/13 Forge tests passing,' but the agent's /api/state response reports testsPassed=8, testsTotal=8 as hardcoded constants. The dashboard reading this state will display the wrong number, contradicting the README narrative…

tweet 4 of 8280 / 280

What GPT-5 saw: "proposalId comes from the URL and is interpolated directly into a filename without validating against path separators. An attacker with admin privileges could use segments like ../ to target unintended files within BRAIN_PATH. While admin auth is required, this…

tweet 5 of 897 / 280

Both flagged the same line range. AntFleet's unanimous gate fired — the finding posted on the PR.

tweet 6 of 893 / 280

The fix landed in commit 3c10efc: (view diff at https://www.antfleet.dev/anatomy/57e5c9ae-3)

tweet 7 of 881 / 280

AntFleet reviews every PR with two frontier models. Only unanimous findings post.

tweet 8 of 877 / 280

Full anatomy + reasoning + diffs: https://www.antfleet.dev/anatomy/57e5c9ae-3

Paste into X composer one tweet at a time. X has no multi-tweet intent API.

view receipt →all receipts →view disagreements →case studies →