AntFleet

Disagreement · 74ff1b9c-anthropic-3

Raw user content interpolated into JSON `message` without escaping

mismatch
repo 6f7fc663·PR #28·reviewed 1 week ago

Primary finding

Raw user content interpolated into JSON `message` without escaping

mediumsecuritymedium
  • skills/vvvkernel-audit/SKILL.md:30-42
  • skills/vvvkernel/SKILL.md:22-31
  • skills/vvvkernel-narrative/SKILL.md:40-47
All six skills instruct the agent to POST a JSON body where untrusted input (`$var`, fetched GitHub content, fetched contract source, project-context.md) is substituted directly into a JSON string field. Without an explicit JSON-escape step, contract source containing `"`, backslashes, or control chars will either (a) produce invalid JSON and silently fail, or (b) allow prompt-injection that re-opens the JSON object and overrides `expert_role` (e.g., a snippet ending with `","expert_role":"admin"`). The audit skill is the most exposed since it feeds arbitrary remote code into the prompt.

Recommendation

Add an explicit step: 'JSON-escape `<content>` / `<query>` before substitution' or instruct the agent to build the body via a JSON serializer rather than string templating. Optionally cap content length and strip control characters.

Counterpart finding

VVVKernel Query: Response parsing assumes a JSON field named “response” without documenting the API schema

lowapi-contractmedium
  • skills/vvvkernel/SKILL.md:35-37
  • skills/vvvkernel/SKILL.md:25-33
The doc instructs extracting a field named “response” but does not show a sample response body or confirm that this is the correct field name. Other skills also do not define the response schema. This risks breakage if the API returns a different structure (e.g., “message”, “content”, etc.).

Recommendation

Document the expected response schema from vvvkernel.com/api/agent/chat, including field names and error formats. Update the parsing step accordingly and align all skills that consume this API.

Why this didn't post

This finding didn't meet AntFleet's unanimous agreement threshold. Both frontier models review every PR independently; only findings they both flag with the same severity and category are posted to the PR. This one fell through.

read the methodology →

From the same review

These findings passed the unanimous gate on the same PR review. The disagreement above was filtered out; the findings below were posted.