AntFleet

Disagreement · f3aba81f-anthropic-0

Target-side semantics contradict between schema docs and evaluation logic

mismatch
repo 6f7fc663·PR #23·reviewed 1 week ago

Primary finding

Target-side semantics contradict between schema docs and evaluation logic

mediumdocs-gaphigh
  • skills/price-threshold-alert/SKILL.md:84-86
  • skills/price-threshold-alert/SKILL.md:146-152
The invariant text describes `side` from the operator's perspective ('waiting for the price to climb to it' means the target is above the current price), while the cross condition treats `side=above` as 'price must rise to/above target'. This is internally consistent for `above`, but the schema comment phrasing is confusing because `side` actually names the *target's position relative to current price at registration*, not the direction the price must move. The naming `above`/`below` is ambiguous and the docs do not state whether `side` refers to the operator goal direction, the target position, or the cross direction. An implementer reading only the schema block could swap the comparison.

Recommendation

Rename `side` to something unambiguous like `cross_direction: up|down` (the direction the price must move to hit the target) and align all three references (schema, invariant, Step 6) to one definition with an example.

Counterpart finding

Token symbol is used in logs/notifications but not extracted from the API response

mediumapi-contracthigh
  • skills/price-threshold-alert/SKILL.md:98-101
  • skills/price-threshold-alert/SKILL.md:221-223
  • skills/price-threshold-alert/SKILL.md:156-164
The extraction list omits a token symbol, yet the log and notification templates reference $TOKEN/${SYMBOL}. Without extracting symbol (e.g., from .baseToken.symbol), messages cannot be correctly populated.

Recommendation

Extract symbol from the chosen pair (DexScreener provides baseToken.symbol). Fall back to contract or a placeholder if missing.

Why this didn't post

This finding didn't meet AntFleet's unanimous agreement threshold. Both frontier models review every PR independently; only findings they both flag with the same severity and category are posted to the PR. This one fell through.

read the methodology →