AntFleet

Disagreement · 650efab1-anthropic-0

Unverified Claude model identifier 'claude-sonnet-4-6' used across all skills

mismatch
repo 53606958·PR #3·reviewed 1 week ago

Primary finding

Unverified Claude model identifier 'claude-sonnet-4-6' used across all skills

highbuild-releasemedium
  • aeon.yml:19-40
Every skill that pins a model uses the string 'claude-sonnet-4-6', and the top-level default is also 'claude-sonnet-4-6'. Anthropic's public model identifiers follow the pattern 'claude-sonnet-4-5-YYYYMMDD' / 'claude-sonnet-4-5' / 'claude-opus-4-1', and there is no released 'claude-sonnet-4-6' model. If the gateway (Venice) or the underlying Claude API rejects unknown model strings, every scheduled skill will fail at dispatch time, silently halting tick/heartbeat/claim-diem/etc. Because this string is repeated 8 times rather than centralised, a typo here propagates across the entire fleet. Either the identifier is wrong, or there is no single source of truth — both are bugs with concrete production impact.

Recommendation

Define the model once (top-level `model:` already exists as 'claude-sonnet-4-6') and remove per-skill overrides, OR replace with a verified identifier such as 'claude-sonnet-4-5' (or the dated variant). At minimum, validate the model id against the provider's catalog in CI before merging config changes.

Counterpart finding

Redundant per-skill model overrides duplicate the default model, increasing drift risk

lowmaintainabilityhigh
  • aeon.yml:39-40
  • aeon.yml:18
  • aeon.yml:21
  • aeon.yml:24
  • aeon.yml:27
  • aeon.yml:30
  • aeon.yml:33
  • aeon.yml:36
  • aeon.yml:37
A default model is defined globally, yet multiple skills redundantly specify an identical per-skill model. If the default changes later, these skills will silently remain pinned to the old value, causing config drift and inconsistent behavior.

Recommendation

Remove per-skill model entries that are identical to the default, or use a YAML anchor/alias to centralize the model version. Keep per-skill overrides only where they intentionally differ.

Why this didn't post

This finding didn't meet AntFleet's unanimous agreement threshold. Both frontier models review every PR independently; only findings they both flag with the same severity and category are posted to the PR. This one fell through.

read the methodology →

From the same review

These findings passed the unanimous gate on the same PR review. The disagreement above was filtered out; the findings below were posted.