AntFleet

Disagreement · 3a9ae97b-anthropic-6

Observability /events SSE stream allocates seen_ids set bounded only by full clear, opening a memory + dedup correctness gap

mismatch
repo 193af03f·PR #1·reviewed 1 week ago

Primary finding

Observability /events SSE stream allocates seen_ids set bounded only by full clear, opening a memory + dedup correctness gap

lowbughigh
  • backend/app/api/observability.py:65-75
When seen_ids hits 5000 it is wholesale cleared, which means events received in the next batch that were already seen prior to the clear will be re-emitted as duplicates to the SSE consumer. A bounded LRU (e.g. collections.deque + set) would preserve the dedup property. Minor but a correctness regression on long-lived SSE connections.

Recommendation

Replace seen_ids set with an OrderedDict / deque-backed LRU of fixed capacity that evicts oldest entries instead of clearing.

Counterpart finding

Observability endpoints accept unvalidated simulation_id in path construction (potential path traversal)

lowsecuritymedium
  • backend/app/api/observability.py:40-84
  • backend/app/api/observability.py:116-135
simulation_id from query params is interpolated into filesystem paths without validation (unlike other modules that call validate_simulation_id). While the filename is constrained to 'events.jsonl', a crafted value with '..' segments can attempt directory traversal above the simulation data root. Even if it typically results in a miss, inputs should be rejected early.

Recommendation

Validate simulation_id (same validator used by report/share/watch) and reject invalid values with 400. Alternatively, normalize and ensure the resolved path stays under the configured root before opening/tailing.

Why this didn't post

This finding didn't meet AntFleet's unanimous agreement threshold. Both frontier models review every PR independently; only findings they both flag with the same severity and category are posted to the PR. This one fell through.

read the methodology →

From the same review

These findings passed the unanimous gate on the same PR review. The disagreement above was filtered out; the findings below were posted.