AntFleet

Architecture · the fleet

AntFleet is operated by a small fleet of agents.

Two of them are reviewer-fleet language models (Claude Opus 4.7, GPT-5). A third — Onboarder — owns the partner-facing lifecycle. The rest are deterministic workers: a webhook receiver, an agreement gate, a daily sweeper, a reaction poller. Together they manufacture the receipts you see on /receipts. This page is the operator's diagram.

The agents

  • Webhook Receiver

    deterministic

    Listens at /api/github/webhook. Verifies HMAC signature. Inserts a stub review row. Dispatches the heavy work to the after()-scheduled review pipeline.

    runs · on every pull_request open / synchronize / reopen event

  • Reviewer · Claude Opus 4.7

    language model

    Reads the changed files. Returns a structured findings list — title, severity, category, evidence, reasoning.

    runs · in parallel with the GPT-5 reviewer on each PR event

  • Reviewer · GPT-5

    language model

    Same contract as the Claude reviewer. Two independent providers; no shared prompt cache; no cross-talk.

    runs · in parallel with the Claude reviewer on each PR event

  • Agreement Gate

    deterministic

    Compares both reviewer outputs. Emits only the findings both agents flagged on overlapping evidence. Silence is the correct output when there is no unanimous overlap.

    runs · once per review, immediately after both reviewers return

  • Sweeper

    deterministic

    For every open finding, fetches main HEAD and compares against the review commit. If the evidence file appears in the changed-files set, marks the finding closed at the new SHA and posts a closure receipt comment on the PR.

    runs · daily at 06:00 UTC; can be fired on demand by an operator

  • Reaction Poller

    deterministic

    Re-fetches each posted review comment at the 24h / 7d / 30d marks. Persists maintainer reactions (thumbs-up / down / heart / rocket / etc.) as implicit RLHF signal for future routing.

    runs · daily as part of the sweeper pass

  • Onboarder

    language model

    Owns the partner-facing lifecycle. Posts a model-authored welcome on a fresh install, frames the partner's first review, reads agent@antfleet.dev for public-receipts opt-in, and posts a 7-day reaction-tally check-in.

    runs · on installation.created webhook, on the first PR per install, on inbound email at agent@antfleet.dev, and on a 7-day cron

The review pipeline

Each PR event triggers one pass through this pipeline. The agents on the left are stateless workers; the row on the right is the durable artifact that gets written to Postgres and (when agreement happens) to GitHub.

  1. Pull-request eventWebhook Receiver
  2. Webhook Receiver↓ verifies HMAC, inserts stubreviews row
  3. Webhook Receiver↓ dispatchesReviewer Fleet (2 agents)
  4. Reviewer · Claude Opus 4.7↓ findings[]Agreement Gate
  5. Reviewer · GPT-5↓ findings[]Agreement Gate
  6. Agreement Gate↓ unanimous onlyfinding_status rows + PR comment

The Agreement Gate is the trust primitive. A finding only crosses into the PR comment if both reviewers flagged the same code with overlapping evidence. Silence on a PR means "no unanimous finding," not "no findings at all" — individual reviewer outputs are persisted to reviews.provider_responses for analysis but never posted.

The sweep loop

A finding stays open in Postgres until the Sweeper detects that the flagged file changed on the repo's default branch. The closure mechanic is intentionally cheap — "evidence file touched" is a strong proxy for "bug addressed" on a corpus where unanimous findings are ~100% real.

  1. Cron · 06:00 UTC dailySweeper
  2. Sweeper↓ loadSweepWork()open findings grouped by repo
  3. Sweeper↓ per-repo: getRef + compareCommitschanged-files set
  4. Sweeper↓ if evidence file changedmarkFindingClosed + closure SHA
  5. Sweeper↓ posts on original PRClosure receipt comment
  6. Reaction Poller↓ at 24h / 7d / 30dmaintainer_reactions rows

The closure receipt comment is the artifact. It lives on GitHub's event log — third-party-witnessed — and it is what the public /receipts counter is counting.

The Onboarder lifecycle

Onboarder is the third language-model agent in the fleet. It owns everything partner-facing that isn't a PR review: the welcome on install, a one-time framing comment on the partner's first PR, and a 7-day check-in. Each action is structured tool output from claude-opus-4-7 and persists a row in onboarding_events with the prompt, output, and GitHub artifact ids — the same audit shape as reviews.

  1. installation.created webhookWebhook Receiver
  2. Webhook Receiver↓ dispatchesOnboarder · welcome
  3. Onboarder · welcome↓ repos.get + LLM tool callissues.create on partner repo
  4. (first PR per install)↓ after Reviewer postsOnboarder · first-review summary
  5. Onboarder · first-review summary↓ LLM tool callissues.createComment on the PR
  6. Cron · 06:00 UTC daily↓ alongside SweeperOnboarder · check-in driver
  7. Onboarder · check-in driver↓ for installs aged 7-8dOnboarder · 7-day check-in
  8. Onboarder · 7-day check-in↓ LLM tool callissues.createComment on welcome issue

The whole lifecycle is gated behind an ONBOARDER_ENABLED env flag — default off in every environment. Cutover is operator- controlled, not webhook-controlled: a fresh install on a repo with the flag off generates a server log line and nothing else. Idempotency is per-(installation_id, owner, repo, event_type) so a repeated webhook never produces a duplicate issue or comment.

What is in a receipt

A closure receipt is the artifact the Sweeper posts on the original PR when a finding closes. Every field below is either authored by AntFleet or third-party-witnessed by GitHub's event log. Nothing is rendered from a database we control alone.

AntFleet's signal succeeds when the underlying fix lands on upstream — whether via merge of our PR (merged) or via a separate upstream commit that applies the same fix (absorbed_inline). Both are receipt-eligible. Absorbed-inline detection uses an LLM judge to compare our PR's diff against recent upstream commits; only matches above 70% confidence are classified.

antfleet[bot]·commented on PR

Heading

AntFleet · finding f1b5393a-0 closed in 4640404

finding id = our DB · SHA = GitHub commit log

Category + Severity

Security · High

emitted only when both reviewers agreed

Title

Health endpoint exposes operational details

produced by the Reviewer Fleet at review time

Evidence path

apps/web/app/api/health/route.ts:5-11

from the Reviewer Fleet output; verifiable in the PR diff

Attribution

Originally flagged in the AntFleet review · Receipt automated

links to the original GitHub comment; you can click and read it