AntFleet

Anatomy · 42eb81fe-1

v4-readiness manifest references files not included in the skill’s read set, causing missed Review detections

highbugclosed in 4b9b492
repo 6f7fc663·PR #12·reviewed 1 week ago·closed 1 week ago

The vulnerable code

skills/v4-readiness/SKILL.md:0-0

0---
1name: v4-readiness
2description: Generate a per-fork v4 upgrade readiness checklist — reads the fork's aeon.yml, skills.json, and MEMORY.md, cross-references against the embedded v4 change manifest, emits Safe / Review / Custom / Action-items breakdown
3var: ""
4tags: [meta, dx]
5---
6
7> **${var}** — Optional. Pass `dry-run` to skip the notification (article still writes, log still appends). Pass a fork repo slug (e.g. `someuser/aeon`) to read remote `aeon.yml` + `skills.json` from that fork instead of the local working tree (useful for surveying the fleet ahead of a v4 announcement). Empty = audit the local fork.
8
9Today is ${today}. Convert the **current** state of this fork — its enabled skills, model overrides, chain definitions, custom skill list — into a personalized checklist for the upcoming v4 release. The point is to give every fork operator a structured surface for "what's safe, what's about to change, what I added myself" **before** v4 lands, not after they've already pulled and discovered something broke.
10
11## Why this exists
12
13v4 is announced as a full redesign (~2 weeks lead time per operator's social posts). 40+ forks are running on the current architecture. Without a structured per-fork readiness check, operators hit breaking changes blind: they pull the upstream, their custom `aeon.yml` contains a now-removed key, a chain consumer references a renamed skill, a model override points to a retired model, their custom skill imports from a path that moved. Every one of those is recoverable in five minutes if it's surfaced ahead of time and unrecoverable in five hours if it's discovered at the moment a cron fires.
14
15This skill surfaces them ahead of time. It is read-only across the fork; it never auto-edits config, never opens PRs, never auto-pulls upstream. It writes one article and one notification — the operator owns the upgrade decision.
16
17## When this skill runs
18
19`workflow_dispatch` only. There is no cron — the article only matters in the window before v4 lands and during the upgrade itself. Operators dispatch it manually:
20
21- **Pre-announcement** — to see which embedded patterns the fork is currently leaning on, regardless of whether v4 has marked them yet.
22- **At v4 announcement** — once the manifest in this skill is updated with the actual v4 change list, re-dispatch to get a real readiness verdict.
23- **During upgrade** — re-dispatch after any partial change to confirm the gap list shrank.
24- **Post-upgrade** — run once more on the v4 branch to confirm zero remaining items before merging.
25
26## Config
27
28No new secrets. No new env vars. No new state files. Pure local file I/O over the fork's own working tree, plus optional `gh api` for the `${var}=owner/repo` remote-survey mode.
29
30Reads:
31- `aeon.yml` — enabled skills, model overrides, chain definitions, schedule strings, reactive triggers, gateway block, channels block.
32- `skills.json` — total skill count, category breakdown, per-skill metadata; used as the catalog fingerprint to detect drift from upstream.
33- `memory/MEMORY.md` — Skills Built table (custom skills with no upstream equivalent get the most attention from the readiness check).
34- `skills/*/SKILL.md` (frontmatter only) — confirms which custom skills are actually present on disk, not just remembered in MEMORY.md.
35- The **embedded v4 change manifest** in this file (§Manifest below). This is the source of truth for what counts as Safe / Review / Removed / Renamed.
36
37Writes:
38- `articles/v4-readiness-${today}.md` — the full per-fork readiness report.
39- `memory/logs/${today}.md` — log block.
40
41If `${var}` is a fork slug instead of `dry-run` or empty, replace every local file read with `gh api repos/${var}/contents/<path>` and decode the base64 content. Custom-skill scan via `gh api repos/${var}/contents/skills?ref=main`.
42
43## Manifest
44
45The change manifest is embedded here so it travels with the skill — operators never need a separate config file. **Update this section as v4 details are announced.** Until the v4 list is finalized, the manifest below is seeded with the patterns the operator's social posts have flagged as in-scope for v4 and the patterns historically known to be stable.
46
47### Safe — patterns confirmed stable into v4
48
49| Pattern | Where it lives | Why it stays |
50|---------|----------------|--------------|
51| SKILL.md frontmatter keys (`name`, `description`, `var`, `tags`) | `skills/*/SKILL.md` | Public skill contract; renaming would break every fork |
52| `./notify "message"` interface | bash | Operator-facing CLI; documented in CLAUDE.md |
53| `memory/` directory layout (`MEMORY.md`, `logs/`, `topics/`, `issues/`) | filesystem | File-based memory is the project's identity; layout is documented in CLAUDE.md |
54| `articles/${skill}-${today}.md` output convention | per-skill | Consumed by chains, dashboard, syndicate-article — too many readers to break |
55| `memory/watched-repos.md` format (`- owner/repo` per line) | filesystem | Read by repo-pulse, repo-actions, fork-fleet, star-momentum-alert |
56| `gh api` and `gh pr create` usage in skills | bash | GitHub CLI is stable; sandbox workaround for env-var-in-headers |
57| `${today}` template variable | SKILL.md prose | Substituted by the runner; no plan to change |
58
59### Review — patterns flagged for review in v4
60
61These are the patterns the operator's social posts have signposted as in-scope for v4 redesign, OR patterns that are internal-enough to plausibly change. Not all will change; presence here means "look at this skill manually before merging the v4 PR."
62
63| Pattern | Where it lives | What might change |
64|---------|----------------|--------------------|
65| `chains:` runner interface | `aeon.yml` | Step format (`parallel:` / `consume:` keys, `on_error` semantics) |
66| `reactive:` trigger conditions | `aeon.yml` | Condition vocabulary (`consecutive_failures`, `success_rate`, `last_status`) |
67| Chain runner output passing | `.outputs/*.md`, `chain-runner.yml` | Path layout, file shape |
68| Schedule syntax (`workflow_dispatch`, `reactive`, cron) | `aeon.yml` | Naming of pseudo-schedules; cron escapes |
69| Model selector strings | `aeon.yml` per-skill `model:` | Model id references — Opus/Sonnet/Haiku version pins |
70| `gateway:` provider block | `aeon.yml` | Bankr/direct selector, env var names |
71| `channels:` block (`jsonrender.enabled`) | `aeon.yml` | Toggle key names, channel set |
72| MCP server tool naming (`aeon-${skill_slug}`) | `mcp-server/src/index.ts` | Naming convention for forks consuming the MCP |
73| `add-skill`, `add-mcp`, `add-a2a` CLIs | repo root | Argument shape, supported sources |
74| `skills.json` schema (`version`, `categories`, `skills[].install`) | `skills.json` | Field set; rename/remove of optional fields |
75| `dashboard/lib/catalog.ts` json-render catalog shape | `dashboard/` | Spec shape for `dashboard/outputs/*.json` |
76
77### Custom — skills with no upstream equivalent
78
79Anything listed in this fork's `memory/MEMORY.md` Skills Built table OR present under `skills/` and missing from upstream's `skills.json` (compared to the install metadata in this fork's own `skills.json` if present). These need a manual v4 compat check from the operator — the upstream maintainer cannot guarantee their patterns travel.
80
81For each custom skill, we list:
82- name
83- declared `var` (if any) — same as upstream contract
84- whether it consumes any path from the **Review** table above
85- count of references to other skills (chained or implicit)
86
87### Removed (placeholder)
88
89| Pattern | Replacement | Migration note |
90|---------|-------------|----------------|
91
92(Empty until v4 is announced. The operator's job — and the maintainer's job upstream — is to populate this row by row as the v4 PRs land. Each row should have a one-line migration recipe so the readiness report can convert it directly into an action item.)
93
94## Steps
95
96### 1. Parse var
97
98- If `${var}` matches `^dry-run$` → `MODE=dry-run`. No notification, article still writes.
99- Else if `${var}` matches `^[a-zA-Z0-9._-]+/[a-zA-Z0-9._-]+$` → `MODE=remote`, `TARGET=${var}`. All file reads go through `gh api repos/${TARGET}/contents/...`.
100- Else if `${var}` is empty → `MODE=local`, `TARGET=$(gh repo view --json nameWithOwner --jq .nameWithOwner)`.
101- Anything else → log `V4_READINESS_BAD_VAR: ${var}` and exit (no notify, no article).
102
103### 2. Load fork inputs
104
105```bash
106mkdir -p articles
107```
108
109Read each input. Any missing input is non-fatal — log `V4_READINESS_MISSING_INPUT: <name>` and proceed without it. The skill never invents content for missing inputs.
110
111| Input | Local | Remote (`MODE=remote`) |
112|-------|-------|------------------------|
113| `aeon.yml` | direct read | `gh api repos/${TARGET}/contents/aeon.yml --jq .content \| base64 -d` |
114| `skills.json` | direct read | same pattern |
115| `memory/MEMORY.md` | direct read | same pattern |
116| Custom skills | `ls skills/` minus skills present in `skills.json` install rows | `gh api repos/${TARGET}/contents/skills` JSON |
117
118If `aeon.yml` is unreadable, log `V4_READINESS_NO_CONFIG` and exit with no notification (the fork is not initialized; nothing to check).
119
120### 3. Compute the enabled-skill snapshot
121
122From `aeon.yml`, extract for each skill under `skills:`:
123- `enabled` (true/false)
124- `schedule` (cron or `workflow_dispatch` or `reactive`)
125- `var` (default value, if set)
126- `model` (override, if set)
127
128Ignore commented entries. Capture `chains:`, `reactive:`, `gateway:`, `channels:` blocks verbatim — the readiness check looks at their **shape**, not their values.
129
130### 4. Walk the manifest categories
131
132For each row in the **Safe** table: scan the fork's inputs for the pattern. If found, record under `safe[]` with the file/line where it was matched. If absent, do not flag — Safe means "if you use it, it stays working," not "you must use it."
133
134For each row in the **Review** table: scan the fork's inputs for the pattern. If the fork uses it, record under `review[]` with the matched location and the manifest's "what might change" note. The presence of a Review row is what the operator should manually inspect before merging v4.
135
136For each custom-skill candidate: confirm it exists on disk (`skills/${name}/SKILL.md`) and is **not** present in the upstream-fingerprint heuristic (skills with `install: ./add-skill aaronjmars/aeon ${name}` in this fork's `skills.json` are upstream; everything else is custom). Cross-reference custom skills against the Review patterns — a custom skill that uses `chains:` consume is the highest-priority audit candidate.
137
138For each row in the **Removed** table (currently empty): if the fork uses the removed pattern, record under `action[]` with the migration note as the action.
139
140### 5. Score effort per Review item
141
142Per item, assign a complexity tag based on the manifest pattern:
143
144| Tag | Heuristic |
145|-----|-----------|
146| `trivial` | Config rename only; one-line `aeon.yml` edit |
147| `minor` | Config restructure; ≤ 5 lines or one chain block edit |
148| `moderate` | SKILL.md prose changes needed (e.g. chained skill consuming an output whose shape changed) |
149| `manual` | Custom-skill review required; outcome cannot be predicted from the fork's metadata alone |
150
151The score is informational — it is not a green-light gate. A `trivial` item still counts; the tag tells the operator whether they need 60 seconds or 60 minutes to address it.
152
153### 6. Build the article
154
155Path: `articles/v4-readiness-${today}.md`. Overwrite if exists.
156
157```markdown
158# v4 Readiness — ${TARGET} — ${today}
159
160**Verdict:** ${one of: READY — 0 review items, 0 action items | REVIEW — N items to inspect, M action items | ACTION — M removed-pattern items must be addressed before upgrade}
161
162*Audit basis: aeon.yml + skills.json + MEMORY.md + skills/ on disk · Manifest version: embedded in skills/v4-readiness/SKILL.md as of ${today}*
163
164---
165
166## Safe (${safe_count})
167
168Patterns this fork uses that are confirmed stable into v4. No action needed.
169
170| Pattern | Where in this fork |
171|---------|---------------------|
172| ${pattern} | ${file_or_line} |
173
174## Review (${review_count})
175
176Patterns this fork uses that v4 may change. Inspect each one before merging the upstream v4 PR.
177
178| Pattern | Where | What might change | Effort |
179|---------|-------|--------------------|--------|
180| ${pattern} | ${file_or_line} | ${manifest_note} | ${tag} |
181
182## Custom (${custom_count})
183
184Skills present on this fork but not in the upstream catalog. The upstream maintainer cannot guarantee their patterns travel into v4.
185
186| Skill | Schedule | Reads from Review patterns? | Notes |
187|-------|----------|-----------------------------|-------|
188| ${name} | ${schedule} | ${yes/no — list} | ${one-line summary from MEMORY.md if present} |
189
190## Action items (${action_count})
191
192(Populated when the **Removed** table is non-empty. Each row is a concrete numbered step the operator must take before pulling v4.)
193
1941. ${action} (${tag})
195
196---
197
198## Methodology
199
200- Manifest read from `skills/v4-readiness/SKILL.md` §Manifest.
201- Custom-skill detection: skills present under `skills/` whose slug does not appear in `skills.json[skills][].slug` with an `install` line referencing the upstream catalog.
202- Effort tags are heuristic — `trivial`/`minor`/`moderate`/`manual`. They do not predict v4 release-note exact wording.
203- This skill never modifies `aeon.yml`, never opens a PR, never pulls upstream. It only reports.
204
205---
206
207*Re-run after the v4 announcement updates the embedded manifest. Until then the **Removed** section is empty by design — only Safe / Review / Custom rows are populated.*
208```
209
210If `safe_count == 0 AND review_count == 0 AND custom_count == 0 AND action_count == 0`: the verdict is `READY` and the article still writes — operators may want a paper trail confirming an empty-state audit.
211
212### 7. Notify
213
214If `MODE == dry-run` → skip notify, log `V4_READINESS_DRY_RUN`, exit cleanly (article still wrote).
215
216If verdict is `READY` AND no items in any bucket → still notify, but with a single-line body. Operators dispatched this skill manually; silence on a manual run is worse than a one-line "all clear" reply.
217
218Standard notify body:
219
220```
221*v4 Readiness — ${today} — ${TARGET}*
222
223Verdict: ${verdict}
224
225- Safe: ${safe_count}
226- Review: ${review_count} (${trivial}/${minor}/${moderate}/${manual} effort split)
227- Custom: ${custom_count}
228- Action: ${action_count}
229
230Top review item: ${first_review_pattern_or_"—"}
231Top custom skill: ${first_custom_skill_or_"—"}
232
233Article: articles/v4-readiness-${today}.md
234Manifest version: embedded in skills/v4-readiness/SKILL.md as of ${today}
235
236Re-dispatch after the next v4 manifest update to refresh the verdict.
237```
238
239Cap message at ~3500 chars (Telegram safe limit). If exceeded, drop the Custom section first; Action and Review are higher priority.
240
241### 8. Log to `memory/logs/${today}.md`
242
243```
244## v4 Readiness
245- **Skill**: v4-readiness
246- **Mode**: ${local|remote|dry-run}
247- **Target**: ${TARGET}
248- **Verdict**: ${READY|REVIEW|ACTION}
249- **Counts**: safe=${N} review=${N} custom=${N} action=${N}
250- **Article**: articles/v4-readiness-${today}.md
251- **Notification**: ${sent|skipped — dry-run}
252- **Status**: ${V4_READINESS_OK | V4_READINESS_DRY_RUN | V4_READINESS_NO_CONFIG | V4_READINESS_BAD_VAR | V4_READINESS_PARTIAL}
253```
254
255`V4_READINESS_PARTIAL` means at least one input was missing (logged in step 2) but the audit still wrote — the operator should sanity-check the affected section.
256
257## Exit taxonomy
258
259| Status | Meaning | Notify? |
260|--------|---------|---------|
261| `V4_READINESS_OK` | Audit completed against all inputs | Yes |
262| `V4_READINESS_PARTIAL` | At least one input missing; audit ran on remaining inputs | Yes |
263| `V4_READINESS_DRY_RUN` | `var=dry-run` mode | No (article still writes) |
264| `V4_READINESS_NO_CONFIG` | `aeon.yml` unreadable; fork not initialized | No |
265| `V4_READINESS_BAD_VAR` | `${var}` was non-empty, non-`dry-run`, not a `owner/repo` slug | No |
266
267## Sandbox note
268
269**Local mode (default).** Pure local file I/O — no curl, no env-var-in-headers, no prefetch. Every read is a directory listing or file read against the working tree. The only outbound call is `./notify` itself, which uses the postprocess pattern (see CLAUDE.md).
270
271**Remote mode (`var=owner/repo`).** Each input read is a single `gh api repos/${TARGET}/contents/${path}` call. `gh` handles auth via the workflow's `GITHUB_TOKEN`, so there is no env-var-in-curl pattern to work around. The remote-survey mode is rate-limit-bounded: with five reads per fork, the standard 5,000/h `GITHUB_TOKEN` budget covers ~1,000 fork audits per hour — the realistic per-day operator workload is far below this.
272
273## Constraints
274
275- **Never auto-mutate the fork.** The skill is read-only. It does not edit `aeon.yml`, does not open a PR, does not pull upstream. The upgrade decision belongs to the operator.
276- **Never invent v4 details.** The Manifest is the only source of truth. Operators update the Manifest when the maintainer posts v4 changes; the skill reports against whatever the Manifest currently says. If the Manifest is stale, the report is stale, but it is never wrong-by-fabrication.
277- **Custom-skill detection is heuristic.** A skill present under `skills/` with no `install:` in `skills.json` is treated as custom. False positives (the fork removed and re-added an upstream skill manually) are tagged `manual` so the operator notices; false negatives (the fork edited an upstream skill in place) are caught by `skill-update-check`, not here.
278- **Idempotent.** Same-day reruns overwrite the article. The log line is appended (multiple runs visible if the operator re-dispatches during an upgrade).
279- **One notification max per run.** Even if remote-mode audits multiple targets in sequence (not currently supported by `var` syntax — one slug per run), each invocation produces at most one notify call.
280- **Manifest evolves; skill body does not.** When the v4 announcement lands, the Manifest tables in this file are the only edit surface. The Steps and Constraints stay stable so operators can regenerate the article without merging upstream changes to skill prose.
281
282## Edge cases
283
284- **Empty `aeon.yml` skills block (fresh fork)** — verdict is `READY`, every bucket is empty. Article writes, notification fires with the single-line body. The operator confirms the fork has nothing to migrate.
285- **Custom skill that imports an upstream skill name** — listed under Custom with a `notes` cell flagging the collision. The operator must decide whether to keep the override after v4 lands.
286- **Manifest's Removed section non-empty AND fork uses the pattern** — verdict escalates to `ACTION` regardless of Review counts. Action items list before Review in the notification.
287- **`gh api` fails in remote mode** — log `V4_READINESS_REMOTE_API_ERROR: <code>` and fall back to a partial audit using only inputs that did read; emit `V4_READINESS_PARTIAL`. Do not retry; remote-mode is a survey tool and partial coverage is acceptable.
288- **Same fork audited twice in one day with `var=dry-run` then `var=` empty** — the empty-var run overwrites the article and sends the notification; the dry-run run already wrote the article body so the empty-var run produces a byte-identical or near-identical file (only the timestamp changes). This is intended; the operator gets a notification once they explicitly opt in.
289

The reasoning

Opus

Output unavailable for this row.

GPT-5

v4-readiness manifest references files not included in the skill’s read set, causing missed Review detections

highbughigh
  • skills/v4-readiness/SKILL.md
  • skills/v4-readiness/SKILL.md
  • skills/v4-readiness/SKILL.md
The skill promises to detect Review-pattern usage by scanning the fork’s inputs, but its declared input set excludes several locations explicitly listed in the Manifest (mcp-server/src/index.ts, .outputs/, chain-runner.yml, dashboard/). As a result, forks using those patterns will not be detected and will yield undercounted Review items or an incorrect READY verdict. This directly undermines the feature’s purpose of pre-upgrade risk surfacing.

Recommendation

Align the read set with the Manifest or narrow the Manifest to only patterns actually scanned. Concretely: (1) Add optional reads for mcp-server/src/index.ts, chain-runner.yml, .outputs/, and dashboard/* in both local and remote modes; (2) If these inputs are missing, mark the audit as PARTIAL and annotate which Review buckets were not scanned; (3) Until implemented, move those patterns to a clearly labeled "Not scanned in this version" section to avoid false assurance.

The agreement

Both frontier models flagged this within the same line range. AntFleet's unanimous gate fired — the finding posted on the PR. Closed in 4b9b492.

The fix

0---
1name: v4-readiness
2description: Generate a per-fork v4 upgrade readiness checklist — reads the fork's aeon.yml, skills.json, and MEMORY.md, cross-references against the embedded v4 change manifest, emits Safe / Review / Custom / Action-items breakdown
3var: ""
4tags: [meta, dx]
5---
6
7> **${var}** — Optional. Pass `dry-run` to skip the notification (article still writes, log still appends). Pass a fork repo slug (e.g. `someuser/aeon`) to read remote `aeon.yml` + `skills.json` from that fork instead of the local working tree (useful for surveying the fleet ahead of a v4 announcement). Empty = audit the local fork.
8
9Today is ${today}. Convert the **current** state of this fork — its enabled skills, model overrides, chain definitions, custom skill list — into a personalized checklist for the upcoming v4 release. The point is to give every fork operator a structured surface for "what's safe, what's about to change, what I added myself" **before** v4 lands, not after they've already pulled and discovered something broke.
10
11## Why this exists
12
13v4 is announced as a full redesign (~2 weeks lead time per operator's social posts). 40+ forks are running on the current architecture. Without a structured per-fork readiness check, operators hit breaking changes blind: they pull the upstream, their custom `aeon.yml` contains a now-removed key, a chain consumer references a renamed skill, a model override points to a retired model, their custom skill imports from a path that moved. Every one of those is recoverable in five minutes if it's surfaced ahead of time and unrecoverable in five hours if it's discovered at the moment a cron fires.
14
15This skill surfaces them ahead of time. It is read-only across the fork; it never auto-edits config, never opens PRs, never auto-pulls upstream. It writes one article and one notification — the operator owns the upgrade decision.
16
17## When this skill runs
18
19`workflow_dispatch` only. There is no cron — the article only matters in the window before v4 lands and during the upgrade itself. Operators dispatch it manually:
20
21- **Pre-announcement** — to see which embedded patterns the fork is currently leaning on, regardless of whether v4 has marked them yet.
22- **At v4 announcement** — once the manifest in this skill is updated with the actual v4 change list, re-dispatch to get a real readiness verdict.
23- **During upgrade** — re-dispatch after any partial change to confirm the gap list shrank.
24- **Post-upgrade** — run once more on the v4 branch to confirm zero remaining items before merging.
25
26## Config
27
28No new secrets. No new env vars. No new state files. Pure local file I/O over the fork's own working tree, plus optional `gh api` for the `${var}=owner/repo` remote-survey mode.
29
30Reads:
31- `aeon.yml` — enabled skills, model overrides, chain definitions, schedule strings, reactive triggers, gateway block, channels block.
32- `skills.json` — total skill count, category breakdown, per-skill metadata; used as the catalog fingerprint to detect drift from upstream.
33- `memory/MEMORY.md` — Skills Built table (custom skills with no upstream equivalent get the most attention from the readiness check).
34- `skills/*/SKILL.md` (frontmatter only) — confirms which custom skills are actually present on disk, not just remembered in MEMORY.md.
35- The **embedded v4 change manifest** in this file (§Manifest below). This is the source of truth for what counts as Safe / Review / Removed / Renamed.
36
37Writes:
38- `articles/v4-readiness-${today}.md` — the full per-fork readiness report.
39- `memory/logs/${today}.md` — log block.
40
41If `${var}` is a fork slug instead of `dry-run` or empty, replace every local file read with `gh api repos/${var}/contents/<path>` and decode the base64 content. Custom-skill scan via `gh api repos/${var}/contents/skills?ref=main`.
42
43## Manifest
44
45The change manifest is embedded here so it travels with the skill — operators never need a separate config file. **Update this section as v4 details are announced.** Until the v4 list is finalized, the manifest below is seeded with the patterns the operator's social posts have flagged as in-scope for v4 and the patterns historically known to be stable.
46
47### Safe — patterns confirmed stable into v4
48
49| Pattern | Where it lives | Why it stays |
50|---------|----------------|--------------|
51| SKILL.md frontmatter keys (`name`, `description`, `var`, `tags`) | `skills/*/SKILL.md` | Public skill contract; renaming would break every fork |
52| `./notify "message"` interface | bash | Operator-facing CLI; documented in CLAUDE.md |
53| `memory/` directory layout (`MEMORY.md`, `logs/`, `topics/`, `issues/`) | filesystem | File-based memory is the project's identity; layout is documented in CLAUDE.md |
54| `articles/${skill}-${today}.md` output convention | per-skill | Consumed by chains, dashboard, syndicate-article — too many readers to break |
55| `memory/watched-repos.md` format (`- owner/repo` per line) | filesystem | Read by repo-pulse, repo-actions, fork-fleet, star-momentum-alert |
56| `gh api` and `gh pr create` usage in skills | bash | GitHub CLI is stable; sandbox workaround for env-var-in-headers |
57| `${today}` template variable | SKILL.md prose | Substituted by the runner; no plan to change |
58
59### Review — patterns flagged for review in v4
60
61These are the patterns the operator's social posts have signposted as in-scope for v4 redesign, OR patterns that are internal-enough to plausibly change. Not all will change; presence here means "look at this skill manually before merging the v4 PR."
62
63| Pattern | Where it lives | What might change |
64|---------|----------------|--------------------|
65| `chains:` runner interface | `aeon.yml` | Step format (`parallel:` / `consume:` keys, `on_error` semantics) |
66| `reactive:` trigger conditions | `aeon.yml` | Condition vocabulary (`consecutive_failures`, `success_rate`, `last_status`) |
67| Chain runner output passing | `.outputs/*.md`, `chain-runner.yml` | Path layout, file shape |
68| Schedule syntax (`workflow_dispatch`, `reactive`, cron) | `aeon.yml` | Naming of pseudo-schedules; cron escapes |
69| Model selector strings | `aeon.yml` per-skill `model:` | Model id references — Opus/Sonnet/Haiku version pins |
70| `gateway:` provider block | `aeon.yml` | Bankr/direct selector, env var names |
71| `channels:` block (`jsonrender.enabled`) | `aeon.yml` | Toggle key names, channel set |
72| MCP server tool naming (`aeon-${skill_slug}`) | `mcp-server/src/index.ts` | Naming convention for forks consuming the MCP |
73| `add-skill`, `add-mcp`, `add-a2a` CLIs | repo root | Argument shape, supported sources |
74| `skills.json` schema (`version`, `categories`, `skills[].install`) | `skills.json` | Field set; rename/remove of optional fields |
75| `dashboard/lib/catalog.ts` json-render catalog shape | `dashboard/` | Spec shape for `dashboard/outputs/*.json` |
76
77### Custom — skills with no upstream equivalent
78
79Anything listed in this fork's `memory/MEMORY.md` Skills Built table OR present under `skills/` and missing from upstream's `skills.json` (compared to the install metadata in this fork's own `skills.json` if present). These need a manual v4 compat check from the operator — the upstream maintainer cannot guarantee their patterns travel.
80
81For each custom skill, we list:
82- name
83- declared `var` (if any) — same as upstream contract
84- whether it consumes any path from the **Review** table above
85- count of references to other skills (chained or implicit)
86
87### Removed (placeholder)
88
89| Pattern | Replacement | Migration note |
90|---------|-------------|----------------|
91
92(Empty until v4 is announced. The operator's job — and the maintainer's job upstream — is to populate this row by row as the v4 PRs land. Each row should have a one-line migration recipe so the readiness report can convert it directly into an action item.)
93
94## Steps
95
96### 1. Parse var
97
98- If `${var}` matches `^dry-run$` → `MODE=dry-run`. No notification, article still writes.
99- Else if `${var}` matches `^[a-zA-Z0-9._-]+/[a-zA-Z0-9._-]+$` → `MODE=remote`, `TARGET=${var}`. All file reads go through `gh api repos/${TARGET}/contents/...`.
100- Else if `${var}` is empty → `MODE=local`, `TARGET=$(gh repo view --json nameWithOwner --jq .nameWithOwner)`.
101- Anything else → log `V4_READINESS_BAD_VAR: ${var}` and exit (no notify, no article).
102
103### 2. Load fork inputs
104
105```bash
106mkdir -p articles
107```
108
109Read each input. Any missing input is non-fatal — log `V4_READINESS_MISSING_INPUT: <name>` and proceed without it. The skill never invents content for missing inputs.
110
111| Input | Local | Remote (`MODE=remote`) |
112|-------|-------|------------------------|
113| `aeon.yml` | direct read | `gh api repos/${TARGET}/contents/aeon.yml --jq .content \| base64 -d` |
114| `skills.json` | direct read | same pattern |
115| `memory/MEMORY.md` | direct read | same pattern |
116| Custom skills | `ls skills/` minus skills present in `skills.json` install rows | `gh api repos/${TARGET}/contents/skills` JSON |
117
118If `aeon.yml` is unreadable, log `V4_READINESS_NO_CONFIG` and exit with no notification (the fork is not initialized; nothing to check).
119
120### 3. Compute the enabled-skill snapshot
121
122From `aeon.yml`, extract for each skill under `skills:`:
123- `enabled` (true/false)
124- `schedule` (cron or `workflow_dispatch` or `reactive`)
125- `var` (default value, if set)
126- `model` (override, if set)
127
128Ignore commented entries. Capture `chains:`, `reactive:`, `gateway:`, `channels:` blocks verbatim — the readiness check looks at their **shape**, not their values.
129
130### 4. Walk the manifest categories
131
132For each row in the **Safe** table: scan the fork's inputs for the pattern. If found, record under `safe[]` with the file/line where it was matched. If absent, do not flag — Safe means "if you use it, it stays working," not "you must use it."
133
134For each row in the **Review** table: scan the fork's inputs for the pattern. If the fork uses it, record under `review[]` with the matched location and the manifest's "what might change" note. The presence of a Review row is what the operator should manually inspect before merging v4.
135
136For each custom-skill candidate: confirm it exists on disk (`skills/${name}/SKILL.md`) and is **not** present in the upstream-fingerprint heuristic (skills with `install: ./add-skill aaronjmars/aeon ${name}` in this fork's `skills.json` are upstream; everything else is custom). Cross-reference custom skills against the Review patterns — a custom skill that uses `chains:` consume is the highest-priority audit candidate.
137
138For each row in the **Removed** table (currently empty): if the fork uses the removed pattern, record under `action[]` with the migration note as the action.
139
140### 5. Score effort per Review item
141
142Per item, assign a complexity tag based on the manifest pattern:
143
144| Tag | Heuristic |
145|-----|-----------|
146| `trivial` | Config rename only; one-line `aeon.yml` edit |
147| `minor` | Config restructure; ≤ 5 lines or one chain block edit |
148| `moderate` | SKILL.md prose changes needed (e.g. chained skill consuming an output whose shape changed) |
149| `manual` | Custom-skill review required; outcome cannot be predicted from the fork's metadata alone |
150
151The score is informational — it is not a green-light gate. A `trivial` item still counts; the tag tells the operator whether they need 60 seconds or 60 minutes to address it.
152
153### 6. Build the article
154
155Path: `articles/v4-readiness-${today}.md`. Overwrite if exists.
156
157```markdown
158# v4 Readiness — ${TARGET} — ${today}
159
160**Verdict:** ${one of: READY — 0 review items, 0 action items | REVIEW — N items to inspect, M action items | ACTION — M removed-pattern items must be addressed before upgrade}
161
162*Audit basis: aeon.yml + skills.json + MEMORY.md + skills/ on disk · Manifest version: embedded in skills/v4-readiness/SKILL.md as of ${today}*
163
164---
165
166## Safe (${safe_count})
167
168Patterns this fork uses that are confirmed stable into v4. No action needed.
169
170| Pattern | Where in this fork |
171|---------|---------------------|
172| ${pattern} | ${file_or_line} |
173
174## Review (${review_count})
175
176Patterns this fork uses that v4 may change. Inspect each one before merging the upstream v4 PR.
177
178| Pattern | Where | What might change | Effort |
179|---------|-------|--------------------|--------|
180| ${pattern} | ${file_or_line} | ${manifest_note} | ${tag} |
181
182## Custom (${custom_count})
183
184Skills present on this fork but not in the upstream catalog. The upstream maintainer cannot guarantee their patterns travel into v4.
185
186| Skill | Schedule | Reads from Review patterns? | Notes |
187|-------|----------|-----------------------------|-------|
188| ${name} | ${schedule} | ${yes/no — list} | ${one-line summary from MEMORY.md if present} |
189
190## Action items (${action_count})
191
192(Populated when the **Removed** table is non-empty. Each row is a concrete numbered step the operator must take before pulling v4.)
193
1941. ${action} (${tag})
195
196---
197
198## Methodology
199
200- Manifest read from `skills/v4-readiness/SKILL.md` §Manifest.
201- Custom-skill detection: skills present under `skills/` whose slug does not appear in `skills.json[skills][].slug` with an `install` line referencing the upstream catalog.
202- Effort tags are heuristic — `trivial`/`minor`/`moderate`/`manual`. They do not predict v4 release-note exact wording.
203- This skill never modifies `aeon.yml`, never opens a PR, never pulls upstream. It only reports.
204
205---
206
207*Re-run after the v4 announcement updates the embedded manifest. Until then the **Removed** section is empty by design — only Safe / Review / Custom rows are populated.*
208```
209
210If `safe_count == 0 AND review_count == 0 AND custom_count == 0 AND action_count == 0`: the verdict is `READY` and the article still writes — operators may want a paper trail confirming an empty-state audit.
211
212### 7. Notify
213
214If `MODE == dry-run` → skip notify, log `V4_READINESS_DRY_RUN`, exit cleanly (article still wrote).
215
216If verdict is `READY` AND no items in any bucket → still notify, but with a single-line body. Operators dispatched this skill manually; silence on a manual run is worse than a one-line "all clear" reply.
217
218Standard notify body:
219
220```
221*v4 Readiness — ${today} — ${TARGET}*
222
223Verdict: ${verdict}
224
225- Safe: ${safe_count}
226- Review: ${review_count} (${trivial}/${minor}/${moderate}/${manual} effort split)
227- Custom: ${custom_count}
228- Action: ${action_count}
229
230Top review item: ${first_review_pattern_or_"—"}
231Top custom skill: ${first_custom_skill_or_"—"}
232
233Article: articles/v4-readiness-${today}.md
234Manifest version: embedded in skills/v4-readiness/SKILL.md as of ${today}
235
236Re-dispatch after the next v4 manifest update to refresh the verdict.
237```
238
239Cap message at ~3500 chars (Telegram safe limit). If exceeded, drop the Custom section first; Action and Review are higher priority.
240
241### 8. Log to `memory/logs/${today}.md`
242
243```
244## v4 Readiness
245- **Skill**: v4-readiness
246- **Mode**: ${local|remote|dry-run}
247- **Target**: ${TARGET}
248- **Verdict**: ${READY|REVIEW|ACTION}
249- **Counts**: safe=${N} review=${N} custom=${N} action=${N}
250- **Article**: articles/v4-readiness-${today}.md
251- **Notification**: ${sent|skipped — dry-run}
252- **Status**: ${V4_READINESS_OK | V4_READINESS_DRY_RUN | V4_READINESS_NO_CONFIG | V4_READINESS_BAD_VAR | V4_READINESS_PARTIAL}
253```
254
255`V4_READINESS_PARTIAL` means at least one input was missing (logged in step 2) but the audit still wrote — the operator should sanity-check the affected section.
256
257## Exit taxonomy
258
259| Status | Meaning | Notify? |
260|--------|---------|---------|
261| `V4_READINESS_OK` | Audit completed against all inputs | Yes |
262| `V4_READINESS_PARTIAL` | At least one input missing; audit ran on remaining inputs | Yes |
263| `V4_READINESS_DRY_RUN` | `var=dry-run` mode | No (article still writes) |
264| `V4_READINESS_NO_CONFIG` | `aeon.yml` unreadable; fork not initialized | No |
265| `V4_READINESS_BAD_VAR` | `${var}` was non-empty, non-`dry-run`, not a `owner/repo` slug | No |
266
267## Sandbox note
268
269**Local mode (default).** Pure local file I/O — no curl, no env-var-in-headers, no prefetch. Every read is a directory listing or file read against the working tree. The only outbound call is `./notify` itself, which uses the postprocess pattern (see CLAUDE.md).
270
271**Remote mode (`var=owner/repo`).** Each input read is a single `gh api repos/${TARGET}/contents/${path}` call. `gh` handles auth via the workflow's `GITHUB_TOKEN`, so there is no env-var-in-curl pattern to work around. The remote-survey mode is rate-limit-bounded: with five reads per fork, the standard 5,000/h `GITHUB_TOKEN` budget covers ~1,000 fork audits per hour — the realistic per-day operator workload is far below this.
272
273## Constraints
274
275- **Never auto-mutate the fork.** The skill is read-only. It does not edit `aeon.yml`, does not open a PR, does not pull upstream. The upgrade decision belongs to the operator.
276- **Never invent v4 details.** The Manifest is the only source of truth. Operators update the Manifest when the maintainer posts v4 changes; the skill reports against whatever the Manifest currently says. If the Manifest is stale, the report is stale, but it is never wrong-by-fabrication.
277- **Custom-skill detection is heuristic.** A skill present under `skills/` with no `install:` in `skills.json` is treated as custom. False positives (the fork removed and re-added an upstream skill manually) are tagged `manual` so the operator notices; false negatives (the fork edited an upstream skill in place) are caught by `skill-update-check`, not here.
278- **Idempotent.** Same-day reruns overwrite the article. The log line is appended (multiple runs visible if the operator re-dispatches during an upgrade).
279- **One notification max per run.** Even if remote-mode audits multiple targets in sequence (not currently supported by `var` syntax — one slug per run), each invocation produces at most one notify call.
280- **Manifest evolves; skill body does not.** When the v4 announcement lands, the Manifest tables in this file are the only edit surface. The Steps and Constraints stay stable so operators can regenerate the article without merging upstream changes to skill prose.
281
282## Edge cases
283
284- **Empty `aeon.yml` skills block (fresh fork)** — verdict is `READY`, every bucket is empty. Article writes, notification fires with the single-line body. The operator confirms the fork has nothing to migrate.
285- **Custom skill that imports an upstream skill name** — listed under Custom with a `notes` cell flagging the collision. The operator must decide whether to keep the override after v4 lands.
286- **Manifest's Removed section non-empty AND fork uses the pattern** — verdict escalates to `ACTION` regardless of Review counts. Action items list before Review in the notification.
287- **`gh api` fails in remote mode** — log `V4_READINESS_REMOTE_API_ERROR: <code>` and fall back to a partial audit using only inputs that did read; emit `V4_READINESS_PARTIAL`. Do not retry; remote-mode is a survey tool and partial coverage is acceptable.
288- **Same fork audited twice in one day with `var=dry-run` then `var=` empty** — the empty-var run overwrites the article and sends the notification; the dry-run run already wrote the article body so the empty-var run produces a byte-identical or near-identical file (only the timestamp changes). This is intended; the operator gets a notification once they explicitly opt in.
289

Closure

Closed 1 week ago

SHA: 4b9b49251c8c9808bf147d55aa2930352af2e8c0

View closure receipt on GitHub →

Tweet thread template

tweet 1 of 8190 / 280

Two frontier models reviewed PR #12 on 6f7fc663. Both found this bug: high bug: v4-readiness manifest references files not included in the skill’s read set, causing missed Review detections

tweet 2 of 8118 / 280

The vulnerable code (skills/v4-readiness/SKILL.md:0-0): (full snippet at https://www.antfleet.dev/anatomy/42eb81fe-1)

tweet 3 of 836 / 280

What Opus saw: "Output unavailable"

tweet 4 of 8280 / 280

What GPT-5 saw: "The skill promises to detect Review-pattern usage by scanning the fork’s inputs, but its declared input set excludes several locations explicitly listed in the Manifest (mcp-server/src/index.ts, .outputs/, chain-runner.yml, dashboard/). As a result, forks using…

tweet 5 of 897 / 280

Both flagged the same line range. AntFleet's unanimous gate fired — the finding posted on the PR.

tweet 6 of 893 / 280

The fix landed in commit 4b9b492: (view diff at https://www.antfleet.dev/anatomy/42eb81fe-1)

tweet 7 of 881 / 280

AntFleet reviews every PR with two frontier models. Only unanimous findings post.

tweet 8 of 877 / 280

Full anatomy + reasoning + diffs: https://www.antfleet.dev/anatomy/42eb81fe-1

Paste into X composer one tweet at a time. X has no multi-tweet intent API.