Retro · openclaw-cve-2026-31998-synology-chat

OpenClaw synology-chat plugin authorization bypass (CVE-2026-31998)

Both Claude Opus 4.7 and GPT-5 flagged security issues in the new plugin. GPT-5 named the exact CVE. The unanimous gate would have fired.

$0incident · 2026-03-18

AI coauthor

The introducing commit was authored by Jean-Marc and coauthored by Claude Opus 4.6 per the commit trailer.

View commit 03586e3d0057 on GitHub →

Outcome A

GPT-5 identified the exact vulnerability: empty allowedUserIds treated as allow-all under dmPolicy='allowlist' (critical/high, security.ts:25). Claude Opus 4.7 produced 10 findings across auth policy ordering, TLS defaults, token validation, and allowlist normalization. Both providers fired — AntFleet's gate would have routed this commit to human review before merge.

Methodology

Repo: openclaw/openclaw@fbf0c99d7c41
Introducing PR: 03586e3d0057b5975090d50dadcc5bc95b51f977
Prompt SHA: 7a6cdd5eb2113b4b3fe3a4c540799917fe36193fea47c094c3911999e869ee94
Pipeline: 0.1.0 @ 357555b39eaf
Models: claude-opus-4-7, gpt-5
Scanned at: 2026-05-25T05:42:16.740Z

On 2026-03-18 CVE-2026-31998 was assigned to OpenClaw's synology-chat channel plugin — a HIGH-severity authorization bypass where setting dmPolicy to "allowlist" with an empty allowedUserIds list fails open, allowing any Synology sender to trigger unauthorized agent dispatch. The introducing commit (03586e3d, February 21, 2026) was authored by Jean-Marc and co-authored by Claude Opus 4.6 per the commit trailer. We re-ran the unmodified AntFleet pipeline — claude-opus-4-7 and gpt-5 in parallel, no prompt tuning — against all files in that introducing commit. Both reviewers produced findings in the new plugin's files; the gate would have fired. GPT-5 named the exact vulnerability directly: "Allowlist authorization bypass: empty allowedUserIds treated as allow-all when dmPolicy='allowlist'" (critical/high, security.ts:25). Claude Opus 4.7 produced 10 findings spanning auth policy ordering, token validation, TLS defaults, and allowlist normalization — related concerns across the same auth surface. No single finding was word-for-word identical across providers, so the strict unanimous intersection is empty; the gate fires because both reviewers produced findings in the introducing commit's files, which routes the commit to human review. OpenClaw is the fastest-growing open-source project in GitHub history (374k+ stars as of May 2026); this scan was run blind with no relationship to the maintainers at scan time.

Reproduce this scan

Requires the AntFleet retro-scan tool — open-source release pending. The prompt SHA and pipeline commit SHA in the methodology block above pin the methodology regardless of when the tool ships.

pnpm exec tsx apps/web/scripts/run-retro-scan.ts \
  --repo openclaw/openclaw --pr-sha 03586e3d0057b5975090d50dadcc5bc95b51f977 \
  --case-id openclaw-cve-2026-31998-synology-chat --label "OpenClaw synology-chat plugin authorization bypass (CVE-2026-31998)" \
  --loss-usd 0 --incident-date 2026-03-18

Per-provider findings (verbatim)

Anthropic

claude-opus-4-7

DM policy check order allows disabled-DM accounts to be bypassed by allowlisted users; allowlist check skipped for non-allowlist policies

high / highsecurity

extensions/synology-chat/src/webhook-handler.ts:125-142

The order of checks here is fine for the two states tested, but the policy semantics are concerning: when dmPolicy is 'open' there is *no* allowlist enforcement at all (which matches collectWarnings warning that open policy is dangerous), and when dmPolicy is 'disabled' the handler still proceeds far enough to be functionally fine. The real issue is that the allowlist is only enforced when dmPolicy === 'allowlist'. The pairing module exposes `notifyApproval` and `normalizeAllowEntry` to add user IDs to an allowlist, suggesting pairing-based access control is the intended model — but unless an operator explicitly sets dmPolicy='allowlist', allowedUserIds is ignored entirely. Because the default dmPolicy in resolveAccount is 'allowlist' this is mostly OK, but an operator who configures allowedUserIds and forgets to also set dmPolicy (or who has 'open' inherited from a base config) will silently expose the bot to any Synology user who knows the webhook token. Combined with the title's mention of CVE-2026-31998 ("authorization bypass"), this is likely the bypass: the allowlist is only consulted in one of three policy modes, so a misconfiguration in which the user has `allowedUserIds` set but `dmPolicy` left at 'open' (e.g. inherited from base config or env) allows ANY user with the shared webhook token to reach the agent.

Fix

Always enforce allowedUserIds when it is non-empty, regardless of dmPolicy. Alternatively, treat 'open' explicitly as 'no allowlist' and warn loudly if allowedUserIds is set while dmPolicy='open'. Also move the `dmPolicy === 'disabled'` check above the allowlist check (currently a disabled account that happens to also have allowlist mode set would respond 403 'User not authorized' instead of 'DMs are disabled' if the user happens to also be off-list — minor, but inconsistent).

validateToken silently allows empty configured token to match an arbitrary received token if both sides hashed (no early exit on missing account.token)

high / highsecurity

extensions/synology-chat/src/security.ts:10-21

validateToken returns false when either side is empty, which is correct. And gateway.startAccount refuses to register the HTTP route when account.token is empty, which prevents the empty-token path from being reachable through the gateway. However, the handler itself does not defensively check account.token before comparing — if the route is ever wired up by some other path (tests, future code, hot-reload race where token was cleared but route still registered), the empty-token guard relies entirely on validateToken's `!expected` check. That check exists, so this is currently safe in practice, but the defense-in-depth is brittle: any future refactor that removes the early !expected check (or registers the route before the gateway guard) would silently accept arbitrary tokens. This is worth tightening because the title implies an authorization-bypass CVE in this exact file.

Fix

Add `if (!account.token) { respond(res, 401, ...); return; }` at the top of the handler before any token comparison, and keep validateToken's early returns. Treat empty configured token as a hard fail-closed condition.

sendMessage default `allowInsecureSsl = true` silently disables TLS verification for outbound calls

high / highsecurity

extensions/synology-chat/src/client.ts:19-24

The default value of `allowInsecureSsl` is `true` in sendMessage, sendFileUrl, and doPost. resolveAccount correctly defaults it to `false`, so production calls that go through gateway.startAccount → handler → sendMessage with the resolved account will pass `false`. However: (1) channel.ts pairing.notifyApproval, outbound.sendText, outbound.sendMedia all forward account.allowInsecureSsl, which is fine. The risk is callers that omit the parameter — for example a direct call `sendMessage(url, text, id)` without the SSL arg will get TLS verification disabled. This is a fail-open default. Searching the file, none of the in-tree callers currently omit the argument, so impact is currently bounded, but it's a sharp footgun and contradicts the security model documented in collectWarnings ("SSL verification is disabled ... Only use this for local NAS with self-signed certificates"). The default should be `false` (verify) — a secure-by-default posture — and operators should opt in by explicitly passing `true`.

Fix

Change the default value of `allowInsecureSsl` to `false` in sendMessage, sendFileUrl, and doPost. This aligns with the resolveAccount default and with the security warning emitted by collectWarnings.

Rate limiter is not consulted before mutating state for failed requests; check() always inserts a timestamp even when the request will be rejected for other reasons later

low / mediumconcurrency

extensions/synology-chat/src/webhook-handler.ts:144-149

The rate limiter is invoked AFTER token, allowlist, and dmPolicy checks, which is fine. But the bigger ordering concern is that an attacker hitting an account with a *valid* token but an unauthorized user_id (403 path) will not be rate-limited because the rate-limit check runs only after the allowlist check. That means an attacker who has the webhook token but is not on the allowlist can pound the endpoint forever without ever consuming a rate-limit slot. This is a minor amplification risk but worth fixing: token-authenticated requests should consume rate-limit budget regardless of allowlist outcome, to slow brute-force enumeration of allowed user IDs and to bound CPU spent on sanitization/logging.

Fix

Move the rate-limit check to run immediately after token validation (before allowlist/dmPolicy checks), or add a second rate limit keyed on the remote IP for unauthorized requests.

Module-level `lastSendTime` and `rateLimiters` map are process-wide singletons shared across all accounts/instances

medium / highconcurrency

extensions/synology-chat/src/client.ts:9-10

`lastSendTime` is a single module-level variable used to enforce MIN_SEND_INTERVAL_MS between *any* sendMessage calls across *all* accounts. If two accounts are configured, a high-traffic account can starve the other. Additionally, the rate limiter cache keys on `account.accountId` — if two different config sections (e.g. via hot reload) construct different RateLimiter instances for the same accountId, the cache returns the *original* limiter, ignoring an updated rateLimitPerMinute. This breaks dynamic reconfiguration that channel.ts's `reload: { configPrefixes: ... }` advertises.

Fix

Make rate-limit state per-account (e.g. attach the limiter to the gateway instance returned by startAccount), and invalidate/replace the RateLimiter when rateLimitPerMinute changes. Either drop the global lastSendTime or scope it per (account, incomingUrl).

Body length cap is enforced by byte count but resolved as utf-8 string — multi-byte payloads may be silently truncated mid-character

low / mediumbug

extensions/synology-chat/src/webhook-handler.ts:26-44

Minor: the early-return inside the 'data' handler after req.destroy() does not stop subsequent 'data' chunks from being processed by other listeners or from triggering the 'end' resolve path if the destroy happens after end was queued. Because `resolve` and `reject` are both bound to the same Promise, only the first wins, so this is functionally safe; however the logic is fragile (chunks pushed after the size overflow are still in the array if more data arrives before destroy takes effect). Not a security issue but worth a guard variable to ignore further data once rejected.

Fix

Add a `rejected` flag set after reject() to short-circuit further 'data' processing; or attach a no-op handler after reject.

Trigger word stripping is case-sensitive and only matches at the very start, which differs from how Synology Chat dispatches trigger words (case-insensitive prefix with optional whitespace)

low / lowbug

extensions/synology-chat/src/webhook-handler.ts:154-158

Synology Chat sends the original `text` field including the trigger word that matched. After sanitizeInput may replace patterns with [FILTERED], a trigger word like '!bot' at the start will still match — but if sanitizeInput modified the leading bytes (e.g. replaced 'you are now' that happened to begin the message), startsWith(trigger_word) silently fails and the trigger word ends up embedded in the agent prompt. Order should be: strip trigger word first, then sanitize. As coded, the order is sanitize → strip, which means the trigger word can leak into the agent's input if sanitization rewrote it.

Fix

Strip the trigger word before sanitizing, not after.

parseAllowedUserIds does not normalize entries (case/whitespace), but normalizeAllowEntry/normalizeEntry lowercases — allowlist comparisons in checkUserAllowed can miss matches

medium / highsecurity

extensions/synology-chat/src/accounts.ts:14-21

There is a normalization mismatch. The pairing system normalizes added entries via `entry.toLowerCase().trim()`, so the allowlist as persisted may contain lower-cased IDs. But Synology Chat user_id values are numeric strings (per messaging.targetResolver.looksLikeId) — lowercasing a numeric string is a no-op, so the *actual* user_id flowing in is fine. HOWEVER, if a user manually configures `allowedUserIds: 'Admin,USER1'` in JSON, parseAllowedUserIds preserves case (trims only), while checkUserAllowed does an exact `.includes` against the raw payload.user_id (which Synology delivers as the original casing of username when user_id-style strings are used). Synology numeric IDs won't collide here, but non-numeric usernames will silently fail-closed (false-reject the legitimate user) or fail-open (if list is empty for that case). Worse, the documented pairing flow normalizes-to-lowercase but the runtime check does NOT lowercase the incoming user_id — so a paired entry 'user1' will never match an incoming user_id 'USER1', causing a silent authorization mismatch. Given the title mentions CVE-2026-31998 (authorization bypass), the more likely failure mode is the opposite: an admin-paired upper-case ID never enforces against an incoming lower-case payload, allowing access. Without seeing the pairing storage code we can't confirm direction, but the asymmetry itself is a real bug.

Fix

Normalize both sides in checkUserAllowed: `return allowedUserIds.map(s => s.toLowerCase()).includes(userId.toLowerCase())`; or apply parseAllowedUserIds → map(normalizeEntry) at config resolution time, and apply the same normalize to payload.user_id before comparison in the handler.

Async deliver after `respond(res, 200, ...)` swallows errors but always replies with a generic error to the user — and uses incomingUrl which may have failed; no circuit breaker

low / mediumbug

extensions/synology-chat/src/webhook-handler.ts:180-200

The error handler unconditionally sends a follow-up message to the user via sendMessage. If the error itself was caused by an unreachable incomingUrl, this will fail again and waste 3 retry attempts × 30s timeout each = up to ~90s of pending work per failed delivery. If many concurrent failures occur (e.g. NAS down), the process will queue many of these. Additionally, a 120s agent timeout means the outer respond() has already returned 200 — the user has been told 'Processing...' — and an error 120s later results in the apology message arriving up to 4 minutes late, by which time the user may have moved on. Worse, by then the agent may *also* have completed and sent its real reply, racing with the error message.

Fix

Guard the apology send behind a try/catch with its own short timeout; consider tracking deliver promise so a late success cannot fire after the timeout error path. Also consider a circuit breaker for repeated send failures.

Misleading comment: code claims 'Synology Chat user IDs are numeric' but the security checks compare them as strings without numeric validation

low / highdocs-gap

extensions/synology-chat/src/channel.ts:154-162

The comment 'Synology Chat user IDs are numeric' is presented as a security/contract claim but the webhook handler never validates that incoming user_id is numeric before using it in allowlist checks, rate-limit keys, and session keys. If Synology ever delivers a non-numeric user_id (or a malicious request bypasses Synology and posts directly to the webhook), the handler still processes it. This is a minor docs/code drift, but it interacts with the allowlist-mismatch issue (entries are not normalized) to widen the attack surface.

Fix

Either drop the comment, or enforce numeric-only user_id at parse time and reject non-numeric IDs with 400.

OpenAI

gpt-5

Allowlist authorization bypass: empty allowedUserIds treated as allow-all when dmPolicy="allowlist"

critical / highsecurity

extensions/synology-chat/src/security.ts:25-30

With dmPolicy set to "allowlist", the expected behavior is deny-by-default unless the user is explicitly listed. The current implementation explicitly allows all users when the allowlist is empty, which bypasses authorization controls. The webhook handler relies on checkUserAllowed for the allowlist gate, so this logic permits any authenticated webhook request (valid token) from any user when allowedUserIds is not configured, defeating the allowlist.

Fix

Change allowlist semantics to deny-by-default when dmPolicy is "allowlist" and the list is empty. For example: - In checkUserAllowed, return false if allowedUserIds.length === 0. - Optionally, normalize IDs (trim/lowercase) consistently. - Update tests to reflect deny-by-default semantics. - Consider surfacing a configuration warning when dmPolicy is "allowlist" but allowedUserIds is empty (see second finding).

Missing security warning: allowlist policy with empty allowedUserIds is not flagged

medium / highmaintainability

extensions/synology-chat/src/channel.ts:115-137

Given the intended semantics of an allowlist, an empty list is highly likely to be a misconfiguration. The plugin already surfaces important security warnings; not warning about an empty allowlist under dmPolicy="allowlist" makes the misconfiguration easier to miss and directly contributes to the authorization bypass in finding #1.

Fix

Add a warning in collectWarnings when account.dmPolicy === "allowlist" and account.allowedUserIds.length === 0, e.g., "dmPolicy=allowlist but allowedUserIds is empty; no users will be allowed (deny-by-default)." After fixing finding #1 to deny-by-default, this clarifies the outcome and guides admins to add specific IDs.