Retro · openclaw-cve-2026-31998-synology-chat
OpenClaw synology-chat plugin authorization bypass (CVE-2026-31998)
Both Claude Opus 4.7 and GPT-5 flagged security issues in the new plugin. GPT-5 named the exact CVE. The unanimous gate would have fired.
The introducing commit was authored by Jean-Marc and coauthored by Claude Opus 4.6 per the commit trailer.
View commit 03586e3d0057 on GitHub →Outcome A
GPT-5 identified the exact vulnerability: empty allowedUserIds treated as allow-all under dmPolicy='allowlist' (critical/high, security.ts:25). Claude Opus 4.7 produced 10 findings across auth policy ordering, TLS defaults, token validation, and allowlist normalization. Both providers fired — AntFleet's gate would have routed this commit to human review before merge.
Methodology
- Repo
- openclaw/openclaw@fbf0c99d7c41
- Introducing PR
- 03586e3d0057b5975090d50dadcc5bc95b51f977
- Prompt SHA
- 7a6cdd5eb2113b4b3fe3a4c540799917fe36193fea47c094c3911999e869ee94
- Pipeline
- 0.1.0 @ 357555b39eaf
- Models
- claude-opus-4-7, gpt-5
- Scanned at
- 2026-05-25T05:42:16.740Z
Reproduce this scan
Requires the AntFleet retro-scan tool — open-source release pending. The prompt SHA and pipeline commit SHA in the methodology block above pin the methodology regardless of when the tool ships.
pnpm exec tsx apps/web/scripts/run-retro-scan.ts \
--repo openclaw/openclaw --pr-sha 03586e3d0057b5975090d50dadcc5bc95b51f977 \
--case-id openclaw-cve-2026-31998-synology-chat --label "OpenClaw synology-chat plugin authorization bypass (CVE-2026-31998)" \
--loss-usd 0 --incident-date 2026-03-18Per-provider findings (verbatim)
Anthropic
claude-opus-4-7
DM policy check order allows disabled-DM accounts to be bypassed by allowlisted users; allowlist check skipped for non-allowlist policies
extensions/synology-chat/src/webhook-handler.ts:125-142
The order of checks here is fine for the two states tested, but the policy semantics are concerning: when dmPolicy is 'open' there is *no* allowlist enforcement at all (which matches collectWarnings warning that open policy is dangerous), and when dmPolicy is 'disabled' the handler still proceeds far enough to be functionally fine. The real issue is that the allowlist is only enforced when dmPolicy === 'allowlist'. The pairing module exposes `notifyApproval` and `normalizeAllowEntry` to add user IDs to an allowlist, suggesting pairing-based access control is the intended model — but unless an operator explicitly sets dmPolicy='allowlist', allowedUserIds is ignored entirely. Because the default dmPolicy in resolveAccount is 'allowlist' this is mostly OK, but an operator who configures allowedUserIds and forgets to also set dmPolicy (or who has 'open' inherited from a base config) will silently expose the bot to any Synology user who knows the webhook token. Combined with the title's mention of CVE-2026-31998 ("authorization bypass"), this is likely the bypass: the allowlist is only consulted in one of three policy modes, so a misconfiguration in which the user has `allowedUserIds` set but `dmPolicy` left at 'open' (e.g. inherited from base config or env) allows ANY user with the shared webhook token to reach the agent.
Fix
Always enforce allowedUserIds when it is non-empty, regardless of dmPolicy. Alternatively, treat 'open' explicitly as 'no allowlist' and warn loudly if allowedUserIds is set while dmPolicy='open'. Also move the `dmPolicy === 'disabled'` check above the allowlist check (currently a disabled account that happens to also have allowlist mode set would respond 403 'User not authorized' instead of 'DMs are disabled' if the user happens to also be off-list — minor, but inconsistent).
validateToken silently allows empty configured token to match an arbitrary received token if both sides hashed (no early exit on missing account.token)
extensions/synology-chat/src/security.ts:10-21
validateToken returns false when either side is empty, which is correct. And gateway.startAccount refuses to register the HTTP route when account.token is empty, which prevents the empty-token path from being reachable through the gateway. However, the handler itself does not defensively check account.token before comparing — if the route is ever wired up by some other path (tests, future code, hot-reload race where token was cleared but route still registered), the empty-token guard relies entirely on validateToken's `!expected` check. That check exists, so this is currently safe in practice, but the defense-in-depth is brittle: any future refactor that removes the early !expected check (or registers the route before the gateway guard) would silently accept arbitrary tokens. This is worth tightening because the title implies an authorization-bypass CVE in this exact file.
Fix
Add `if (!account.token) { respond(res, 401, ...); return; }` at the top of the handler before any token comparison, and keep validateToken's early returns. Treat empty configured token as a hard fail-closed condition.
sendMessage default `allowInsecureSsl = true` silently disables TLS verification for outbound calls
extensions/synology-chat/src/client.ts:19-24
The default value of `allowInsecureSsl` is `true` in sendMessage, sendFileUrl, and doPost. resolveAccount correctly defaults it to `false`, so production calls that go through gateway.startAccount → handler → sendMessage with the resolved account will pass `false`. However: (1) channel.ts pairing.notifyApproval, outbound.sendText, outbound.sendMedia all forward account.allowInsecureSsl, which is fine. The risk is callers that omit the parameter — for example a direct call `sendMessage(url, text, id)` without the SSL arg will get TLS verification disabled. This is a fail-open default. Searching the file, none of the in-tree callers currently omit the argument, so impact is currently bounded, but it's a sharp footgun and contradicts the security model documented in collectWarnings ("SSL verification is disabled ... Only use this for local NAS with self-signed certificates"). The default should be `false` (verify) — a secure-by-default posture — and operators should opt in by explicitly passing `true`.
Fix
Change the default value of `allowInsecureSsl` to `false` in sendMessage, sendFileUrl, and doPost. This aligns with the resolveAccount default and with the security warning emitted by collectWarnings.
Rate limiter is not consulted before mutating state for failed requests; check() always inserts a timestamp even when the request will be rejected for other reasons later
extensions/synology-chat/src/webhook-handler.ts:144-149
The rate limiter is invoked AFTER token, allowlist, and dmPolicy checks, which is fine. But the bigger ordering concern is that an attacker hitting an account with a *valid* token but an unauthorized user_id (403 path) will not be rate-limited because the rate-limit check runs only after the allowlist check. That means an attacker who has the webhook token but is not on the allowlist can pound the endpoint forever without ever consuming a rate-limit slot. This is a minor amplification risk but worth fixing: token-authenticated requests should consume rate-limit budget regardless of allowlist outcome, to slow brute-force enumeration of allowed user IDs and to bound CPU spent on sanitization/logging.
Fix
Move the rate-limit check to run immediately after token validation (before allowlist/dmPolicy checks), or add a second rate limit keyed on the remote IP for unauthorized requests.
Module-level `lastSendTime` and `rateLimiters` map are process-wide singletons shared across all accounts/instances
extensions/synology-chat/src/client.ts:9-10
`lastSendTime` is a single module-level variable used to enforce MIN_SEND_INTERVAL_MS between *any* sendMessage calls across *all* accounts. If two accounts are configured, a high-traffic account can starve the other. Additionally, the rate limiter cache keys on `account.accountId` — if two different config sections (e.g. via hot reload) construct different RateLimiter instances for the same accountId, the cache returns the *original* limiter, ignoring an updated rateLimitPerMinute. This breaks dynamic reconfiguration that channel.ts's `reload: { configPrefixes: ... }` advertises.
Fix
Make rate-limit state per-account (e.g. attach the limiter to the gateway instance returned by startAccount), and invalidate/replace the RateLimiter when rateLimitPerMinute changes. Either drop the global lastSendTime or scope it per (account, incomingUrl).
Body length cap is enforced by byte count but resolved as utf-8 string — multi-byte payloads may be silently truncated mid-character
extensions/synology-chat/src/webhook-handler.ts:26-44
Minor: the early-return inside the 'data' handler after req.destroy() does not stop subsequent 'data' chunks from being processed by other listeners or from triggering the 'end' resolve path if the destroy happens after end was queued. Because `resolve` and `reject` are both bound to the same Promise, only the first wins, so this is functionally safe; however the logic is fragile (chunks pushed after the size overflow are still in the array if more data arrives before destroy takes effect). Not a security issue but worth a guard variable to ignore further data once rejected.
Fix
Add a `rejected` flag set after reject() to short-circuit further 'data' processing; or attach a no-op handler after reject.
Trigger word stripping is case-sensitive and only matches at the very start, which differs from how Synology Chat dispatches trigger words (case-insensitive prefix with optional whitespace)
extensions/synology-chat/src/webhook-handler.ts:154-158
Synology Chat sends the original `text` field including the trigger word that matched. After sanitizeInput may replace patterns with [FILTERED], a trigger word like '!bot' at the start will still match — but if sanitizeInput modified the leading bytes (e.g. replaced 'you are now' that happened to begin the message), startsWith(trigger_word) silently fails and the trigger word ends up embedded in the agent prompt. Order should be: strip trigger word first, then sanitize. As coded, the order is sanitize → strip, which means the trigger word can leak into the agent's input if sanitization rewrote it.
Fix
Strip the trigger word before sanitizing, not after.
parseAllowedUserIds does not normalize entries (case/whitespace), but normalizeAllowEntry/normalizeEntry lowercases — allowlist comparisons in checkUserAllowed can miss matches
extensions/synology-chat/src/accounts.ts:14-21
There is a normalization mismatch. The pairing system normalizes added entries via `entry.toLowerCase().trim()`, so the allowlist as persisted may contain lower-cased IDs. But Synology Chat user_id values are numeric strings (per messaging.targetResolver.looksLikeId) — lowercasing a numeric string is a no-op, so the *actual* user_id flowing in is fine. HOWEVER, if a user manually configures `allowedUserIds: 'Admin,USER1'` in JSON, parseAllowedUserIds preserves case (trims only), while checkUserAllowed does an exact `.includes` against the raw payload.user_id (which Synology delivers as the original casing of username when user_id-style strings are used). Synology numeric IDs won't collide here, but non-numeric usernames will silently fail-closed (false-reject the legitimate user) or fail-open (if list is empty for that case). Worse, the documented pairing flow normalizes-to-lowercase but the runtime check does NOT lowercase the incoming user_id — so a paired entry 'user1' will never match an incoming user_id 'USER1', causing a silent authorization mismatch. Given the title mentions CVE-2026-31998 (authorization bypass), the more likely failure mode is the opposite: an admin-paired upper-case ID never enforces against an incoming lower-case payload, allowing access. Without seeing the pairing storage code we can't confirm direction, but the asymmetry itself is a real bug.
Fix
Normalize both sides in checkUserAllowed: `return allowedUserIds.map(s => s.toLowerCase()).includes(userId.toLowerCase())`; or apply parseAllowedUserIds → map(normalizeEntry) at config resolution time, and apply the same normalize to payload.user_id before comparison in the handler.
Async deliver after `respond(res, 200, ...)` swallows errors but always replies with a generic error to the user — and uses incomingUrl which may have failed; no circuit breaker
extensions/synology-chat/src/webhook-handler.ts:180-200
The error handler unconditionally sends a follow-up message to the user via sendMessage. If the error itself was caused by an unreachable incomingUrl, this will fail again and waste 3 retry attempts × 30s timeout each = up to ~90s of pending work per failed delivery. If many concurrent failures occur (e.g. NAS down), the process will queue many of these. Additionally, a 120s agent timeout means the outer respond() has already returned 200 — the user has been told 'Processing...' — and an error 120s later results in the apology message arriving up to 4 minutes late, by which time the user may have moved on. Worse, by then the agent may *also* have completed and sent its real reply, racing with the error message.
Fix
Guard the apology send behind a try/catch with its own short timeout; consider tracking deliver promise so a late success cannot fire after the timeout error path. Also consider a circuit breaker for repeated send failures.
Misleading comment: code claims 'Synology Chat user IDs are numeric' but the security checks compare them as strings without numeric validation
extensions/synology-chat/src/channel.ts:154-162
The comment 'Synology Chat user IDs are numeric' is presented as a security/contract claim but the webhook handler never validates that incoming user_id is numeric before using it in allowlist checks, rate-limit keys, and session keys. If Synology ever delivers a non-numeric user_id (or a malicious request bypasses Synology and posts directly to the webhook), the handler still processes it. This is a minor docs/code drift, but it interacts with the allowlist-mismatch issue (entries are not normalized) to widen the attack surface.
Fix
Either drop the comment, or enforce numeric-only user_id at parse time and reject non-numeric IDs with 400.
OpenAI
gpt-5
Allowlist authorization bypass: empty allowedUserIds treated as allow-all when dmPolicy="allowlist"
extensions/synology-chat/src/security.ts:25-30
With dmPolicy set to "allowlist", the expected behavior is deny-by-default unless the user is explicitly listed. The current implementation explicitly allows all users when the allowlist is empty, which bypasses authorization controls. The webhook handler relies on checkUserAllowed for the allowlist gate, so this logic permits any authenticated webhook request (valid token) from any user when allowedUserIds is not configured, defeating the allowlist.
Fix
Change allowlist semantics to deny-by-default when dmPolicy is "allowlist" and the list is empty. For example: - In checkUserAllowed, return false if allowedUserIds.length === 0. - Optionally, normalize IDs (trim/lowercase) consistently. - Update tests to reflect deny-by-default semantics. - Consider surfacing a configuration warning when dmPolicy is "allowlist" but allowedUserIds is empty (see second finding).
Missing security warning: allowlist policy with empty allowedUserIds is not flagged
extensions/synology-chat/src/channel.ts:115-137
Given the intended semantics of an allowlist, an empty list is highly likely to be a misconfiguration. The plugin already surfaces important security warnings; not warning about an empty allowlist under dmPolicy="allowlist" makes the misconfiguration easier to miss and directly contributes to the authorization bypass in finding #1.
Fix
Add a warning in collectWarnings when account.dmPolicy === "allowlist" and account.allowedUserIds.length === 0, e.g., "dmPolicy=allowlist but allowedUserIds is empty; no users will be allowed (deny-by-default)." After fixing finding #1 to deny-by-default, this clarifies the outcome and guides admins to add specific IDs.