fix(connect-evm): suppress switchChain `_failed` event on successful addEthereumChain recovery by adonesky1 · Pull Request #294 · MetaMask/connect-monorepo

adonesky1 · 2026-05-11T21:43:03Z

Stacked on top of #290. Targets wapi-1253-failure-reason-and-tracking-fixes; once #290 lands GitHub will retarget the base to main automatically and the diff will narrow to just the three files in this PR.

Summary

On the switchChain → "Unrecognized chain ID" → addEthereumChain recovery path, the EVM shim was firing four Mixpanel events for what is logically one chain change:

mmconnect_wallet_action_requested  (switch)
mmconnect_wallet_action_failed     (switch — swallowed by the shim, never surfaces to the dapp)
mmconnect_wallet_action_requested  (add)
mmconnect_wallet_action_succeeded  (add)

The middle _failed is misleading. The dapp called switchChain(chainId, chainConfiguration); the wallet didn't know the chain; the shim's recovery added it and switched; the dapp's promise resolves successfully. The user sees one chain change. Mixpanel saw a failure that never reached the dapp.

This PR hoists the recovery-eligibility check above the _failed track call so the failed event only fires when there's no recovery to attempt. Post-merge, the same successful-recovery flow emits three events:

mmconnect_wallet_action_requested  (switch)
mmconnect_wallet_action_requested  (add)
mmconnect_wallet_action_succeeded  (add)

Behavioural matrix

Scenario	Before (events fired)	After (events fired)
Switch succeeds	`switch _requested`, `switch _succeeded`	same
Switch fails, no `chainConfiguration`	`switch _requested`, `switch _failed`	same
Switch fails for non-"Unrecognized chain ID" reason (user rejection, etc.), even with `chainConfiguration`	`switch _requested`, `switch _failed`/`_rejected`	same
Switch "Unrecognized chain ID" + `chainConfiguration`, recovery succeeds	switch `_requested`, switch `_failed`, add `_requested`, add `_succeeded`	switch `_requested`, add `_requested`, add `_succeeded`
Switch "Unrecognized chain ID" + `chainConfiguration`, recovery itself fails	switch `_requested`, switch `_failed`, add `_requested`, add `_failed`	switch `_requested`, add `_requested`, add `_failed`

The suppression preserves the "exactly one _failed per logical chain change" invariant in the unhappy path too — the add's own _failed represents the failure of the whole flow.

Why not just remove the shim tracking entirely?

That was the framing of the closed #291, which claimed the multichain RequestRouter was tracking these methods at the wallet boundary so the shim was duplicating it. Source review showed that's not true — MetamaskConnectEVM.switchChain calls this.#request(...) → transport.sendEip1193Message(...), which does a raw postMessage (default transport) / dappClient.sendRequest({ name: 'metamask-provider', ... }) (MWP) directly. Those paths bypass RequestRouter.invokeMethod / wallet_invokeMethod and so bypass #withAnalyticsTracking entirely. The shim is the only analytics layer for the four EIP-1193 intercepted methods (wallet_switchEthereumChain, wallet_addEthereumChain, eth_requestAccounts / wallet_requestPermissions). #291 was closed for this reason; this PR is the correct, scoped fix for the one real intra-shim double-fire.

Recovering the lost signal

The pre-suppression switch _failed event was the only signal that told us "wallet X didn't know chain Y." Post-suppression this is still recoverable via a Mixpanel query: any wallet_addEthereumChain _requested event for which the same session has no preceding dapp-initiated addEthereumChain call is, by construction, a chain-switch-recovery attempt. Captured as backlog item (b) in the failure_reason follow-up plan.

Test plan

New test coverage in packages/connect-evm/src/connect.test.ts (describe('switchChain', ...)):

falls back to wallet_addEthereumChain when wallet_switchEthereumChain fails with "Unrecognized chain ID" and chainConfiguration is provided — extended to assert event names match [switch _requested, add _requested, add _succeeded] (no _failed).
still emits _failed for the switch when recovery is impossible (no chainConfiguration provided) — renamed from the old title; asserts [switch _requested, switch _failed].
rethrows non "Unrecognized chain ID" errors without falling back, even when chainConfiguration is provided — extended to assert one of _failed / _rejected still fires.
New: emits _failed for the add when recovery itself fails (suppression only applies on successful recovery) — asserts the unhappy-recovery path emits exactly one _failed (the add's), not two.

Test scaffolding:

mockCore extended with options.dapp, versions, transportType, and storage.getAnonId() so getWalletActionAnalyticsProperties runs end-to-end instead of being swallowed by #trackWalletAction*'s defensive try/catch. Confirmed by running the failing tests against the un-suppressed code — the first round of failures showed empty event arrays (analytics path silently throwing); after the mock fix, the failures show the actual switch _failed between switch and add, exactly the bug.
vi.spyOn(analytics, 'track') set up in beforeEach after the session-establishment events fire so test assertions only see calls made during the test body.

Validation:

yarn workspace @metamask/connect-evm test --run — 64 passed.
yarn workspace @metamask/connect-multichain test --run — 354 passed, 4 skipped (no regression from the mock change).
yarn build — all 11 tasks succeed.
yarn lint:eslint — clean.
yarn changelog:validate — clean.

Spike: WAPI-1253.
Sibling PRs from the same spike:
- feat(analytics): attach failure_reason to _failed events + fix isRejectionError #290 — failure_reason plumbing + classifier hardening. This PR is stacked on it.
- fix(connect-multichain): three isRejectionError correctness fixes (unwrap, drop 4100, tighten "user" heuristic) #292 — isRejectionError correctness fixes (independent).
- feat(connect-multichain): attach error_code and error_message_sample on _failed events #293 — error_code + error_message_sample companions (stacked on feat(analytics): attach failure_reason to _failed events + fix isRejectionError #290).
Closed: refactor(connect-evm): stop double-tracking EVM intercepted methods #291 — earlier "EVM-shim dedupe" attempt, retracted after source review.

…sRejectionError` Two bundled changes from the WAPI-1253 spike that share scaffolding (both reuse `getUnwrappedErrorDetails`): 1. **Attach `failure_reason` to `_failed` events.** Both `mmconnect_wallet_action_failed` and `mmconnect_connection_failed` now carry a producer-side tag describing why they fired — `transport_timeout`, `transport_disconnect`, `wallet_method_unsupported`, `wallet_invalid_params`, `wallet_internal_error`, `wallet_custom_error`, `no_active_session`, `unrecognised_chain`, `rpc_node_http_error`, `rpc_node_request_error`, `rpc_node_response_error`, or `unknown`. The schema-side change lives in: consensys-vertical-apps/metamask-sdk-analytics-api#31 Note: the `rpc_node_*` branches exist in the classifier but no producer currently emits them — read-only RPC failures still throw without tracking. A follow-up PR may wire them up. 2. **Unwrap `RPCInvokeMethodErr` in `isRejectionError`.** The router re-throws wallet errors wrapped in `RPCInvokeMethodErr`, whose outer `code` is the SDK's static `53`. Without unwrapping, wallet-side `code: 4001` rejections fell through to the brittle message-substring fallback, and any rejection whose message didn't happen to contain "reject"/"denied"/"cancel" was misclassified as a failure. Also tightened the "user" message heuristic so unrelated phrases like "user operation reverted" (Account Abstraction) no longer count as rejections. Tests: - New `analytics.test.ts` covering `isRejectionError` (including the `RPCInvokeMethodErr` unwrap and the tightened "user" heuristic) and every branch of `classifyFailureReason`. - New `requestRouter.test.ts` assertions for `failure_reason` on wallet internal errors and transport timeouts, plus the regression test that wallet-side 4001 with a non-matching message now lands in `_rejected`. Schema mirror: hand-edited `packages/analytics/src/schema.ts` to add the new optional property. Regenerating via `openapi-typescript@7` would have produced a 1k-line diff (the file was generated by an older version that emitted `type` aliases instead of `interface`s); hand-edit keeps the diff tight and matches the existing style. Related: WAPI-1253. Sibling PRs from the same spike: - #291 — dedupe EVM shim tracking. - #292 — `isRejectionError` unwrap fix on its own (this PR carries the same fix as a dependency; whichever lands second is a no-op).

…nd bench Follow-up to manual testing of the failure_reason classifier. Two behaviour fixes in `classifyFailureReason` / `isRejectionError`, plus moves the test-bench bits into a permanent home in the playground. Classifier: - Drop `code === 4100` from `isRejectionError`. EIP-1193 `4100 Unauthorized` is what MetaMask returns when a method is not in the granted CAIP-25 scope — the multichain permission layer rejects it before any method handler runs (see core/packages/ multichain-api-middleware/src/handlers/wallet-invokeMethod.ts). That's a permission/support signal, not a user-driven rejection. Misclassifying it inflated `_rejected` and hid genuine permission issues from `_failed`. After this change, dapp calls for methods outside the granted scope correctly surface as `_failed` with `failure_reason: wallet_unauthorized`. - Add `wallet_unauthorized` to the `FailureReason` union and route `4100` to it. Distinct from `wallet_method_unsupported` because the wallet never gets to see whether the method exists. - Add explicit branches for `4200 → wallet_method_unsupported` (wallet handler exists but explicitly refuses) and `4902 → unrecognised_chain` (`wallet_switchEthereumChain` to an unknown chain) so we don't lose signal to the catch-all `wallet_custom_error` range. Playground: - Rename `FailureReasonTestBench` → `AnalyticsTestBench` and wrap in a `<details>` so it's a permanent, collapsible section. No longer gated on `isConnected` — it always renders so testers can also exercise the `no_active_session` branch. - Use the first connected EVM account from `session.sessionScopes` for the `personal_sign` rejection sanity-check. Calling `personal_sign(msg, 0x0000…)` made the wallet bail with `-32602` before showing the prompt, defeating the rejection check. - Move the echo server from the temp `wapi-1253-test-rig/` dir into `playground/browser-playground/scripts/analytics-echo-server.mjs` and wire it as `yarn analytics:echo`. - Document the full manual-testing workflow in the playground README (echo server + `METAMASK_ANALYTICS_ENDPOINT` env var + CSP gotcha + the multichain scope behaviour that explains why so many buckets funnel into `wallet_unauthorized`).

`wallet_custom_error` was the catch-all for any EIP-1193 / EIP-1474 provider-defined code in the 1000-4999 range we hadn't carved out into its own bucket. With named branches now in place for 4001 (handled via `isRejectionError`), 4100, 4200, and 4902, the remaining "we don't know" cases are conceptually identical to `unknown` — having two buckets for the same signal adds noise without insight. After this: - Unrecognised provider codes (e.g. 4900 Disconnected) fall through to `unknown` instead of `wallet_custom_error`. - The catch-all `1000 <= code <= 4999` branch is removed. - Future codes can be promoted from `unknown` into named buckets by adding new branches above the fallback — no schema migration needed since `failure_reason` is still an open string. Updated: - Test for code 4900 now asserts `unknown` instead of `wallet_custom_error`. - Playground `ExpectedBucket` union drops the entry. - CHANGELOG entry reworded to reflect the tighter taxonomy.

The `no_active_session` classifier branch matched the SDK-thrown sentinel string `'No active session found'`, raised by `RequestRouter` and `Multichain` when `getActiveSession()` returns `null`. Both occurrences live inside `setTimeout` callbacks in the deeplink-open path: setTimeout(async () => { const session = await this.transport.getActiveSession(); if (!session) { throw new Error('No active session found'); // unhandled } ... }, 10); That throw is unhandled — it doesn't reach `#withAnalyticsTracking`'s try/catch and never produces a `mmconnect_wallet_action_failed` event. The classifier branch was dead from day one. The real pre-connect signal — `invokeMethod` called before `Transport not initialized, establish connection first` — is thrown from the `transport` getter, *outside* the `RequestRouter`'s analytics wrapper, so it's also never tracked. Wiring up pre-flight tracking is deliberately out of scope for this PR (see "Out of scope" in the PR body). Drop the bucket from the `FailureReason` union, the classifier branch, the test, the CHANGELOG entry, and the corresponding button in the playground analytics test bench. Pre-flight failures will still be diagnosable when we wire them up; we'll add the bucket back at the same time. Keeping it now just risks reviewers (or future us) trying to trigger it and finding it doesn't fire.

…cket - Inspect wallet JSON-RPC code BEFORE the transport_disconnect substring heuristic so a wallet `{ code: 4900, message: 'Disconnected' }` no longer leaks into transport_disconnect. - Narrow the transport_disconnect substring set ('transport disconnect', 'connection lost', 'socket closed', 'not connected') so it doesn't snag wallet errors that happen to mention "disconnect". - Drop session_expired from FailureReason — no producer in the monorepo throws an error matching its message regex, the branch was defensive-only. - Document connection-side error shapes in a new test block so the classifier's current behaviour on `transport.connect()` catch errors ('Transport not initialized', 'Existing connection is pending', etc.) is pinned and visible.

- analytics.test.ts: re-wrap two long lines per prettier (mechanical formatting only, no test behaviour changes). - analytics-echo-server.mjs: auto-fix curly, prefer-destructuring, prefer-template, hashbang, and other surface-level lint errors; inline-disable n/no-process-env for the documented PORT override; rename catch binding err -> parseError (id-denylist). - CHANGELOGs: append #290 PR links to every Unreleased bullet under packages/analytics and packages/connect-multichain (required by the check_changelog workflow) and add an Unreleased block to playground/browser-playground for the Analytics test bench + echo server additions. No production code changes.

Collapse iterative-development bullets in the Unreleased sections of @metamask/connect-multichain, @metamask/analytics, and @metamask/browser-playground down to what an atomic reader needs: - connect-multichain: 1 Added + 4 Fixed -> 1 Added + 3 Fixed. Dropped the session_expired-removal bullet (no producer ever emitted it, so it's not a user-facing change worth documenting) and the standalone 4900 / transport_disconnect ordering bullet (covered by the classifier description in the Added entry). - analytics: trim the schema-mirror bullet, drop the verbose taxonomy enumeration (it lives in the producer changelog and the schema PR). - browser-playground: collapse the two test-bench bullets into one. No code or behaviour change.

Narrow this PR to just the `failure_reason` taxonomy + classifier work. The two `isRejectionError` correctness fixes (unwrap RPCInvokeMethodErr, drop 4100 from the rejection set, tighten the "user" message heuristic) move to #292 so the two pieces can be reviewed independently. isRejectionError is now byte-identical to main on this branch. The new wallet_unauthorized classifier bucket is forward-compatible dead code until #292 lands and stops routing 4100 to _rejected. Drop the corresponding tests: the isRejectionError describe block in analytics.test.ts and the "code 4001 wrapped in RPCInvokeMethodErr lands in _rejected" regression test in requestRouter.test.ts (both belong to #292's surface area). CHANGELOG: drop the three isRejectionError Fixed bullets — they're covered by #292's changelog. No behavior change vs main for isRejectionError.

Same dead-bucket cleanup we did for no_active_session and session_expired. The classifier had RPCHttpErr / RPCReadonlyRequestErr / RPCReadonlyResponseErr branches mapping to rpc_node_http_error / rpc_node_request_error / rpc_node_response_error, but no producer emits any of them — handleWithRpcNode in RequestRouter throws without ever calling analytics.track. They were unreachable from day one. Drop the three FailureReason union members, the three classifier branches, the three matching tests, and the three error-class imports. Also drop the "Not currently tracked" disclaimer block in the playground test bench (used to call them out as supported-but- not-wired) and update the analytics package schema description to swap the misleading `rpc_node_error` example for `wallet_unauthorized`. When a producer wires up handleWithRpcNode tracking, we will add the buckets back at the same time. Keeping them now just risks reviewers (or future us) thinking they fire and being confused when they never do.

…addEthereumChain recovery When `wallet_switchEthereumChain` throws "Unrecognized chain ID" and the caller supplied a `chainConfiguration`, the shim catches the error and transparently calls `#addEthereumChain` — from the dapp/user's perspective this is still one logical chain change with one outcome. The previous behaviour fired `mmconnect_wallet_action_failed` for the swallowed switch attempt before the recovery ran, producing four Mixpanel events (`switch _requested`, `switch _failed`, `add _requested`, `add _succeeded`) for what should be three. Hoist the recovery-eligibility check above the `_failed` track call so the event only fires when there's no recovery to attempt: - No `chainConfiguration` provided → switch `_failed` still fires (caller sees the rejection too). - Error is not "Unrecognized chain ID" → switch `_failed`/`_rejected` still fires. - Error is "Unrecognized chain ID" + `chainConfiguration` is set → fall through to `#addEthereumChain`, whose own `_succeeded` / `_failed` represents the whole flow. Net effect: exactly one `_failed` per logical chain change, regardless of which RPC method the wallet rejected. Tests: - New `analytics.track` spy on the switchChain test block (the existing tests only asserted network calls, not analytics surface). - Asserts the suppression on the recovery-success path (3 events, not 4). - Asserts `_failed` still fires when no chainConfiguration is provided. - Asserts `_failed`/`_rejected` still fires for non-"Unrecognized chain" errors with chainConfiguration. - New test: when recovery itself fails, suppression of the switch event is preserved and only the add's `_failed` fires (1 `_failed`, not 2). Mock core extended with `options.dapp`, `versions`, `transportType`, and `storage.getAnonId()` so `getWalletActionAnalyticsProperties` can run end-to-end instead of being swallowed by `#trackWalletAction*`'s defensive try/catch. Related: WAPI-1253. The closed #291 conflated this fix with a non-existent shim-vs-router duplication; this PR addresses only the real intra-shim double-fire.

The base branch was changed.

adonesky1 added 10 commits May 11, 2026 12:37

adonesky1 requested a review from a team as a code owner May 11, 2026 21:43

wenfix previously approved these changes May 13, 2026

View reviewed changes

Base automatically changed from wapi-1253-failure-reason-and-tracking-fixes to main May 14, 2026 20:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(connect-evm): suppress switchChain `_failed` event on successful addEthereumChain recovery#294

fix(connect-evm): suppress switchChain `_failed` event on successful addEthereumChain recovery#294
adonesky1 wants to merge 10 commits into
mainfrom
wapi-1253-chain-switch-recovery-dedupe

adonesky1 commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

adonesky1 commented May 11, 2026

Summary

Behavioural matrix

Why not just remove the shim tracking entirely?

Recovering the lost signal

Test plan

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants