Skip to content

ADR: fix mobile Supabase session restore race on cold-start

ADR: fix mobile Supabase session restore race on cold-start

Section titled “ADR: fix mobile Supabase session restore race on cold-start”

gym_app users with a valid persisted Supabase session intermittently land on /auth/welcome on cold-start, and cold-start push taps lose their destination. Four compounding defects in tktspace-mobile-app (RC-1..RC-4 in the spec, lines 32-55) cause the race: AuthService filters AuthChangeEvent.initialSession; push_notifications_service never drains _pendingTap because that event is filtered; AuthInterceptor calls logout() on any 401 — including transient ones fired during the hydration window — which trips Supabase’s single-use refresh-token rule and permanently destroys the session; and route rendering proceeds during the hydration window with stale auth state. This ADR locks the architecture for the fix.

Five coordinated changes across two packages and one app:

  1. Make hydration deterministic: AuthService.init() awaits SupabaseAuth.instance.initialSession (the documented Dart-SDK hydration Future) and flips isHydrated = true (a ValueNotifier<bool>) before returning.
  2. Stop filtering AuthChangeEvent.initialSession and AuthChangeEvent.tokenRefreshed — emit notifyListeners() on both so refreshListenable: auth re-evaluates and downstream listeners (push) fire.
  3. AuthInterceptor queues 401s that arrive while isHydrated == false and replays them after hydration; only post-hydration 401s feed the existing refresh-or-logout flow.
  4. PushNotificationsService._onAuthChanged drains _pendingTap on the now-visible initialSession event (free side-effect of decision 2).
  5. Add a /_hydrating GoRoute and a top-level redirect that parks every cold-start request there until isHydrated flips, then routes to the original target via a next= query parameter.

A. Flip isHydrated on the first AuthChangeEvent after init() returns. Rejected — non-deterministic. If no session is persisted, gotrue does not emit any event, so isHydrated would never flip and the splash would hang.

B. Timer-based fallback (flip isHydrated after 2s if no event). Rejected as a band-aid: still races on slow disk / cold boot, and trades a permanent bug for an intermittent one.

C. Drop the 401 → logout behaviour entirely. Rejected — would let truly stale sessions fester. We need the existing logout flow once we are confident the session is no longer transient. The queue-and-replay design preserves it for post-hydration 401s.

D. Unify /_hydrating with the just-merged /auth/callback route (both show a spinner). Rejected — they encode two different state machines (disk-restore vs PKCE exchange) reached from different entry points. Merging them would force one screen to implement both flows. See D6 below.

E. Move the hydration splash into per-route loaders instead of a router-level redirect. Rejected — every authenticated route would need to duplicate the gate, and unauthenticated routes still need to wait before deciding whether to redirect.

N/A — spec frontmatter has surfaces: []; no contract change.

N/A — spec frontmatter has surfaces: []; contracts/*.openapi.yaml are not touched.

N/A — no schema change.

N/A — backend untouched.

N/A — tktspace-business, tktspace-web, and tktspace-landing are not affected.

Affected app: tktspace-mobile-app/apps/gym_app only. tickets_app is out of scope.

Affected shared packages: auth, notifications.

D1. Hydration signal — explicit initialSession await in AuthService.init()

Section titled “D1. Hydration signal — explicit initialSession await in AuthService.init()”

After await Supabase.initialize(...) (currently at packages/auth/lib/src/auth_service.dart:60), call await SupabaseAuth.instance.initialSession synchronously before init() returns. This is the documented Dart-SDK hydration Future on supabase_flutter 2.12.4 / gotrue 2.20.0 (see CHANGELOG.md:729 and supabase_flutter-2.12.4/test/initialization_test.dart). Then flip a new ValueNotifier<bool> isHydrated to true. This is deterministic — no race with the async AuthChangeEvent.initialSession event stream, and it works whether or not a session is persisted (the Future resolves either way; with no persisted session it resolves to null).

NOTE: the synchronous getter Supabase.instance.client.auth.currentSession returns the post-restored session AFTER initialSession has resolved, and is useful for synchronous reads later in the app’s lifecycle (e.g. inside _handleAuthChange). It does NOT, by itself, signal that hydration has completed — awaiting initialSession is what unblocks the hydration gate.

The await is wrapped in a 5-second hard timeout to defend against a flaky network / disk read hanging the splash forever:

final restored = await SupabaseAuth.instance.initialSession
.timeout(const Duration(seconds: 5), onTimeout: () => null);
// `restored` is either the persisted Session, or null (no session / timeout).
// Either way, hydration is done — the router can make a routing decision.
isHydrated.value = true;

On timeout: isHydrated flips to true with Supabase.instance.client.auth.currentSession == null, so authGuard routes the user to /auth/welcome (the same fallback as “no session was ever persisted”). The user has to re-auth — acceptable degraded behaviour vs. an infinite splash.

AuthService continues to extend ChangeNotifier; isHydrated is exposed as a public ValueNotifier<bool> so GoRouter, the interceptor, and tests can read it without depending on AuthService’s listener semantics.

D2. Stop filtering initialSession and tokenRefreshed in _handleAuthChange

Section titled “D2. Stop filtering initialSession and tokenRefreshed in _handleAuthChange”

The current switch (event.event) at auth_service.dart:68-82 has default: break; swallowing AuthChangeEvent.initialSession and AuthChangeEvent.tokenRefreshed. Replace with explicit cases that call notifyListeners() (and nothing else). initialSession is the post-restore signal; tokenRefreshed is the rotation signal — both should rebuild the router guard so a freshly-refreshed JWT propagates to interceptors and listeners (PushNotificationsService._onAuthChanged).

The existing comment at auth_service.dart:62-65 (“Filter events: tokenRefreshed/initialSession/passwordRecovery don’t change auth/UI state — notifying on them churns GoRouter’s refreshListenable mid-frame and triggers a GlobalKey collision”) needs to be reconciled. The GlobalKey collision was caused by two parallel fetchMe() calls (the de-dup at auth_service.dart:19 already fixes that), not by the notify itself. We will:

  • Keep the _fetchMeFuture de-dup.
  • NOT call fetchMe() on initialSession (only signedIn triggers a fetch — initialSession reuses whatever is cached).
  • Emit notifyListeners() on initialSession and tokenRefreshed. This is safe because the hydration splash (D5) ensures the router is parked on /_hydrating during the first notify, so no shell navigator is mounted yet.

D3. AuthInterceptor request queue during cold-start window

Section titled “D3. AuthInterceptor request queue during cold-start window”

packages/auth/lib/src/auth_interceptor.dart:30-32 currently calls _auth.logout() on any 401 while authenticated. Replace with:

  • If _auth.isHydrated.value == false AND response is 401: enqueue the request (RequestOptions equivalent for Chopper — store the original Request + a Completer<Response<BodyType>>) into an in-memory list owned by the interceptor.
  • Register a one-shot listener on _auth.isHydrated: when it flips to true:
    • If _auth.isAuthenticated == true → drain the queue by re-issuing each request via chain.proceed(request) with the now-fresh Authorization header re-applied (see below).
    • If _auth.isAuthenticated == false (hydration finished but no session — fresh install, timeout, or rejected session) → DROP the queue: complete each queued Completer with a synthetic Response carrying status 408 Request Timeout and a tktspace_auth: not_ready reason header. Do NOT replay: every replayed request would 401 again and uselessly trigger the logout flow, churning the UI.
  • If the replayed request STILL 401s after hydration → that’s a legitimate auth failure → follow the existing logout() flow.

Re-applying the token on replay: the replay must read _auth.accessToken AT REPLAY TIME, not capture it at enqueue time (the whole point is that the token rehydrates between enqueue and replay). The interceptor will re-construct the request with applyHeader(request, 'Authorization', 'Bearer ${_auth.accessToken}').

Queue cap: 50 entries. If exceeded, the oldest entry completes with a synthetic Response carrying status 408 Request Timeout (the request couldn’t complete within a reasonable window because auth wasn’t ready — semantic match for 408; 503 would imply the server is unavailable, which is not true) and a tktspace_auth: not_ready reason header so the UI can render a graceful error instead of hanging.

isAuthenticated short-circuit: preserved for post-hydration 401s (matches current line 30 semantics).

D4. Push pendingTap drain on initialSession

Section titled “D4. Push pendingTap drain on initialSession”

packages/notifications/lib/src/push_notifications_service.dart:201-219 (_onAuthChanged) is wired via _auth.addListener(...) at line 170. Once D2 flips notifyListeners() for initialSession, the existing _onAuthChanged already drains _pendingTap when _auth.isAuthenticated == true. No code change in PushNotificationsService is strictly required — but the listener registration MUST happen before AuthService.init() returns so the first notify is not missed.

Boot order invariant (Order A — chosen): PushNotificationsService.init() MUST run and subscribe its listener BEFORE AuthService.init() is awaited. This guarantees the listener is attached before initialSession notify fires inside await authService.init(). No caching of “did initialSession fire” is needed in AuthService.

Canonical snippet for apps/gym_app/lib/main.dart:

Future<void> main() async {
WidgetsFlutterBinding.ensureInitialized();
final authService = AuthService();
final pushService = PushNotificationsService(authService);
// Order A — push subscribes FIRST, then auth hydrates.
await pushService.init(); // registers _onAuthChanged via authService.addListener
await authService.init(); // awaits SupabaseAuth.instance.initialSession,
// emits notifyListeners() on initialSession,
// which fires _onAuthChanged with the listener already attached.
runApp(MyApp(auth: authService, push: pushService));
}

Order B (AuthService caches “did initialSession fire” as a flag, late subscribers receive it on attach) was considered and rejected as more complex with no upside given the boot is fully controlled by us.

Optional defensive change (recommended): make the drain idempotent by guarding on _pendingTap == true (already in place at line 204). No edit needed.

Add to apps/gym_app/lib/router/app_router.dart:

  • A new GoRoute(path: '/_hydrating', builder: ...) rendering a Scaffold with a centered CircularProgressIndicator (same visual as the existing /auth/callback builder at line 149-159, intentionally — see D6).
  • GoRouter.refreshListenable must trigger on isHydrated changes too. Wrap auth and auth.isHydrated into a Listenable.merge([...]) and pass that to refreshListenable.

Redirect chain — canonical order (Pattern A: normalise first). Pattern A is chosen over Pattern B because normalisation runs uniformly in a single place (step 1), and the post-hydration unparking (step 3) only deals with bare paths in next=, never scheme-prefixed URIs. Pattern B (post-hydration re-normalise) was rejected — it duplicates normalisation logic across two redirect stages and lengthens the chain.

The composed redirect (replacing _composedRedirect or wrapping it) runs in this order:

redirect(BuildContext, GoRouterState state):
1. SCHEME NORMALISER (always first, regardless of hydration state)
If state.uri.toString() matches the scheme-prefixed pattern
(e.g. 'com.fitspace.client.app://auth/callback?code=...')
→ return normalised bare path (e.g. '/auth/callback?code=...')
// After this step, `next=` and all subsequent comparisons see bare paths only.
2. HYDRATION PARK
If !auth.isHydrated.value AND state.uri.path != '/_hydrating'
→ return '/_hydrating?next=' + Uri.encodeQueryComponent(state.uri.toString())
// state.uri is now guaranteed bare (step 1), so `next=` is bare too.
3. HYDRATION UNPARK
If auth.isHydrated.value AND state.uri.path == '/_hydrating':
final next = state.uri.queryParameters['next'];
if (next != null && next.isNotEmpty) return next;
if (auth.isAuthenticated) return '/home/main';
return '/auth/welcome';
4. AUTH GUARD (existing logic — unchanged)
Delegates to authGuard(state) for protected routes.

Walk-through — scheme-prefixed cold-start:

Incoming URI: com.fitspace.client.app://auth/callback?code=abc123. Hydration: not yet complete.

  • Step 1 normalises to /auth/callback?code=abc123. Redirect returned.
  • Redirect chain re-runs with state.uri == '/auth/callback?code=abc123'.
  • Step 1 no-op (already bare).
  • Step 2 fires: !isHydrated && path != '/_hydrating' → redirect to /_hydrating?next=%2Fauth%2Fcallback%3Fcode%3Dabc123.
  • Redirect chain re-runs with state.uri == '/_hydrating?next=...'.
  • Step 1 no-op. Step 2 no-op (path IS /_hydrating). Step 3 no-op (not hydrated yet). Step 4 no-op (route is public).
  • User sees the spinner. await SupabaseAuth.instance.initialSession resolves.
  • isHydrated flips → refreshListenable fires → redirect chain re-runs.
  • Step 1 no-op. Step 2 no-op (path IS /_hydrating). Step 3 fires: pops next=/auth/callback?code=abc123 → redirect.
  • Redirect chain re-runs with state.uri == '/auth/callback?code=abc123'. Step 1 no-op. Step 2 no-op (hydrated). Step 3 no-op (path is not /_hydrating). Step 4 evaluates — /auth/callback is a public route. Route renders the PKCE-exchange spinner.

Walk-through — bare-path cold-start (push notification deep link, app launched from a tap on /home/checkin):

Incoming URI: /home/checkin. Hydration: not yet complete.

  • Step 1 no-op (already bare).
  • Step 2 fires: redirect to /_hydrating?next=%2Fhome%2Fcheckin.
  • Spinner renders. Hydration resolves.
  • isHydrated flips → redirect re-runs.
  • Step 3 pops next=/home/checkin → redirect.
  • Redirect re-runs on /home/checkin. Step 4 (authGuard) evaluates: if authenticated, allow; if not, redirect to /auth/welcome.

D6. Coordination with Bug B (just-merged fix-mobile-auth-callback-route)

Section titled “D6. Coordination with Bug B (just-merged fix-mobile-auth-callback-route)”

The /auth/callback GoRoute (already in app_router.dart:149-159, post Bug B / MR !26) renders a CircularProgressIndicator while supabase_flutter does the PKCE exchange. With D5 in place, /_hydrating and /auth/callback render identical UI but are logically different screens:

  • /_hydrating — cold-start entry point, parked here before any auth state is known, drained after SupabaseAuth.instance.initialSession resolves.
  • /auth/callback — reached mid-OAuth flow after /auth/login, parked here while supabase_flutter consumes the PKCE code query parameter and emits signedIn.

These MUST stay separate. Merging them would conflate two state machines and require one route to know about both entry conditions. The visual duplication is intentional and cheap.

packages/auth/lib/src/auth_guard.dart already lost the /auth/callback short-circuit in Bug B (see the comment at line 17-22). With D5’s /_hydrating redirect running at the GoRouter redirect: level BEFORE the guard, the guard itself does not need changes — the hydration redirect captures every cold-start route before authGuard evaluates auth.isAuthenticated. AC-7 (spec line 100-104) is automatically satisfied.

Unit tests (no simulator):

  • packages/auth/test/auth_service_test.dart
    • notifyListeners() fires when the mocked Supabase auth stream emits AuthChangeEvent.initialSession.
    • notifyListeners() fires on AuthChangeEvent.tokenRefreshed.
    • isHydrated.value is false immediately after construction and true after init() resolves. Test stub mocks Supabase.initialize and stubs SupabaseAuth.instance.initialSession to a Future<Session?> (either resolving to a fake Session or to null) — both branches must flip isHydrated to true.
    • Timeout branch: stub SupabaseAuth.instance.initialSession to a Future that never resolves; assert isHydrated.value == true and currentSession == null after the 5-second timeout (use fakeAsync to fast-forward).
  • packages/auth/test/auth_interceptor_test.dart
    • 401 with isHydrated == false enqueues the request and does NOT call _auth.logout().
    • Flipping isHydrated to true with isAuthenticated == true drains the queue: each queued request is re-proceeded with the current _auth.accessToken.
    • Flipping isHydrated to true with isAuthenticated == false DROPS the queue: each queued Completer completes with a synthetic 408 response, and _auth.logout() is NOT called.
    • 401 after hydration completes triggers _auth.logout() (existing behaviour preserved).
    • Queue cap of 50: 51st enqueue completes the oldest with status 408 Request Timeout.
  • packages/notifications/test/push_notifications_service_test.dart
    • _pendingTap == true + a fake AuthService flipping isAuthenticated from false to true via notifyListeners() (simulating initialSession) calls onNotificationTap.

Integration test (simulator, may be CI-only):

  • apps/gym_app/integration_test/cold_start_session_restore_test.dart
    • Pre-populates SharedPreferences with a valid Supabase session blob. supabase_flutter persists the session under the key supabase.auth.token with a JSON-stringified value containing refresh_token, access_token, expires_at, token_type, user, etc. Canonical seed:

      final testSession = {
      'access_token': 'fake-jwt-...',
      'refresh_token': 'fake-refresh-...',
      'expires_at': DateTime.now().add(const Duration(hours: 1)).millisecondsSinceEpoch ~/ 1000,
      'token_type': 'bearer',
      'user': {'id': 'fake-uuid', /* ... */},
      };
      SharedPreferences.setMockInitialValues({
      'supabase.auth.token': jsonEncode(testSession),
      });
    • Boots gym_app via main_tktspace.dart.

    • Asserts the user lands on /home/main (not /auth/welcome).

    • Tagged @Tags(['simulator']) per spec line 154.

Router redirect-loop guard:

  • apps/gym_app/test/router/hydration_redirect_test.dart — assert that with isHydrated == false, navigating to /_hydrating does NOT re-redirect to itself (covers the risk mitigation listed below).
  • packages/auth/lib/src/auth_service.dart — add ValueNotifier<bool> isHydrated, await SupabaseAuth.instance.initialSession (with 5s timeout) at the end of init(), emit notifyListeners() for initialSession and tokenRefreshed.
  • packages/auth/lib/src/auth_interceptor.dart — add queue + drain on hydration with branching on isAuthenticated (drain-replay if authed, drop-with-408 if not authed), preserve post-hydration logout flow, cap queue at 50 (synthetic 408 on overflow).
  • packages/auth/lib/src/auth_guard.dart — no change. Documented here so reviewers know this is intentional.
  • packages/notifications/lib/src/push_notifications_service.dart — no functional change (D4 lands free with D2). Listener-registration ordering documented in gym_app boot path.
  • apps/gym_app/lib/router/app_router.dart — add /_hydrating GoRoute, replace _composedRedirect with the Pattern-A four-step chain (scheme normaliser → hydration park → hydration unpark → authGuard), merge auth.isHydrated into refreshListenable.
  • apps/gym_app/lib/main.dart — Order A boot sequence: await pushService.init() BEFORE await authService.init() so the push listener is attached when initialSession notify fires.
  • Tests as listed in D8.
  • Risk: the new initialSession await blocks init() longer on cold-start (perceptible UI delay). Mitigation: /_hydrating renders a spinner so the UX is honest; measure latency in dev — if initialSession exceeds ~200 ms p95 we can promote the splash to the OS-level launch screen (separate ticket if measurable).

  • Risk: SupabaseAuth.instance.initialSession hangs indefinitely on a flaky network or stuck disk read — splash hangs forever. Mitigation: wrap the await in .timeout(Duration(seconds: 5), onTimeout: () => null) (D1). On timeout, isHydrated flips with currentSession == nullauthGuard routes to /auth/welcome. User re-auths — acceptable degraded behaviour vs. an infinite splash. Covered by a fakeAsync test in D8.

  • Risk: request queue grows unbounded if hydration never completes (e.g. supabase init throws). Mitigation: cap at 50 entries; oldest evicted with synthetic 408 response. Hydration completion is also guarded by init()’s normal error handling — a thrown Supabase.initialize propagates up and the app crashes loudly, which is preferable to hanging. Worst case for slow hydration: 5s timeout (above) flips isHydrated and the queue drains via the unauthed-drop path (D3).

  • Risk: /_hydrating recursive redirect loop. Mitigation: the state.uri.path != '/_hydrating' check breaks it. Covered by hydration_redirect_test.dart (D8).

  • Risk: deep links (push notifications from getInitialMessage(), the /auth/callback OAuth re-entry) arrive during the hydration window. Mitigation: the next= query parameter on /_hydrating captures the original URI and the redirect replays after hydration. Push notification handling already stashes into _pendingTap and drains via D4 — orthogonal path.

  • Risk: notifyListeners() on initialSession triggers the original GlobalKey collision the existing comment warned about. Mitigation: the splash (D5) means no shell navigator is mounted when the first notify happens — by the time the router unparks /_hydrating, hydration is finished and the notify cascade has settled. The _fetchMeFuture de-dup (line 19, already in place) covers the parallel-fetch case.

  • Risk: tokenRefreshed now triggers notifyListeners() on every background rotation, which churns refreshListenable listeners. Mitigation: rotation is rare (Supabase default ~1h). The notify is cheap; the guard re-evaluates and short-circuits because isAuthenticated doesn’t change. No-op for the user.

  • Single PR against tktspace-mobile-app covering all five package/app touches and the tests.
  • No feature flag — the change is a strict superset of correct behaviour; gating it would leave half-fixed cold-start paths in production.
  • No backfill — purely client-side.
  • Verify in dev simulator with the integration test (D8) before merge.
  • After merge, monitor crash + auth-error rates in Firebase Crashlytics / Sentry for one release cycle. Specifically watch for an UPTICK in “session lost” reports (would indicate the queue cap is too low) or a DOWNTICK (expected, spec lines 23-26).

N/A — spec frontmatter has surfaces: []; no contract or schema change.

N/A — spec frontmatter has surfaces: []; no contract or schema change.

STATUS: READY_FOR_REVIEW