ADR: fix mobile Supabase session restore race on cold-start
ADR: fix mobile Supabase session restore race on cold-start
Section titled “ADR: fix mobile Supabase session restore race on cold-start”Context
Section titled “Context”gym_app users with a valid persisted Supabase session intermittently land on /auth/welcome on cold-start, and cold-start push taps lose their destination. Four compounding defects in tktspace-mobile-app (RC-1..RC-4 in the spec, lines 32-55) cause the race: AuthService filters AuthChangeEvent.initialSession; push_notifications_service never drains _pendingTap because that event is filtered; AuthInterceptor calls logout() on any 401 — including transient ones fired during the hydration window — which trips Supabase’s single-use refresh-token rule and permanently destroys the session; and route rendering proceeds during the hydration window with stale auth state. This ADR locks the architecture for the fix.
Decision
Section titled “Decision”Five coordinated changes across two packages and one app:
- Make hydration deterministic:
AuthService.init()awaitsSupabaseAuth.instance.initialSession(the documented Dart-SDK hydration Future) and flipsisHydrated = true(aValueNotifier<bool>) before returning. - Stop filtering
AuthChangeEvent.initialSessionandAuthChangeEvent.tokenRefreshed— emitnotifyListeners()on both sorefreshListenable: authre-evaluates and downstream listeners (push) fire. AuthInterceptorqueues 401s that arrive whileisHydrated == falseand replays them after hydration; only post-hydration 401s feed the existing refresh-or-logout flow.PushNotificationsService._onAuthChangeddrains_pendingTapon the now-visibleinitialSessionevent (free side-effect of decision 2).- Add a
/_hydratingGoRoute and a top-level redirect that parks every cold-start request there untilisHydratedflips, then routes to the original target via anext=query parameter.
Considered alternatives
Section titled “Considered alternatives”A. Flip isHydrated on the first AuthChangeEvent after init() returns. Rejected — non-deterministic. If no session is persisted, gotrue does not emit any event, so isHydrated would never flip and the splash would hang.
B. Timer-based fallback (flip isHydrated after 2s if no event). Rejected as a band-aid: still races on slow disk / cold boot, and trades a permanent bug for an intermittent one.
C. Drop the 401 → logout behaviour entirely. Rejected — would let truly stale sessions fester. We need the existing logout flow once we are confident the session is no longer transient. The queue-and-replay design preserves it for post-hydration 401s.
D. Unify /_hydrating with the just-merged /auth/callback route (both show a spinner). Rejected — they encode two different state machines (disk-restore vs PKCE exchange) reached from different entry points. Merging them would force one screen to implement both flows. See D6 below.
E. Move the hydration splash into per-route loaders instead of a router-level redirect. Rejected — every authenticated route would need to duplicate the gate, and unauthenticated routes still need to wait before deciding whether to redirect.
API design (per surface)
Section titled “API design (per surface)”N/A — spec frontmatter has surfaces: []; no contract change.
Surface impact
Section titled “Surface impact”N/A — spec frontmatter has surfaces: []; contracts/*.openapi.yaml are not touched.
Data model
Section titled “Data model”N/A — no schema change.
Backend module placement
Section titled “Backend module placement”N/A — backend untouched.
Frontend implications (per app)
Section titled “Frontend implications (per app)”N/A — tktspace-business, tktspace-web, and tktspace-landing are not affected.
Mobile implications
Section titled “Mobile implications”Affected app: tktspace-mobile-app/apps/gym_app only. tickets_app is out of scope.
Affected shared packages: auth, notifications.
D1. Hydration signal — explicit initialSession await in AuthService.init()
Section titled “D1. Hydration signal — explicit initialSession await in AuthService.init()”After await Supabase.initialize(...) (currently at packages/auth/lib/src/auth_service.dart:60), call await SupabaseAuth.instance.initialSession synchronously before init() returns. This is the documented Dart-SDK hydration Future on supabase_flutter 2.12.4 / gotrue 2.20.0 (see CHANGELOG.md:729 and supabase_flutter-2.12.4/test/initialization_test.dart). Then flip a new ValueNotifier<bool> isHydrated to true. This is deterministic — no race with the async AuthChangeEvent.initialSession event stream, and it works whether or not a session is persisted (the Future resolves either way; with no persisted session it resolves to null).
NOTE: the synchronous getter Supabase.instance.client.auth.currentSession returns the post-restored session AFTER initialSession has resolved, and is useful for synchronous reads later in the app’s lifecycle (e.g. inside _handleAuthChange). It does NOT, by itself, signal that hydration has completed — awaiting initialSession is what unblocks the hydration gate.
The await is wrapped in a 5-second hard timeout to defend against a flaky network / disk read hanging the splash forever:
final restored = await SupabaseAuth.instance.initialSession .timeout(const Duration(seconds: 5), onTimeout: () => null);// `restored` is either the persisted Session, or null (no session / timeout).// Either way, hydration is done — the router can make a routing decision.isHydrated.value = true;On timeout: isHydrated flips to true with Supabase.instance.client.auth.currentSession == null, so authGuard routes the user to /auth/welcome (the same fallback as “no session was ever persisted”). The user has to re-auth — acceptable degraded behaviour vs. an infinite splash.
AuthService continues to extend ChangeNotifier; isHydrated is exposed as a public ValueNotifier<bool> so GoRouter, the interceptor, and tests can read it without depending on AuthService’s listener semantics.
D2. Stop filtering initialSession and tokenRefreshed in _handleAuthChange
Section titled “D2. Stop filtering initialSession and tokenRefreshed in _handleAuthChange”The current switch (event.event) at auth_service.dart:68-82 has default: break; swallowing AuthChangeEvent.initialSession and AuthChangeEvent.tokenRefreshed. Replace with explicit cases that call notifyListeners() (and nothing else). initialSession is the post-restore signal; tokenRefreshed is the rotation signal — both should rebuild the router guard so a freshly-refreshed JWT propagates to interceptors and listeners (PushNotificationsService._onAuthChanged).
The existing comment at auth_service.dart:62-65 (“Filter events: tokenRefreshed/initialSession/passwordRecovery don’t change auth/UI state — notifying on them churns GoRouter’s refreshListenable mid-frame and triggers a GlobalKey collision”) needs to be reconciled. The GlobalKey collision was caused by two parallel fetchMe() calls (the de-dup at auth_service.dart:19 already fixes that), not by the notify itself. We will:
- Keep the
_fetchMeFuturede-dup. - NOT call
fetchMe()oninitialSession(onlysignedIntriggers a fetch —initialSessionreuses whatever is cached). - Emit
notifyListeners()oninitialSessionandtokenRefreshed. This is safe because the hydration splash (D5) ensures the router is parked on/_hydratingduring the first notify, so no shell navigator is mounted yet.
D3. AuthInterceptor request queue during cold-start window
Section titled “D3. AuthInterceptor request queue during cold-start window”packages/auth/lib/src/auth_interceptor.dart:30-32 currently calls _auth.logout() on any 401 while authenticated. Replace with:
- If
_auth.isHydrated.value == falseAND response is 401: enqueue the request (RequestOptionsequivalent for Chopper — store the originalRequest+ aCompleter<Response<BodyType>>) into an in-memory list owned by the interceptor. - Register a one-shot listener on
_auth.isHydrated: when it flips totrue:- If
_auth.isAuthenticated == true→ drain the queue by re-issuing each request viachain.proceed(request)with the now-freshAuthorizationheader re-applied (see below). - If
_auth.isAuthenticated == false(hydration finished but no session — fresh install, timeout, or rejected session) → DROP the queue: complete each queuedCompleterwith a syntheticResponsecarrying status408 Request Timeoutand atktspace_auth: not_readyreason header. Do NOT replay: every replayed request would 401 again and uselessly trigger the logout flow, churning the UI.
- If
- If the replayed request STILL 401s after hydration → that’s a legitimate auth failure → follow the existing
logout()flow.
Re-applying the token on replay: the replay must read _auth.accessToken AT REPLAY TIME, not capture it at enqueue time (the whole point is that the token rehydrates between enqueue and replay). The interceptor will re-construct the request with applyHeader(request, 'Authorization', 'Bearer ${_auth.accessToken}').
Queue cap: 50 entries. If exceeded, the oldest entry completes with a synthetic Response carrying status 408 Request Timeout (the request couldn’t complete within a reasonable window because auth wasn’t ready — semantic match for 408; 503 would imply the server is unavailable, which is not true) and a tktspace_auth: not_ready reason header so the UI can render a graceful error instead of hanging.
isAuthenticated short-circuit: preserved for post-hydration 401s (matches current line 30 semantics).
D4. Push pendingTap drain on initialSession
Section titled “D4. Push pendingTap drain on initialSession”packages/notifications/lib/src/push_notifications_service.dart:201-219 (_onAuthChanged) is wired via _auth.addListener(...) at line 170. Once D2 flips notifyListeners() for initialSession, the existing _onAuthChanged already drains _pendingTap when _auth.isAuthenticated == true. No code change in PushNotificationsService is strictly required — but the listener registration MUST happen before AuthService.init() returns so the first notify is not missed.
Boot order invariant (Order A — chosen): PushNotificationsService.init() MUST run and subscribe its listener BEFORE AuthService.init() is awaited. This guarantees the listener is attached before initialSession notify fires inside await authService.init(). No caching of “did initialSession fire” is needed in AuthService.
Canonical snippet for apps/gym_app/lib/main.dart:
Future<void> main() async { WidgetsFlutterBinding.ensureInitialized();
final authService = AuthService(); final pushService = PushNotificationsService(authService);
// Order A — push subscribes FIRST, then auth hydrates. await pushService.init(); // registers _onAuthChanged via authService.addListener await authService.init(); // awaits SupabaseAuth.instance.initialSession, // emits notifyListeners() on initialSession, // which fires _onAuthChanged with the listener already attached.
runApp(MyApp(auth: authService, push: pushService));}Order B (AuthService caches “did initialSession fire” as a flag, late subscribers receive it on attach) was considered and rejected as more complex with no upside given the boot is fully controlled by us.
Optional defensive change (recommended): make the drain idempotent by guarding on _pendingTap == true (already in place at line 204). No edit needed.
D5. Hydration splash route
Section titled “D5. Hydration splash route”Add to apps/gym_app/lib/router/app_router.dart:
- A new
GoRoute(path: '/_hydrating', builder: ...)rendering aScaffoldwith a centeredCircularProgressIndicator(same visual as the existing/auth/callbackbuilder at line 149-159, intentionally — see D6). GoRouter.refreshListenablemust trigger onisHydratedchanges too. Wrapauthandauth.isHydratedinto aListenable.merge([...])and pass that torefreshListenable.
Redirect chain — canonical order (Pattern A: normalise first). Pattern A is chosen over Pattern B because normalisation runs uniformly in a single place (step 1), and the post-hydration unparking (step 3) only deals with bare paths in next=, never scheme-prefixed URIs. Pattern B (post-hydration re-normalise) was rejected — it duplicates normalisation logic across two redirect stages and lengthens the chain.
The composed redirect (replacing _composedRedirect or wrapping it) runs in this order:
redirect(BuildContext, GoRouterState state): 1. SCHEME NORMALISER (always first, regardless of hydration state) If state.uri.toString() matches the scheme-prefixed pattern (e.g. 'com.fitspace.client.app://auth/callback?code=...') → return normalised bare path (e.g. '/auth/callback?code=...') // After this step, `next=` and all subsequent comparisons see bare paths only.
2. HYDRATION PARK If !auth.isHydrated.value AND state.uri.path != '/_hydrating' → return '/_hydrating?next=' + Uri.encodeQueryComponent(state.uri.toString()) // state.uri is now guaranteed bare (step 1), so `next=` is bare too.
3. HYDRATION UNPARK If auth.isHydrated.value AND state.uri.path == '/_hydrating': final next = state.uri.queryParameters['next']; if (next != null && next.isNotEmpty) return next; if (auth.isAuthenticated) return '/home/main'; return '/auth/welcome';
4. AUTH GUARD (existing logic — unchanged) Delegates to authGuard(state) for protected routes.Walk-through — scheme-prefixed cold-start:
Incoming URI: com.fitspace.client.app://auth/callback?code=abc123. Hydration: not yet complete.
- Step 1 normalises to
/auth/callback?code=abc123. Redirect returned. - Redirect chain re-runs with
state.uri == '/auth/callback?code=abc123'. - Step 1 no-op (already bare).
- Step 2 fires:
!isHydrated && path != '/_hydrating'→ redirect to/_hydrating?next=%2Fauth%2Fcallback%3Fcode%3Dabc123. - Redirect chain re-runs with
state.uri == '/_hydrating?next=...'. - Step 1 no-op. Step 2 no-op (path IS
/_hydrating). Step 3 no-op (not hydrated yet). Step 4 no-op (route is public). - User sees the spinner.
await SupabaseAuth.instance.initialSessionresolves. isHydratedflips →refreshListenablefires → redirect chain re-runs.- Step 1 no-op. Step 2 no-op (path IS
/_hydrating). Step 3 fires: popsnext=/auth/callback?code=abc123→ redirect. - Redirect chain re-runs with
state.uri == '/auth/callback?code=abc123'. Step 1 no-op. Step 2 no-op (hydrated). Step 3 no-op (path is not/_hydrating). Step 4 evaluates —/auth/callbackis a public route. Route renders the PKCE-exchange spinner.
Walk-through — bare-path cold-start (push notification deep link, app launched from a tap on /home/checkin):
Incoming URI: /home/checkin. Hydration: not yet complete.
- Step 1 no-op (already bare).
- Step 2 fires: redirect to
/_hydrating?next=%2Fhome%2Fcheckin. - Spinner renders. Hydration resolves.
isHydratedflips → redirect re-runs.- Step 3 pops
next=/home/checkin→ redirect. - Redirect re-runs on
/home/checkin. Step 4 (authGuard) evaluates: if authenticated, allow; if not, redirect to/auth/welcome.
D6. Coordination with Bug B (just-merged fix-mobile-auth-callback-route)
Section titled “D6. Coordination with Bug B (just-merged fix-mobile-auth-callback-route)”The /auth/callback GoRoute (already in app_router.dart:149-159, post Bug B / MR !26) renders a CircularProgressIndicator while supabase_flutter does the PKCE exchange. With D5 in place, /_hydrating and /auth/callback render identical UI but are logically different screens:
/_hydrating— cold-start entry point, parked here before any auth state is known, drained afterSupabaseAuth.instance.initialSessionresolves./auth/callback— reached mid-OAuth flow after/auth/login, parked here while supabase_flutter consumes the PKCEcodequery parameter and emitssignedIn.
These MUST stay separate. Merging them would conflate two state machines and require one route to know about both entry conditions. The visual duplication is intentional and cheap.
D7. AuthGuard interaction
Section titled “D7. AuthGuard interaction”packages/auth/lib/src/auth_guard.dart already lost the /auth/callback short-circuit in Bug B (see the comment at line 17-22). With D5’s /_hydrating redirect running at the GoRouter redirect: level BEFORE the guard, the guard itself does not need changes — the hydration redirect captures every cold-start route before authGuard evaluates auth.isAuthenticated. AC-7 (spec line 100-104) is automatically satisfied.
D8. Test approach
Section titled “D8. Test approach”Unit tests (no simulator):
packages/auth/test/auth_service_test.dartnotifyListeners()fires when the mocked Supabase auth stream emitsAuthChangeEvent.initialSession.notifyListeners()fires onAuthChangeEvent.tokenRefreshed.isHydrated.valueisfalseimmediately after construction andtrueafterinit()resolves. Test stub mocksSupabase.initializeand stubsSupabaseAuth.instance.initialSessionto aFuture<Session?>(either resolving to a fakeSessionor tonull) — both branches must flipisHydratedtotrue.- Timeout branch: stub
SupabaseAuth.instance.initialSessionto aFuturethat never resolves; assertisHydrated.value == trueandcurrentSession == nullafter the 5-second timeout (usefakeAsyncto fast-forward).
packages/auth/test/auth_interceptor_test.dart- 401 with
isHydrated == falseenqueues the request and does NOT call_auth.logout(). - Flipping
isHydratedtotruewithisAuthenticated == truedrains the queue: each queued request is re-proceeded with the current_auth.accessToken. - Flipping
isHydratedtotruewithisAuthenticated == falseDROPS the queue: each queuedCompletercompletes with a synthetic 408 response, and_auth.logout()is NOT called. - 401 after hydration completes triggers
_auth.logout()(existing behaviour preserved). - Queue cap of 50: 51st enqueue completes the oldest with status
408 Request Timeout.
- 401 with
packages/notifications/test/push_notifications_service_test.dart_pendingTap == true+ a fakeAuthServiceflippingisAuthenticatedfromfalsetotruevianotifyListeners()(simulatinginitialSession) callsonNotificationTap.
Integration test (simulator, may be CI-only):
apps/gym_app/integration_test/cold_start_session_restore_test.dart-
Pre-populates
SharedPreferenceswith a valid Supabase session blob.supabase_flutterpersists the session under the keysupabase.auth.tokenwith a JSON-stringified value containingrefresh_token,access_token,expires_at,token_type,user, etc. Canonical seed:final testSession = {'access_token': 'fake-jwt-...','refresh_token': 'fake-refresh-...','expires_at': DateTime.now().add(const Duration(hours: 1)).millisecondsSinceEpoch ~/ 1000,'token_type': 'bearer','user': {'id': 'fake-uuid', /* ... */},};SharedPreferences.setMockInitialValues({'supabase.auth.token': jsonEncode(testSession),}); -
Boots
gym_appviamain_tktspace.dart. -
Asserts the user lands on
/home/main(not/auth/welcome). -
Tagged
@Tags(['simulator'])per spec line 154.
-
Router redirect-loop guard:
apps/gym_app/test/router/hydration_redirect_test.dart— assert that withisHydrated == false, navigating to/_hydratingdoes NOT re-redirect to itself (covers the risk mitigation listed below).
Touch list (files)
Section titled “Touch list (files)”packages/auth/lib/src/auth_service.dart— addValueNotifier<bool> isHydrated, awaitSupabaseAuth.instance.initialSession(with 5s timeout) at the end ofinit(), emitnotifyListeners()forinitialSessionandtokenRefreshed.packages/auth/lib/src/auth_interceptor.dart— add queue + drain on hydration with branching onisAuthenticated(drain-replay if authed, drop-with-408 if not authed), preserve post-hydration logout flow, cap queue at 50 (synthetic 408 on overflow).packages/auth/lib/src/auth_guard.dart— no change. Documented here so reviewers know this is intentional.packages/notifications/lib/src/push_notifications_service.dart— no functional change (D4 lands free with D2). Listener-registration ordering documented in gym_app boot path.apps/gym_app/lib/router/app_router.dart— add/_hydratingGoRoute, replace_composedRedirectwith the Pattern-A four-step chain (scheme normaliser → hydration park → hydration unpark → authGuard), mergeauth.isHydratedintorefreshListenable.apps/gym_app/lib/main.dart— Order A boot sequence:await pushService.init()BEFOREawait authService.init()so the push listener is attached wheninitialSessionnotify fires.- Tests as listed in D8.
-
Risk: the new
initialSessionawait blocksinit()longer on cold-start (perceptible UI delay). Mitigation:/_hydratingrenders a spinner so the UX is honest; measure latency in dev — ifinitialSessionexceeds ~200 ms p95 we can promote the splash to the OS-level launch screen (separate ticket if measurable). -
Risk:
SupabaseAuth.instance.initialSessionhangs indefinitely on a flaky network or stuck disk read — splash hangs forever. Mitigation: wrap the await in.timeout(Duration(seconds: 5), onTimeout: () => null)(D1). On timeout,isHydratedflips withcurrentSession == null→authGuardroutes to/auth/welcome. User re-auths — acceptable degraded behaviour vs. an infinite splash. Covered by a fakeAsync test in D8. -
Risk: request queue grows unbounded if hydration never completes (e.g. supabase init throws). Mitigation: cap at 50 entries; oldest evicted with synthetic 408 response. Hydration completion is also guarded by
init()’s normal error handling — a thrownSupabase.initializepropagates up and the app crashes loudly, which is preferable to hanging. Worst case for slow hydration: 5s timeout (above) flipsisHydratedand the queue drains via the unauthed-drop path (D3). -
Risk:
/_hydratingrecursive redirect loop. Mitigation: thestate.uri.path != '/_hydrating'check breaks it. Covered byhydration_redirect_test.dart(D8). -
Risk: deep links (push notifications from
getInitialMessage(), the/auth/callbackOAuth re-entry) arrive during the hydration window. Mitigation: thenext=query parameter on/_hydratingcaptures the original URI and the redirect replays after hydration. Push notification handling already stashes into_pendingTapand drains via D4 — orthogonal path. -
Risk:
notifyListeners()oninitialSessiontriggers the original GlobalKey collision the existing comment warned about. Mitigation: the splash (D5) means no shell navigator is mounted when the first notify happens — by the time the router unparks/_hydrating, hydration is finished and the notify cascade has settled. The_fetchMeFuturede-dup (line 19, already in place) covers the parallel-fetch case. -
Risk:
tokenRefreshednow triggersnotifyListeners()on every background rotation, which churnsrefreshListenablelisteners. Mitigation: rotation is rare (Supabase default ~1h). The notify is cheap; the guard re-evaluates and short-circuits becauseisAuthenticateddoesn’t change. No-op for the user.
Rollout plan
Section titled “Rollout plan”- Single PR against
tktspace-mobile-appcovering all five package/app touches and the tests. - No feature flag — the change is a strict superset of correct behaviour; gating it would leave half-fixed cold-start paths in production.
- No backfill — purely client-side.
- Verify in dev simulator with the integration test (D8) before merge.
- After merge, monitor crash + auth-error rates in Firebase Crashlytics / Sentry for one release cycle. Specifically watch for an UPTICK in “session lost” reports (would indicate the queue cap is too low) or a DOWNTICK (expected, spec lines 23-26).
OpenAPI sections
Section titled “OpenAPI sections”N/A — spec frontmatter has surfaces: []; no contract or schema change.
DB / migration sections
Section titled “DB / migration sections”N/A — spec frontmatter has surfaces: []; no contract or schema change.
STATUS: READY_FOR_REVIEW