ADR: Re-bucket SERVICE Activities from SPORT to SERVICES Sphere

Status

PROPOSED

AC ↔ D# mapping

AC	Primary decisions	Notes
AC-1	D1 (CTE UPDATE+audit), D2 (system actor), D3 (jsonb shape)	Single atomic transaction; audit captures full activity_ids[] for rollback.
AC-2	D4 (pre-flight DO block)	RAISE if SERVICES sphere missing.
AC-3	D1 (CTE — `HAVING count(*) > 0` guard, v2)	Audit INSERT skipped if updated CTE yields zero rows. (v2: WHERE EXISTS was wrong — see D1.)
AC-4	D3 (activity_ids[] in after.jsonb)	Rollback inverse UPDATE targets only this run’s set.
AC-5	D5 (e2e suite shape)	4 cases: happy, idempotent, pre-flight, selective.

Decisions

D1 — Single CTE: UPDATE + audit INSERT atomic (v2: HAVING guard)

Decision: combine the activity UPDATE and the audit-log INSERT in a single SQL CTE statement, with HAVING count(*) > 0 on the audit INSERT’s aggregate SELECT so the audit row only fires when rows actually moved.

v2 amendment: the originally proposed WHERE EXISTS (SELECT 1 FROM updated) guard is incorrect for an aggregate SELECT. An aggregate query without GROUP BY emits exactly one row even when WHERE filters out all input — verified against local Postgres: the WHERE EXISTS variant emits a row with ids = [], the HAVING count(*) > 0 variant emits zero rows. With WHERE EXISTS, a no-op second run would insert a zero-count audit row, violating AC-3. The spec’s SQL sketch (specs/services-sphere-rebucket.md, Migration section) carries the same flaw; backend-dev must use HAVING count(*) > 0 instead. The spec’s AC-3 wording already permits this (“or fold both into a CTE” — the binding requirement is the behaviour, not the exact guard syntax).

Corrected audit CTE shape:

audit_insert AS (
  INSERT INTO activities.sphere_audit_log
    (sphere_id, sphere_code, actor_user_id, action, before, after)
  SELECT
    (SELECT id FROM services_id),
    'SERVICES',
    '00000000-0000-0000-0000-000000000000'::uuid,
    'UPDATE',
    jsonb_build_object('rule', 'SERVICE-in-SPORT', 'count', count(*)),
    jsonb_build_object('rule', 'moved-to-SERVICES',
                       'activity_ids', jsonb_agg(id))
  FROM updated
  HAVING count(*) > 0
  RETURNING id
)

(count(*) over updated equals the affected count, so the separate (SELECT count(*) FROM affected) subquery is unnecessary; COALESCE on jsonb_agg is also unnecessary since HAVING guarantees ≥1 input row.)

Rationale:

Atomicity within a single statement — no race window where activities update without an audit trail (or vice-versa).
Idempotency is automatic: second-run sees empty affected → empty updated → HAVING count(*) > 0 eliminates the aggregate group → audit skipped.
Single-pass SQL avoids reading count(*) twice (once for guard, once for audit metadata).

Alternative rejected: two separate statements (audit first, then UPDATE). Requires a count-and-skip wrapper for idempotency and duplicates the audit row in retry scenarios. (This IS the pattern 0044 uses — acceptable there because 0044’s WHERE clauses self-filter identically on both statements; here the batched single-row audit makes the CTE form strictly safer.)

Idempotency scope (v2 clarification): Drizzle’s migration journal tracks applied migrations — pnpm db:migrate will not re-execute 0045 after a clean run, so journal-level re-application is already blocked. The SQL-body idempotency required by AC-3 matters for: (a) manual psql execution of the file, (b) e2e tests applying the migration programmatically (AC-5.2 runs the SQL body twice, not the journal), and (c) the partial-failure/retry case. Both layers are intentional and complementary.

D2 — System actor UUID `00000000-0000-0000-0000-000000000000`

Decision: identical pattern to P0 #1 — use the nil UUID as actor_user_id on the audit row. The activities.sphere_audit_log.actor_user_id column is intentionally non-FK (activities.schema.ts:102-104), so the nil UUID is a stable convention for “system” actor without seeding any user row.

Rationale: consistency with the only other migration-driven sphere audit row (the MEMBERSHIP cleanup from P0 #1). Operators inspecting the audit log learn one convention.

D3 — Audit jsonb shape: rule + count (before), activity_ids[] (after)

Decision: structure the audit jsonb as:

before: { "rule": "SERVICE-in-SPORT", "count": <N> }
after:  { "rule": "moved-to-SERVICES", "activity_ids": [<uuid>, ...] }

Rationale:

rule field documents the migration intent in-row (operators don’t need to dig out the migration SQL).
activity_ids[] in after is the rollback primitive — sufficient to scope an inverse UPDATE without re-deriving from any other source.
count in before is human-readable summary for at-a-glance audit reads.

Rejected alternatives:

Storing before.sphere_id/after.sphere_id for each activity (verbose, N rows worth of data; activity_ids[] is sufficient since both spheres are known by code).
Inserting one audit row per affected activity (clutters the log; the single batched row matches the “this was one migration step” mental model).

D4 — Pre-flight DO block RAISEs on missing SERVICES sphere

Decision: start the migration with a DO $$ block that selects SERVICES.id and RAISE EXCEPTION if NULL.

Rationale:

If SERVICES sphere is missing (e.g. seed migration didn’t run, or the row was manually deleted), the CTE would silently no-op (UPDATE matches zero because the WHERE clause selects from missing sphere) and the migration would appear successful while doing nothing.
Fail-loud is cheaper than silent data corruption.
PROD action item: before deployment, verify SELECT count(*) FROM activities.spheres WHERE code IN ('SPORT', 'SERVICES') returns 2.

D5 — Test suite: 4 e2e cases covering happy / idempotent / pre-flight / selective

Decision: integration tests in a new e2e spec file apps/api-e2e/src/spheres/sphere-rebucket-0045.spec.ts (migration number 0045 confirmed in v2 — head was 0044).

Rationale: mirrors the pattern from P0 #1’s migration test (sphere-migration-0044.spec.ts). Tests use direct pg client connections to a test DB; the migration is applied programmatically.

Out-of-scope for this test file:

gym_app sphere filter behavior — that’s tested in P3 #7’s own test plan.
Category mismatch detection — P3 #20 handles that, this ticket explicitly leaves the drift in place.

Out of scope (explicitly deferred)

Activity-category sphere consistency — categories these activities link to still reference SPORT-sphere parents. Acknowledged drift; cleaned in P3 #20 follow-up.
Removing SERVICE from SPORT.allowed_activity_types — separate ADR call. Keeping it permissive for now means future admin-created SERVICE-in-SPORT is allowed (a product decision, not in this ticket).
Backfilling audit log for the original 0037 backfill — historical, no operational value.
CI guard to prevent future re-introduction of legacy backfill rules — over-engineering for what should be a one-time cleanup.

Risks (post-mitigation)

None blocking. Small-scale data UPDATE with audit trail; pre-flight RAISE on missing dependencies; idempotent re-runs.
Display-level mismatch with categories — activities now in SERVICES sphere may still link to SPORT-sphere categories. Backend filter logic (activities.service.ts) is generic and doesn’t break on this. UI display (admin grid, web explore) shows the activity’s sphere by id, so the activity-side is consistent. Category-side mismatch is the P3 #20 concern.

Post-merge audit (P3 #20)

Followed up after merge to verify activity-category sphere consistency drift assumption (Non-Goals item):

SELECT count(*) AS mismatched_links
FROM activities.activities a
JOIN activities.activity_categories ac ON ac.activity_id = a.id
JOIN activities.categories c ON c.id = ac.category_id
WHERE a.sphere_id != c.sphere_id;

Dev DB result (2026-06-13): 0 mismatched links, 0 total activity-category links (activity_categories table empty). 536 activities exist with sphere_id assigned, but no category links populated. The category mismatch risk has not materialized on dev.

Recommendation: run the same audit on prod before any concern surfaces. If prod also returns 0 mismatches, no migration is needed; close P3 #20 follow-up without action. If prod finds N mismatches, open a small data-fix ticket using the P3 #8 / P0 #1 migration pattern (drizzle --custom + CTE with HAVING count(*) > 0 audit guard).

Verification

Apply migration locally against fresh DB (pnpm run docker:up && pnpm db:migrate).
Run the e2e test (pnpm exec jest --testPathPattern=sphere-rebucket-0045) — all 4 cases pass.

Note on idempotency verification: pnpm db:migrate re-run only tests the Drizzle journal (which blocks re-application by design), NOT the SQL body. Body idempotency must be tested by manually re-executing the file: psql "$DATABASE_URL" -f libs/shared/data-access-db/migrations/0045_*.sql — audit count should stay constant. 3. Manually check via psql:

SELECT count(*) FROM activities.activities WHERE type = 'SERVICE' AND sphere_id = (SELECT id FROM activities.spheres WHERE code = 'SPORT') → 0.
SELECT after FROM activities.sphere_audit_log WHERE sphere_code = 'SERVICES' AND action = 'UPDATE' ORDER BY created_at DESC LIMIT 1 → jsonb with activity_ids[] populated.

Test SQL-body idempotency by manually re-executing the migration file: psql "$DATABASE_URL" -f libs/shared/data-access-db/migrations/0045_*.sql → audit count stays constant. (pnpm db:migrate re-run is a no-op — Drizzle journal blocks it — and does NOT test the SQL body; see note above.)

STATUS: READY_FOR_REVIEW

ADR: Re-bucket SERVICE Activities from SPORT to SERVICES Sphere

ADR: Re-bucket SERVICE Activities from SPORT to SERVICES Sphere

Status

AC ↔ D# mapping

Decisions

D1 — Single CTE: UPDATE + audit INSERT atomic (v2: HAVING guard)

D2 — System actor UUID 00000000-0000-0000-0000-000000000000

D3 — Audit jsonb shape: rule + count (before), activity_ids[] (after)

D4 — Pre-flight DO block RAISEs on missing SERVICES sphere

D5 — Test suite: 4 e2e cases covering happy / idempotent / pre-flight / selective

Out of scope (explicitly deferred)

Risks (post-mitigation)

Post-merge audit (P3 #20)

Verification

D2 — System actor UUID `00000000-0000-0000-0000-000000000000`