Skip to content

Improvement backlog

A consolidated catalogue of forward-looking improvements surfaced while filling the lead-entity docs (Phase 2 of the domain documentation initiative, May 2026). Each item names its source entity (or entities), describes the gap, and proposes a direction.

This list is not a planning document — it is a discovery snapshot. Use it as input to ticket creation, not as a roadmap.

Every item here is also listed in the Recommendations section of its source entity doc. This page groups them by theme so they can be triaged as a single concern (e.g. all idempotency gaps planned together).

How to read

  • Source — entity (or entities) where the gap was observed.
  • Gap — what’s missing or fragile today.
  • Direction — suggested fix, not a final design.
  • Risk if ignored — what can go wrong if the gap stays open.

1. Idempotency & race conditions (high risk — financial)

SourceGapDirection
PaymentexternalId (gateway id) is not UNIQUE — webhook re-delivery can produce duplicate processingPartial UNIQUE (external_id) WHERE external_id IS NOT NULL
PaymentcreatePayment is not idempotent — client retry produces two gateway checkoutsIdempotency key in the create DTO
Customer passLiqPay webhook into pass purchase has no DB-level dedupSame partial UNIQUE pattern as Payment
Company subscriptionBilling webhook (Mono) has no DB-level dedupSame partial UNIQUE pattern
WalletWallet-to-wallet transfers (future) cannot rely on compare-and-swap UPDATE — need SELECT FOR UPDATEExplicit row-level locking when multi-row mutations land
SessionsyncStatus is not transactionally isolated against booking writes — race window between count-read and status-writeSELECT ... FOR UPDATE on the session row or pg_advisory_lock
BookingNo lock on (sessionId, customerId) during booking insert — concurrent requests can create duplicate bookingsUnique index + ON CONFLICT, or row-level lock on session before insert
MessageFan-out is Promise.all after the transaction commits — partial fan-out on crash, duplicates on retryEither transactional fan-out OR UNIQUE (message_id, customer_id) with ON CONFLICT DO NOTHING

Risk if ignored: double-charging, double-issuing, double-notification, lost revenue, customer-trust events.


2. Missing DB constraints (medium risk — schema integrity)

SourceGapDirection
Company memberOne OWNER per company is enforced only in CompanyMembersServicePartial UNIQUE (company_id) WHERE role = 'OWNER'
Company member(company_id, user_id) not UNIQUE — same user can theoretically appear twice in one companyUNIQUE constraint
Pass entitlement template(pass_id, activity_id) not UNIQUE — duplicates allowed todayUNIQUE constraint (already OPEN)
Activityactivity.company_id == category.company_id (or category is platform-wide) is not validated when linkingAssertion in activity-category.service before insert
Payment settings(company_id, platform) not UNIQUE — multiple LiqPay/Mono configs per company are possibleUNIQUE constraint
Spheredefault_activity_type ∈ allowed_activity_types not CHECK’dDB CHECK
Time slotstart_time is text without format CHECKCHECK (start_time ~ '^([01]?\d|2[0-3]):[0-5]\d$')
Time slotday_of_week has no range CHECKCHECK (day_of_week BETWEEN 0 AND 6)
Sessionstarts_at < ends_at not CHECK’dDB CHECK
Passvalidity_days > 0 not CHECK’dDB CHECK
Customer passvalid_until > activated_at not CHECK’d when both setDB CHECK with NULL handling
Message(target_type, target_id) consistency not CHECK’dCHECK ((target_type IN ('ALL','MANUAL') AND target_id IS NULL) OR (target_type IN ('ACTIVITY','SESSION') AND target_id IS NOT NULL))
Companyowner_id IS NULL is allowed indefinitely — only the initial creation transaction needs itService-level guard OR DB trigger that prevents update-to-NULL after creation
Userusers.scope is nullable — allows gradual backfill but leaves orphan rows ambiguousFlip to NOT NULL once count(*) WHERE scope IS NULL = 0 is stable for a release cycle; track via cron or one-shot ticket (auth-sync writes the value on every login, so orphans heal naturally)

Risk if ignored: data drift, integrity violations that only surface at use time, hard-to-reproduce production bugs.


3. State-machine enforcement (medium risk)

Most status enums have no DB-level transition rules and no audit log of who/when changed status. Transitions ride generic PATCH update endpoints, scattered across services.

SourceState machineWhere transitions live today
ActivityDRAFTPUBLISHED, * → CANCELLED, * → ARCHIVEDGeneric PATCH; only → ARCHIVED has a dedicated path
Booking5-state machine (PENDING, CONFIRMED, CANCELLED, REFUNDED, PENDING_PAYMENT)Scattered across bookings.service.ts and bookings-client.service.ts
Company subscriptiontrialing → active → past_due → cancelledThree sources (admin controller, webhook, three cron methods)
Customer pass6-state machine (AWAITING_PAYMENTPENDINGACTIVEPAUSEDEXPIRED / CANCELLED)Mixed: service + webhook + scheduler
SessionAVAILABLEBOOKED (auto-sync) / * → CANCELLED (admin)syncStatus plus admin PATCH
PaymentOne-shot pending → succeeded / failedprocessWebhook

Direction:

  • Typed state-machine helpers per entity (TS) that validate source status before any UPDATE.
  • DB-level CHECK constraints on allowed transitions (PostgreSQL BEFORE UPDATE trigger or stored function).
  • Dedicated transition endpoints (/publish, /cancel, /transfer-ownership) instead of generic PATCH — each with explicit DTO and audit logging.

Risk if ignored: illegal transitions slip through (CANCELLED → CONFIRMED, cancelled subscription → active), hard-to-debug status corruption, no record for disputes.


4. Audit logging gaps (high risk — disputes, compliance)

No entity except Sphere (which has Sphere audit log) has a per-row mutation log. Disputes today rely on inferring “what happened” from updated_at timestamps and wallet_transactions snapshots.

SourceMissing logWhy it matters
BookingStatus-transition log”Why was my booking cancelled?”
Customer passStatus-transition log”When and why did my pass expire / get cancelled?”
PassTemplate-change log”I bought it under different terms”
Company subscriptionStatus + plan-change logFinancial audit
PaymentStatus-transition logFinancial audit, reconciliation
Payment settingsCredential-change logSecurity incident response — “who rotated the LiqPay key when”
Company memberRole-change / transfer-ownership log”Who promoted whom OWNER”
Notification(optional) read/unread historyLower priority — UX feature

Direction: a generic audit_log table or per-entity sibling table — {entity, entity_id, actor_user_id, before, after, reason, timestamp}. Wire into the service layer via a generic decorator.


5. Cross-schema reference strategy (revise ADR)

The original ADR claimed “cross-schema references are never FK’d”. Reality found during Phase 2 is mixed:

  • No FK (intentional decoupling): activities.activities.company_id, activities.locations.company_id, activities.categories.company_id, payments.payment_settings.company_id, notifications.message.target_id (polymorphic), payments.payments.source_id (polymorphic).
  • FK with RESTRICT (template-tier ownership): passes.pass.company_id, passes.customer_pass.customer_id, passes.customer_pass.pass_id, passes.pass_entitlement_template.pass_id, passes.pass_entitlement_template.activity_id.
  • FK with CASCADE (tight booking-side coupling): bookings.bookings.session_id, bookings.bookings.customer_id, wallet.wallet.customer_id, wallet.wallet.company_id, wallet.refund_request.booking_id, notifications.message.company_id, notifications.message.sent_by.

Direction: revise the ADR to document the three-way pattern explicitly: when to omit FK, when to use RESTRICT, when to use CASCADE. Add a checklist to the architect agent.

Risk if ignored: new developers assume “no cross-schema FK ever” and miss the dependency between passes and companies, or create accidental cross-schema CASCADE chains.


6. Polymorphic columns are plain text (low risk, but tech-debt)

SourceColumnKnown values
Paymentsource_typeBOOKING, ORDER (default), TOPUP, PASS_PURCHASE, SUBSCRIPTION_PAYMENT
Wallet transactionsource_typeBOOKING, MANUAL_DEBIT, MANUAL_CREDIT, REFUND, LIQPAY, …
Company subscriptionrenewal_gatewayimplied: liqpay, mono
WalletcurrencyUAH (only one in use today)

Direction: promote all of these to Postgres enums. Schema migrations are cheap; the visibility gain at insert/select is real (no typos, IDE autocomplete, exhaustiveness checks in TS).


7. Soft-delete & retention (medium risk — compliance, disputes)

Most entities use hard delete + CASCADE. Several need historical preservation for finance / compliance.

SourceTodayWhy soft-delete matters
CompanyHard delete, cascade everywhereTenant churn, GDPR right-to-be-forgotten vs invoice retention
Company customerHard delete, cascade through bookings/walletsFinance / support need history
UserNo soft-disableAdmin needs to disable users without Supabase hard-delete
WalletHard delete via CASCADEFinancial history loss on customer/company delete
NotificationHard delete via CASCADE on UserBroadcast delivery proof lost when recipient deleted

Direction: introduce archived_at / deleted_at columns (or per-entity status enum extensions) and replace CASCADE with soft-archive workflows. Hard delete becomes an explicit admin operation, not a side effect.

Scheduled drops (TTL’d backup tables)

These tables exist deliberately, with a hard drop date. They are NOT permanent state — they’re a one-release-cycle safety window after a destructive migration.

TableCreated byDrop targetFollow-up ticket
companies.company_member_legacy_identityMigration 0040_global_user_identity.sql (ticket 869dee6n5)2026-07-15869denpn8

If a row in this table is touched (read/update) between creation and drop, that is a signal the migration’s conflict-resolved backfill picked the wrong winner for that user — investigate before the drop date.


8. Cron / retry / timeout policies missing

Several flows have producers but no “what if this never resolves” cleanup.

SourceStuck stateDirection
Paymentpending payments wait forever for a webhook that may never arriveCron: after N hours mark as failed with failureReason='webhook_timeout'
BookingSLOT_ATTENDEE post-fact billing failure path is undefinedAdd a BILLING_OVERDUE state (or use PENDING_PAYMENT) after N retry attempts
Customer passAWAITING_PAYMENT waits for LiqPay webhook indefinitelyCron + timeout, same pattern as Payment
Customer passTotal paused time can be infiniteCap on cumulative pause days, OR auto-resume after N days

9. Performance indices (low risk — readiness)

Hot read paths that would benefit from purposeful indices:

SourceQuerySuggested index
NotificationUnread count per userCREATE INDEX ... ON notifications (user_id) WHERE read_at IS NULL
SessionActive sessions by activity / sphere(already has sphere_id); consider (activity_id, starts_at) for upcoming-sessions query
BookingCustomer’s upcoming bookings(customer_id, status) WHERE status IN ('CONFIRMED','PENDING_PAYMENT')

10. Naming clarity (low risk — onboarding)

Confusing enum names that consistently trip new developers and product folks:

SourceTodayBetter
Session statusAVAILABLE / BOOKED / CANCELLEDOPEN / FULL / CANCELLED — “BOOKED” wrongly implies “has bookings”
Booking statusPENDING / PENDING_PAYMENTAWAITING_GATEWAY / AWAITING_TOPUP — current names look like duplicates
NotificationMixed concerns in one notification_type enum(Acceptable, but consider tagging events by “kind” if it grows past ~30 values)

11. Snapshot semantics & user expectations

Several places snapshot a template at purchase time and never re-sync. This is by design but consistently surprises product/CX folks:

  • Customer pass — snapshots Pass at purchase. Template edits do not affect already-bought passes.
  • Bookingprice snapshots Session price at booking. Later session price changes do not refund or upcharge.
  • Subscription payment — snapshot of plan + period for that one payment attempt.
  • Wallet transactionbalanceBefore / balanceAfter snapshot per transaction.

Direction: document this pattern explicitly in Domain overview under a heading like “Snapshot semantics”. Today the convention is consistent but undocumented.


12. Credentials & secrets

SourceTodayDirection
Payment settings sibling tables (liqpay, mono_settings)Plain-text storage of privateKey and tokenSecrets manager (Vault / AWS Secrets / GCP Secret Manager) OR pgcrypto + KMS-managed key
Company subscriptionmono_wallet_id (saved-card token) plain textSame — encrypt at rest

Risk if ignored: a DB dump exposes live payment credentials.


13. Recurrence model — weekly only

Time slot supports only dayOfWeek + startTime. No daily / monthly / specific-dates / interval-based recurrence.

Direction: add a nullable rrule text column (RFC 5545 RRULE). Falls back to the existing weekly fields when NULL; materialisation pipeline reads rrule first.


14. Documentation & observability gaps

  • BullMQ queues — session generation, broadcast fan-out, and potentially others run through BullMQ. Queue names, payload shapes, retry/dedup config are only in code. Document in Catalog, Notifications context indexes.
  • Push delivery statusDevice token doesn’t track per-notification delivery success/failure.
  • Broadcast delivery qualityMessage sends to a count but offline customers are silently skipped; admin UI doesn’t show “20% of audience didn’t receive”.
  • Subscription messages limitassertMessagesLimit counts all resolved customers, including offline ones who never receive — billing for ghosts.

How to use this list

  1. Triaging: group items by theme and propose them as separate spec tickets (/new-ticket). Each theme is its own spec — don’t bundle “all DB constraints” into one giant migration.
  2. Per-spec impact: before opening a ticket, check the source entity’s Recommendations section for the latest framing — items here may have additional context the original entity owner added.
  3. Priority hint: themes 1, 4, 12 are high-risk (financial / compliance / security). Themes 2, 3, 7 are medium. Themes 6, 9, 10 are low-priority polish.

Snapshot date: 2026-05-22. Items will be moved to “done” or “deferred” as tickets are completed; see git history of this file for the journey.