Improvement backlog
A consolidated catalogue of forward-looking improvements surfaced while filling the lead-entity docs (Phase 2 of the domain documentation initiative, May 2026). Each item names its source entity (or entities), describes the gap, and proposes a direction.
This list is not a planning document — it is a discovery snapshot. Use it as input to ticket creation, not as a roadmap.
Every item here is also listed in the Recommendations section of its source entity doc. This page groups them by theme so they can be triaged as a single concern (e.g. all idempotency gaps planned together).
How to read
- Source — entity (or entities) where the gap was observed.
- Gap — what’s missing or fragile today.
- Direction — suggested fix, not a final design.
- Risk if ignored — what can go wrong if the gap stays open.
1. Idempotency & race conditions (high risk — financial)
| Source | Gap | Direction |
|---|---|---|
| Payment | externalId (gateway id) is not UNIQUE — webhook re-delivery can produce duplicate processing | Partial UNIQUE (external_id) WHERE external_id IS NOT NULL |
| Payment | createPayment is not idempotent — client retry produces two gateway checkouts | Idempotency key in the create DTO |
| Customer pass | LiqPay webhook into pass purchase has no DB-level dedup | Same partial UNIQUE pattern as Payment |
| Company subscription | Billing webhook (Mono) has no DB-level dedup | Same partial UNIQUE pattern |
| Wallet | Wallet-to-wallet transfers (future) cannot rely on compare-and-swap UPDATE — need SELECT FOR UPDATE | Explicit row-level locking when multi-row mutations land |
| Session | syncStatus is not transactionally isolated against booking writes — race window between count-read and status-write | SELECT ... FOR UPDATE on the session row or pg_advisory_lock |
| Booking | No lock on (sessionId, customerId) during booking insert — concurrent requests can create duplicate bookings | Unique index + ON CONFLICT, or row-level lock on session before insert |
| Message | Fan-out is Promise.all after the transaction commits — partial fan-out on crash, duplicates on retry | Either transactional fan-out OR UNIQUE (message_id, customer_id) with ON CONFLICT DO NOTHING |
Risk if ignored: double-charging, double-issuing, double-notification, lost revenue, customer-trust events.
2. Missing DB constraints (medium risk — schema integrity)
| Source | Gap | Direction |
|---|---|---|
| Company member | One OWNER per company is enforced only in CompanyMembersService | Partial UNIQUE (company_id) WHERE role = 'OWNER' |
| Company member | (company_id, user_id) not UNIQUE — same user can theoretically appear twice in one company | UNIQUE constraint |
| Pass entitlement template | (pass_id, activity_id) not UNIQUE — duplicates allowed today | UNIQUE constraint (already OPEN) |
| Activity | activity.company_id == category.company_id (or category is platform-wide) is not validated when linking | Assertion in activity-category.service before insert |
| Payment settings | (company_id, platform) not UNIQUE — multiple LiqPay/Mono configs per company are possible | UNIQUE constraint |
| Sphere | default_activity_type ∈ allowed_activity_types not CHECK’d | DB CHECK |
| Time slot | start_time is text without format CHECK | CHECK (start_time ~ '^([01]?\d|2[0-3]):[0-5]\d$') |
| Time slot | day_of_week has no range CHECK | CHECK (day_of_week BETWEEN 0 AND 6) |
| Session | starts_at < ends_at not CHECK’d | DB CHECK |
| Pass | validity_days > 0 not CHECK’d | DB CHECK |
| Customer pass | valid_until > activated_at not CHECK’d when both set | DB CHECK with NULL handling |
| Message | (target_type, target_id) consistency not CHECK’d | CHECK ((target_type IN ('ALL','MANUAL') AND target_id IS NULL) OR (target_type IN ('ACTIVITY','SESSION') AND target_id IS NOT NULL)) |
| Company | owner_id IS NULL is allowed indefinitely — only the initial creation transaction needs it | Service-level guard OR DB trigger that prevents update-to-NULL after creation |
| User | users.scope is nullable — allows gradual backfill but leaves orphan rows ambiguous | Flip to NOT NULL once count(*) WHERE scope IS NULL = 0 is stable for a release cycle; track via cron or one-shot ticket (auth-sync writes the value on every login, so orphans heal naturally) |
Risk if ignored: data drift, integrity violations that only surface at use time, hard-to-reproduce production bugs.
3. State-machine enforcement (medium risk)
Most status enums have no DB-level transition rules and no audit log of who/when changed status. Transitions ride generic PATCH update endpoints, scattered across services.
| Source | State machine | Where transitions live today |
|---|---|---|
| Activity | DRAFT ↔ PUBLISHED, * → CANCELLED, * → ARCHIVED | Generic PATCH; only → ARCHIVED has a dedicated path |
| Booking | 5-state machine (PENDING, CONFIRMED, CANCELLED, REFUNDED, PENDING_PAYMENT) | Scattered across bookings.service.ts and bookings-client.service.ts |
| Company subscription | trialing → active → past_due → cancelled | Three sources (admin controller, webhook, three cron methods) |
| Customer pass | 6-state machine (AWAITING_PAYMENT → PENDING → ACTIVE ↔ PAUSED → EXPIRED / CANCELLED) | Mixed: service + webhook + scheduler |
| Session | AVAILABLE ↔ BOOKED (auto-sync) / * → CANCELLED (admin) | syncStatus plus admin PATCH |
| Payment | One-shot pending → succeeded / failed | processWebhook |
Direction:
- Typed state-machine helpers per entity (TS) that validate source status before any UPDATE.
- DB-level CHECK constraints on allowed transitions (PostgreSQL
BEFORE UPDATEtrigger or stored function). - Dedicated transition endpoints (
/publish,/cancel,/transfer-ownership) instead of generic PATCH — each with explicit DTO and audit logging.
Risk if ignored: illegal transitions slip through (CANCELLED → CONFIRMED, cancelled subscription → active), hard-to-debug status corruption, no record for disputes.
4. Audit logging gaps (high risk — disputes, compliance)
No entity except Sphere (which has Sphere audit log) has a per-row mutation log. Disputes today rely on inferring “what happened” from updated_at timestamps and wallet_transactions snapshots.
| Source | Missing log | Why it matters |
|---|---|---|
| Booking | Status-transition log | ”Why was my booking cancelled?” |
| Customer pass | Status-transition log | ”When and why did my pass expire / get cancelled?” |
| Pass | Template-change log | ”I bought it under different terms” |
| Company subscription | Status + plan-change log | Financial audit |
| Payment | Status-transition log | Financial audit, reconciliation |
| Payment settings | Credential-change log | Security incident response — “who rotated the LiqPay key when” |
| Company member | Role-change / transfer-ownership log | ”Who promoted whom OWNER” |
| Notification | (optional) read/unread history | Lower priority — UX feature |
Direction: a generic audit_log table or per-entity sibling table — {entity, entity_id, actor_user_id, before, after, reason, timestamp}. Wire into the service layer via a generic decorator.
5. Cross-schema reference strategy (revise ADR)
The original ADR claimed “cross-schema references are never FK’d”. Reality found during Phase 2 is mixed:
- No FK (intentional decoupling):
activities.activities.company_id,activities.locations.company_id,activities.categories.company_id,payments.payment_settings.company_id,notifications.message.target_id(polymorphic),payments.payments.source_id(polymorphic). - FK with RESTRICT (template-tier ownership):
passes.pass.company_id,passes.customer_pass.customer_id,passes.customer_pass.pass_id,passes.pass_entitlement_template.pass_id,passes.pass_entitlement_template.activity_id. - FK with CASCADE (tight booking-side coupling):
bookings.bookings.session_id,bookings.bookings.customer_id,wallet.wallet.customer_id,wallet.wallet.company_id,wallet.refund_request.booking_id,notifications.message.company_id,notifications.message.sent_by.
Direction: revise the ADR to document the three-way pattern explicitly: when to omit FK, when to use RESTRICT, when to use CASCADE. Add a checklist to the architect agent.
Risk if ignored: new developers assume “no cross-schema FK ever” and miss the dependency between passes and companies, or create accidental cross-schema CASCADE chains.
6. Polymorphic columns are plain text (low risk, but tech-debt)
| Source | Column | Known values |
|---|---|---|
| Payment | source_type | BOOKING, ORDER (default), TOPUP, PASS_PURCHASE, SUBSCRIPTION_PAYMENT |
| Wallet transaction | source_type | BOOKING, MANUAL_DEBIT, MANUAL_CREDIT, REFUND, LIQPAY, … |
| Company subscription | renewal_gateway | implied: liqpay, mono |
| Wallet | currency | UAH (only one in use today) |
Direction: promote all of these to Postgres enums. Schema migrations are cheap; the visibility gain at insert/select is real (no typos, IDE autocomplete, exhaustiveness checks in TS).
7. Soft-delete & retention (medium risk — compliance, disputes)
Most entities use hard delete + CASCADE. Several need historical preservation for finance / compliance.
| Source | Today | Why soft-delete matters |
|---|---|---|
| Company | Hard delete, cascade everywhere | Tenant churn, GDPR right-to-be-forgotten vs invoice retention |
| Company customer | Hard delete, cascade through bookings/wallets | Finance / support need history |
| User | No soft-disable | Admin needs to disable users without Supabase hard-delete |
| Wallet | Hard delete via CASCADE | Financial history loss on customer/company delete |
| Notification | Hard delete via CASCADE on User | Broadcast delivery proof lost when recipient deleted |
Direction: introduce archived_at / deleted_at columns (or per-entity status enum extensions) and replace CASCADE with soft-archive workflows. Hard delete becomes an explicit admin operation, not a side effect.
Scheduled drops (TTL’d backup tables)
These tables exist deliberately, with a hard drop date. They are NOT permanent state — they’re a one-release-cycle safety window after a destructive migration.
| Table | Created by | Drop target | Follow-up ticket |
|---|---|---|---|
companies.company_member_legacy_identity | Migration 0040_global_user_identity.sql (ticket 869dee6n5) | 2026-07-15 | 869denpn8 |
If a row in this table is touched (read/update) between creation and drop, that is a signal the migration’s conflict-resolved backfill picked the wrong winner for that user — investigate before the drop date.
8. Cron / retry / timeout policies missing
Several flows have producers but no “what if this never resolves” cleanup.
| Source | Stuck state | Direction |
|---|---|---|
| Payment | pending payments wait forever for a webhook that may never arrive | Cron: after N hours mark as failed with failureReason='webhook_timeout' |
| Booking | SLOT_ATTENDEE post-fact billing failure path is undefined | Add a BILLING_OVERDUE state (or use PENDING_PAYMENT) after N retry attempts |
| Customer pass | AWAITING_PAYMENT waits for LiqPay webhook indefinitely | Cron + timeout, same pattern as Payment |
| Customer pass | Total paused time can be infinite | Cap on cumulative pause days, OR auto-resume after N days |
9. Performance indices (low risk — readiness)
Hot read paths that would benefit from purposeful indices:
| Source | Query | Suggested index |
|---|---|---|
| Notification | Unread count per user | CREATE INDEX ... ON notifications (user_id) WHERE read_at IS NULL |
| Session | Active sessions by activity / sphere | (already has sphere_id); consider (activity_id, starts_at) for upcoming-sessions query |
| Booking | Customer’s upcoming bookings | (customer_id, status) WHERE status IN ('CONFIRMED','PENDING_PAYMENT') |
10. Naming clarity (low risk — onboarding)
Confusing enum names that consistently trip new developers and product folks:
| Source | Today | Better |
|---|---|---|
Session status | AVAILABLE / BOOKED / CANCELLED | OPEN / FULL / CANCELLED — “BOOKED” wrongly implies “has bookings” |
Booking status | PENDING / PENDING_PAYMENT | AWAITING_GATEWAY / AWAITING_TOPUP — current names look like duplicates |
| Notification | Mixed concerns in one notification_type enum | (Acceptable, but consider tagging events by “kind” if it grows past ~30 values) |
11. Snapshot semantics & user expectations
Several places snapshot a template at purchase time and never re-sync. This is by design but consistently surprises product/CX folks:
- Customer pass — snapshots Pass at purchase. Template edits do not affect already-bought passes.
- Booking —
pricesnapshots Session price at booking. Later session price changes do not refund or upcharge. - Subscription payment — snapshot of plan + period for that one payment attempt.
- Wallet transaction —
balanceBefore/balanceAftersnapshot per transaction.
Direction: document this pattern explicitly in Domain overview under a heading like “Snapshot semantics”. Today the convention is consistent but undocumented.
12. Credentials & secrets
| Source | Today | Direction |
|---|---|---|
Payment settings sibling tables (liqpay, mono_settings) | Plain-text storage of privateKey and token | Secrets manager (Vault / AWS Secrets / GCP Secret Manager) OR pgcrypto + KMS-managed key |
| Company subscription | mono_wallet_id (saved-card token) plain text | Same — encrypt at rest |
Risk if ignored: a DB dump exposes live payment credentials.
13. Recurrence model — weekly only
Time slot supports only dayOfWeek + startTime. No daily / monthly / specific-dates / interval-based recurrence.
Direction: add a nullable rrule text column (RFC 5545 RRULE). Falls back to the existing weekly fields when NULL; materialisation pipeline reads rrule first.
14. Documentation & observability gaps
- BullMQ queues — session generation, broadcast fan-out, and potentially others run through BullMQ. Queue names, payload shapes, retry/dedup config are only in code. Document in Catalog, Notifications context indexes.
- Push delivery status — Device token doesn’t track per-notification delivery success/failure.
- Broadcast delivery quality — Message sends to a count but offline customers are silently skipped; admin UI doesn’t show “20% of audience didn’t receive”.
- Subscription messages limit —
assertMessagesLimitcounts all resolved customers, including offline ones who never receive — billing for ghosts.
How to use this list
- Triaging: group items by theme and propose them as separate spec tickets (
/new-ticket). Each theme is its own spec — don’t bundle “all DB constraints” into one giant migration. - Per-spec impact: before opening a ticket, check the source entity’s
Recommendationssection for the latest framing — items here may have additional context the original entity owner added. - Priority hint: themes 1, 4, 12 are high-risk (financial / compliance / security). Themes 2, 3, 7 are medium. Themes 6, 9, 10 are low-priority polish.
Snapshot date: 2026-05-22. Items will be moved to “done” or “deferred” as tickets are completed; see git history of this file for the journey.