Skip to content

10.03 Async Event Flow And Pub/Sub

Статус: черновик для обсуждения

1. Назначение документа

Этот документ фиксирует product-level decisions по asynchronous event flow для MVP.

Документ не задает exact Pub/Sub topic names, subscriptions, retry config or message schema.

Он фиксирует:

  • какие business events нужны;
  • какие flows должны быть asynchronous;
  • что должно попадать в transaction timeline;
  • как system должна вести себя при failed events;
  • какие решения остаются на development team.

2. Основной принцип

MVP должен быть event-driven.

Gateways должны быстро принимать external requests/callbacks and hand off processing into durable async pipeline.

Core processing может занимать время, но merchant/provider/customer entry points не должны зависеть от synchronous full processing.

3. Merchant deposit create flow

Merchant вызывает:

http
POST /api/v2/deposit

Merchant Gateway должен вернуть hosted payment URL после того, как request/session safely accepted.

Product decision:

  • Merchant Gateway должен сразу сгенерировать payment session и вернуть hosted payment URL.
  • Как именно создается transaction record, решает development team.
  • Transaction ID generation boundary также решает development team.

Возможные technical options:

  1. Merchant Gateway создает minimal transaction/session record and publishes event.
  2. Merchant Gateway создает request/session record, а Orchestrator Service создает transaction record.
  3. Merchant Gateway calls Orchestrator synchronously only for minimal accepted-state creation, then processing continues async.

Product requirement:

  • merchant получает response only after safe durable acceptance;
  • hosted payment URL available immediately;
  • full routing/provider execution happens asynchronously;
  • accepted request must not be lost;
  • duplicate merchant transactionId must not create duplicate transaction.

4. Hosted form while processing is not ready

Если hosted form открыта, но Orchestrator Service еще не создал или не подготовил transaction state, customer должен видеть loading state.

Hosted form не должна показывать technical statuses customer-у.

Customer-facing behavior:

  • show loading;
  • poll session/transaction state;
  • when missing fields are required, show dynamic form;
  • when provider redirect URL is ready, redirect customer;
  • when transaction is rejected/unavailable, show generic customer-facing error.

5. Provider callback flow

Provider Gateway должен принимать callback even if transaction is not ready yet.

Product decision:

  • Provider Gateway receives callback;
  • Provider Gateway queues callback first;
  • Provider Gateway responds OK only after durable queue publish;
  • Orchestrator Service processes callback later;
  • callback can be stored/processed later if transaction is not ready.

Question “как по-другому?”:

Теоретически gateway мог бы synchronously искать transaction and validate callback before responding provider-у, но для MVP это нежелательно, потому что:

  • provider endpoint would depend on core availability;
  • callback could timeout during internal processing;
  • provider retry behavior can be unpredictable;
  • callback acceptance should be resilient.

Therefore callback gateway should be async-oriented.

6. MVP business events

MVP должен поддерживать следующие business events:

  • DepositRequested;
  • PaymentSessionCreated;
  • RoutingStarted;
  • RoutingCompleted;
  • ProviderRequestStarted;
  • ProviderRequestSucceeded;
  • ProviderRequestFailed;
  • ProviderCallbackReceived;
  • ProviderCallbackProcessed;
  • TransactionStatusChanged;
  • MerchantWebhookRequested;
  • MerchantWebhookDelivered;
  • MerchantWebhookFailed;
  • MerchantWebhookRetryScheduled;
  • MerchantWebhookRetryExhausted.

Exact event names can be changed by development team.

Product requirement is not the exact name, but the fact that these business moments are traceable and processable.

7. Transaction timeline

Каждое важное business event должно быть reflected in transaction timeline.

Timeline should include:

  • deposit requested;
  • payment session created;
  • routing started;
  • routing completed;
  • routing failed/rejected;
  • provider request started;
  • provider request succeeded;
  • provider request failed;
  • provider callback received;
  • provider callback ignored/rejected/duplicate;
  • provider callback processed;
  • status changed;
  • merchant webhook requested;
  • merchant webhook delivered;
  • merchant webhook failed;
  • merchant webhook retry scheduled;
  • merchant webhook retry exhausted;
  • failed event processing, если влияет на transaction investigation.

Timeline visibility rules still apply.

Merchant user should see merchant-safe interpretation.

Platform user with permission can see technical/internal details.

8. Failed event processing

Если event обработался с ошибкой, это нужно сохранить.

Failed processing should be visible:

  • in technical logs/metrics;
  • in transaction timeline, если event относится к transaction and helps investigation;
  • in alerting/monitoring, if failure affects processing.

Product reason:

  • support/platform must understand where transaction got stuck;
  • failed async processing must not disappear silently;
  • retry/fallback behavior must be explainable.

9. Dead-letter queue explanation

Dead-letter queue, или DLQ, - это technical quarantine for messages/events that could not be processed after retries.

Пример:

text
Event failed 10 times
System stops retrying it forever
Event goes to dead-letter queue
Ops/dev investigate why it failed

Зачем это нужно:

  • чтобы broken event не крутился бесконечно;
  • чтобы queue не забивалась одним bad message;
  • чтобы ops/dev могли расследовать failed event;
  • чтобы system had a safe place for poison messages.

Product owner не должен проектировать DLQ UI или exact retry count.

Product requirement:

  • failed events must not be lost silently;
  • failed events must be observable;
  • system must have a technical way to isolate events that cannot be processed after retries.

В MVP отдельный Back Office UI для DLQ не нужен.

Development/ops team should decide whether to use Pub/Sub dead-letter topics, custom failed-event table, alerting, or another mechanism.

10. Merchant webhook sending

Кто именно отправляет merchant webhooks, решает development team.

Allowed options:

  • Orchestrator Service module/worker;
  • dedicated webhook worker inside same deployable service;
  • separate lightweight worker, if development team justifies it.

Product requirements:

  • webhook sending is async;
  • webhook retry is supported;
  • webhook delivery attempts are stored;
  • webhook timeline events are stored;
  • webhook resend from Back Office uses original transaction webhook URLs;
  • webhook retry does not block transaction creation.

11. Merchant webhook retry events

Webhook retry should be represented as events.

Minimum events:

  • MerchantWebhookRequested;
  • MerchantWebhookFailed;
  • MerchantWebhookRetryScheduled;
  • MerchantWebhookDelivered;
  • MerchantWebhookRetryExhausted.

Retry behavior:

  • exponential backoff;
  • exact intervals/max duration defined by development team;
  • any non-2xx response is failed attempt;
  • any 2xx response is delivered.

12. Provider request retry

Provider request retry is not required in MVP as separate retry mechanism.

MVP behavior:

  • if provider/SUB MID execution fails because of provider error/timeout/velocity/inactive config, system uses routing fallback;
  • fallback goes to next candidate according to routing rules and group fallback configuration;
  • if no fallback path remains, transaction becomes REJECTED or appropriate failed internal state according to transaction lifecycle rules.

Provider-specific retry may be added later if needed, but it is not MVP requirement.

13. Idempotency

Exact idempotency implementation is development team decision.

Product requirements:

  • duplicate merchant transactionId must not create duplicate transaction;
  • duplicate provider callbacks must not create duplicate status changes/webhooks;
  • duplicate queue events must not create duplicate side effects;
  • repeated webhook sending should be controlled by delivery attempt records/retry logic;
  • event processing should be traceable.

Development team can choose:

  • event ID based idempotency;
  • inbox table;
  • unique constraints;
  • idempotency keys;
  • message deduplication;
  • another reliable approach.

14. Event ownership

Exact publisher/consumer ownership is technical design.

Product-level expectation:

EventLikely publisherLikely consumer
DepositRequestedMerchant GatewayOrchestrator Service
PaymentSessionCreatedMerchant Gateway or Orchestrator ServiceForms Service / Orchestrator Service
RoutingStartedOrchestrator ServiceTimeline / monitoring
RoutingCompletedOrchestrator ServiceTimeline / provider execution
ProviderRequestStartedOrchestrator ServiceTimeline / monitoring
ProviderCallbackReceivedProvider GatewayOrchestrator Service
ProviderCallbackProcessedOrchestrator ServiceTransaction lifecycle / timeline
TransactionStatusChangedOrchestrator ServiceMerchant webhook pipeline / timeline
MerchantWebhookRequestedOrchestrator ServiceWebhook worker/module
MerchantWebhookRetryScheduledWebhook worker/moduleWebhook worker/module

This table is indicative, not final technical contract.

15. Acceptance Criteria

Async Event Flow считается согласованным для MVP, если:

  • Merchant Gateway returns hosted payment URL after durable acceptance.
  • Transaction/session creation boundary is development team decision.
  • Hosted form shows loading while transaction/session processing is not ready.
  • Provider Gateway accepts callbacks asynchronously.
  • Provider Gateway queues callback before OK response.
  • MVP business events are listed.
  • Important business events are reflected in transaction timeline.
  • Failed event processing is saved and visible for investigation.
  • DLQ concept is explained as technical mechanism.
  • Separate DLQ UI is not required in MVP.
  • Merchant webhook retry events are required.
  • Provider request retry is not required; routing fallback is MVP behavior.
  • Event idempotency implementation is development team decision.

16. Open Questions

Product open questions отсутствуют.

Technical/design follow-up:

  1. Development team должна определить exact Pub/Sub topics/subscriptions.
  2. Development team должна определить exact event schemas.
  3. Development team должна определить retry/dead-letter strategy.
  4. Development team должна определить idempotency approach.
  5. Development team должна определить exact publisher/consumer ownership.
  6. Development team должна определить how timeline is written from async events.