Тема
10.03 Async Event Flow And Pub/Sub
Статус: черновик для обсуждения
1. Назначение документа
Этот документ фиксирует product-level decisions по asynchronous event flow для MVP.
Документ не задает exact Pub/Sub topic names, subscriptions, retry config or message schema.
Он фиксирует:
- какие business events нужны;
- какие flows должны быть asynchronous;
- что должно попадать в transaction timeline;
- как system должна вести себя при failed events;
- какие решения остаются на development team.
2. Основной принцип
MVP должен быть event-driven.
Gateways должны быстро принимать external requests/callbacks and hand off processing into durable async pipeline.
Core processing может занимать время, но merchant/provider/customer entry points не должны зависеть от synchronous full processing.
3. Merchant deposit create flow
Merchant вызывает:
http
POST /api/v2/depositMerchant Gateway должен вернуть hosted payment URL после того, как request/session safely accepted.
Product decision:
- Merchant Gateway должен сразу сгенерировать payment session и вернуть hosted payment URL.
- Как именно создается transaction record, решает development team.
- Transaction ID generation boundary также решает development team.
Возможные technical options:
- Merchant Gateway создает minimal transaction/session record and publishes event.
- Merchant Gateway создает request/session record, а Orchestrator Service создает transaction record.
- Merchant Gateway calls Orchestrator synchronously only for minimal accepted-state creation, then processing continues async.
Product requirement:
- merchant получает response only after safe durable acceptance;
- hosted payment URL available immediately;
- full routing/provider execution happens asynchronously;
- accepted request must not be lost;
- duplicate merchant
transactionIdmust not create duplicate transaction.
4. Hosted form while processing is not ready
Если hosted form открыта, но Orchestrator Service еще не создал или не подготовил transaction state, customer должен видеть loading state.
Hosted form не должна показывать technical statuses customer-у.
Customer-facing behavior:
- show loading;
- poll session/transaction state;
- when missing fields are required, show dynamic form;
- when provider redirect URL is ready, redirect customer;
- when transaction is rejected/unavailable, show generic customer-facing error.
5. Provider callback flow
Provider Gateway должен принимать callback even if transaction is not ready yet.
Product decision:
- Provider Gateway receives callback;
- Provider Gateway queues callback first;
- Provider Gateway responds OK only after durable queue publish;
- Orchestrator Service processes callback later;
- callback can be stored/processed later if transaction is not ready.
Question “как по-другому?”:
Теоретически gateway мог бы synchronously искать transaction and validate callback before responding provider-у, но для MVP это нежелательно, потому что:
- provider endpoint would depend on core availability;
- callback could timeout during internal processing;
- provider retry behavior can be unpredictable;
- callback acceptance should be resilient.
Therefore callback gateway should be async-oriented.
6. MVP business events
MVP должен поддерживать следующие business events:
DepositRequested;PaymentSessionCreated;RoutingStarted;RoutingCompleted;ProviderRequestStarted;ProviderRequestSucceeded;ProviderRequestFailed;ProviderCallbackReceived;ProviderCallbackProcessed;TransactionStatusChanged;MerchantWebhookRequested;MerchantWebhookDelivered;MerchantWebhookFailed;MerchantWebhookRetryScheduled;MerchantWebhookRetryExhausted.
Exact event names can be changed by development team.
Product requirement is not the exact name, but the fact that these business moments are traceable and processable.
7. Transaction timeline
Каждое важное business event должно быть reflected in transaction timeline.
Timeline should include:
- deposit requested;
- payment session created;
- routing started;
- routing completed;
- routing failed/rejected;
- provider request started;
- provider request succeeded;
- provider request failed;
- provider callback received;
- provider callback ignored/rejected/duplicate;
- provider callback processed;
- status changed;
- merchant webhook requested;
- merchant webhook delivered;
- merchant webhook failed;
- merchant webhook retry scheduled;
- merchant webhook retry exhausted;
- failed event processing, если влияет на transaction investigation.
Timeline visibility rules still apply.
Merchant user should see merchant-safe interpretation.
Platform user with permission can see technical/internal details.
8. Failed event processing
Если event обработался с ошибкой, это нужно сохранить.
Failed processing should be visible:
- in technical logs/metrics;
- in transaction timeline, если event относится к transaction and helps investigation;
- in alerting/monitoring, if failure affects processing.
Product reason:
- support/platform must understand where transaction got stuck;
- failed async processing must not disappear silently;
- retry/fallback behavior must be explainable.
9. Dead-letter queue explanation
Dead-letter queue, или DLQ, - это technical quarantine for messages/events that could not be processed after retries.
Пример:
text
Event failed 10 times
System stops retrying it forever
Event goes to dead-letter queue
Ops/dev investigate why it failedЗачем это нужно:
- чтобы broken event не крутился бесконечно;
- чтобы queue не забивалась одним bad message;
- чтобы ops/dev могли расследовать failed event;
- чтобы system had a safe place for poison messages.
Product owner не должен проектировать DLQ UI или exact retry count.
Product requirement:
- failed events must not be lost silently;
- failed events must be observable;
- system must have a technical way to isolate events that cannot be processed after retries.
В MVP отдельный Back Office UI для DLQ не нужен.
Development/ops team should decide whether to use Pub/Sub dead-letter topics, custom failed-event table, alerting, or another mechanism.
10. Merchant webhook sending
Кто именно отправляет merchant webhooks, решает development team.
Allowed options:
- Orchestrator Service module/worker;
- dedicated webhook worker inside same deployable service;
- separate lightweight worker, if development team justifies it.
Product requirements:
- webhook sending is async;
- webhook retry is supported;
- webhook delivery attempts are stored;
- webhook timeline events are stored;
- webhook resend from Back Office uses original transaction webhook URLs;
- webhook retry does not block transaction creation.
11. Merchant webhook retry events
Webhook retry should be represented as events.
Minimum events:
MerchantWebhookRequested;MerchantWebhookFailed;MerchantWebhookRetryScheduled;MerchantWebhookDelivered;MerchantWebhookRetryExhausted.
Retry behavior:
- exponential backoff;
- exact intervals/max duration defined by development team;
- any non-2xx response is failed attempt;
- any 2xx response is delivered.
12. Provider request retry
Provider request retry is not required in MVP as separate retry mechanism.
MVP behavior:
- if provider/SUB MID execution fails because of provider error/timeout/velocity/inactive config, system uses routing fallback;
- fallback goes to next candidate according to routing rules and group fallback configuration;
- if no fallback path remains, transaction becomes
REJECTEDor appropriate failed internal state according to transaction lifecycle rules.
Provider-specific retry may be added later if needed, but it is not MVP requirement.
13. Idempotency
Exact idempotency implementation is development team decision.
Product requirements:
- duplicate merchant
transactionIdmust not create duplicate transaction; - duplicate provider callbacks must not create duplicate status changes/webhooks;
- duplicate queue events must not create duplicate side effects;
- repeated webhook sending should be controlled by delivery attempt records/retry logic;
- event processing should be traceable.
Development team can choose:
- event ID based idempotency;
- inbox table;
- unique constraints;
- idempotency keys;
- message deduplication;
- another reliable approach.
14. Event ownership
Exact publisher/consumer ownership is technical design.
Product-level expectation:
| Event | Likely publisher | Likely consumer |
|---|---|---|
DepositRequested | Merchant Gateway | Orchestrator Service |
PaymentSessionCreated | Merchant Gateway or Orchestrator Service | Forms Service / Orchestrator Service |
RoutingStarted | Orchestrator Service | Timeline / monitoring |
RoutingCompleted | Orchestrator Service | Timeline / provider execution |
ProviderRequestStarted | Orchestrator Service | Timeline / monitoring |
ProviderCallbackReceived | Provider Gateway | Orchestrator Service |
ProviderCallbackProcessed | Orchestrator Service | Transaction lifecycle / timeline |
TransactionStatusChanged | Orchestrator Service | Merchant webhook pipeline / timeline |
MerchantWebhookRequested | Orchestrator Service | Webhook worker/module |
MerchantWebhookRetryScheduled | Webhook worker/module | Webhook worker/module |
This table is indicative, not final technical contract.
15. Acceptance Criteria
Async Event Flow считается согласованным для MVP, если:
- Merchant Gateway returns hosted payment URL after durable acceptance.
- Transaction/session creation boundary is development team decision.
- Hosted form shows loading while transaction/session processing is not ready.
- Provider Gateway accepts callbacks asynchronously.
- Provider Gateway queues callback before OK response.
- MVP business events are listed.
- Important business events are reflected in transaction timeline.
- Failed event processing is saved and visible for investigation.
- DLQ concept is explained as technical mechanism.
- Separate DLQ UI is not required in MVP.
- Merchant webhook retry events are required.
- Provider request retry is not required; routing fallback is MVP behavior.
- Event idempotency implementation is development team decision.
16. Open Questions
Product open questions отсутствуют.
Technical/design follow-up:
- Development team должна определить exact Pub/Sub topics/subscriptions.
- Development team должна определить exact event schemas.
- Development team должна определить retry/dead-letter strategy.
- Development team должна определить idempotency approach.
- Development team должна определить exact publisher/consumer ownership.
- Development team должна определить how timeline is written from async events.