zitadel/internal/repository
Livio Spring 8537805ea5
feat(notification): use event worker pool (#8962)
# Which Problems Are Solved

The current handling of notification follows the same pattern as all
other projections:
Created events are handled sequentially (based on "position") by a
handler. During the process, a lot of information is aggregated (user,
texts, templates, ...).
This leads to back pressure on the projection since the handling of
events might take longer than the time before a new event (to be
handled) is created.

# How the Problems Are Solved

- The current user notification handler creates separate notification
events based on the user / session events.
- These events contain all the present and required information
including the userID.
- These notification events get processed by notification workers, which
gather the necessary information (recipient address, texts, templates)
to send out these notifications.
- If a notification fails, a retry event is created based on the current
notification request including the current state of the user (this
prevents race conditions, where a user is changed in the meantime and
the notification already gets the new state).
- The retry event will be handled after a backoff delay. This delay
increases with every attempt.
- If the configured amount of attempts is reached or the message expired
(based on config), a cancel event is created, letting the workers know,
the notification must no longer be handled.
- In case of successful send, a sent event is created for the
notification aggregate and the existing "sent" events for the user /
session object is stored.
- The following is added to the defaults.yaml to allow configuration of
the notification workers:
```yaml

Notifications:
  # The amount of workers processing the notification request events.
  # If set to 0, no notification request events will be handled. This can be useful when running in
  # multi binary / pod setup and allowing only certain executables to process the events.
  Workers: 1 # ZITADEL_NOTIFIACATIONS_WORKERS
  # The amount of events a single worker will process in a run.
  BulkLimit: 10 # ZITADEL_NOTIFIACATIONS_BULKLIMIT
  # Time interval between scheduled notifications for request events
  RequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_REQUEUEEVERY
  # The amount of workers processing the notification retry events.
  # If set to 0, no notification retry events will be handled. This can be useful when running in
  # multi binary / pod setup and allowing only certain executables to process the events.
  RetryWorkers: 1 # ZITADEL_NOTIFIACATIONS_RETRYWORKERS
  # Time interval between scheduled notifications for retry events
  RetryRequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_RETRYREQUEUEEVERY
  # Only instances are projected, for which at least a projection-relevant event exists within the timeframe
  # from HandleActiveInstances duration in the past until the projection's current time
  # If set to 0 (default), every instance is always considered active
  HandleActiveInstances: 0s # ZITADEL_NOTIFIACATIONS_HANDLEACTIVEINSTANCES
  # The maximum duration a transaction remains open
  # before it spots left folding additional events
  # and updates the table.
  TransactionDuration: 1m # ZITADEL_NOTIFIACATIONS_TRANSACTIONDURATION
  # Automatically cancel the notification after the amount of failed attempts
  MaxAttempts: 3 # ZITADEL_NOTIFIACATIONS_MAXATTEMPTS
  # Automatically cancel the notification if it cannot be handled within a specific time
  MaxTtl: 5m  # ZITADEL_NOTIFIACATIONS_MAXTTL
  # Failed attempts are retried after a confogired delay (with exponential backoff).
  # Set a minimum and maximum delay and a factor for the backoff
  MinRetryDelay: 1s  # ZITADEL_NOTIFIACATIONS_MINRETRYDELAY
  MaxRetryDelay: 20s # ZITADEL_NOTIFIACATIONS_MAXRETRYDELAY
  # Any factor below 1 will be set to 1
  RetryDelayFactor: 1.5 # ZITADEL_NOTIFIACATIONS_RETRYDELAYFACTOR
```


# Additional Changes

None

# Additional Context

- closes #8931
2024-11-27 15:01:17 +00:00
..
action fix(setup): init projections (#7194) 2024-01-25 17:28:20 +01:00
asset refactor: rename package errors to zerrors (#7039) 2023-12-08 15:30:55 +01:00
authrequest fix(oidc): store requested response_mode (#8145) 2024-06-17 09:50:12 +00:00
debug_events feat: add debug events API (#8533) 2024-09-11 08:24:00 +00:00
deviceauth feat(oidc): sid claim for id_tokens issued through login V1 (#8525) 2024-09-03 13:19:00 +00:00
execution feat(v3alpha): write actions (#8225) 2024-07-31 14:42:12 +02:00
feature feat(OIDC): add back channel logout (#8837) 2024-10-31 15:57:17 +01:00
flow refactor: rename package errors to zerrors (#7039) 2023-12-08 15:30:55 +01:00
idp feat(saml): allow setting nameid-format and alternative mapping for transient format (#7979) 2024-05-23 05:04:07 +00:00
idpconfig refactor: rename package errors to zerrors (#7039) 2023-12-08 15:30:55 +01:00
idpintent fix(setup): init projections (#7194) 2024-01-25 17:28:20 +01:00
instance feat: Add Twilio Verification Service (#8678) 2024-09-26 09:14:33 +02:00
keypair feat(v3alpha): web key resource (#8262) 2024-08-14 14:18:14 +00:00
limits fix(setup): init projections (#7194) 2024-01-25 17:28:20 +01:00
member refactor: rename package errors to zerrors (#7039) 2023-12-08 15:30:55 +01:00
metadata refactor: rename package errors to zerrors (#7039) 2023-12-08 15:30:55 +01:00
milestone fix(milestones): use previous spelling for milestone types (#8886) 2024-11-11 11:28:27 +00:00
notification feat(notification): use event worker pool (#8962) 2024-11-27 15:01:17 +00:00
oidcsession perf(oidc): optimize token creation (#7822) 2024-05-16 07:07:56 +02:00
org fix(eventstore): cleanup org fields on remove (#8946) 2024-11-26 15:26:41 +00:00
policy feat(cnsl): docs link can be customized and custom button is available (#7840) 2024-05-13 16:01:50 +02:00
project feat(OIDC): add back channel logout (#8837) 2024-10-31 15:57:17 +01:00
pseudo feat: push telemetry (#6027) 2023-07-06 08:38:13 +02:00
quota fix(setup): init projections (#7194) 2024-01-25 17:28:20 +01:00
restrictions fix(setup): init projections (#7194) 2024-01-25 17:28:20 +01:00
session feat(OIDC): add back channel logout (#8837) 2024-10-31 15:57:17 +01:00
sessionlogout feat(OIDC): add back channel logout (#8837) 2024-10-31 15:57:17 +01:00
settings refactor: rename package errors to zerrors (#7039) 2023-12-08 15:30:55 +01:00
target feat: add action v2 execution on requests and responses (#7637) 2024-05-04 11:55:57 +02:00
user feat(OIDC): add back channel logout (#8837) 2024-10-31 15:57:17 +01:00
usergrant refactor(fmt): run gci on complete project (#7557) 2024-04-03 10:43:43 +00:00
webkey feat(v3alpha): web key resource (#8262) 2024-08-14 14:18:14 +00:00