Commit Graph

11 Commits

Author SHA1 Message Date
Livio Spring
8537805ea5
feat(notification): use event worker pool (#8962)
# Which Problems Are Solved

The current handling of notification follows the same pattern as all
other projections:
Created events are handled sequentially (based on "position") by a
handler. During the process, a lot of information is aggregated (user,
texts, templates, ...).
This leads to back pressure on the projection since the handling of
events might take longer than the time before a new event (to be
handled) is created.

# How the Problems Are Solved

- The current user notification handler creates separate notification
events based on the user / session events.
- These events contain all the present and required information
including the userID.
- These notification events get processed by notification workers, which
gather the necessary information (recipient address, texts, templates)
to send out these notifications.
- If a notification fails, a retry event is created based on the current
notification request including the current state of the user (this
prevents race conditions, where a user is changed in the meantime and
the notification already gets the new state).
- The retry event will be handled after a backoff delay. This delay
increases with every attempt.
- If the configured amount of attempts is reached or the message expired
(based on config), a cancel event is created, letting the workers know,
the notification must no longer be handled.
- In case of successful send, a sent event is created for the
notification aggregate and the existing "sent" events for the user /
session object is stored.
- The following is added to the defaults.yaml to allow configuration of
the notification workers:
```yaml

Notifications:
  # The amount of workers processing the notification request events.
  # If set to 0, no notification request events will be handled. This can be useful when running in
  # multi binary / pod setup and allowing only certain executables to process the events.
  Workers: 1 # ZITADEL_NOTIFIACATIONS_WORKERS
  # The amount of events a single worker will process in a run.
  BulkLimit: 10 # ZITADEL_NOTIFIACATIONS_BULKLIMIT
  # Time interval between scheduled notifications for request events
  RequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_REQUEUEEVERY
  # The amount of workers processing the notification retry events.
  # If set to 0, no notification retry events will be handled. This can be useful when running in
  # multi binary / pod setup and allowing only certain executables to process the events.
  RetryWorkers: 1 # ZITADEL_NOTIFIACATIONS_RETRYWORKERS
  # Time interval between scheduled notifications for retry events
  RetryRequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_RETRYREQUEUEEVERY
  # Only instances are projected, for which at least a projection-relevant event exists within the timeframe
  # from HandleActiveInstances duration in the past until the projection's current time
  # If set to 0 (default), every instance is always considered active
  HandleActiveInstances: 0s # ZITADEL_NOTIFIACATIONS_HANDLEACTIVEINSTANCES
  # The maximum duration a transaction remains open
  # before it spots left folding additional events
  # and updates the table.
  TransactionDuration: 1m # ZITADEL_NOTIFIACATIONS_TRANSACTIONDURATION
  # Automatically cancel the notification after the amount of failed attempts
  MaxAttempts: 3 # ZITADEL_NOTIFIACATIONS_MAXATTEMPTS
  # Automatically cancel the notification if it cannot be handled within a specific time
  MaxTtl: 5m  # ZITADEL_NOTIFIACATIONS_MAXTTL
  # Failed attempts are retried after a confogired delay (with exponential backoff).
  # Set a minimum and maximum delay and a factor for the backoff
  MinRetryDelay: 1s  # ZITADEL_NOTIFIACATIONS_MINRETRYDELAY
  MaxRetryDelay: 20s # ZITADEL_NOTIFIACATIONS_MAXRETRYDELAY
  # Any factor below 1 will be set to 1
  RetryDelayFactor: 1.5 # ZITADEL_NOTIFIACATIONS_RETRYDELAYFACTOR
```


# Additional Changes

None

# Additional Context

- closes #8931
2024-11-27 15:01:17 +00:00
Livio Spring
041af26917
feat(OIDC): add back channel logout (#8837)
# Which Problems Are Solved

Currently ZITADEL supports RP-initiated logout for clients. Back-channel
logout ensures that user sessions are terminated across all connected
applications, even if the user closes their browser or loses
connectivity providing a more secure alternative for certain use cases.

# How the Problems Are Solved

If the feature is activated and the client used for the authentication
has a back_channel_logout_uri configured, a
`session_logout.back_channel` will be registered. Once a user terminates
their session, a (notification) handler will send a SET (form POST) to
the registered uri containing a logout_token (with the user's ID and
session ID).

- A new feature "back_channel_logout" is added on system and instance
level
- A `back_channel_logout_uri` can be managed on OIDC applications
- Added a `session_logout` aggregate to register and inform about sent
`back_channel` notifications
- Added a `SecurityEventToken` channel and `Form`message type in the
notification handlers
- Added `TriggeredAtOrigin` fields to `HumanSignedOut` and
`TerminateSession` events for notification handling
- Exported various functions and types in the `oidc` package to be able
to reuse for token signing in the back_channel notifier.
- To prevent that current existing session termination events will be
handled, a setup step is added to set the `current_states` for the
`projections.notifications_back_channel_logout` to the current position

- [x] requires https://github.com/zitadel/oidc/pull/671

# Additional Changes

- Updated all OTEL dependencies to v1.29.0, since OIDC already updated
some of them to that version.
- Single Session Termination feature is correctly checked (fixed feature
mapping)

# Additional Context

- closes https://github.com/zitadel/zitadel/issues/8467
- TODO:
  - Documentation
  - UI to be done: https://github.com/zitadel/zitadel/issues/8469

---------

Co-authored-by: Hidde Wieringa <hidde@hiddewieringa.nl>
2024-10-31 15:57:17 +01:00
Silvan
2243306ef6
feat(cmd): mirror (#7004)
# Which Problems Are Solved

Adds the possibility to mirror an existing database to a new one. 

For that a new command was added `zitadel mirror`. Including it's
subcommands for a more fine grained mirror of the data.

Sub commands:

* `zitadel mirror eventstore`: copies only events and their unique
constraints
* `zitadel mirror system`: mirrors the data of the `system`-schema
*  `zitadel mirror projections`: runs all projections
*  `zitadel mirror auth`: copies auth requests
* `zitadel mirror verify`: counts the amount of rows in the source and
destination database and prints the diff.

The command requires one of the following flags:
* `--system`: copies all instances of the system
* `--instance <instance-id>`, `--instance <comma separated list of
instance ids>`: copies only the defined instances

The command is save to execute multiple times by adding the
`--replace`-flag. This replaces currently existing data except of the
`events`-table

# Additional Changes

A `--for-mirror`-flag was added to `zitadel setup` to prepare the new
database. The flag skips the creation of the first instances and initial
run of projections.

It is now possible to skip the creation of the first instance during
setup by setting `FirstInstance.Skip` to true in the steps
configuration.

# Additional info

It is currently not possible to merge multiple databases. See
https://github.com/zitadel/zitadel/issues/7964 for more details.

It is currently not possible to use files. See
https://github.com/zitadel/zitadel/issues/7966 for more information.

closes https://github.com/zitadel/zitadel/issues/7586
closes https://github.com/zitadel/zitadel/issues/7486

### Definition of Ready

- [x] I am happy with the code
- [x] Short description of the feature/issue is added in the pr
description
- [x] PR is linked to the corresponding user story
- [x] Acceptance criteria are met
- [x] All open todos and follow ups are defined in a new ticket and
justified
- [x] Deviations from the acceptance criteria and design are agreed with
the PO and documented.
- [x] No debug or dead code
- [x] My code has no repetitions
- [x] Critical parts are tested automatically
- [ ] Where possible E2E tests are implemented
- [x] Documentation/examples are up-to-date
- [x] All non-functional requirements are met
- [x] Functionality of the acceptance criteria is checked manually on
the dev system.

---------

Co-authored-by: Livio Spring <livio.a@gmail.com>
2024-05-30 09:35:30 +00:00
Silvan
17953e9040
fix(setup): init projections (#7194)
Even though this is a feature it's released as fix so that we can back port to earlier revisions.

As reported by multiple users startup of ZITADEL after leaded to downtime and worst case rollbacks to the previously deployed version.

The problem starts rising when there are too many events to process after the start of ZITADEL. The root cause are changes on projections (database tables) which must be recomputed. This PR solves this problem by adding a new step to the setup phase which prefills the projections. The step can be enabled by adding the `--init-projections`-flag to `setup`, `start-from-init` and `start-from-setup`. Setting this flag results in potentially longer duration of the setup phase but reduces the risk of the problems mentioned in the paragraph above.
2024-01-25 17:28:20 +01:00
Elio Bischof
dd33538c0a
feat: restrict languages (#6931)
* feat: return 404 or 409 if org reg disallowed

* fix: system limit permissions

* feat: add iam limits api

* feat: disallow public org registrations on default instance

* add integration test

* test: integration

* fix test

* docs: describe public org registrations

* avoid updating docs deps

* fix system limits integration test

* silence integration tests

* fix linting

* ignore strange linter complaints

* review

* improve reset properties naming

* redefine the api

* use restrictions aggregate

* test query

* simplify and test projection

* test commands

* fix unit tests

* move integration test

* support restrictions on default instance

* also test GetRestrictions

* self review

* lint

* abstract away resource owner

* fix tests

* configure supported languages

* fix allowed languages

* fix tests

* default lang must not be restricted

* preferred language must be allowed

* change preferred languages

* check languages everywhere

* lint

* test command side

* lint

* add integration test

* add integration test

* restrict supported ui locales

* lint

* lint

* cleanup

* lint

* allow undefined preferred language

* fix integration tests

* update main

* fix env var

* ignore linter

* ignore linter

* improve integration test config

* reduce cognitive complexity

* compile

* check for duplicates

* remove useless restriction checks

* review

* revert restriction renaming

* fix language restrictions

* lint

* generate

* allow custom texts for supported langs for now

* fix tests

* cleanup

* cleanup

* cleanup

* lint

* unsupported preferred lang is allowed

* fix integration test

* finish reverting to old property name

* finish reverting to old property name

* load languages

* refactor(i18n): centralize translators and fs

* lint

* amplify no validations on preferred languages

* fix integration test

* lint

* fix resetting allowed languages

* test unchanged restrictions
2023-12-05 11:12:01 +00:00
Silvan
b5564572bc
feat(eventstore): increase parallel write capabilities (#5940)
This implementation increases parallel write capabilities of the eventstore.
Please have a look at the technical advisories: [05](https://zitadel.com/docs/support/advisory/a10005) and  [06](https://zitadel.com/docs/support/advisory/a10006).
The implementation of eventstore.push is rewritten and stored events are migrated to a new table `eventstore.events2`.
If you are using cockroach: make sure that the database user of ZITADEL has `VIEWACTIVITY` grant. This is used to query events.
2023-10-19 12:19:10 +02:00
Elio Bischof
8f6cb47567
fix: use triggering origin for notification links (#6628)
* take baseurl if saved on event

* refactor: make es mocks reusable

* Revert "refactor: make es mocks reusable"

This reverts commit 434ce12a6a.

* make messages testable

* test asset url

* fmt

* fmt

* simplify notification.Start

* test url combinations

* support init code added

* support password changed

* support reset pw

* support user domain claimed

* support add pwless login

* support verify phone

* Revert "support verify phone"

This reverts commit e40503303e.

* save trigger origin from ctx

* add ready for review check

* camel

* test email otp

* fix variable naming

* fix DefaultOTPEmailURLV2

* Revert "fix DefaultOTPEmailURLV2"

This reverts commit fa34d4d2a8.

* fix email otp challenged test

* fix email otp challenged test

* pass origin in login and gateway requests

* take origin from header

* take x-forwarded if present

* Update internal/notification/handlers/queries.go

Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>

* Update internal/notification/handlers/commands.go

Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>

* move origin header to ctx if available

* generate

* cleanup

* use forwarded header

* support X-Forwarded-* headers

* standardize context handling

* fix linting

---------

Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
2023-10-10 13:20:53 +00:00
Livio Spring
bb40e173bd
feat(api): add otp (sms and email) checks in session api (#6422)
* feat: add otp (sms and email) checks in session api

* implement sending

* fix tests

* add tests

* add integration tests

* fix merge main and add tests

* put default OTP Email url into config

---------

Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
2023-08-24 09:41:52 +00:00
Elio Bischof
9b768003b7
feat: improve milestone format (#6150)
* feat: milestone format

* feat: push external domain

* cleanup

* Revert "remove prerelease"

This reverts commit 7417fdbeb3.

* fix branch

* remove prerelease
2023-07-06 19:31:08 +02:00
Elio Bischof
bb756482c7
feat: push telemetry (#6027)
* document analytics config

* rework configuration and docs

* describe HandleActiveInstances better

* describe active instances on quotas better

* only projected events are considered

* cleanup

* describe changes at runtime

* push milestones

* stop tracking events

* calculate and push 4 in 6 milestones

* reduce milestone pushed

* remove docs

* fix scheduled pseudo event projection

* push 5 in 6 milestones

* push 6 in 6 milestones

* ignore client ids

* fix text array contains

* push human readable milestone type

* statement unit tests

* improve dev and db performance

* organize imports

* cleanup

* organize imports

* test projection

* check rows.Err()

* test search query

* pass linting

* review

* test 4 milestones

* simplify milestone by instance ids query

* use type NamespacedCondition

* cleanup

* lint

* lint

* dont overwrite original error

* no opt-in in examples

* cleanup

* prerelease

* enable request headers

* make limit configurable

* review fixes

* only requeue special handlers secondly

* include integration tests

* Revert "include integration tests"

This reverts commit 96db9504ec.

* pass reducers

* test handlers

* fix unit test

* feat: increment version

* lint

* remove prerelease

* fix integration tests
2023-07-06 08:38:13 +02:00
Elio Bischof
cccccd005c
feat: call webhooks at least once (#5454)
* feat: call webhooks at least once

* self review

* feat: improve notification observability

* feat: add notification tracing

* test(e2e): test at-least-once webhook delivery

* fix webhook notifications

* dedicated quota notifications handler

* fix linting

* fix e2e test

* wait less in e2e test

* fix: don't ignore failed events in handlers

* fix: don't ignore failed events in handlers

* faster requeues

* question

* fix retries

* fix retries

* retry

* don't instance ids query

* revert handler_projection

* statements can be nil

* cleanup

* make unit tests pass

* add comments

* add comments

* lint

* spool only active instances

* feat(config): handle inactive instances

* customizable HandleInactiveInstances

* call inactive instances quota webhooks

* test: handling with and w/o inactive instances

* omit retrying noop statements

* docs: describe projection options

* enable global handling of inactive instances

* self review

* requeue quota notifications every 5m

* remove caos_errors reference

* fix comment styles

* make handlers package flat

* fix linting

* fix repeating quota notifications

* test with more usage

* debug log channel init failures
2023-03-28 22:09:06 +00:00