# Which Problems Are Solved
The current handling of notification follows the same pattern as all
other projections:
Created events are handled sequentially (based on "position") by a
handler. During the process, a lot of information is aggregated (user,
texts, templates, ...).
This leads to back pressure on the projection since the handling of
events might take longer than the time before a new event (to be
handled) is created.
# How the Problems Are Solved
- The current user notification handler creates separate notification
events based on the user / session events.
- These events contain all the present and required information
including the userID.
- These notification events get processed by notification workers, which
gather the necessary information (recipient address, texts, templates)
to send out these notifications.
- If a notification fails, a retry event is created based on the current
notification request including the current state of the user (this
prevents race conditions, where a user is changed in the meantime and
the notification already gets the new state).
- The retry event will be handled after a backoff delay. This delay
increases with every attempt.
- If the configured amount of attempts is reached or the message expired
(based on config), a cancel event is created, letting the workers know,
the notification must no longer be handled.
- In case of successful send, a sent event is created for the
notification aggregate and the existing "sent" events for the user /
session object is stored.
- The following is added to the defaults.yaml to allow configuration of
the notification workers:
```yaml
Notifications:
# The amount of workers processing the notification request events.
# If set to 0, no notification request events will be handled. This can be useful when running in
# multi binary / pod setup and allowing only certain executables to process the events.
Workers: 1 # ZITADEL_NOTIFIACATIONS_WORKERS
# The amount of events a single worker will process in a run.
BulkLimit: 10 # ZITADEL_NOTIFIACATIONS_BULKLIMIT
# Time interval between scheduled notifications for request events
RequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_REQUEUEEVERY
# The amount of workers processing the notification retry events.
# If set to 0, no notification retry events will be handled. This can be useful when running in
# multi binary / pod setup and allowing only certain executables to process the events.
RetryWorkers: 1 # ZITADEL_NOTIFIACATIONS_RETRYWORKERS
# Time interval between scheduled notifications for retry events
RetryRequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_RETRYREQUEUEEVERY
# Only instances are projected, for which at least a projection-relevant event exists within the timeframe
# from HandleActiveInstances duration in the past until the projection's current time
# If set to 0 (default), every instance is always considered active
HandleActiveInstances: 0s # ZITADEL_NOTIFIACATIONS_HANDLEACTIVEINSTANCES
# The maximum duration a transaction remains open
# before it spots left folding additional events
# and updates the table.
TransactionDuration: 1m # ZITADEL_NOTIFIACATIONS_TRANSACTIONDURATION
# Automatically cancel the notification after the amount of failed attempts
MaxAttempts: 3 # ZITADEL_NOTIFIACATIONS_MAXATTEMPTS
# Automatically cancel the notification if it cannot be handled within a specific time
MaxTtl: 5m # ZITADEL_NOTIFIACATIONS_MAXTTL
# Failed attempts are retried after a confogired delay (with exponential backoff).
# Set a minimum and maximum delay and a factor for the backoff
MinRetryDelay: 1s # ZITADEL_NOTIFIACATIONS_MINRETRYDELAY
MaxRetryDelay: 20s # ZITADEL_NOTIFIACATIONS_MAXRETRYDELAY
# Any factor below 1 will be set to 1
RetryDelayFactor: 1.5 # ZITADEL_NOTIFIACATIONS_RETRYDELAYFACTOR
```
# Additional Changes
None
# Additional Context
- closes#8931
# Which Problems Are Solved
Add a cache implementation using Redis single mode. This does not add
support for Redis Cluster or sentinel.
# How the Problems Are Solved
Added the `internal/cache/redis` package. All operations occur
atomically, including setting of secondary indexes, using LUA scripts
where needed.
The [`miniredis`](https://github.com/alicebob/miniredis) package is used
to run unit tests.
# Additional Changes
- Move connector code to `internal/cache/connector/...` and remove
duplicate code from `query` and `command` packages.
- Fix a missed invalidation on the restrictions projection
# Additional Context
Closes#8130
# Which Problems Are Solved
We identified the need of caching.
Currently we have a number of places where we use different ways of
caching, like go maps or LRU.
We might also want shared chaches in the future, like Redis-based or in
special SQL tables.
# How the Problems Are Solved
Define a generic Cache interface which allows different implementations.
- A noop implementation is provided and enabled as.
- An implementation using go maps is provided
- disabled in defaults.yaml
- enabled in integration tests
- Authz middleware instance objects are cached using the interface.
# Additional Changes
- Enabled integration test command raceflag
- Fix a race condition in the limits integration test client
- Fix a number of flaky integration tests. (Because zitadel is super
fast now!) 🎸🚀
# Additional Context
Related to https://github.com/zitadel/zitadel/issues/8648
# Which Problems Are Solved
To have more insight on the performance, CPU and memory usage of
ZITADEL, we want to enable profiling.
# How the Problems Are Solved
- Allow profiling by configuration.
- Provide Google Cloud Profiler as first implementation
# Additional Changes
None.
# Additional Context
There were possible memory leaks reported:
https://discord.com/channels/927474939156643850/1273210227918897152
Co-authored-by: Silvan <silvan.reusser@gmail.com>
# Which Problems Are Solved
Implement a new API service that allows management of OIDC signing web
keys.
This allows users to manage rotation of the instance level keys. which
are currently managed based on expiry.
The API accepts the generation of the following key types and
parameters:
- RSA keys with 2048, 3072 or 4096 bit in size and:
- Signing with SHA-256 (RS256)
- Signing with SHA-384 (RS384)
- Signing with SHA-512 (RS512)
- ECDSA keys with
- P256 curve
- P384 curve
- P512 curve
- ED25519 keys
# How the Problems Are Solved
Keys are serialized for storage using the JSON web key format from the
`jose` library. This is the format that will be used by OIDC for
signing, verification and publication.
Each instance can have a number of key pairs. All existing public keys
are meant to be used for token verification and publication the keys
endpoint. Keys can be activated and the active private key is meant to
sign new tokens. There is always exactly 1 active signing key:
1. When the first key for an instance is generated, it is automatically
activated.
2. Activation of the next key automatically deactivates the previously
active key.
3. Keys cannot be manually deactivated from the API
4. Active keys cannot be deleted
# Additional Changes
- Query methods that later will be used by the OIDC package are already
implemented. Preparation for #8031
- Fix indentation in french translation for instance event
- Move user_schema translations to consistent positions in all
translation files
# Additional Context
- Closes#8030
- Part of #7809
---------
Co-authored-by: Elio Bischof <elio@zitadel.com>
# Which Problems Are Solved
ZITADEL currently selects the instance context based on a HTTP header
(see https://github.com/zitadel/zitadel/issues/8279#issue-2399959845 and
checks it against the list of instance domains. Let's call it instance
or API domain.
For any context based URL (e.g. OAuth, OIDC, SAML endpoints, links in
emails, ...) the requested domain (instance domain) will be used. Let's
call it the public domain.
In cases of proxied setups, all exposed domains (public domains) require
the domain to be managed as instance domain.
This can either be done using the "ExternalDomain" in the runtime config
or via system API, which requires a validation through CustomerPortal on
zitadel.cloud.
# How the Problems Are Solved
- Two new headers / header list are added:
- `InstanceHostHeaders`: an ordered list (first sent wins), which will
be used to match the instance.
(For backward compatibility: the `HTTP1HostHeader`, `HTTP2HostHeader`
and `forwarded`, `x-forwarded-for`, `x-forwarded-host` are checked
afterwards as well)
- `PublicHostHeaders`: an ordered list (first sent wins), which will be
used as public host / domain. This will be checked against a list of
trusted domains on the instance.
- The middleware intercepts all requests to the API and passes a
`DomainCtx` object with the hosts and protocol into the context
(previously only a computed `origin` was passed)
- HTTP / GRPC server do not longer try to match the headers to instances
themself, but use the passed `http.DomainContext` in their interceptors.
- The `RequestedHost` and `RequestedDomain` from authz.Instance are
removed in favor of the `http.DomainContext`
- When authenticating to or signing out from Console UI, the current
`http.DomainContext(ctx).Origin` (already checked by instance
interceptor for validity) is used to compute and dynamically add a
`redirect_uri` and `post_logout_redirect_uri`.
- Gateway passes all configured host headers (previously only did
`x-zitadel-*`)
- Admin API allows to manage trusted domain
# Additional Changes
None
# Additional Context
- part of #8279
- open topics:
- "single-instance" mode
- Console UI
* feat(api): feature API proto definitions
* update proto based on discussion with @livio-a
* cleanup old feature flag stuff
* authz instance queries
* align defaults
* projection definitions
* define commands and event reducers
* implement system and instance setter APIs
* api getter implementation
* unit test repository package
* command unit tests
* unit test Get queries
* grpc converter unit tests
* migrate the V1 features
* migrate oidc to dynamic features
* projection unit test
* fix instance by host
* fix instance by id data type in sql
* fix linting errors
* add system projection test
* fix behavior inversion
* resolve proto file comments
* rename SystemDefaultLoginInstanceEventType to SystemLoginDefaultOrgEventType so it's consistent with the instance level event
* use write models and conditional set events
* system features integration tests
* instance features integration tests
* error on empty request
* documentation entry
* typo in feature.proto
* fix start unit tests
* solve linting error on key case switch
* remove system defaults after discussion with @eliobischof
* fix system feature projection
* resolve comments in defaults.yaml
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
Even though this is a feature it's released as fix so that we can back port to earlier revisions.
As reported by multiple users startup of ZITADEL after leaded to downtime and worst case rollbacks to the previously deployed version.
The problem starts rising when there are too many events to process after the start of ZITADEL. The root cause are changes on projections (database tables) which must be recomputed. This PR solves this problem by adding a new step to the setup phase which prefills the projections. The step can be enabled by adding the `--init-projections`-flag to `setup`, `start-from-init` and `start-from-setup`. Setting this flag results in potentially longer duration of the setup phase but reduces the risk of the problems mentioned in the paragraph above.
* define roles and permissions
* support system user memberships
* don't limit system users
* cleanup permissions
* restrict memberships to aggregates
* default to SYSTEM_OWNER
* update unit tests
* test: system user token test (#6778)
* update unit tests
* refactor: make authz testable
* move session constants
* cleanup
* comment
* comment
* decode member type string to enum (#6780)
* decode member type string to enum
* handle all membership types
* decode enums where necessary
* decode member type in steps config
* update system api docs
* add technical advisory
* tweak docs a bit
* comment in comment
* lint
* extract token from Bearer header prefix
* review changes
* fix tests
* fix: add fix for activityhandler
* add isSystemUser
* remove IsSystemUser from activity info
* fix: add fix for activityhandler
---------
Co-authored-by: Stefan Benz <stefan@caos.ch>
* start feature flags
* base feature events on domain const
* setup default features
* allow setting feature in system api
* allow setting feature in admin api
* set settings in login based on feature
* fix rebasing
* unit tests
* i18n
* update policy after domain discovery
* some changes from review
* check feature and value type
* check feature and value type
* pipeline runs on ubuntu instead of docker
* added Makefile to build zitadel core (backend) and console (frontend)
* pipeline runs in parallel where possible
* pipeline is split into multiple jobs
* removed goreleaser
* added command to check if zitadel instance is running
* feat(instance): implement create instance with direct machine user and credentials
* fix: deprecated add endpoint and variable declaration
* fix(instance): update logic for pats and machinekeys
* fix(instance): unit test corrections and additional unit test for pats and machinekeys
* fix(instance-create): include review changes
* fix(instance-create): linter fixes
* move iframe usage to solution scenarios configurations
* Revert "move iframe usage to solution scenarios configurations"
This reverts commit 9db31f3808e6dfcae9907bc574c072436a19865a.
* fix merge
* fix: add review suggestions
Co-authored-by: Livio Spring <livio.a@gmail.com>
* fix: add review changes
* fix: add review changes for default definitions
* fix: add review changes for machinekey details
* fix: add machinekey output when setup with machineuser
* fix: add changes from review
* fix instance converter for machine and allow overwriting of further machine fields
Co-authored-by: Livio Spring <livio.a@gmail.com>