# Which Problems Are Solved
The recently introduced notification queue have potential race conditions.
# How the Problems Are Solved
Current code is refactored to use the queue package, which is safe in
regards of concurrency.
# Additional Changes
- the queue is included in startup
- improved code quality of queue
# Additional Context
- closes https://github.com/zitadel/zitadel/issues/9278
# Which Problems Are Solved
Setup fails to push all role permission events when running Zitadel with
CockroachDB. `TransactionRetryError`s were visible in logs which finally
times out the setup job with `timeout: context deadline exceeded`
# How the Problems Are Solved
As suggested in the [Cockroach documentation](timeout: context deadline
exceeded), _"break down larger transactions"_. The commands to be pushed
for the role permissions are chunked in 50 events per push. This
chunking is only done with CockroachDB.
# Additional Changes
- gci run fixed some unrelated imports
- access to `command.Commands` for the setup job, so we can reuse the
sync logic.
# Additional Context
Closes#9293
---------
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
# Which Problems Are Solved
Some OAuth2 and OIDC providers require the use of PKCE for all their
clients. While ZITADEL already recommended the same for its clients, it
did not yet support the option on the IdP configuration.
# How the Problems Are Solved
- A new boolean `use_pkce` is added to the add/update generic OAuth/OIDC
endpoints.
- A new checkbox is added to the generic OAuth and OIDC provider
templates.
- The `rp.WithPKCE` option is added to the provider if the use of PKCE
has been set.
- The `rp.WithCodeChallenge` and `rp.WithCodeVerifier` options are added
to the OIDC/Auth BeginAuth and CodeExchange function.
- Store verifier or any other persistent argument in the intent or auth
request.
- Create corresponding session object before creating the intent, to be
able to store the information.
- (refactored session structs to use a constructor for unified creation
and better overview of actual usage)
Here's a screenshot showing the URI including the PKCE params:

# Additional Changes
None.
# Additional Context
- Closes#6449
- This PR replaces the existing PR (#8228) of @doncicuto. The base he
did was cherry picked. Thank you very much for that!
---------
Co-authored-by: Miguel Cabrerizo <doncicuto@gmail.com>
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
# Which Problems Are Solved
The OAuth2 Device Authorization Grant could not yet been handled through
the new login UI, resp. using the session API.
This PR adds the ability for the login UI to get the required
information to display the user and handle their decision (approve with
authorization or deny) using the OIDC Service API.
# How the Problems Are Solved
- Added a `GetDeviceAuthorizationRequest` endpoint, which allows getting
the `id`, `client_id`, `scope`, `app_name` and `project_name` of the
device authorization request
- Added a `AuthorizeOrDenyDeviceAuthorization` endpoint, which allows to
approve/authorize with the session information or deny the request. The
identification of the request is done by the `device_authorization_id` /
`id` returned in the previous request.
- To prevent leaking the `device_code` to the UI, but still having an
easy reference, it's encrypted and returned as `id`, resp. decrypted
when used.
- Fixed returned error types for device token responses on token
endpoint:
- Explicitly return `access_denied` (without internal error) when user
denied the request
- Default to `invalid_grant` instead of `access_denied`
- Explicitly check on initial state when approving the reqeust
- Properly handle done case (also relates to initial check)
- Documented the flow and handling in custom UIs (according to OIDC /
SAML)
# Additional Changes
- fixed some typos and punctuation in the corresponding OIDC / SAML
guides.
- added some missing translations for auth and saml request
# Additional Context
- closes#6239
---------
Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
# Which Problems Are Solved
New integration tests can't use command side to simulate successful
intents.
# How the Problems Are Solved
Add endpoints to only in integration tests available sink to create
already successful intents.
# Additional Changes
None
# Additional Context
Closes#8557
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
Performance issue for GRPC call `zitadel.user.v2.UserService.ListUsers`
due to lack of org filtering on `ListUsers`
# Additional Context
Replace this example with links to related issues, discussions, discord
threads, or other sources with more context.
Use the Closing #issue syntax for issues that are resolved with this PR.
- Closes https://github.com/zitadel/zitadel/issues/9191
---------
Co-authored-by: Iraq Jaber <IraqJaber@gmail.com>
Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
# Which Problems Are Solved
OIDC applications can configure the used login version, which is
currently not possible for SAML applications.
# How the Problems Are Solved
Add the same functionality dependent on the feature-flag for SAML
applications.
# Additional Changes
None
# Additional Context
Closes#9267
Follow up issue for frontend changes #9354
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
To integrate river as a task queue we need to ensure the migrations of
river are executed.
# How the Problems Are Solved
- A new schema was added to the Zitadel database called "queue"
- Added a repeatable setup step to Zitadel which executes the
[migrations of
river](https://riverqueue.com/docs/migrations#go-migration-api).
# Additional Changes
- Added more hooks to the databases to properly set the schema for the
task queue
# Additional Context
- Closes https://github.com/zitadel/zitadel/issues/9280
# Which Problems Are Solved
Systems running with PostgreSQL before Zitadel v2.39 are likely to have
a wrong type for the `in_tx_order` column in the `eventstore.event2`
table. The migration at the time used the `event_sequence` as default
value without typecast, which results in a `bigint` type for that
column. However, when creating the table from scratch, we explicitly
specify the type to be `integer`.
Starting from Zitadel v2.67 we use a Pl/PgSQL function to push events.
The function requires the types from `eventstore.events2` to the same as
the `select` destinations used in the function. In the function
`in_tx_order` is also expected to by of `integer` type.
CochroachDB systems are not affected because `bigint` is an alias to the
`int` type. In other words, CockroachDB uses `int8` when specifying type
`int`. Therefore the types already match.
# How the Problems Are Solved
Retrieve the actual column type currently in use. A template is used to
assign the type to the `ordinality` column returned as `in_tx_order`.
# Additional Changes
- Detailed logging on migration failure
# Additional Context
- Closes#9180
---------
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
# Which Problems Are Solved
The login by email was not possible anymore. This was due to a newly
generated user projection because of #9255 .
Internal logs showed that the computed lower case column for verified
email was missing.
# How the Problems Are Solved
Update name of setup step 25 to rerun the step, since the underlying sql
changed.
# Additional Changes
None
# Additional Context
- relates to #9255
# Which Problems Are Solved
After updating to version 2.69.0, my zitadel instance refuse to start
with this error log :
```
time="2025-02-03T19:46:47Z" level=info msg="starting migration" caller="/home/runner/work/zitadel/zitadel/internal/migration/migration.go:66" name=46_init_permission_functions
time="2025-02-03T19:46:47Z" level=info msg="execute statement" caller="/home/runner/work/zitadel/zitadel/cmd/setup/46.go:29" file=01-role_permissions_view.sql migration=46_init_permission_functions
time="2025-02-03T19:46:47Z" level=info msg="execute statement" caller="/home/runner/work/zitadel/zitadel/cmd/setup/46.go:29" file=02-instance_orgs_view.sql migration=46_init_permission_functions
time="2025-02-03T19:46:47Z" level=info msg="execute statement" caller="/home/runner/work/zitadel/zitadel/cmd/setup/46.go:29" file=03-instance_members_view.sql migration=46_init_permission_functions
time="2025-02-03T19:46:47Z" level=info msg="execute statement" caller="/home/runner/work/zitadel/zitadel/cmd/setup/46.go:29" file=04-org_members_view.sql migration=46_init_permission_functions
time="2025-02-03T19:46:47Z" level=info msg="execute statement" caller="/home/runner/work/zitadel/zitadel/cmd/setup/46.go:29" file=05-project_members_view.sql migration=46_init_permission_functions
time="2025-02-03T19:46:47Z" level=info msg="execute statement" caller="/home/runner/work/zitadel/zitadel/cmd/setup/46.go:29" file=06-permitted_orgs_function.sql migration=46_init_permission_functions
time="2025-02-03T19:46:47Z" level=error msg="migration failed" caller="/home/runner/work/zitadel/zitadel/internal/migration/migration.go:68" error="46_init_permission_functions 06-permitted_orgs_function.sql: ERROR: subquery in FROM must have an alias (SQLSTATE 42601)" name=46_init_permission_functions
time="2025-02-03T19:46:47Z" level=fatal msg="migration failed" caller="/home/runner/work/zitadel/zitadel/cmd/setup/setup.go:274" error="46_init_permission_functions 06-permitted_orgs_function.sql: ERROR: subquery in FROM must have an alias (SQLSTATE 42601)" name=46_init_permission_functions
```
# How the Problems Are Solved
I used the original sql script on my database which gave me the same
error.
So i added an alias for the subquery and the error cas gone
# Additional Context
I was migrating from version 2.58.3
Closes https://github.com/zitadel/zitadel/issues/9300
Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
# Which Problems Are Solved
* Adds support for the service provider configuration SCIM v2 endpoints
# How the Problems Are Solved
* Adds support for the service provider configuration SCIM v2 endpoints
* `GET /scim/v2/{orgId}/ServiceProviderConfig`
* `GET /scim/v2/{orgId}/ResourceTypes`
* `GET /scim/v2/{orgId}/ResourceTypes/{name}`
* `GET /scim/v2/{orgId}/Schemas`
* `GET /scim/v2/{orgId}/Schemas/{id}`
# Additional Context
Part of #8140
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
# Which Problems Are Solved
Add the ability to update the timestamp when MFA initialization was last
skipped.
Get User By ID now also returns the timestamps when MFA setup was last
skipped.
# How the Problems Are Solved
- Add a `HumanMFAInitSkipped` method to the `users/v2` API.
- MFA skipped was already projected in the `auth.users3` table. In this
PR the same column is added to the users projection. Event handling is
kept the same as in the `UserView`:
<details>
62804ca45f/internal/user/repository/view/model/user.go (L243-L377)
</details>
# Additional Changes
- none
# Additional Context
- Closes https://github.com/zitadel/zitadel/issues/9197
# Which Problems Are Solved
* Adds support for the bulk SCIM v2 endpoint
# How the Problems Are Solved
* Adds support for the bulk SCIM v2 endpoint under `POST
/scim/v2/{orgID}/Bulk`
# Additional Context
Part of #8140
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
# Which Problems Are Solved
Paths for setup steps are joined with "\" when binary is started under
Windows, which results in wrongly joined paths.
# How the Problems Are Solved
Replace the usage of "filepath" with "path" package, which does only
join with "/" and nothing OS specific.
# Additional Changes
None
# Additional Context
Closes#9227
# Which Problems Are Solved
The membership fields migration timed out in certain cases. It also
tried to migrate instances which were already removed.
# How the Problems Are Solved
Revert the previous fix that combined the repeatable step for multiple
fill triggers. The membeship migration is now single-run as it might
take a lot of time. It is not worth making it repeatable. Instance IDs
of removed instances are skipped.
# Additional Changes
None
# Additional Context
Introduced in https://github.com/zitadel/zitadel/pull/9199
# Which Problems Are Solved
Memberships did not have a fields table fill migration.
# How the Problems Are Solved
Add filling of membership fields to the repeatable steps.
# Additional Changes
- Use the same repeatable step for multiple fill fields handlers.
- Fix an error for PostgreSQL 15 where a subquery in a `FROM` clause
needs an alias ing the `permitted_orgs` function.
# Additional Context
- Part of https://github.com/zitadel/zitadel/issues/9188
- Introduced in https://github.com/zitadel/zitadel/pull/9152
# Which Problems Are Solved
https://github.com/zitadel/zitadel/pull/9186 introduced the new `push`
sql function for cockroachdb. The function used the wrong database
function to generate the position of the event and would therefore
insert events at a position before events created with an old Zitadel
version.
# How the Problems Are Solved
Instead of `EXTRACT(EPOCH FROM NOW())`, `cluster_logical_timestamp()` is
used to calculate the position of an event.
# Additional Context
- Introduced in https://github.com/zitadel/zitadel/pull/9186
- Affected versions:
https://github.com/zitadel/zitadel/releases/tag/v2.67.3
# Which Problems Are Solved
Zitadel currently uses 3 database pool, 1 for queries, 1 for pushing
events and 1 for scheduled projection updates. This defeats the purpose
of a connection pool which already handles multiple connections.
During load tests we found that the current structure of connection
pools consumes a lot of database resources. The resource usage dropped
after we reduced the amount of database pools to 1 because existing
connections can be used more efficiently.
# How the Problems Are Solved
Removed logic to handle multiple connection pools and use a single one.
# Additional Changes
none
# Additional Context
part of https://github.com/zitadel/zitadel/issues/8352
# Which Problems Are Solved
Currently ZITADEL defines organization and instance member roles and
permissions in defaults.yaml. The permission check is done on API call
level. For example: "is this user allowed to make this call on this
org". This makes sense on the V1 API where the API is permission-level
shaped. For example, a search for users always happens in the context of
the organization. (Either the organization the calling user belongs to,
or through member ship and the x-zitadel-orgid header.
However, for resource based APIs we must be able to resolve permissions
by object. For example, an IAM_OWNER listing users should be able to get
all users in an instance based on the query filters. Alternatively a
user may have user.read permissions on one or more orgs. They should be
able to read just those users.
# How the Problems Are Solved
## Role permission mapping
The role permission mappings defined from `defaults.yaml` or local
config override are synchronized to the database on every run of
`zitadel setup`:
- A single query per **aggregate** builds a list of `add` and `remove`
actions needed to reach the desired state or role permission mappings
from the config.
- The required events based on the actions are pushed to the event
store.
- Events define search fields so that permission checking can use the
indices and is strongly consistent for both query and command sides.
The migration is split in the following aggregates:
- System aggregate for for roles prefixed with `SYSTEM`
- Each instance for roles not prefixed with `SYSTEM`. This is in
anticipation of instance level management over the API.
## Membership
Current instance / org / project membership events now have field table
definitions. Like the role permissions this ensures strong consistency
while still being able to use the indices of the fields table. A
migration is provided to fill the membership fields.
## Permission check
I aimed keeping the mental overhead to the developer to a minimal. The
provided implementation only provides a permission check for list
queries for org level resources, for example users. In the `query`
package there is a simple helper function `wherePermittedOrgs` which
makes sure the underlying database function is called as part of the
`SELECT` query and the permitted organizations are part of the `WHERE`
clause. This makes sure results from non-permitted organizations are
omitted. Under the hood:
- A Pg/PlSQL function searches for a list of organization IDs the passed
user has the passed permission.
- When the user has the permission on instance level, it returns early
with all organizations.
- The functions uses a number of views. The views help mapping the
fields entries into relational data and simplify the code use for the
function. The views provide some pre-filters which allow proper index
usage once the final `WHERE` clauses are set by the function.
# Additional Changes
# Additional Context
Closes#9032
Closes https://github.com/zitadel/zitadel/issues/9014https://github.com/zitadel/zitadel/issues/9188 defines follow-ups for
the new permission framework based on this concept.
# Which Problems Are Solved
The performance of the initial push function can further be increased
# How the Problems Are Solved
`eventstore.push`- and `eventstore.commands_to_events`-functions were
rewritten
# Additional Changes
none
# Additional Context
same optimizations as for postgres:
https://github.com/zitadel/zitadel/pull/9092
# Which Problems Are Solved
In versions previous to v2.66 it was possible to set a different
resource owner on project grants. This was introduced with the new
resource based API. The resource owner was possible to overwrite using
the x-zitadel-org header.
Because of this issue project grants got the wrong resource owner,
instead of the owner of the project it got the granted org which is
wrong because a resource owner of an aggregate is not allowed to change.
# How the Problems Are Solved
- The wrong owners of the events are set to the original owner of the
project.
- A new event is pushed to these aggregates `project.owner.corrected`
- The projection updates the owners of the user grants if that event was
written
# Additional Changes
The eventstore push function (replaced in version 2.66) writes the
correct resource owner.
# Additional Context
closes https://github.com/zitadel/zitadel/issues/9072
# Which Problems Are Solved
ListSessions only works to list the sessions that you are the creator
of.
# How the Problems Are Solved
Add options to search for sessions created by other users, sessions
belonging to the same useragent and sessions belonging to your user.
Possible through additional search parameters which as default use the
information contained in your session token but can also be filled with
specific IDs.
# Additional Changes
Remodel integration tests, to separate the Create and Get of sessions
correctly.
# Additional Context
Closes#8301
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
- Adds infrastructure code (basic implementation, error handling,
middlewares, ...) to implement the SCIM v2 interface
- Adds support for the user create SCIM v2 endpoint
# How the Problems Are Solved
- Adds support for the user create SCIM v2 endpoint under `POST
/scim/v2/{orgID}/Users`
# Additional Context
Part of #8140
# Which Problems Are Solved
On Zitadel cloud we found changing the order of columns in the
`eventstore.events2_current_sequence` index improved CPU usage for the
`SELECT ... FOR UPDATE` query the pusher executes.
# How the Problems Are Solved
`eventstore.events2_current_sequence`-index got replaced
# Additional Context
closes https://github.com/zitadel/zitadel/issues/9082
# Which Problems Are Solved
We were seeing high query costs in a the lateral join executed in the
commands_to_events procedural function in the database. The high cost
resulted in incremental CPU usage as a load test continued and less
req/sec handled, sarting at 836 and ending at 130 req/sec.
# How the Problems Are Solved
1. Set `PARALLEL SAFE`. I noticed that this option defaults to `UNSAFE`.
But it's actually safe if the function doesn't `INSERT`
2. Set the returned `ROWS 10` parameter.
3. Function is re-written in Pl/PgSQL so that we eliminate expensive
joins.
4. Introduced an intermediate state that does `SELECT DISTINCT` for the
aggregate so that we don't have to do an expensive lateral join.
# Additional Changes
Use a `COALESCE` to get the owner from the last event, instead of a
`CASE` switch.
# Additional Context
- Function was introduced in
https://github.com/zitadel/zitadel/pull/8816
- Closes https://github.com/zitadel/zitadel/issues/8352
---------
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
# Which Problems Are Solved
It was possible to set a diffent algorithm for the legacy signer. This
is not supported howerver and breaks the token endpoint.
# How the Problems Are Solved
Remove the OIDC.SigningKeyAlgorithm config option and hard-code RS256
for the legacy signer.
# Additional Changes
- none
# Additional Context
Only RS256 is supported by the legacy signer. It was mentioned in the
comment of the config not to use it and use the webkeys resource
instead.
- closes#9121
# Which Problems Are Solved
get instance by domain cannot provide an instance id because it is not
known at that time. This causes a full table scan on the fields table
because current indexes always include the `instance_id` column.
# How the Problems Are Solved
Added a specific index for this query.
# Additional Context
If a system has many fields and there is no cache hit for the given
domain this query can heaviuly influence database CPU usage, the newly
added resolves this problem.
# Which Problems Are Solved
The console has no information about where and how to send PostHog
events.
# How the Problems Are Solved
A PostHog API URL and token are passed through as plain text from the
Zitadel runtime config to the environment.json. By default, no values
are configured and the keys in the environment.json are omitted.
# Additional Context
- Closes https://github.com/zitadel/zitadel/issues/9070
- Complements https://github.com/zitadel/zitadel/pull/9077
# Which Problems Are Solved
Some IdP callbacks use HTTP form POST to return their data on callbacks.
For handling CSRF in the login after such calls, a 302 Found to the
corresponding non form callback (in ZITADEL) is sent. Depending on the
size of the initial form body, this could lead to ZITADEL terminating
the connection, resulting in the user not getting a response or an
intermediate proxy to return them an HTTP 502.
# How the Problems Are Solved
- the form body is parsed and stored into the ZITADEL cache (using the
configured database by default)
- the redirect (302 Found) is performed with the request id
- the callback retrieves the data from the cache instead of the query
parameters (will fallback to latter to handle open uncached requests)
# Additional Changes
- fixed a typo in the default (cache) configuration: `LastUsage` ->
`LastUseAge`
# Additional Context
- reported by a customer
- needs to be backported to current cloud version (2.66.x)
---------
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
# Which Problems Are Solved
It is currently not possible to use SAML with the Session API.
# How the Problems Are Solved
Add SAML service, to get and resolve SAML requests.
Add SAML session and SAML request aggregate, which can be linked to the
Session to get back a SAMLResponse from the API directly.
# Additional Changes
Update of dependency zitadel/saml to provide all functionality for
handling of SAML requests and responses.
# Additional Context
Closes#6053
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
To be able to migrate or test the new login UI, admins might want to
(temporarily) switch individual apps.
At a later point admin might want to make sure all applications use the
new login UI.
# How the Problems Are Solved
- Added a feature flag `` on instance level to require all apps to use
the new login and provide an optional base url.
- if the flag is enabled, all (OIDC) applications will automatically use
the v2 login.
- if disabled, applications can decide based on their configuration
- Added an option on OIDC apps to use the new login UI and an optional
base url.
- Removed the requirement to use `x-zitadel-login-client` to be
redirected to the login V2 and retrieve created authrequest and link
them to SSO sessions.
- Added a new "IAM_LOGIN_CLIENT" role to allow management of users,
sessions, grants and more without `x-zitadel-login-client`.
# Additional Changes
None
# Additional Context
closes https://github.com/zitadel/zitadel/issues/8702
# Which Problems Are Solved
When downgrading zitadel and upgrading it again, it might be that orgs
deleted in this period still have stale entries in the fields table.
# How the Problems Are Solved
- Make the cleanup repeatable
- Scope the query by instance so that an index is used.
# Which Problems Are Solved
setup step 41 cannot handle downgrades at the moment. This step writes
the instance domain to the fields table. If there are new instances
created during the downgraded version is running there would be domain
missing in the fields afterwards.
# How the Problems Are Solved
Make step 41 repeatable for each version
# Which Problems Are Solved
Migration step 39 is supposed to cleanup stale organization entries in
the eventstore.fields table. In order to do this it used the projection
to check which orgs still exist.
During initial setup of ZITADEL the first instance with the organization
is created. Howevet, the projections are filled after all migrations are
done. With the organization projection empty, the fields of the first
org would be deleted.
This was discovered during development of a new field type. The
accosiated events did not yet have any projection based filled assigned.
It seems fields with a pre-fill projection are somehow restored.
Therefore a restoration migration isn't required IMO.
# How the Problems Are Solved
Query the event store for `org.removed` events instead. This has the
drawback of using a sequential scan on the eventstore, making the
migration more expensive.
# Additional Changes
- none
# Additional Context
- Introduced in https://github.com/zitadel/zitadel/pull/8946
# Which Problems Are Solved
Scheduled handlers use `eventstore.InstanceIDs` to get the all active
instances within a given timeframe. This function scrapes through all
events written within that time frame which can cause heavy load on the
database.
# How the Problems Are Solved
A new query cache `activeInstances` is introduced which caches the ids
of all instances queried by id or host within the configured timeframe.
# Additional Changes
- Changed `default.yaml`
- Removed `HandleActiveInstances` from custom handler configs
- Added `MaxActiveInstances` to define the maximal amount of cached
instance ids
- fixed start-from-init and start-from-setup to start auth and admin
projections twice
- fixed org cache invalidation to use correct index
# Additional Context
- part of #8999
# Which Problems Are Solved
There are some problems related to the use of CockroachDB with the new
notification handling (#8931).
See #9002 for details.
# How the Problems Are Solved
- Brought back the previous notification handler as legacy mode.
- Added a configuration to choose between legacy mode and new parallel
workers.
- Enabled legacy mode by default to prevent issues.
# Additional Changes
None
# Additional Context
- closes https://github.com/zitadel/zitadel/issues/9002
- relates to #8931
# Which Problems Are Solved
While running the latest RC / main, we noticed some errors including
context timeouts and rollback issues.
# How the Problems Are Solved
- The transaction context is passed and used for any event being written
and for handling savepoints to be able to handle context timeouts.
- The user projection is not triggered anymore. This will reduce
unnecessary load and potential timeouts if lot of workers are running.
In case a user would not be projected yet, the request event will log an
error and then be skipped / retried on the next run.
- Additionally, the context is checked if being closed after each event
process.
- `latestRetries` now correctly only returns the latest retry events to
be processed
- Default values for notifications have been changed to run workers less
often, more retry delay, but less transaction duration.
# Additional Changes
None
# Additional Context
relates to #8931
---------
Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
# Which Problems Are Solved
Instance domains are only computed on read side. This can cause missing
domains if calls are executed shortly after a instance domain (or
instance) was added.
# How the Problems Are Solved
The instance domain is added to the fields table which is filled on
command side.
# Additional Changes
- added setup step to compute instance domains
- instance by host uses fields table instead of instance_domains table
# Additional Context
- part of https://github.com/zitadel/zitadel/issues/8999
# Which Problems Are Solved
If many events are written to the same aggregate id it can happen that
zitadel [starts to retry the push
transaction](48ffc902cc/internal/eventstore/eventstore.go (L101))
because [the locking
behaviour](48ffc902cc/internal/eventstore/v3/sequence.go (L25))
during push does compute the wrong sequence because newly committed
events are not visible to the transaction. These events impact the
current sequence.
In cases with high command traffic on a single aggregate id this can
have severe impact on general performance of zitadel. Because many
connections of the `eventstore pusher` database pool are blocked by each
other.
# How the Problems Are Solved
To improve the performance this locking mechanism was removed and the
business logic of push is moved to sql functions which reduce network
traffic and can be analyzed by the database before the actual push. For
clients of the eventstore framework nothing changed.
# Additional Changes
- after a connection is established prefetches the newly added database
types
- `eventstore.BaseEvent` now returns the correct revision of the event
# Additional Context
- part of https://github.com/zitadel/zitadel/issues/8931
---------
Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
Co-authored-by: Livio Spring <livio.a@gmail.com>
Co-authored-by: Max Peintner <max@caos.ch>
Co-authored-by: Elio Bischof <elio@zitadel.com>
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
Co-authored-by: Miguel Cabrerizo <30386061+doncicuto@users.noreply.github.com>
Co-authored-by: Joakim Lodén <Loddan@users.noreply.github.com>
Co-authored-by: Yxnt <Yxnt@users.noreply.github.com>
Co-authored-by: Stefan Benz <stefan@caos.ch>
Co-authored-by: Harsha Reddy <harsha.reddy@klaviyo.com>
Co-authored-by: Zach H <zhirschtritt@gmail.com>
# Which Problems Are Solved
The action v2 messages were didn't contain anything providing security
for the sent content.
# How the Problems Are Solved
Each Target now has a SigningKey, which can also be newly generated
through the API and returned at creation and through the Get-Endpoints.
There is now a HTTP header "Zitadel-Signature", which is generated with
the SigningKey and Payload, and also contains a timestamp to check with
a tolerance if the message took to long to sent.
# Additional Changes
The functionality to create and check the signature is provided in the
pkg/actions package, and can be reused in the SDK.
# Additional Context
Closes#7924
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
The current handling of notification follows the same pattern as all
other projections:
Created events are handled sequentially (based on "position") by a
handler. During the process, a lot of information is aggregated (user,
texts, templates, ...).
This leads to back pressure on the projection since the handling of
events might take longer than the time before a new event (to be
handled) is created.
# How the Problems Are Solved
- The current user notification handler creates separate notification
events based on the user / session events.
- These events contain all the present and required information
including the userID.
- These notification events get processed by notification workers, which
gather the necessary information (recipient address, texts, templates)
to send out these notifications.
- If a notification fails, a retry event is created based on the current
notification request including the current state of the user (this
prevents race conditions, where a user is changed in the meantime and
the notification already gets the new state).
- The retry event will be handled after a backoff delay. This delay
increases with every attempt.
- If the configured amount of attempts is reached or the message expired
(based on config), a cancel event is created, letting the workers know,
the notification must no longer be handled.
- In case of successful send, a sent event is created for the
notification aggregate and the existing "sent" events for the user /
session object is stored.
- The following is added to the defaults.yaml to allow configuration of
the notification workers:
```yaml
Notifications:
# The amount of workers processing the notification request events.
# If set to 0, no notification request events will be handled. This can be useful when running in
# multi binary / pod setup and allowing only certain executables to process the events.
Workers: 1 # ZITADEL_NOTIFIACATIONS_WORKERS
# The amount of events a single worker will process in a run.
BulkLimit: 10 # ZITADEL_NOTIFIACATIONS_BULKLIMIT
# Time interval between scheduled notifications for request events
RequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_REQUEUEEVERY
# The amount of workers processing the notification retry events.
# If set to 0, no notification retry events will be handled. This can be useful when running in
# multi binary / pod setup and allowing only certain executables to process the events.
RetryWorkers: 1 # ZITADEL_NOTIFIACATIONS_RETRYWORKERS
# Time interval between scheduled notifications for retry events
RetryRequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_RETRYREQUEUEEVERY
# Only instances are projected, for which at least a projection-relevant event exists within the timeframe
# from HandleActiveInstances duration in the past until the projection's current time
# If set to 0 (default), every instance is always considered active
HandleActiveInstances: 0s # ZITADEL_NOTIFIACATIONS_HANDLEACTIVEINSTANCES
# The maximum duration a transaction remains open
# before it spots left folding additional events
# and updates the table.
TransactionDuration: 1m # ZITADEL_NOTIFIACATIONS_TRANSACTIONDURATION
# Automatically cancel the notification after the amount of failed attempts
MaxAttempts: 3 # ZITADEL_NOTIFIACATIONS_MAXATTEMPTS
# Automatically cancel the notification if it cannot be handled within a specific time
MaxTtl: 5m # ZITADEL_NOTIFIACATIONS_MAXTTL
# Failed attempts are retried after a confogired delay (with exponential backoff).
# Set a minimum and maximum delay and a factor for the backoff
MinRetryDelay: 1s # ZITADEL_NOTIFIACATIONS_MINRETRYDELAY
MaxRetryDelay: 20s # ZITADEL_NOTIFIACATIONS_MAXRETRYDELAY
# Any factor below 1 will be set to 1
RetryDelayFactor: 1.5 # ZITADEL_NOTIFIACATIONS_RETRYDELAYFACTOR
```
# Additional Changes
None
# Additional Context
- closes#8931
# Which Problems Are Solved
When an org is removed, the corresponding fields are not deleted. This
creates issues, such as recreating a new org with the same verified
domain.
# How the Problems Are Solved
Remove the search fields by the org aggregate, instead of just setting
the removed state.
# Additional Changes
- Cleanup migration script that removed current stale fields.
# Additional Context
- Closes https://github.com/zitadel/zitadel/issues/8943
- Related to https://github.com/zitadel/zitadel/pull/8790
---------
Co-authored-by: Silvan <silvan.reusser@gmail.com>
# Which Problems Are Solved
Organizations are ofter searched for by ID or primary domain. This
results in many redundant queries, resulting in a performance impact.
# How the Problems Are Solved
Cache Organizaion objects by ID and primary domain.
# Additional Changes
- Adjust integration test config to use all types of cache.
- Adjust integration test lifetimes so the pruner has something to do
while the tests run.
# Additional Context
- Closes#8865
- After #8902
# Which Problems Are Solved
By having default entries in the `Username` and `ClientName` fields, it
was not possible to unset there parameters. Unsetting them is required
for GCP connections
# How the Problems Are Solved
Set the fields to empty strings.
# Additional Changes
- none
# Additional Context
- none
# Which Problems Are Solved
If a redis cache has connection issues or any other type of permament
error,
it tanks the responsiveness of ZITADEL.
We currently do not support things like Redis cluster or sentinel. So
adding a simple redis cache improves performance but introduces a single
point of failure.
# How the Problems Are Solved
Implement a [circuit
breaker](https://learn.microsoft.com/en-us/previous-versions/msp-n-p/dn589784(v=pandp.10)?redirectedfrom=MSDN)
as
[`redis.Limiter`](https://pkg.go.dev/github.com/redis/go-redis/v9#Limiter)
by wrapping sony's [gobreaker](https://github.com/sony/gobreaker)
package. This package is picked as it seems well maintained and we
already use their `sonyflake` package
# Additional Changes
- The unit tests constructed an unused `redis.Client` and didn't cleanup
the connector. This is now fixed.
# Additional Context
Closes#8864
# Which Problems Are Solved
Fixes 'column "instance_id" does not exist' errors from #8558.
# How the Problems Are Solved
The instanceClause / WHERE clause in the query for the respective tables
is excluded.
I have successfully created a mirror with this change.
# Which Problems Are Solved
Fixes small typo in email body during user creation & verification. The
change also includes the removal of some unnecessary white space in the
same yaml file.
# How the Problems Are Solved
Replaces din't with didn't.

Co-authored-by: jtaylor@dingo.com <jtaylor@dingo.com>
Co-authored-by: Silvan <silvan.reusser@gmail.com>
# Which Problems Are Solved
https://github.com/zitadel/zitadel/pull/8788 accidentally changed the
spelling of milestone types from PascalCase to snake_case. This breaks
systems where `milestone.pushed` events already exist.
# How the Problems Are Solved
- Use PascalCase again
- Prefix event types with v2. (Previous pushed event type was anyway
ignored).
- Create `milstones3` projection
# Additional Changes
None
# Additional Context
relates to #8788