# Which Problems Are Solved
Scheduled handlers use `eventstore.InstanceIDs` to get the all active
instances within a given timeframe. This function scrapes through all
events written within that time frame which can cause heavy load on the
database.
# How the Problems Are Solved
A new query cache `activeInstances` is introduced which caches the ids
of all instances queried by id or host within the configured timeframe.
# Additional Changes
- Changed `default.yaml`
- Removed `HandleActiveInstances` from custom handler configs
- Added `MaxActiveInstances` to define the maximal amount of cached
instance ids
- fixed start-from-init and start-from-setup to start auth and admin
projections twice
- fixed org cache invalidation to use correct index
# Additional Context
- part of #8999
# Which Problems Are Solved
There are some problems related to the use of CockroachDB with the new
notification handling (#8931).
See #9002 for details.
# How the Problems Are Solved
- Brought back the previous notification handler as legacy mode.
- Added a configuration to choose between legacy mode and new parallel
workers.
- Enabled legacy mode by default to prevent issues.
# Additional Changes
None
# Additional Context
- closes https://github.com/zitadel/zitadel/issues/9002
- relates to #8931
# Which Problems Are Solved
While running the latest RC / main, we noticed some errors including
context timeouts and rollback issues.
# How the Problems Are Solved
- The transaction context is passed and used for any event being written
and for handling savepoints to be able to handle context timeouts.
- The user projection is not triggered anymore. This will reduce
unnecessary load and potential timeouts if lot of workers are running.
In case a user would not be projected yet, the request event will log an
error and then be skipped / retried on the next run.
- Additionally, the context is checked if being closed after each event
process.
- `latestRetries` now correctly only returns the latest retry events to
be processed
- Default values for notifications have been changed to run workers less
often, more retry delay, but less transaction duration.
# Additional Changes
None
# Additional Context
relates to #8931
---------
Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
# Which Problems Are Solved
Instance domains are only computed on read side. This can cause missing
domains if calls are executed shortly after a instance domain (or
instance) was added.
# How the Problems Are Solved
The instance domain is added to the fields table which is filled on
command side.
# Additional Changes
- added setup step to compute instance domains
- instance by host uses fields table instead of instance_domains table
# Additional Context
- part of https://github.com/zitadel/zitadel/issues/8999
# Which Problems Are Solved
If many events are written to the same aggregate id it can happen that
zitadel [starts to retry the push
transaction](48ffc902cc/internal/eventstore/eventstore.go (L101))
because [the locking
behaviour](48ffc902cc/internal/eventstore/v3/sequence.go (L25))
during push does compute the wrong sequence because newly committed
events are not visible to the transaction. These events impact the
current sequence.
In cases with high command traffic on a single aggregate id this can
have severe impact on general performance of zitadel. Because many
connections of the `eventstore pusher` database pool are blocked by each
other.
# How the Problems Are Solved
To improve the performance this locking mechanism was removed and the
business logic of push is moved to sql functions which reduce network
traffic and can be analyzed by the database before the actual push. For
clients of the eventstore framework nothing changed.
# Additional Changes
- after a connection is established prefetches the newly added database
types
- `eventstore.BaseEvent` now returns the correct revision of the event
# Additional Context
- part of https://github.com/zitadel/zitadel/issues/8931
---------
Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
Co-authored-by: Livio Spring <livio.a@gmail.com>
Co-authored-by: Max Peintner <max@caos.ch>
Co-authored-by: Elio Bischof <elio@zitadel.com>
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
Co-authored-by: Miguel Cabrerizo <30386061+doncicuto@users.noreply.github.com>
Co-authored-by: Joakim Lodén <Loddan@users.noreply.github.com>
Co-authored-by: Yxnt <Yxnt@users.noreply.github.com>
Co-authored-by: Stefan Benz <stefan@caos.ch>
Co-authored-by: Harsha Reddy <harsha.reddy@klaviyo.com>
Co-authored-by: Zach H <zhirschtritt@gmail.com>
# Which Problems Are Solved
The action v2 messages were didn't contain anything providing security
for the sent content.
# How the Problems Are Solved
Each Target now has a SigningKey, which can also be newly generated
through the API and returned at creation and through the Get-Endpoints.
There is now a HTTP header "Zitadel-Signature", which is generated with
the SigningKey and Payload, and also contains a timestamp to check with
a tolerance if the message took to long to sent.
# Additional Changes
The functionality to create and check the signature is provided in the
pkg/actions package, and can be reused in the SDK.
# Additional Context
Closes#7924
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
The current handling of notification follows the same pattern as all
other projections:
Created events are handled sequentially (based on "position") by a
handler. During the process, a lot of information is aggregated (user,
texts, templates, ...).
This leads to back pressure on the projection since the handling of
events might take longer than the time before a new event (to be
handled) is created.
# How the Problems Are Solved
- The current user notification handler creates separate notification
events based on the user / session events.
- These events contain all the present and required information
including the userID.
- These notification events get processed by notification workers, which
gather the necessary information (recipient address, texts, templates)
to send out these notifications.
- If a notification fails, a retry event is created based on the current
notification request including the current state of the user (this
prevents race conditions, where a user is changed in the meantime and
the notification already gets the new state).
- The retry event will be handled after a backoff delay. This delay
increases with every attempt.
- If the configured amount of attempts is reached or the message expired
(based on config), a cancel event is created, letting the workers know,
the notification must no longer be handled.
- In case of successful send, a sent event is created for the
notification aggregate and the existing "sent" events for the user /
session object is stored.
- The following is added to the defaults.yaml to allow configuration of
the notification workers:
```yaml
Notifications:
# The amount of workers processing the notification request events.
# If set to 0, no notification request events will be handled. This can be useful when running in
# multi binary / pod setup and allowing only certain executables to process the events.
Workers: 1 # ZITADEL_NOTIFIACATIONS_WORKERS
# The amount of events a single worker will process in a run.
BulkLimit: 10 # ZITADEL_NOTIFIACATIONS_BULKLIMIT
# Time interval between scheduled notifications for request events
RequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_REQUEUEEVERY
# The amount of workers processing the notification retry events.
# If set to 0, no notification retry events will be handled. This can be useful when running in
# multi binary / pod setup and allowing only certain executables to process the events.
RetryWorkers: 1 # ZITADEL_NOTIFIACATIONS_RETRYWORKERS
# Time interval between scheduled notifications for retry events
RetryRequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_RETRYREQUEUEEVERY
# Only instances are projected, for which at least a projection-relevant event exists within the timeframe
# from HandleActiveInstances duration in the past until the projection's current time
# If set to 0 (default), every instance is always considered active
HandleActiveInstances: 0s # ZITADEL_NOTIFIACATIONS_HANDLEACTIVEINSTANCES
# The maximum duration a transaction remains open
# before it spots left folding additional events
# and updates the table.
TransactionDuration: 1m # ZITADEL_NOTIFIACATIONS_TRANSACTIONDURATION
# Automatically cancel the notification after the amount of failed attempts
MaxAttempts: 3 # ZITADEL_NOTIFIACATIONS_MAXATTEMPTS
# Automatically cancel the notification if it cannot be handled within a specific time
MaxTtl: 5m # ZITADEL_NOTIFIACATIONS_MAXTTL
# Failed attempts are retried after a confogired delay (with exponential backoff).
# Set a minimum and maximum delay and a factor for the backoff
MinRetryDelay: 1s # ZITADEL_NOTIFIACATIONS_MINRETRYDELAY
MaxRetryDelay: 20s # ZITADEL_NOTIFIACATIONS_MAXRETRYDELAY
# Any factor below 1 will be set to 1
RetryDelayFactor: 1.5 # ZITADEL_NOTIFIACATIONS_RETRYDELAYFACTOR
```
# Additional Changes
None
# Additional Context
- closes#8931
# Which Problems Are Solved
When an org is removed, the corresponding fields are not deleted. This
creates issues, such as recreating a new org with the same verified
domain.
# How the Problems Are Solved
Remove the search fields by the org aggregate, instead of just setting
the removed state.
# Additional Changes
- Cleanup migration script that removed current stale fields.
# Additional Context
- Closes https://github.com/zitadel/zitadel/issues/8943
- Related to https://github.com/zitadel/zitadel/pull/8790
---------
Co-authored-by: Silvan <silvan.reusser@gmail.com>
# Which Problems Are Solved
Organizations are ofter searched for by ID or primary domain. This
results in many redundant queries, resulting in a performance impact.
# How the Problems Are Solved
Cache Organizaion objects by ID and primary domain.
# Additional Changes
- Adjust integration test config to use all types of cache.
- Adjust integration test lifetimes so the pruner has something to do
while the tests run.
# Additional Context
- Closes#8865
- After #8902
# Which Problems Are Solved
By having default entries in the `Username` and `ClientName` fields, it
was not possible to unset there parameters. Unsetting them is required
for GCP connections
# How the Problems Are Solved
Set the fields to empty strings.
# Additional Changes
- none
# Additional Context
- none
# Which Problems Are Solved
If a redis cache has connection issues or any other type of permament
error,
it tanks the responsiveness of ZITADEL.
We currently do not support things like Redis cluster or sentinel. So
adding a simple redis cache improves performance but introduces a single
point of failure.
# How the Problems Are Solved
Implement a [circuit
breaker](https://learn.microsoft.com/en-us/previous-versions/msp-n-p/dn589784(v=pandp.10)?redirectedfrom=MSDN)
as
[`redis.Limiter`](https://pkg.go.dev/github.com/redis/go-redis/v9#Limiter)
by wrapping sony's [gobreaker](https://github.com/sony/gobreaker)
package. This package is picked as it seems well maintained and we
already use their `sonyflake` package
# Additional Changes
- The unit tests constructed an unused `redis.Client` and didn't cleanup
the connector. This is now fixed.
# Additional Context
Closes#8864
# Which Problems Are Solved
Fixes 'column "instance_id" does not exist' errors from #8558.
# How the Problems Are Solved
The instanceClause / WHERE clause in the query for the respective tables
is excluded.
I have successfully created a mirror with this change.
# Which Problems Are Solved
Fixes small typo in email body during user creation & verification. The
change also includes the removal of some unnecessary white space in the
same yaml file.
# How the Problems Are Solved
Replaces din't with didn't.
![image](https://github.com/user-attachments/assets/48abf38b-4deb-42b7-a85b-91009e19f27f)
Co-authored-by: jtaylor@dingo.com <jtaylor@dingo.com>
Co-authored-by: Silvan <silvan.reusser@gmail.com>
# Which Problems Are Solved
https://github.com/zitadel/zitadel/pull/8788 accidentally changed the
spelling of milestone types from PascalCase to snake_case. This breaks
systems where `milestone.pushed` events already exist.
# How the Problems Are Solved
- Use PascalCase again
- Prefix event types with v2. (Previous pushed event type was anyway
ignored).
- Create `milstones3` projection
# Additional Changes
None
# Additional Context
relates to #8788
# Which Problems Are Solved
Add a cache implementation using Redis single mode. This does not add
support for Redis Cluster or sentinel.
# How the Problems Are Solved
Added the `internal/cache/redis` package. All operations occur
atomically, including setting of secondary indexes, using LUA scripts
where needed.
The [`miniredis`](https://github.com/alicebob/miniredis) package is used
to run unit tests.
# Additional Changes
- Move connector code to `internal/cache/connector/...` and remove
duplicate code from `query` and `command` packages.
- Fix a missed invalidation on the restrictions projection
# Additional Context
Closes#8130
# Which Problems Are Solved
Currently ZITADEL supports RP-initiated logout for clients. Back-channel
logout ensures that user sessions are terminated across all connected
applications, even if the user closes their browser or loses
connectivity providing a more secure alternative for certain use cases.
# How the Problems Are Solved
If the feature is activated and the client used for the authentication
has a back_channel_logout_uri configured, a
`session_logout.back_channel` will be registered. Once a user terminates
their session, a (notification) handler will send a SET (form POST) to
the registered uri containing a logout_token (with the user's ID and
session ID).
- A new feature "back_channel_logout" is added on system and instance
level
- A `back_channel_logout_uri` can be managed on OIDC applications
- Added a `session_logout` aggregate to register and inform about sent
`back_channel` notifications
- Added a `SecurityEventToken` channel and `Form`message type in the
notification handlers
- Added `TriggeredAtOrigin` fields to `HumanSignedOut` and
`TerminateSession` events for notification handling
- Exported various functions and types in the `oidc` package to be able
to reuse for token signing in the back_channel notifier.
- To prevent that current existing session termination events will be
handled, a setup step is added to set the `current_states` for the
`projections.notifications_back_channel_logout` to the current position
- [x] requires https://github.com/zitadel/oidc/pull/671
# Additional Changes
- Updated all OTEL dependencies to v1.29.0, since OIDC already updated
some of them to that version.
- Single Session Termination feature is correctly checked (fixed feature
mapping)
# Additional Context
- closes https://github.com/zitadel/zitadel/issues/8467
- TODO:
- Documentation
- UI to be done: https://github.com/zitadel/zitadel/issues/8469
---------
Co-authored-by: Hidde Wieringa <hidde@hiddewieringa.nl>
# Which Problems Are Solved
Milestones used existing events from a number of aggregates. OIDC
session is one of them. We noticed in load-tests that the reduction of
the oidc_session.added event into the milestone projection is a costly
business with payload based conditionals. A milestone is reached once,
but even then we remain subscribed to the OIDC events. This requires the
projections.current_states to be updated continuously.
# How the Problems Are Solved
The milestone creation is refactored to use dedicated events instead.
The command side decides when a milestone is reached and creates the
reached event once for each milestone when required.
# Additional Changes
In order to prevent reached milestones being created twice, a migration
script is provided. When the old `projections.milestones` table exist,
the state is read from there and `v2` milestone aggregate events are
created, with the original reached and pushed dates.
# Additional Context
- Closes https://github.com/zitadel/zitadel/issues/8800
# Which Problems Are Solved
System administrators can block hosts and IPs for HTTP calls in actions.
Using DNS, blocked IPs could be bypassed.
# How the Problems Are Solved
- Hosts are resolved (DNS lookup) to check whether their corresponding
IP is blocked.
# Additional Changes
- Added complete lookup ip address range and "unspecified" address to
the default `DenyList`
# Which Problems Are Solved
The primary issue addressed in this PR is that the defaults.yaml file
contains escaped characters (like `<` for < and `>` for >) in
message texts, which prevents valid HTML rendering in certain parts of
the Zitadel platform.
These escaped characters are used in user-facing content (e.g., email
templates or notifications), resulting in improperly displayed text,
where the HTML elements like line breaks or bold text don't render
correctly.
# How the Problems Are Solved
The solution involves replacing the escaped characters with their
corresponding HTML tags in the defaults.yaml file, ensuring that the
HTML renders correctly in the emails or user interfaces where these
messages are displayed.
This update ensures that:
- The HTML in these message templates is rendered properly, improving
the user experience.
- The content looks professional and adheres to web standards for
displaying HTML content.
# Additional Changes
N/A
# Additional Context
N/A
- Closes#8531
Co-authored-by: Max Peintner <max@caos.ch>
# Which Problems Are Solved
Optimize the query that checks for terminated sessions in the access
token verifier. The verifier is used in auth middleware, userinfo and
introspection.
# How the Problems Are Solved
The previous implementation built a query for certain events and then
appended a single `PositionAfter` clause. This caused the postgreSQL
planner to use indexes only for the instance ID, aggregate IDs,
aggregate types and event types. Followed by an expensive sequential
scan for the position. This resulting in internal over-fetching of rows
before the final filter was applied.
![Screenshot_20241007_105803](https://github.com/user-attachments/assets/f2d91976-be87-428b-b604-a211399b821c)
Furthermore, the query was searching for events which are not always
applicable. For example, there was always a session ID search and if
there was a user ID, we would also search for a browser fingerprint in
event payload (expensive). Even if those argument string would be empty.
This PR changes:
1. Nest the position query, so that a full `instance_id, aggregate_id,
aggregate_type, event_type, "position"` index can be matched.
2. Redefine the `es_wm` index to include the `position` column.
3. Only search for events for the IDs that actually have a value. Do not
search (noop) if none of session ID, user ID or fingerpint ID are set.
New query plan:
![Screenshot_20241007_110648](https://github.com/user-attachments/assets/c3234c33-1b76-4b33-a4a9-796f69f3d775)
# Additional Changes
- cleanup how we load multi-statement migrations and make that a bit
more reusable.
# Additional Context
- Related to https://github.com/zitadel/zitadel/issues/7639
# Which Problems Are Solved
Cache implementation using a PGX connection pool.
# How the Problems Are Solved
Defines a new schema `cache` in the zitadel database.
A table for string keys and a table for objects is defined.
For postgreSQL, tables are unlogged and partitioned by cache name for
performance.
Cockroach does not have unlogged tables and partitioning is an
enterprise feature that uses alternative syntax combined with sharding.
Regular tables are used here.
# Additional Changes
- `postgres.Config` can return a pxg pool. See following discussion
# Additional Context
- Part of https://github.com/zitadel/zitadel/issues/8648
- Closes https://github.com/zitadel/zitadel/issues/8647
---------
Co-authored-by: Silvan <silvan.reusser@gmail.com>
# Which Problems Are Solved
Twilio supports a robust, multi-channel verification service that
notably supports multi-region SMS sender numbers required for our use
case. Currently, Zitadel does much of the work of the Twilio Verify (eg.
localization, code generation, messaging) but doesn't support the pool
of sender numbers that Twilio Verify does.
# How the Problems Are Solved
To support this API, we need to be able to store the Twilio Service ID
and send that in a verification request where appropriate: phone number
verification and SMS 2FA code paths.
This PR does the following:
- Adds the ability to use Twilio Verify of standard messaging through
Twilio
- Adds support for international numbers and more reliable verification
messages sent from multiple numbers
- Adds a new Twilio configuration option to support Twilio Verify in the
admin console
- Sends verification SMS messages through Twilio Verify
- Implements Twilio Verification Checks for codes generated through the
same
# Additional Changes
# Additional Context
- base was implemented by @zhirschtritt in
https://github.com/zitadel/zitadel/pull/8268❤️
- closes https://github.com/zitadel/zitadel/issues/8581
---------
Co-authored-by: Zachary Hirschtritt <zachary.hirschtritt@klaviyo.com>
Co-authored-by: Joey Biscoglia <joey.biscoglia@klaviyo.com>
# Which Problems Are Solved
We identified the need of caching.
Currently we have a number of places where we use different ways of
caching, like go maps or LRU.
We might also want shared chaches in the future, like Redis-based or in
special SQL tables.
# How the Problems Are Solved
Define a generic Cache interface which allows different implementations.
- A noop implementation is provided and enabled as.
- An implementation using go maps is provided
- disabled in defaults.yaml
- enabled in integration tests
- Authz middleware instance objects are cached using the interface.
# Additional Changes
- Enabled integration test command raceflag
- Fix a race condition in the limits integration test client
- Fix a number of flaky integration tests. (Because zitadel is super
fast now!) 🎸🚀
# Additional Context
Related to https://github.com/zitadel/zitadel/issues/8648
# Which Problems Are Solved
Endpoints to maintain email and phone contact on user v3 are not
implemented.
# How the Problems Are Solved
Add 3 endpoints with SetContactEmail, VerifyContactEmail and
ResendContactEmailCode.
Add 3 endpoints with SetContactPhone, VerifyContactPhone and
ResendContactPhoneCode.
Refactor the logic how contact is managed in the user creation and
update.
# Additional Changes
None
# Additional Context
- part of https://github.com/zitadel/zitadel/issues/6433
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
Reduce the chance for projection dead-locks. Increasing or disabling the
projection transaction duration solved dead-locks in all reported cases.
# How the Problems Are Solved
Increase the default transaction duration to 1 minute.
Due to the high value it is functionally similar to disabling,
however it still provides a safety net for transaction that do freeze,
perhaps due to connection issues with the database.
# Additional Changes
- Integration test uses default.
- Technical advisory
# Additional Context
- Related to https://github.com/zitadel/zitadel/issues/8517
---------
Co-authored-by: Silvan <silvan.reusser@gmail.com>
# Which Problems Are Solved
As an administrator I want to be able to invite users to my application
with the API V2, some user data I will already prefil, the user should
add the authentication method themself (password, passkey, sso).
# How the Problems Are Solved
- A user can now be created with a email explicitly set to false.
- If a user has no verified email and no authentication method, an
`InviteCode` can be created through the User V2 API.
- the code can be returned or sent through email
- additionally `URLTemplate` and an `ApplicatioName` can provided for
the email
- The code can be resent and verified through the User V2 API
- The V1 login allows users to verify and resend the code and set a
password (analog user initialization)
- The message text for the user invitation can be customized
# Additional Changes
- `verifyUserPasskeyCode` directly uses `crypto.VerifyCode` (instead of
`verifyEncryptedCode`)
- `verifyEncryptedCode` is removed (unnecessarily queried for the code
generator)
# Additional Context
- closes#8310
- TODO: login V2 will have to implement invite flow:
https://github.com/zitadel/typescript/issues/166
# Which Problems Are Solved
Add a debug API which allows pushing a set of events to be reduced in a
dedicated projection.
The events can carry a sleep duration which simulates a slow query
during projection handling.
# How the Problems Are Solved
- `CreateDebugEvents` allows pushing multiple events which simulate the
lifecycle of a resource. Each event has a `projectionSleep` field, which
issues a `pg_sleep()` statement query in the projection handler :
- Add
- Change
- Remove
- `ListDebugEventsStates` list the current state of the projection,
optionally with a Trigger
- `GetDebugEventsStateByID` get the current state of the aggregate ID in
the projection, optionally with a Trigger
# Additional Changes
- none
# Additional Context
- Allows reproduction of https://github.com/zitadel/zitadel/issues/8517
# Which Problems Are Solved
Use a single server instance for API integration tests. This optimizes
the time taken for the integration test pipeline,
because it allows running tests on multiple packages in parallel. Also,
it saves time by not start and stopping a zitadel server for every
package.
# How the Problems Are Solved
- Build a binary with `go build -race -cover ....`
- Integration tests only construct clients. The server remains running
in the background.
- The integration package and tested packages now fully utilize the API.
No more direct database access trough `query` and `command` packages.
- Use Makefile recipes to setup, start and stop the server in the
background.
- The binary has the race detector enabled
- Init and setup jobs are configured to halt immediately on race
condition
- Because the server runs in the background, races are only logged. When
the server is stopped and race logs exist, the Makefile recipe will
throw an error and print the logs.
- Makefile recipes include logic to print logs and convert coverage
reports after the server is stopped.
- Some tests need a downstream HTTP server to make requests, like quota
and milestones. A new `integration/sink` package creates an HTTP server
and uses websockets to forward HTTP request back to the test packages.
The package API uses Go channels for abstraction and easy usage.
# Additional Changes
- Integration test files already used the `//go:build integration`
directive. In order to properly split integration from unit tests,
integration test files need to be in a `integration_test` subdirectory
of their package.
- `UseIsolatedInstance` used to overwrite the `Tester.Client` for each
instance. Now a `Instance` object is returned with a gRPC client that is
connected to the isolated instance's hostname.
- The `Tester` type is now `Instance`. The object is created for the
first instance, used by default in any test. Isolated instances are also
`Instance` objects and therefore benefit from the same methods and
values. The first instance and any other us capable of creating an
isolated instance over the system API.
- All test packages run in an Isolated instance by calling
`NewInstance()`
- Individual tests that use an isolated instance use `t.Parallel()`
# Additional Context
- Closes#6684
- https://go.dev/doc/articles/race_detector
- https://go.dev/doc/build-cover
---------
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
# Which Problems Are Solved
Float64 which was used for the event.Position field is [not precise in
go and gets rounded](https://github.com/golang/go/issues/47300). This
can lead to unprecies position tracking of events and therefore
projections especially on cockcoachdb as the position used there is a
big number.
example of a unprecies position:
exact: 1725257931223002628
float64: 1725257931223002624.000000
# How the Problems Are Solved
The float64 was replaced by
[github.com/jackc/pgx-shopspring-decimal](https://github.com/jackc/pgx-shopspring-decimal).
# Additional Changes
Correct behaviour of makefile for load tests.
Rename `latestSequence`-queries to `latestPosition`
# Which Problems Are Solved
id_tokens issued for auth requests created through the login UI
currently do not provide a sid claim.
This is due to the fact that (SSO) sessions for the login UI do not have
one and are only computed by the userAgent(ID), the user(ID) and the
authentication checks of the latter.
This prevents client to track sessions and terminate specific session on
the end_session_endpoint.
# How the Problems Are Solved
- An `id` column is added to the `auth.user_sessions` table.
- The `id` (prefixed with `V1_`) is set whenever a session is added or
updated to active (from terminated)
- The id is passed to the `oidc session` (as v2 sessionIDs), to expose
it as `sid` claim
# Additional Changes
- refactored `getUpdateCols` to handle different column value types and
add arguments for query
# Additional Context
- closes#8499
- relates to #8501
# Which Problems Are Solved
Added functionality that user with a userschema can be created and
removed.
# How the Problems Are Solved
Added logic and moved APIs so that everything is API v3 conform.
# Additional Changes
- move of user and userschema API to resources folder
- changed testing and parameters
- some renaming
# Additional Context
closes#7308
---------
Co-authored-by: Elio Bischof <elio@zitadel.com>
# Which Problems Are Solved
Use web keys, managed by the `resources/v3alpha/web_keys` API, for OIDC
token signing and verification,
as well as serving the public web keys on the jwks / keys endpoint.
Response header on the keys endpoint now allows caching of the response.
This is now "safe" to do since keys can be created ahead of time and
caches have sufficient time to pickup the change before keys get
enabled.
# How the Problems Are Solved
- The web key format is used in the `getSignerOnce` function in the
`api/oidc` package.
- The public key cache is changed to get and store web keys.
- The jwks / keys endpoint returns the combined set of valid "legacy"
public keys and all available web keys.
- Cache-Control max-age default to 5 minutes and is configured in
`defaults.yaml`.
When the web keys feature is enabled, fallback mechanisms are in place
to obtain and convert "legacy" `query.PublicKey` as web keys when
needed. This allows transitioning to the feature without invalidating
existing tokens. A small performance overhead may be noticed on the keys
endpoint, because 2 queries need to be run sequentially. This will
disappear once the feature is stable and the legacy code gets cleaned
up.
# Additional Changes
- Extend legacy key lifetimes so that tests can be run on an existing
database with more than 6 hours apart.
- Discovery endpoint returns all supported algorithms when the Web Key
feature is enabled.
# Additional Context
- Closes https://github.com/zitadel/zitadel/issues/8031
- Part of https://github.com/zitadel/zitadel/issues/7809
- After https://github.com/zitadel/oidc/pull/637
- After https://github.com/zitadel/oidc/pull/638
# Which Problems Are Solved
To have more insight on the performance, CPU and memory usage of
ZITADEL, we want to enable profiling.
# How the Problems Are Solved
- Allow profiling by configuration.
- Provide Google Cloud Profiler as first implementation
# Additional Changes
None.
# Additional Context
There were possible memory leaks reported:
https://discord.com/channels/927474939156643850/1273210227918897152
Co-authored-by: Silvan <silvan.reusser@gmail.com>
# Which Problems Are Solved
GetIDPByID as endpoint in the API v2 so that it can be available for the
new login.
# How the Problems Are Solved
Create GetIDPByID endpoint with IDP v2 API, throught the GetProviderByID
implementation from admin and management API.
# Additional Changes
- Remove the OwnerType attribute from the response, as the information
is available through the resourceOwner.
- correct refs to messages in proto which are used for doc generation
- renaming of elements for API v3
# Additional Context
Closes#8337
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
Implement a new API service that allows management of OIDC signing web
keys.
This allows users to manage rotation of the instance level keys. which
are currently managed based on expiry.
The API accepts the generation of the following key types and
parameters:
- RSA keys with 2048, 3072 or 4096 bit in size and:
- Signing with SHA-256 (RS256)
- Signing with SHA-384 (RS384)
- Signing with SHA-512 (RS512)
- ECDSA keys with
- P256 curve
- P384 curve
- P512 curve
- ED25519 keys
# How the Problems Are Solved
Keys are serialized for storage using the JSON web key format from the
`jose` library. This is the format that will be used by OIDC for
signing, verification and publication.
Each instance can have a number of key pairs. All existing public keys
are meant to be used for token verification and publication the keys
endpoint. Keys can be activated and the active private key is meant to
sign new tokens. There is always exactly 1 active signing key:
1. When the first key for an instance is generated, it is automatically
activated.
2. Activation of the next key automatically deactivates the previously
active key.
3. Keys cannot be manually deactivated from the API
4. Active keys cannot be deleted
# Additional Changes
- Query methods that later will be used by the OIDC package are already
implemented. Preparation for #8031
- Fix indentation in french translation for instance event
- Move user_schema translations to consistent positions in all
translation files
# Additional Context
- Closes#8030
- Part of #7809
---------
Co-authored-by: Elio Bischof <elio@zitadel.com>
# Which Problems Are Solved
The current v3alpha actions APIs don't exactly adhere to the [new
resources API
design](https://zitadel.com/docs/apis/v3#standard-resources).
# How the Problems Are Solved
- **Improved ID access**: The aggregate ID is added to the resource
details object, so accessing resource IDs and constructing proto
messages for resources is easier
- **Explicit Instances**: Optionally, the instance can be explicitly
given in each request
- **Pagination**: A default search limit and a max search limit are
added to the defaults.yaml. They apply to the new v3 APIs (currently
only actions). The search query defaults are changed to ascending by
creation date, because this makes the pagination results the most
deterministic. The creation date is also added to the object details.
The bug with updated creation dates is fixed for executions and targets.
- **Removed Sequences**: Removed Sequence from object details and
ProcessedSequence from search details
# Additional Changes
Object details IDs are checked in unit test only if an empty ID is
expected. Centralizing the details check also makes this internal object
more flexible for future evolutions.
# Additional Context
- Closes#8169
- Depends on https://github.com/zitadel/zitadel/pull/8225
---------
Co-authored-by: Silvan <silvan.reusser@gmail.com>
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
# Which Problems Are Solved
The mirror command used the wrong position to filter for events if
different database technologies for source and destination were used.
# How the Problems Are Solved
The statements which diverge are stored on the client so that different
technologies can use different statements.
# Additional Context
- https://discord.com/channels/927474939156643850/1256396896243552347
# Which Problems Are Solved
There was no default configuration for `DeviceAuth`, which makes it
impossible to override by environment variables.
Additionally, a custom `CharAmount` value would overwrite also the
`DashInterval`.
# How the Problems Are Solved
- added to defaults.yaml
- fixed customization
# Additional Changes
None.
# Additional Context
- noticed during a customer request
# Which Problems Are Solved
ZITADEL currently selects the instance context based on a HTTP header
(see https://github.com/zitadel/zitadel/issues/8279#issue-2399959845 and
checks it against the list of instance domains. Let's call it instance
or API domain.
For any context based URL (e.g. OAuth, OIDC, SAML endpoints, links in
emails, ...) the requested domain (instance domain) will be used. Let's
call it the public domain.
In cases of proxied setups, all exposed domains (public domains) require
the domain to be managed as instance domain.
This can either be done using the "ExternalDomain" in the runtime config
or via system API, which requires a validation through CustomerPortal on
zitadel.cloud.
# How the Problems Are Solved
- Two new headers / header list are added:
- `InstanceHostHeaders`: an ordered list (first sent wins), which will
be used to match the instance.
(For backward compatibility: the `HTTP1HostHeader`, `HTTP2HostHeader`
and `forwarded`, `x-forwarded-for`, `x-forwarded-host` are checked
afterwards as well)
- `PublicHostHeaders`: an ordered list (first sent wins), which will be
used as public host / domain. This will be checked against a list of
trusted domains on the instance.
- The middleware intercepts all requests to the API and passes a
`DomainCtx` object with the hosts and protocol into the context
(previously only a computed `origin` was passed)
- HTTP / GRPC server do not longer try to match the headers to instances
themself, but use the passed `http.DomainContext` in their interceptors.
- The `RequestedHost` and `RequestedDomain` from authz.Instance are
removed in favor of the `http.DomainContext`
- When authenticating to or signing out from Console UI, the current
`http.DomainContext(ctx).Origin` (already checked by instance
interceptor for validity) is used to compute and dynamically add a
`redirect_uri` and `post_logout_redirect_uri`.
- Gateway passes all configured host headers (previously only did
`x-zitadel-*`)
- Admin API allows to manage trusted domain
# Additional Changes
None
# Additional Context
- part of #8279
- open topics:
- "single-instance" mode
- Console UI
# Which Problems Are Solved
The current v3alpha actions APIs don't exactly adhere to the [new
resources API
design](https://zitadel.com/docs/apis/v3#standard-resources).
# How the Problems Are Solved
- **Breaking**: The current v3alpha actions APIs are removed. This is
breaking.
- **Resource Namespace**: New v3alpha actions APIs for targets and
executions are added under the namespace /resources.
- **Feature Flag**: New v3alpha actions APIs still have to be activated
using the actions feature flag
- **Reduced Executions Overhead**: Executions are managed similar to
settings according to the new API design: an empty list of targets
basically makes an execution a Noop. So a single method, SetExecution is
enough to cover all use cases. Noop executions are not returned in
future search requests.
- **Compatibility**: The executions created with previous v3alpha APIs
are still available to be managed with the new executions API.
# Additional Changes
- Removed integration tests which test executions but rely on readable
targets. They are added again with #8169
# Additional Context
Closes#8168
# Which Problems Are Solved
The v2beta services are stable but not GA.
# How the Problems Are Solved
The v2beta services are copied to v2. The corresponding v1 and v2beta
services are deprecated.
# Additional Context
Closes#7236
---------
Co-authored-by: Elio Bischof <elio@zitadel.com>
# Which Problems Are Solved
The default terms of service and privacy policy links are applied to all
new ZITADEL instances, also for self hosters. However, the links
contents don't apply to self-hosters.
# How the Problems Are Solved
The links are removed from the DefaultInstance section in the
*defaults.yaml* file.
By default, the links are not shown anymore in the hosted login pages.
They can still be configured using the privacy policy.
# Additional Context
- Found because of a support request
This reverts commit e126ccc9aa.
# Which Problems Are Solved
#8295 introduced the possibility to handle idps on a single callback,
but broke current setups.
# How the Problems Are Solved
- Revert the change until a proper solution is found. Revert is needed
as docs were also changed.
# Additional Changes
None.
# Additional Context
- relates to #8295
# Which Problems Are Solved
The mirror command read the configurations in the wrong order
# How the Problems Are Solved
The Pre execution run of `mirror` reads the default config first and
then applies the custom configs
# Which Problems Are Solved
The connection pool of go uses a high amount of database connections.
# How the Problems Are Solved
The standard lib connection pool was replaced by `pgxpool.Pool`
# Additional Changes
The `db.BeginTx`-spans are removed because they cause to much noise in
the traces.
# Additional Context
- part of https://github.com/zitadel/zitadel/issues/7639
# Which Problems Are Solved
Bigger systems need to process many events during the initialisation
phase of the `eventstore.fields`-table. During setup these calls can
time out.
# How the Problems Are Solved
Changed the default behaviour of these projections to not time out and
increased the bulk limit.
# Which Problems Are Solved
Both the login UI and the IdP intent flow have their own IdP callback
endpoints.
This makes configuration hard to impossible (e.g. Github only allows one
endpoint) for customers.
# How the Problems Are Solved
- The login UI prefixes the `state` parameter when creating an auth /
SAML request.
- All requests now use the `/idp/callback` or the corresponding
variation (e.g. SAML)
- On callback, the state, resp. its prefix is checked. In case of the
login UI prefix, the request will be forwarded to the existing login UI
handler without the prefix state.
Existing setups will therefore not be affected and also requests started
before this release can be handled without any impact.
- Console only lists the "new" endpoint(s). Any
`/login/externalidp/callback` is removed.
# Additional Changes
- Cleaned up some images from the IdP documentation.
- fix the error handling in `handleExternalNotFoundOptionCheck`
# Additional Context
- closes#8236