# Which Problems Are Solved
The session API allowed any authenticated user to update sessions by their ID without any further check.
This was unintentionally introduced with version 2.53.0 when the requirement of providing the latest session token on every session update was removed and no other permission check (e.g. session.write) was ensured.
# How the Problems Are Solved
- Granted `session.write` to `IAM_OWNER` and `IAM_LOGIN_CLIENT` in the defaults.yaml
- Granted `session.read` to `IAM_ORG_MANAGER`, `IAM_USER_MANAGER` and `ORG_OWNER` in the defaults.yaml
- Pass the session token to the UpdateSession command.
- Check for `session.write` permission on session creation and update.
- Alternatively, the (latest) sessionToken can be used to update the session.
- Setting an auth request to failed on the OIDC Service `CreateCallback` endpoint now ensures it's either the same user as used to create the auth request (for backwards compatibilty) or requires `session.link` permission.
- Setting an device auth request to failed on the OIDC Service `AuthorizeOrDenyDeviceAuthorization` endpoint now requires `session.link` permission.
- Setting an auth request to failed on the SAML Service `CreateResponse` endpoint now requires `session.link` permission.
# Additional Changes
none
# Additional Context
none
(cherry picked from commit 4c942f3477)
# Which Problems Are Solved
The commands for the resource based v2beta AuthorizationService API are
added.
Authorizations, previously knows as user grants, give a user in a
specific organization and project context roles.
The project can be owned or granted.
The given roles can be used to restrict access within the projects
applications.
The commands for the resource based v2beta InteralPermissionService API
are added.
Administrators, previously knows as memberships, give a user in a
specific organization and project context roles.
The project can be owned or granted.
The give roles give the user permissions to manage different resources
in Zitadel.
API definitions from https://github.com/zitadel/zitadel/issues/9165 are
implemented.
Contains endpoints for user metadata.
# How the Problems Are Solved
### New Methods
- CreateAuthorization
- UpdateAuthorization
- DeleteAuthorization
- ActivateAuthorization
- DeactivateAuthorization
- ListAuthorizations
- CreateAdministrator
- UpdateAdministrator
- DeleteAdministrator
- ListAdministrators
- SetUserMetadata to set metadata on a user
- DeleteUserMetadata to delete metadata on a user
- ListUserMetadata to query for metadata of a user
## Deprecated Methods
### v1.ManagementService
- GetUserGrantByID
- ListUserGrants
- AddUserGrant
- UpdateUserGrant
- DeactivateUserGrant
- ReactivateUserGrant
- RemoveUserGrant
- BulkRemoveUserGrant
### v1.AuthService
- ListMyUserGrants
- ListMyProjectPermissions
# Additional Changes
- Permission checks for metadata functionality on query and command side
- correct existence checks for resources, for example you can only be an
administrator on an existing project
- combined all member tables to singular query for the administrators
- add permission checks for command an query side functionality
- combined functions on command side where necessary for easier
maintainability
# Additional Context
Closes#9165
---------
Co-authored-by: Elio Bischof <elio@zitadel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
The current maintained gRPC server in combination with a REST (grpc)
gateway is getting harder and harder to maintain. Additionally, there
have been and still are issues with supporting / displaying `oneOf`s
correctly.
We therefore decided to exchange the server implementation to
connectRPC, which apart from supporting connect as protocol, also also
"standard" gRCP clients as well as HTTP/1.1 / rest like clients, e.g.
curl directly call the server without any additional gateway.
# How the Problems Are Solved
- All v2 services are moved to connectRPC implementation. (v1 services
are still served as pure grpc servers)
- All gRPC server interceptors were migrated / copied to a corresponding
connectRPC interceptor.
- API.ListGrpcServices and API. ListGrpcMethods were changed to include
the connect services and endpoints.
- gRPC server reflection was changed to a `StaticReflector` using the
`ListGrpcServices` list.
- The `grpc.Server` interfaces was split into different combinations to
be able to handle the different cases (grpc server and prefixed gateway,
connect server with grpc gateway, connect server only, ...)
- Docs of services serving connectRPC only with no additional gateway
(instance, webkey, project, app, org v2 beta) are changed to expose that
- since the plugin is not yet available on buf, we download it using
`postinstall` hook of the docs
# Additional Changes
- WebKey service is added as v2 service (in addition to the current
v2beta)
# Additional Context
closes#9483
---------
Co-authored-by: Elio Bischof <elio@zitadel.com>
# Which Problems Are Solved
The production endpoint of the service ping was wrong.
Additionally we discussed in the sprint review, that we could randomize
the default interval to prevent all systems to report data at the very
same time and also require a minimal interval.
# How the Problems Are Solved
- fixed the endpoint
- If the interval is set to @daily (default), we generate a random time
(minute, hour) as a cron format.
- Check if the interval is more than 30min and return an error if not.
- Fixed yaml indent on `ResourceCount`
# Additional Changes
None
# Additional Context
as discussed internally
This PR is still WIP and needs changes to at least the tests.
# Which Problems Are Solved
To be able to report analytical / telemetry data from deployed Zitadel
systems back to a central endpoint, we designed a "service ping"
functionality. See also https://github.com/zitadel/zitadel/issues/9706.
This PR adds the first implementation to allow collection base data as
well as report amount of resources such as organizations, users per
organization and more.
# How the Problems Are Solved
- Added a worker to handle the different `ReportType` variations.
- Schedule a periodic job to start a `ServicePingReport`
- Configuration added to allow customization of what data will be
reported
- Setup step to generate and store a `systemID`
# Additional Changes
None
# Additional Context
relates to #9869
# Which Problems Are Solved
We provide a seamless way to initialize Zitadel and the login together.
# How the Problems Are Solved
Additionally to the `IAM_OWNER` role, a set up admin user also gets the
`IAM_LOGIN_CLIENT` role if it is a machine user with a PAT.
# Additional Changes
- Simplifies the load balancing example, as the intermediate
configuration step is not needed anymore.
# Additional Context
- Depends on #10116
- Contributes to https://github.com/zitadel/zitadel-charts/issues/332
- Contributes to https://github.com/zitadel/zitadel/issues/10016
---------
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
# Which Problems Are Solved
Remove the feature flag that allowed triggers in introspection. This
option was a fallback in case introspection would not function properly
without triggers. The API documentation asked for anyone using this flag
to raise an issue. No such issue was received, hence we concluded it is
safe to remove it.
# How the Problems Are Solved
- Remove flags from the system and instance level feature APIs.
- Remove trigger functions that are no longer used
- Adjust tests that used the flag.
# Additional Changes
- none
# Additional Context
- Closes#10026
- Flag was introduced in #7356
---------
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
# Which Problems Are Solved
This PR *partially* addresses #9450 . Specifically, it implements the
resource based API for the apps. APIs for app keys ARE not part of this
PR.
# How the Problems Are Solved
- `CreateApplication`, `PatchApplication` (update) and
`RegenerateClientSecret` endpoints are now unique for all app types:
API, SAML and OIDC apps.
- All new endpoints have integration tests
- All new endpoints are using permission checks V2
# Additional Changes
- The `ListApplications` endpoint allows to do sorting (see protobuf for
details) and filtering by app type (see protobuf).
- SAML and OIDC update endpoint can now receive requests for partial
updates
# Additional Context
Partially addresses #9450
# Which Problems Are Solved
Stabilize the usage of webkeys.
# How the Problems Are Solved
- Remove all legacy signing key code from the OIDC API
- Remove the webkey feature flag from proto
- Remove the webkey feature flag from console
- Cleanup documentation
# Additional Changes
- Resolved some canonical header linter errors in OIDC
- Use the constant for `projections.lock` in the saml package.
# Additional Context
- Closes#10029
- After #10105
- After #10061
# Which Problems Are Solved
Stabilize the optimized introspection code and cleanup unused code.
# How the Problems Are Solved
- `oidc_legacy_introspection` feature flag is removed and reserved.
- `OPStorage` which are no longer needed have their bodies removed.
- The method definitions need to remain in place so the interface
remains implemented.
- A panic is thrown in case any such method is still called
# Additional Changes
- A number of `OPStorage` methods related to token creation were already
unused. These are also cleaned up.
# Additional Context
- Closes#10027
- #7822
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
We are preparing to roll-out and stabilize webkeys in the next version
of Zitadel. Before removing legacy signing-key code, we must ensure all
existing instances have their webkeys generated.
# How the Problems Are Solved
Add a setup step which generate 2 webkeys for each existing instance
that didn't have webkeys yet.
# Additional Changes
Return an error from the config type-switch, when the type is unknown.
# Additional Context
- Part 1/2 of https://github.com/zitadel/zitadel/issues/10029
- Should be back-ported to v3
# Which Problems Are Solved
The resource usage to query user(s) on the database was high and
therefore could have performance impact.
# How the Problems Are Solved
Database queries involving the users and loginnames table were improved
and an index was added for user by email query.
# Additional Changes
- spellchecks
- updated apis on load tests
# additional info
needs cherry pick to v3
# Which Problems Are Solved
This pull request addresses a significant gap in the user service v2
API, which currently lacks methods for managing machine users.
# How the Problems Are Solved
This PR adds new API endpoints to the user service v2 to manage machine
users including their secret, keys and personal access tokens.
Additionally, there's now a CreateUser and UpdateUser endpoints which
allow to create either a human or machine user and update them. The
existing `CreateHumanUser` endpoint has been deprecated along the
corresponding management service endpoints. For details check the
additional context section.
# Additional Context
- Closes https://github.com/zitadel/zitadel/issues/9349
## More details
- API changes: https://github.com/zitadel/zitadel/pull/9680
- Implementation: https://github.com/zitadel/zitadel/pull/9763
- Tests: https://github.com/zitadel/zitadel/pull/9771
## Follow-ups
- Metadata: support managing user metadata using resource API
https://github.com/zitadel/zitadel/pull/10005
- Machine token type: support managing the machine token type (migrate
to new enum with zero value unspecified?)
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
Add the ability to keep track of the current counts of projection
resources. We want to prevent calling `SELECT COUNT(*)` on tables, as
that forces a full scan and sudden spikes of DB resource uses.
# How the Problems Are Solved
- A resource_counts table is added
- Triggers that increment and decrement the counted values on inserts
and deletes
- Triggers that delete all counts of a table when the source table is
TRUNCATEd. This is not in the business logic, but prevents wrong counts
in case someone want to force a re-projection.
- Triggers that delete all counts if the parent resource is deleted
- Script to pre-populate the resource_counts table when a new source
table is added.
The triggers are reusable for any type of resource, in case we choose to
add more in the future.
Counts are aggregated by a given parent. Currently only `instance` and
`organization` are defined as possible parent. This can later be
extended to other types, such as `project`, should the need arise.
I deliberately chose to use `parent_id` to distinguish from the
de-factor `resource_owner` which is usually an organization ID. For
example:
- For users the parent is an organization and the `parent_id` matches
`resource_owner`.
- For organizations the parent is an instance, but the `resource_owner`
is the `org_id`. In this case the `parent_id` is the `instance_id`.
- Applications would have a similar problem, where the parent is a
project, but the `resource_owner` is the `org_id`
# Additional Context
Closes https://github.com/zitadel/zitadel/issues/9957
# Which Problems Are Solved
if Zitadel was started using `start-from-init` or `start-from-setup`
there were rare cases where a panic occured when
`Notifications.LegacyEnabled` was set to false. The cause was a list
which was not reset before refilling.
# How the Problems Are Solved
The list is now reset before each time it gets filled.
# Additional Changes
Ensure all contexts are canceled for the init and setup functions for
`start-from-init- or `start-from-setup` commands.
# Additional Context
none
# Eventstore fixes
- `event.Position` used float64 before which can lead to [precision
loss](https://github.com/golang/go/issues/47300). The type got replaced
by [a type without precision
loss](https://github.com/jackc/pgx-shopspring-decimal)
- the handler reported the wrong error if the current state was updated
and therefore took longer to retry failed events.
# Mirror fixes
- max age of auth requests can be configured to speed up copying data
from `auth.auth_requests` table. Auth requests last updated before the
set age will be ignored. Default is 1 month
- notification projections are skipped because notifications should be
sent by the source system. The projections are set to the latest
position
- ensure that mirror can be executed multiple times
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
Currently if a user signs in using an IdP, once they sign out of
Zitadel, the corresponding IdP session is not terminated. This can be
the desired behavior. In some cases, e.g. when using a shared computer
it results in a potential security risk, since a follower user might be
able to sign in as the previous using the still open IdP session.
# How the Problems Are Solved
- Admins can enabled a federated logout option on SAML IdPs through the
Admin and Management APIs.
- During the termination of a login V1 session using OIDC end_session
endpoint, Zitadel will check if an IdP was used to authenticate that
session.
- In case there was a SAML IdP used with Federated Logout enabled, it
will intercept the logout process, store the information into the shared
cache and redirect to the federated logout endpoint in the V1 login.
- The V1 login federated logout endpoint checks every request on an
existing cache entry. On success it will create a SAML logout request
for the used IdP and either redirect or POST to the configured SLO
endpoint. The cache entry is updated with a `redirected` state.
- A SLO endpoint is added to the `/idp` handlers, which will handle the
SAML logout responses. At the moment it will check again for an existing
federated logout entry (with state `redirected`) in the cache. On
success, the user is redirected to the initially provided
`post_logout_redirect_uri` from the end_session request.
# Additional Changes
None
# Additional Context
- This PR merges the https://github.com/zitadel/zitadel/pull/9841 and
https://github.com/zitadel/zitadel/pull/9854 to main, additionally
updating the docs on Entra ID SAML.
- closes#9228
- backport to 3.x
---------
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
Co-authored-by: Zach Hirschtritt <zachary.hirschtritt@klaviyo.com>
# Which Problems Are Solved
Resource management of projects and sub-resources was before limited by
the context provided by the management API, which would mean you could
only manage resources belonging to a specific organization.
# How the Problems Are Solved
With the addition of a resource-based API, it is now possible to manage
projects and sub-resources on the basis of the resources themselves,
which means that as long as you have the permission for the resource,
you can create, read, update and delete it.
- CreateProject to create a project under an organization
- UpdateProject to update an existing project
- DeleteProject to delete an existing project
- DeactivateProject and ActivateProject to change the status of a
project
- GetProject to query for a specific project with an identifier
- ListProject to query for projects and granted projects
- CreateProjectGrant to create a project grant with project and granted
organization
- UpdateProjectGrant to update the roles of a project grant
- DeactivateProjectGrant and ActivateProjectGrant to change the status
of a project grant
- DeleteProjectGrant to delete an existing project grant
- ListProjectGrants to query for project grants
- AddProjectRole to add a role to an existing project
- UpdateProjectRole to change texts of an existing role
- RemoveProjectRole to remove an existing role
- ListProjectRoles to query for project roles
# Additional Changes
- Changes to ListProjects, which now contains granted projects as well
- Changes to messages as defined in the
[API_DESIGN](https://github.com/zitadel/zitadel/blob/main/API_DESIGN.md)
- Permission checks for project functionality on query and command side
- Added testing to unit tests on command side
- Change update endpoints to no error returns if nothing changes in the
resource
- Changed all integration test utility to the new service
- ListProjects now also correctly lists `granted projects`
- Permission checks for project grant and project role functionality on
query and command side
- Change existing pre checks so that they also work resource specific
without resourceowner
- Added the resourceowner to the grant and role if no resourceowner is
provided
- Corrected import tests with project grants and roles
- Added testing to unit tests on command side
- Change update endpoints to no error returns if nothing changes in the
resource
- Changed all integration test utility to the new service
- Corrected some naming in the proto files to adhere to the API_DESIGN
# Additional Context
Closes#9177
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
<!--
Please inform yourself about the contribution guidelines on submitting a
PR here:
https://github.com/zitadel/zitadel/blob/main/CONTRIBUTING.md#submit-a-pull-request-pr.
Take note of how PR/commit titles should be written and replace the
template texts in the sections below. Don't remove any of the sections.
It is important that the commit history clearly shows what is changed
and why.
Important: By submitting a contribution you agree to the terms from our
Licensing Policy as described here:
https://github.com/zitadel/zitadel/blob/main/LICENSING.md#community-contributions.
-->
# Which Problems Are Solved
These changes introduce resource-based API endpoints for managing
instances and custom domains.
There are 4 types of changes:
- Endpoint implementation: consisting of the protobuf interface and the
implementation of the endpoint. E.g:
606439a172
- (Integration) Tests: testing the implemented endpoint. E.g:
cdfe1f0372
- Fixes: Bugs found during development that are being fixed. E.g:
acbbeedd32
- Miscellaneous: code needed to put everything together or that doesn't
fit any of the above categories. E.g:
529df92abc or
6802cb5468
# How the Problems Are Solved
_Ticked checkboxes indicate that the functionality is complete_
- [x] Instance
- [x] Create endpoint
- [x] Create endpoint tests
- [x] Update endpoint
- [x] Update endpoint tests
- [x] Get endpoint
- [x] Get endpoint tests
- [x] Delete endpoint
- [x] Delete endpoint tests
- [x] Custom Domains
- [x] Add custom domain
- [x] Add custom domain tests
- [x] Remove custom domain
- [x] Remove custom domain tests
- [x] List custom domains
- [x] List custom domains tests
- [x] Trusted Domains
- [x] Add trusted domain
- [x] Add trusted domain tests
- [x] Remove trusted domain
- [x] Remove trusted domain tests
- [x] List trusted domains
- [x] List trusted domains tests
# Additional Changes
When looking for instances (through the `ListInstances` endpoint)
matching a given query, if you ask for the results to be order by a
specific column, the query will fail due to a syntax error. This is
fixed in acbbeedd32 . Further explanation
can be found in the commit message
# Additional Context
- Relates to #9452
- CreateInstance has been excluded:
https://github.com/zitadel/zitadel/issues/9930
- Permission checks / instance retrieval (middleware) needs to be
changed to allow context based permission checks
(https://github.com/zitadel/zitadel/issues/9929), required for
ListInstances
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
- Allow users to use SHA-256 and SHA-512 hashing algorithms. These
algorithms are used by Linux's crypt(3) function.
- Allow users to import passwords using the PHPass algorithm. This
algorithm is used by older PHP systems, WordPress in particular.
# How the Problems Are Solved
- Upgrade passwap to
[v0.9.0](https://github.com/zitadel/passwap/releases/tag/v0.9.0)
- Add sha2 and phpass as a new verifier option in defaults.yaml
# Additional Changes
- Updated docs to explain the two algorithms
# Additional Context
Implements the changes in the passwap library from
https://github.com/zitadel/passwap/pull/59 and
https://github.com/zitadel/passwap/pull/60
# Which Problems Are Solved
We saw high CPU usage if many events were created on the database. This
was caused by the new actions which query for all event types and
aggregate types.
# How the Problems Are Solved
- the handler of action execution does not filter for aggregate and
event types.
- the index for `instance_id` and `position` is reenabled.
# Additional Changes
none
# Additional Context
none
# Which Problems Are Solved
#9837 added a new index `es_instance_position` on the events table with
the idea to improve performance for some projections. Unfortunately, it
makes it worse for almost all projections and would only improve the
situation for the events handler of the actions V2 subscriptions.
# How the Problems Are Solved
Remove the index again.
# Additional Changes
None
# Additional Context
relates to #9837
relates to #9863
# Which Problems Are Solved
The execution handler projection handles all events to check if an
execution has to be provided to the worker to execute.
In this logic all events would be processed from the beginning which is
not necessary.
# How the Problems Are Solved
Add the current state to the execution handler projection, to avoid
processing all existing events.
# Additional Changes
Add custom configuration to the default, so that the transactions are
limited to some events.
# Additional Context
None
# Which Problems Are Solved
Step 54 was not executed during setup.
# How the Problems Are Solved
Added the step to setup jobs
# Additional Changes
none
# Additional Context
- the step was added in https://github.com/zitadel/zitadel/pull/9837
- thanks to @zhirschtritt for raising this.
# Which Problems Are Solved
Some projection queries took a long time to run. It seems that 1 or more
queries couldn't make proper use of the `es_projection` index. This
might be because of a specific complexity aggregate_type and event_type
arguments, making the index unfeasible for postgres.
# How the Problems Are Solved
Following the index recommendation, add and index that covers just
instance_id and position.
# Additional Changes
- none
# Additional Context
- Related to https://github.com/zitadel/zitadel/issues/9832
# Which Problems Are Solved
Instance that had improved performance flags set, got event errors when
getting instance features. This is because the improved performance
flags were marshalled using the enumerated integers, but now needed to
be unmashalled using the added UnmarshallText method.
# How the Problems Are Solved
- Remove emnumer generation
# Additional Changes
- none
# Additional Context
- reported on QA
- Backport to next-rc / v3
# Which Problems Are Solved
The `auth.auth_requests` table is not cleaned up so long running Zitadel
installations can contain many rows.
The mirror command can take long because a the data are first copied
into memory (or disk) on cockroach and users do not get any output from
mirror. This is unfortunate because people don't know if Zitadel got
stuck.
# How the Problems Are Solved
Enhance logging throughout the projection processes and introduce a
configuration option for the maximum age of authentication requests.
# Additional Changes
None
# Additional Context
closes https://github.com/zitadel/zitadel/issues/9764
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
Webkeys were not generated with new instances when the webkey feature
flag was enabled for instance defaults. This would cause a redirect loop
with console for new instances on QA / coud.
# How the Problems Are Solved
- uncomment the webkeys section on defaults.yaml
- Fix field naming of webkey config
# Additional Changes
- Add all available features as comments.
- Make the improved performance type enum parsable from the config,
untill now they were just ints.
- Running of the enumer command created missing enum entries for feature
keys.
# Additional Context
- Needs to be back-ported to v3 / next-rc
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
If I start a fresh instance and do not overwrite `SystemAPIUsers` I get
an error during startup `error="decoding failed due to the following
error(s):\n\n'SystemAPIUsers[0][path]' expected a map, got
'string'\n'SystemAPIUsers[0][memberships]' expected a map, got 'slice'"`
# How the Problems Are Solved
the configuration is commented so that the example is still there
# Additional Changes
-
# Additional Context
was added in https://github.com/zitadel/zitadel/pull/9757
<!--
Please inform yourself about the contribution guidelines on submitting a
PR here:
https://github.com/zitadel/zitadel/blob/main/CONTRIBUTING.md#submit-a-pull-request-pr.
Take note of how PR/commit titles should be written and replace the
template texts in the sections below. Don't remove any of the sections.
It is important that the commit history clearly shows what is changed
and why.
Important: By submitting a contribution you agree to the terms from our
Licensing Policy as described here:
https://github.com/zitadel/zitadel/blob/main/LICENSING.md#community-contributions.
-->
# Which Problems Are Solved
A customer reached out that after an upgrade, actions would always fail
with the error "host is denied" when calling an external API.
This is due to a security fix
(https://github.com/zitadel/zitadel/security/advisories/GHSA-6cf5-w9h3-4rqv),
where a DNS lookup was added to check whether the host name resolves to
a denied IP or subnet.
If the lookup fails due to the internal DNS setup, the action fails as
well. Additionally, the lookup was also performed when the deny list was
empty.
# How the Problems Are Solved
- Prevent DNS lookup when deny list is empty
- Properly initiate deny list and prevent empty entries
# Additional Changes
- Log the reason for blocked address (domain, IP, subnet)
# Additional Context
- reported by a customer
- needs backport to 2.70.x, 2.71.x and 3.0.0 rc
# Which Problems Are Solved
When running a long-running Zitadel Setup, Kubernetes might decide to
move a pod to a new node automatically. Currently, this puts any
migrations into a broken state that an operator needs to manually run
the "cleanup" command on - assuming they catch the error.
The only super long running commands are typically projection pre-fill
operations, which depending on the size of the event table for that
projection, can take many hours - plenty of time for Kubernetes to make
unexpected decisions, especially in a busy cluster.
# How the Problems Are Solved
This change listens on `os.Interrupt` and `syscall.SIGTERM`, cancels the
current Setup context, and runs the `Cleanup` command. The logs then
look something like this:
```shell
...
INFO[0000] verify migration caller="/Users/zach/src/zitadel/internal/migration/migration.go:43" name=repeatable_delete_stale_org_fields
INFO[0000] starting migration caller="/Users/zach/src/zitadel/internal/migration/migration.go:66" name=repeatable_delete_stale_org_fields
INFO[0000] execute delete query caller="/Users/zach/src/zitadel/cmd/setup/39.go:37" instance_id=281297936179003398 migration=repeatable_delete_stale_org_fields progress=1/1
INFO[0000] verify migration caller="/Users/zach/src/zitadel/internal/migration/migration.go:43" name=repeatable_fill_fields_for_instance_domains
INFO[0000] starting migration caller="/Users/zach/src/zitadel/internal/migration/migration.go:66" name=repeatable_fill_fields_for_instance_domains
----- SIGTERM signal issued -----
INFO[0000] received interrupt signal, shutting down: interrupt caller="/Users/zach/src/zitadel/cmd/setup/setup.go:121"
INFO[0000] query failed caller="/Users/zach/src/zitadel/internal/eventstore/repository/sql/query.go:135" error="timeout: context already done: context canceled"
DEBU[0000] filter eventstore failed caller="/Users/zach/src/zitadel/internal/eventstore/handler/v2/field_handler.go:155" error="ID=SQL-KyeAx Message=unable to filter events Parent=(timeout: context already done: context canceled)" projection=instance_domain_fields
DEBU[0000] unable to rollback tx caller="/Users/zach/src/zitadel/internal/eventstore/handler/v2/field_handler.go:110" error="sql: transaction has already been committed or rolled back" projection=instance_domain_fields
INFO[0000] process events failed caller="/Users/zach/src/zitadel/internal/eventstore/handler/v2/field_handler.go:72" error="ID=SQL-KyeAx Message=unable to filter events Parent=(timeout: context already done: context canceled)" projection=instance_domain_fields
DEBU[0000] trigger iteration caller="/Users/zach/src/zitadel/internal/eventstore/handler/v2/field_handler.go:73" iteration=0 projection=instance_domain_fields
ERRO[0000] migration failed caller="/Users/zach/src/zitadel/internal/migration/migration.go:68" error="ID=SQL-KyeAx Message=unable to filter events Parent=(timeout: context already done: context canceled)" name=repeatable_fill_fields_for_instance_domains
ERRO[0000] migration finish failed caller="/Users/zach/src/zitadel/internal/migration/migration.go:71" error="context canceled" name=repeatable_fill_fields_for_instance_domains
----- Cleanup before exiting -----
INFO[0000] cleanup started caller="/Users/zach/src/zitadel/cmd/setup/cleanup.go:30"
INFO[0000] cleanup migration caller="/Users/zach/src/zitadel/cmd/setup/cleanup.go:47" name=repeatable_fill_fields_for_instance_domains
```
# Additional Changes
* `mustExecuteMigration` -> `executeMigration`: **must**Execute logged a
Fatal error previously which calls os.Exit so no cleanup was possible.
Instead, this PR returns an error and assigns it to a shared error in
the Setup closure that defer can check.
* `initProjections` now returns an error instead of exiting
# Additional Context
This behavior might be unwelcome or at least unexpected in some cases.
Putting it behind a feature flag or config setting is likely a good
followup.
---------
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
# Which Problems Are Solved
Add the possibility to filter project resources based on project member
roles.
# How the Problems Are Solved
Extend and refactor existing Pl/PgSQL functions to implement the
following:
- Solve O(n) complexity in returned resources IDs by returning a boolean
filter for instance level permissions.
- Individually permitted orgs are returned only if there was no instance
permission
- Individually permitted projects are returned only if there was no
instance permission
- Because of the multiple filter terms, use `INNER JOIN`s instead of
`WHERE` clauses.
# Additional Changes
- system permission function no longer query the organization view and
therefore can be `immutable`, giving big performance benefits for
frequently reused system users. (like our hosted login in Zitadel cloud)
- The permitted org and project functions are now defined as `stable`
because the don't modify on-disk data. This might give a small
performance gain
- The Pl/PgSQL functions are now tested using Go unit tests.
# Additional Context
- Depends on https://github.com/zitadel/zitadel/pull/9677
- Part of https://github.com/zitadel/zitadel/issues/9188
- Closes https://github.com/zitadel/zitadel/issues/9190
# Which Problems Are Solved
With the change of #9561, the `mirror` command panics as there's no
metrics provider configured.
# How the Problems Are Solved
Correctly initialize the provider (no-op by default) for the mirror
command.
# Additional Changes
None
# Additional Context
relates to #9561 -> needs backports to 2.66.x - 2.71.x and 3.0.0-rc
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
# Which Problems Are Solved
With v2.71.0 the `idp_templates6_ldap3` projection was created but never
filled, as it was a subtable. To fix this we altered the
`idp_templates6_ldap3` to `idp_templates6_ldap2` with v2.71.5.
This was unfortunately without a check that the `idp_templates_ldap2`was
already existing, which resulted in an error in the setup step.
# How the Problems Are Solved
Add check if `idp_templates6_ldap2` is already existing, before renaming
`idp_templates6_ldap3` -> `idp_templates6_ldap2`.
# Additional Changes
None
# Additional Context
Closes#9669
# Which Problems Are Solved
With current provided telemetry it's difficult to predict when a
projection handler is under increased load until it's too late and
causes downstream issues. Importantly, projection updating is in the
critical path for many login flows and increased latency there can
result in system downtime for users.
# How the Problems Are Solved
This PR adds three new prometheus-style metrics:
1. **projection_events_processed** (_labels: projection, success_) -
This metric gives us a counter of the number of events processed per
projection update run and whether they we're processed without error. A
high number of events being processed can let us know how busy a
particular projection handler is.
2. **projection_handle_timer** _(labels: projection)_ - This is the time
it takes to process a projection update given a batch of events - time
to take the current_states lock, query for new events, reduce,
update_the projection, and update current_states.
3. **projection_state_latency** _(labels: projection)_ - This is the
time from the last event processed in the current_states table for a
given projection. It tells us how old was the last event you processed?
Or, how far behind are you running for this projection? Higher latencies
could mean high load or stalled projection handling.
# Additional Changes
I also had to initialize the global otel metrics provider (`metrics.M`)
in the `setup` step additionally to `start` since projection handlers
are initialized at setup. The initialization checks if a metrics
provider is already set (in case of `start-from-setup` or
`start-from-init` to prevent overwriting, which causes the otel metrics
provider to stop working.
# Additional Context
## Example Dashboards


---------
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
Zitadel setup with v2.71.0 could result in errors regarding the
idp_templates6_ldap3 subtable.
# How the Problems Are Solved
Rename the subtable idp_templates6_ldap3 to idp_templates6_ldap2 if no
idp_templates6_ldap2 is existing and rename column `rootCA` to
`root_ca`.
# Additional Changes
None
# Additional Context
Related PR #9292
---------
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
# Which Problems Are Solved
Allow verification of imported salted passwords hashed with plain md5.
# How the Problems Are Solved
- Upgrade passwap to
[v0.7.0](https://github.com/zitadel/passwap/releases/tag/v0.7.0)
- Add md5salted as a new verifier option in `defaults.yaml`
# Additional Changes
- go version and libraries updated (required by passkey v0.7.0)
- secrets.md verifiers updated
- configuration verifiers updated
- added MD5salted and missing MD5Plain to test cases
# Which Problems Are Solved
The service name is hardcoded in the metrics code. Making the service
name to be configurable helps when running multiple instances of
Zitadel.
The defaults remain unchanged, the service name will be defaulted to
ZITADEL.
# How the Problems Are Solved
Add a config option to override the name in defaults.yaml and pass it
down to the corresponding metrics or tracing module (google or otel)
# Additional Changes
NA
# Additional Context
NA
# Which Problems Are Solved
If configuration `notifications.LegacyEnabled` is set to false when
using cockroachdb as a database Zitadel start does not work and prints
the following error: `level=fatal msg="unable to start zitadel"
caller="github.com/zitadel/zitadel/cmd/start/start_from_init.go:44"
error="can't scan into dest[0]: cannot scan NULL into *string"`
# How the Problems Are Solved
The combination of the setting and cockraochdb are checked and a better
error is provided to the user.
# Additional Context
- introduced with https://github.com/zitadel/zitadel/pull/9321
# Which Problems Are Solved
SQL error in `cmd/setup/49/01-permitted_orgs_function.sql`
# How the Problems Are Solved
Updating `cmd/setup/49/01-permitted_orgs_function.sql`
# Additional Context
- Closes https://github.com/zitadel/zitadel/issues/9461
Co-authored-by: Iraq Jaber <IraqJaber@gmail.com>
# Which Problems Are Solved
Actions v2 are not executed in different functions, as provided by the
actions v1.
# How the Problems Are Solved
Add functionality to call actions v2 through OIDC and SAML logic to
complement tokens and SAMLResponses.
# Additional Changes
- Corrected testing for retrieved intent information
- Added testing for IDP types
- Corrected handling of context for issuer in SAML logic
# Additional Context
- Closes#7247
- Dependent on https://github.com/zitadel/saml/pull/97
- docs for migration are done in separate issue:
https://github.com/zitadel/zitadel/issues/9456
---------
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
# Which Problems Are Solved
Currently I am not able to run the new login with a service account with
an IAM_OWNER role.
As the role is missing some permissions which the LOGIN_CLIENT role does
have
# How the Problems Are Solved
Added session permissions to the IAM_OWNER
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
The recently introduced notification queue have potential race conditions.
# How the Problems Are Solved
Current code is refactored to use the queue package, which is safe in
regards of concurrency.
# Additional Changes
- the queue is included in startup
- improved code quality of queue
# Additional Context
- closes https://github.com/zitadel/zitadel/issues/9278
# Which Problems Are Solved
Setup fails to push all role permission events when running Zitadel with
CockroachDB. `TransactionRetryError`s were visible in logs which finally
times out the setup job with `timeout: context deadline exceeded`
# How the Problems Are Solved
As suggested in the [Cockroach documentation](timeout: context deadline
exceeded), _"break down larger transactions"_. The commands to be pushed
for the role permissions are chunked in 50 events per push. This
chunking is only done with CockroachDB.
# Additional Changes
- gci run fixed some unrelated imports
- access to `command.Commands` for the setup job, so we can reuse the
sync logic.
# Additional Context
Closes#9293
---------
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
# Which Problems Are Solved
Some OAuth2 and OIDC providers require the use of PKCE for all their
clients. While ZITADEL already recommended the same for its clients, it
did not yet support the option on the IdP configuration.
# How the Problems Are Solved
- A new boolean `use_pkce` is added to the add/update generic OAuth/OIDC
endpoints.
- A new checkbox is added to the generic OAuth and OIDC provider
templates.
- The `rp.WithPKCE` option is added to the provider if the use of PKCE
has been set.
- The `rp.WithCodeChallenge` and `rp.WithCodeVerifier` options are added
to the OIDC/Auth BeginAuth and CodeExchange function.
- Store verifier or any other persistent argument in the intent or auth
request.
- Create corresponding session object before creating the intent, to be
able to store the information.
- (refactored session structs to use a constructor for unified creation
and better overview of actual usage)
Here's a screenshot showing the URI including the PKCE params:

# Additional Changes
None.
# Additional Context
- Closes#6449
- This PR replaces the existing PR (#8228) of @doncicuto. The base he
did was cherry picked. Thank you very much for that!
---------
Co-authored-by: Miguel Cabrerizo <doncicuto@gmail.com>
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>