# Which Problems Are Solved
The broken login image is fixed.
# How the Problems Are Solved
The most important learnings from
https://github.com/zitadel/zitadel/pull/10318 are applied:
- Path in entrypoint is fixed: `exec node /runtime/apps/login/server.js`
- .dockerignore is updated so CSS styles are built into the image
- `source: .` is passed to the docker-bake action. Without this,
docker-bake builds from a remote context, which seems to be slow and not
updated on new PR commits. Looks like the bake action uploads an
artifact that [conflicts with the compile
workflow](https://github.com/zitadel/zitadel/actions/runs/16620417216/job/47023478437).
Therefore, a pattern is added to the compile workflow so only relevant
artifacts are selected.
# Which Problems Are Solved
Building the login container in the pipeline without build layer caches
takes about 20 minutes.
# How the Problems Are Solved
We use cache-from and cache-to arguments to use the GitHub cache API.
Compare 1st and 2nd run [of this PRs
pipeline](https://github.com/zitadel/zitadel/actions/runs/16590808510/job/46927893304)
# Additional Context
- Follows up on #10343
# Which Problems Are Solved
- The previous monorepo in monorepo structure for the login app and its
related packages was fragmented, complicated and buggy.
- The process for building and testing the login container was
inconsistent between local development and CI.
- Lack of clear documentation as well as easy and reliable ways for
non-frontend developers to reproduce and fix failing PR checks locally.
# How the Problems Are Solved
- Consolidated the login app and its related npm packages by moving the
main package to `apps/login/apps/login` and merging
`apps/login/packages/integration` and `apps/login/packages/acceptance`
into the main `apps/login` package.
- Migrated from Docker Compose-based test setups to dev container-based
setups, adding support for multiple dev container configurations:
- `.devcontainer/base`
- `.devcontainer/turbo-lint-unit`
- `.devcontainer/turbo-lint-unit-debug`
- `.devcontainer/login-integration`
- `.devcontainer/login-integration-debug`
- Added npm scripts to run the new dev container setups, enabling exact
reproduction of GitHub PR checks locally, and updated the pipeline to
use these containers.
- Cleaned up Dockerfiles and docker-bake.hcl files to only build the
production image for the login app.
- Cleaned up compose files to focus on dev environments in dev
containers.
- Updated `CONTRIBUTING.md` with guidance on running and debugging PR
checks locally using the new dev container approach.
- Introduced separate Dockerfiles for the login app to distinguish
between using published client packages and building clients from local
protos.
- Ensured the login container is always built in the pipeline for use in
integration and acceptance tests.
- Updated Makefile and GitHub Actions workflows to use
`--frozen-lockfile` for installing pnpm packages, ensuring reproducible
installs.
- Disabled GitHub release creation by the changeset action.
- Refactored the `/build` directory structure for clarity and
maintainability.
- Added a `clean` command to `docks/package.json`.
- Experimentally added `knip` to the `zitadel-client` package for
improved linting of dependencies and exports.
# Additional Changes
- Fixed Makefile commands for consistency and reliability.
- Improved the structure and clarity of the `/build` directory to
support seamless integration of the login build.
- Enhanced documentation and developer experience for running and
debugging CI checks locally.
# Additional Context
- See updated `CONTRIBUTING.md` for new local development and debugging
instructions.
- These changes are a prerequisite for further improvements to the CI
pipeline and local development workflow.
- Closes#10276
# Which Problems Are Solved
Action runs on PRs from forks can't authenticate at depot.
# How the Problems Are Solved
- The GitHub secret DEPOT_TOKEN is statically passed as env variable to
the steps that use the depot CLI, as described
[here](https://github.com/depot/setup-action#authentication).
- Removed the oidc argument from the depot/setup-action, as we pass the
env statically to the relevant steps.
- The `id-token: write` permission is removed from all workflows, as
it's not needed anymore.
# Additional Changes
Removed the obsolete comment
```yaml
# latest if branch is main, otherwise image version which is the pull request number
```
# Additional Context
Required by these approved PRs so their checks can be executed:
- https://github.com/zitadel/zitadel/pull/9982
- https://github.com/zitadel/zitadel/pull/9958
# Which Problems Are Solved
The current maintained gRPC server in combination with a REST (grpc)
gateway is getting harder and harder to maintain. Additionally, there
have been and still are issues with supporting / displaying `oneOf`s
correctly.
We therefore decided to exchange the server implementation to
connectRPC, which apart from supporting connect as protocol, also also
"standard" gRCP clients as well as HTTP/1.1 / rest like clients, e.g.
curl directly call the server without any additional gateway.
# How the Problems Are Solved
- All v2 services are moved to connectRPC implementation. (v1 services
are still served as pure grpc servers)
- All gRPC server interceptors were migrated / copied to a corresponding
connectRPC interceptor.
- API.ListGrpcServices and API. ListGrpcMethods were changed to include
the connect services and endpoints.
- gRPC server reflection was changed to a `StaticReflector` using the
`ListGrpcServices` list.
- The `grpc.Server` interfaces was split into different combinations to
be able to handle the different cases (grpc server and prefixed gateway,
connect server with grpc gateway, connect server only, ...)
- Docs of services serving connectRPC only with no additional gateway
(instance, webkey, project, app, org v2 beta) are changed to expose that
- since the plugin is not yet available on buf, we download it using
`postinstall` hook of the docs
# Additional Changes
- WebKey service is added as v2 service (in addition to the current
v2beta)
# Additional Context
closes#9483
---------
Co-authored-by: Elio Bischof <elio@zitadel.com>
# Which Problems Are Solved
Fixes the releasing of multi-architecture login images.
# How the Problems Are Solved
- The login-container workflow extends the bake definition with a file
docker-bake-release.hcl wich adds the platforms linux/arm and linux/amd
to all relevant build targets. The used technique is similar to how the
docker metadata action allows to extend the bake definitions.
- The local login tag is moved to the metadata bake target, which is
always inherited and overwritten in the pipeline
- Packages write permission is added
# Additional Changes
- The MIT license is noted in container labels and annotations
- The Image is built from root so that the local proto files are used
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
# Which Problems Are Solved
Build and test workflows are currently running on specific GitHub hosted
runners. These is not needed for most worklfows and just costs more.
# How the Problems Are Solved
Moved all the steps apart from integration-tests to public runners.
# Additional Changes
None
# Additional Context
None
# Which Problems Are Solved
The current "stable" release tag was no longer maintained.
# How the Problems Are Solved
Remove the tag from the docs.
# Additional Changes
Update the docs to reflect that test run with Ubuntu 22.04 instead of
20.04.
# Additional Context
- relates to https://github.com/zitadel/zitadel/issues/8884
# Which Problems Are Solved
The current handling of notification follows the same pattern as all
other projections:
Created events are handled sequentially (based on "position") by a
handler. During the process, a lot of information is aggregated (user,
texts, templates, ...).
This leads to back pressure on the projection since the handling of
events might take longer than the time before a new event (to be
handled) is created.
# How the Problems Are Solved
- The current user notification handler creates separate notification
events based on the user / session events.
- These events contain all the present and required information
including the userID.
- These notification events get processed by notification workers, which
gather the necessary information (recipient address, texts, templates)
to send out these notifications.
- If a notification fails, a retry event is created based on the current
notification request including the current state of the user (this
prevents race conditions, where a user is changed in the meantime and
the notification already gets the new state).
- The retry event will be handled after a backoff delay. This delay
increases with every attempt.
- If the configured amount of attempts is reached or the message expired
(based on config), a cancel event is created, letting the workers know,
the notification must no longer be handled.
- In case of successful send, a sent event is created for the
notification aggregate and the existing "sent" events for the user /
session object is stored.
- The following is added to the defaults.yaml to allow configuration of
the notification workers:
```yaml
Notifications:
# The amount of workers processing the notification request events.
# If set to 0, no notification request events will be handled. This can be useful when running in
# multi binary / pod setup and allowing only certain executables to process the events.
Workers: 1 # ZITADEL_NOTIFIACATIONS_WORKERS
# The amount of events a single worker will process in a run.
BulkLimit: 10 # ZITADEL_NOTIFIACATIONS_BULKLIMIT
# Time interval between scheduled notifications for request events
RequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_REQUEUEEVERY
# The amount of workers processing the notification retry events.
# If set to 0, no notification retry events will be handled. This can be useful when running in
# multi binary / pod setup and allowing only certain executables to process the events.
RetryWorkers: 1 # ZITADEL_NOTIFIACATIONS_RETRYWORKERS
# Time interval between scheduled notifications for retry events
RetryRequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_RETRYREQUEUEEVERY
# Only instances are projected, for which at least a projection-relevant event exists within the timeframe
# from HandleActiveInstances duration in the past until the projection's current time
# If set to 0 (default), every instance is always considered active
HandleActiveInstances: 0s # ZITADEL_NOTIFIACATIONS_HANDLEACTIVEINSTANCES
# The maximum duration a transaction remains open
# before it spots left folding additional events
# and updates the table.
TransactionDuration: 1m # ZITADEL_NOTIFIACATIONS_TRANSACTIONDURATION
# Automatically cancel the notification after the amount of failed attempts
MaxAttempts: 3 # ZITADEL_NOTIFIACATIONS_MAXATTEMPTS
# Automatically cancel the notification if it cannot be handled within a specific time
MaxTtl: 5m # ZITADEL_NOTIFIACATIONS_MAXTTL
# Failed attempts are retried after a confogired delay (with exponential backoff).
# Set a minimum and maximum delay and a factor for the backoff
MinRetryDelay: 1s # ZITADEL_NOTIFIACATIONS_MINRETRYDELAY
MaxRetryDelay: 20s # ZITADEL_NOTIFIACATIONS_MAXRETRYDELAY
# Any factor below 1 will be set to 1
RetryDelayFactor: 1.5 # ZITADEL_NOTIFIACATIONS_RETRYDELAYFACTOR
```
# Additional Changes
None
# Additional Context
- closes#8931
# Which Problems Are Solved
Add a cache implementation using Redis single mode. This does not add
support for Redis Cluster or sentinel.
# How the Problems Are Solved
Added the `internal/cache/redis` package. All operations occur
atomically, including setting of secondary indexes, using LUA scripts
where needed.
The [`miniredis`](https://github.com/alicebob/miniredis) package is used
to run unit tests.
# Additional Changes
- Move connector code to `internal/cache/connector/...` and remove
duplicate code from `query` and `command` packages.
- Fix a missed invalidation on the restrictions projection
# Additional Context
Closes#8130
# Which Problems Are Solved
Upload the integration test server logs as artifacts, even if the tests
fail.
Before this change logs where printed through the Makefile.
However if a test would fail, the logs wouldn't get printed.
# How the Problems Are Solved
- Add an extra build step that pushes `tmp/zitadel.log` and
`tmp/race.log.$pid` to artificats storage.
- Logs are no longer printed in the `core_integration_reports` Makefile
recipe.
# Additional Changes
Do not remove coverage data when generating the coverage report in
`core_integration_reports`. This is to prevent future "File not found"
erros when running the command repeatedly.
# Additional Context
Reported as internal feedback
# Which Problems Are Solved
Use a single server instance for API integration tests. This optimizes
the time taken for the integration test pipeline,
because it allows running tests on multiple packages in parallel. Also,
it saves time by not start and stopping a zitadel server for every
package.
# How the Problems Are Solved
- Build a binary with `go build -race -cover ....`
- Integration tests only construct clients. The server remains running
in the background.
- The integration package and tested packages now fully utilize the API.
No more direct database access trough `query` and `command` packages.
- Use Makefile recipes to setup, start and stop the server in the
background.
- The binary has the race detector enabled
- Init and setup jobs are configured to halt immediately on race
condition
- Because the server runs in the background, races are only logged. When
the server is stopped and race logs exist, the Makefile recipe will
throw an error and print the logs.
- Makefile recipes include logic to print logs and convert coverage
reports after the server is stopped.
- Some tests need a downstream HTTP server to make requests, like quota
and milestones. A new `integration/sink` package creates an HTTP server
and uses websockets to forward HTTP request back to the test packages.
The package API uses Go channels for abstraction and easy usage.
# Additional Changes
- Integration test files already used the `//go:build integration`
directive. In order to properly split integration from unit tests,
integration test files need to be in a `integration_test` subdirectory
of their package.
- `UseIsolatedInstance` used to overwrite the `Tester.Client` for each
instance. Now a `Instance` object is returned with a gRPC client that is
connected to the isolated instance's hostname.
- The `Tester` type is now `Instance`. The object is created for the
first instance, used by default in any test. Isolated instances are also
`Instance` objects and therefore benefit from the same methods and
values. The first instance and any other us capable of creating an
isolated instance over the system API.
- All test packages run in an Isolated instance by calling
`NewInstance()`
- Individual tests that use an isolated instance use `t.Parallel()`
# Additional Context
- Closes#6684
- https://go.dev/doc/articles/race_detector
- https://go.dev/doc/build-cover
---------
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
# Which Problems Are Solved
Bumping charts needs a manual trigger.
# How the Problems Are Solved
The charts bump workflow is run after every ZITADEL release.
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
The navigation in the console default settings is flaky. Sometimes it
arbitrarily jumps to the organizations page.
# How the Problems Are Solved
The lifecycle hooks were extended to react differently to changes that
come from 'outside' and from the component itself.
# Additional Changes
The e2e tests are supposed to run against Firefox and Chrome. However
they are run twice against Electon. Fixing this revealed the console
navigation flakiness that was less visible on Electron.
The following issues are also fixed with this PR to reduce flakiness in
e2e tests.
- The custom command in the pipeline is removed from the e2e action
step, so the browser argument is respected.
- The npm packages of the e2e tests are updated to their latest version.
- Notification tests run against a clean state now so they don't depend
on each other anymore. This resolved some flakiness and improved
debuggability of the tests.
- E2E page load timeout is increased, reducing flakiness.
- E2E tests wait on some elements to be enabled before they interact
with them, reducing flakiness.
# Additional Context
- Closes#8404
- Follow-up: https://github.com/zitadel/zitadel/issues/8471
The e2e tests ran three times in a row successfully in the pipeline
against both browsers.
---------
Co-authored-by: Max Peintner <max@caos.ch>
Co-authored-by: Livio Spring <livio.a@gmail.com>
Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
# Fallback to Vercel CI
Since we cannot share the vercel_token on forks we cannot deploy by
vercel CLI.
This PR reverts to the last working state by using vercel CI.
I will look into a fix with an npm script or a turbo config to ignore
builds on folder changes.
# Which Problems Are Solved
This allows us to build multiple docs in parallel and only runs when
docs/proto are changed.
# Additional Changes
- [ ] Change "required" in GitHub from Vercel to the docs flow
---------
Co-authored-by: Livio Spring <livio.a@gmail.com>
# Which Problems Are Solved
It is not very clear if the author or the reviewer of a PR should tick
the boxes.
# How the Problems Are Solved
The author of the PR is tagged in the comment, because the author should
tick the boxes before marking it as ready for review.
If the feature is enabled the new packages are used to query org by id
Part of: https://github.com/zitadel/zitadel/issues/7639
### Definition of Ready
- [x] I am happy with the code
- [x] Short description of the feature/issue is added in the pr
description
- [x] PR is linked to the corresponding user story
- [ ] Acceptance criteria are met
- [ ] All open todos and follow ups are defined in a new ticket and
justified
- [ ] Deviations from the acceptance criteria and design are agreed with
the PO and documented.
- [x] No debug or dead code
- [x] My code has no repetitions
- [ ] Critical parts are tested automatically
- [ ] Where possible E2E tests are implemented
- [ ] Documentation/examples are up-to-date
- [ ] All non-functional requirements are met
- [x] Functionality of the acceptance criteria is checked manually on
the dev system.