Commit Graph

25 Commits

Author SHA1 Message Date
Livio Spring
8537805ea5
feat(notification): use event worker pool (#8962)
# Which Problems Are Solved

The current handling of notification follows the same pattern as all
other projections:
Created events are handled sequentially (based on "position") by a
handler. During the process, a lot of information is aggregated (user,
texts, templates, ...).
This leads to back pressure on the projection since the handling of
events might take longer than the time before a new event (to be
handled) is created.

# How the Problems Are Solved

- The current user notification handler creates separate notification
events based on the user / session events.
- These events contain all the present and required information
including the userID.
- These notification events get processed by notification workers, which
gather the necessary information (recipient address, texts, templates)
to send out these notifications.
- If a notification fails, a retry event is created based on the current
notification request including the current state of the user (this
prevents race conditions, where a user is changed in the meantime and
the notification already gets the new state).
- The retry event will be handled after a backoff delay. This delay
increases with every attempt.
- If the configured amount of attempts is reached or the message expired
(based on config), a cancel event is created, letting the workers know,
the notification must no longer be handled.
- In case of successful send, a sent event is created for the
notification aggregate and the existing "sent" events for the user /
session object is stored.
- The following is added to the defaults.yaml to allow configuration of
the notification workers:
```yaml

Notifications:
  # The amount of workers processing the notification request events.
  # If set to 0, no notification request events will be handled. This can be useful when running in
  # multi binary / pod setup and allowing only certain executables to process the events.
  Workers: 1 # ZITADEL_NOTIFIACATIONS_WORKERS
  # The amount of events a single worker will process in a run.
  BulkLimit: 10 # ZITADEL_NOTIFIACATIONS_BULKLIMIT
  # Time interval between scheduled notifications for request events
  RequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_REQUEUEEVERY
  # The amount of workers processing the notification retry events.
  # If set to 0, no notification retry events will be handled. This can be useful when running in
  # multi binary / pod setup and allowing only certain executables to process the events.
  RetryWorkers: 1 # ZITADEL_NOTIFIACATIONS_RETRYWORKERS
  # Time interval between scheduled notifications for retry events
  RetryRequeueEvery: 2s # ZITADEL_NOTIFIACATIONS_RETRYREQUEUEEVERY
  # Only instances are projected, for which at least a projection-relevant event exists within the timeframe
  # from HandleActiveInstances duration in the past until the projection's current time
  # If set to 0 (default), every instance is always considered active
  HandleActiveInstances: 0s # ZITADEL_NOTIFIACATIONS_HANDLEACTIVEINSTANCES
  # The maximum duration a transaction remains open
  # before it spots left folding additional events
  # and updates the table.
  TransactionDuration: 1m # ZITADEL_NOTIFIACATIONS_TRANSACTIONDURATION
  # Automatically cancel the notification after the amount of failed attempts
  MaxAttempts: 3 # ZITADEL_NOTIFIACATIONS_MAXATTEMPTS
  # Automatically cancel the notification if it cannot be handled within a specific time
  MaxTtl: 5m  # ZITADEL_NOTIFIACATIONS_MAXTTL
  # Failed attempts are retried after a confogired delay (with exponential backoff).
  # Set a minimum and maximum delay and a factor for the backoff
  MinRetryDelay: 1s  # ZITADEL_NOTIFIACATIONS_MINRETRYDELAY
  MaxRetryDelay: 20s # ZITADEL_NOTIFIACATIONS_MAXRETRYDELAY
  # Any factor below 1 will be set to 1
  RetryDelayFactor: 1.5 # ZITADEL_NOTIFIACATIONS_RETRYDELAYFACTOR
```


# Additional Changes

None

# Additional Context

- closes #8931
2024-11-27 15:01:17 +00:00
Tim Möhlmann
250f2344c8
feat(cache): redis cache (#8822)
# Which Problems Are Solved

Add a cache implementation using Redis single mode. This does not add
support for Redis Cluster or sentinel.

# How the Problems Are Solved

Added the `internal/cache/redis` package. All operations occur
atomically, including setting of secondary indexes, using LUA scripts
where needed.

The [`miniredis`](https://github.com/alicebob/miniredis) package is used
to run unit tests.

# Additional Changes

- Move connector code to `internal/cache/connector/...` and remove
duplicate code from `query` and `command` packages.
- Fix a missed invalidation on the restrictions projection

# Additional Context

Closes #8130
2024-11-04 10:44:51 +00:00
Tim Möhlmann
4eaa3163b6
feat(storage): generic cache interface (#8628)
# Which Problems Are Solved

We identified the need of caching.
Currently we have a number of places where we use different ways of
caching, like go maps or LRU.
We might also want shared chaches in the future, like Redis-based or in
special SQL tables.

# How the Problems Are Solved

Define a generic Cache interface which allows different implementations.

- A noop implementation is provided and enabled as.
- An implementation using go maps is provided
  - disabled in defaults.yaml
  - enabled in integration tests
- Authz middleware instance objects are cached using the interface.

# Additional Changes

- Enabled integration test command raceflag
- Fix a race condition in the limits integration test client
- Fix a number of flaky integration tests. (Because zitadel is super
fast now!) 🎸 🚀

# Additional Context

Related to https://github.com/zitadel/zitadel/issues/8648
2024-09-25 21:40:21 +02:00
Livio Spring
c8e2a3bd49
feat: enable application performance profiling (#8442)
# Which Problems Are Solved

To have more insight on the performance, CPU and memory usage of
ZITADEL, we want to enable profiling.

# How the Problems Are Solved

- Allow profiling by configuration.
- Provide Google Cloud Profiler as first implementation

# Additional Changes

None.

# Additional Context

There were possible memory leaks reported:
https://discord.com/channels/927474939156643850/1273210227918897152

Co-authored-by: Silvan <silvan.reusser@gmail.com>
2024-08-16 13:26:53 +00:00
Tim Möhlmann
64a3bb3149
feat(v3alpha): web key resource (#8262)
# Which Problems Are Solved

Implement a new API service that allows management of OIDC signing web
keys.
This allows users to manage rotation of the instance level keys. which
are currently managed based on expiry.

The API accepts the generation of the following key types and
parameters:

- RSA keys with 2048, 3072 or 4096 bit in size and:
  - Signing with SHA-256 (RS256)
  - Signing with SHA-384 (RS384)
  - Signing with SHA-512 (RS512)
- ECDSA keys with
  - P256 curve
  - P384 curve
  - P512 curve
- ED25519 keys

# How the Problems Are Solved

Keys are serialized for storage using the JSON web key format from the
`jose` library. This is the format that will be used by OIDC for
signing, verification and publication.

Each instance can have a number of key pairs. All existing public keys
are meant to be used for token verification and publication the keys
endpoint. Keys can be activated and the active private key is meant to
sign new tokens. There is always exactly 1 active signing key:

1. When the first key for an instance is generated, it is automatically
activated.
2. Activation of the next key automatically deactivates the previously
active key.
3. Keys cannot be manually deactivated from the API
4. Active keys cannot be deleted

# Additional Changes

- Query methods that later will be used by the OIDC package are already
implemented. Preparation for #8031
- Fix indentation in french translation for instance event
- Move user_schema translations to consistent positions in all
translation files

# Additional Context

- Closes #8030
- Part of #7809

---------

Co-authored-by: Elio Bischof <elio@zitadel.com>
2024-08-14 14:18:14 +00:00
Livio Spring
3d071fc505
feat: trusted (instance) domains (#8369)
# Which Problems Are Solved

ZITADEL currently selects the instance context based on a HTTP header
(see https://github.com/zitadel/zitadel/issues/8279#issue-2399959845 and
checks it against the list of instance domains. Let's call it instance
or API domain.
For any context based URL (e.g. OAuth, OIDC, SAML endpoints, links in
emails, ...) the requested domain (instance domain) will be used. Let's
call it the public domain.
In cases of proxied setups, all exposed domains (public domains) require
the domain to be managed as instance domain.
This can either be done using the "ExternalDomain" in the runtime config
or via system API, which requires a validation through CustomerPortal on
zitadel.cloud.

# How the Problems Are Solved

- Two new headers / header list are added:
- `InstanceHostHeaders`: an ordered list (first sent wins), which will
be used to match the instance.
(For backward compatibility: the `HTTP1HostHeader`, `HTTP2HostHeader`
and `forwarded`, `x-forwarded-for`, `x-forwarded-host` are checked
afterwards as well)
- `PublicHostHeaders`: an ordered list (first sent wins), which will be
used as public host / domain. This will be checked against a list of
trusted domains on the instance.
- The middleware intercepts all requests to the API and passes a
`DomainCtx` object with the hosts and protocol into the context
(previously only a computed `origin` was passed)
- HTTP / GRPC server do not longer try to match the headers to instances
themself, but use the passed `http.DomainContext` in their interceptors.
- The `RequestedHost` and `RequestedDomain` from authz.Instance are
removed in favor of the `http.DomainContext`
- When authenticating to or signing out from Console UI, the current
`http.DomainContext(ctx).Origin` (already checked by instance
interceptor for validity) is used to compute and dynamically add a
`redirect_uri` and `post_logout_redirect_uri`.
- Gateway passes all configured host headers (previously only did
`x-zitadel-*`)
- Admin API allows to manage trusted domain

# Additional Changes

None

# Additional Context

- part of #8279 
- open topics: 
  - "single-instance" mode
  - Console UI
2024-07-31 18:00:38 +03:00
Elio Bischof
6c0e7c402d
fix(setup): decode complex config strings (#7854)
* fix(setup): decode complex config strings

* decode complex types before string lists

* lint

* fix hasher env docs

* hasher defaults

* cleanup

* cleanup

---------

Co-authored-by: Livio Spring <livio.a@gmail.com>
2024-05-01 12:17:27 +02:00
Tim Möhlmann
26d1563643
feat(api): feature flags (#7356)
* feat(api): feature API proto definitions

* update proto based on discussion with @livio-a

* cleanup old feature flag stuff

* authz instance queries

* align defaults

* projection definitions

* define commands and event reducers

* implement system and instance setter APIs

* api getter implementation

* unit test repository package

* command unit tests

* unit test Get queries

* grpc converter unit tests

* migrate the V1 features

* migrate oidc to dynamic features

* projection unit test

* fix instance by host

* fix instance by id data type in sql

* fix linting errors

* add system projection test

* fix behavior inversion

* resolve proto file comments

* rename SystemDefaultLoginInstanceEventType to SystemLoginDefaultOrgEventType so it's consistent with the instance level event

* use write models and conditional set events

* system features integration tests

* instance features integration tests

* error on empty request

* documentation entry

* typo in feature.proto

* fix start unit tests

* solve linting error on key case switch

* remove system defaults after discussion with @eliobischof

* fix system feature projection

* resolve comments in defaults.yaml

---------

Co-authored-by: Livio Spring <livio.a@gmail.com>
2024-02-28 10:55:54 +02:00
Elio Bischof
19af2f7372
feat: support whole config as env (#6336)
* fix existing env vars

* feat: support all config by env

* cleanup

* remove system users hook

* decode system users in setup
2024-02-16 16:04:42 +00:00
Silvan
17953e9040
fix(setup): init projections (#7194)
Even though this is a feature it's released as fix so that we can back port to earlier revisions.

As reported by multiple users startup of ZITADEL after leaded to downtime and worst case rollbacks to the previously deployed version.

The problem starts rising when there are too many events to process after the start of ZITADEL. The root cause are changes on projections (database tables) which must be recomputed. This PR solves this problem by adding a new step to the setup phase which prefills the projections. The step can be enabled by adding the `--init-projections`-flag to `setup`, `start-from-init` and `start-from-setup`. Setting this flag results in potentially longer duration of the setup phase but reduces the risk of the problems mentioned in the paragraph above.
2024-01-25 17:28:20 +01:00
Elio Bischof
4980cd6a0c
feat: add SYSTEM_OWNER role (#6765)
* define roles and permissions

* support system user memberships

* don't limit system users

* cleanup permissions

* restrict memberships to aggregates

* default to SYSTEM_OWNER

* update unit tests

* test: system user token test (#6778)

* update unit tests

* refactor: make authz testable

* move session constants

* cleanup

* comment

* comment

* decode member type string to enum (#6780)

* decode member type string to enum

* handle all membership types

* decode enums where necessary

* decode member type in steps config

* update system api docs

* add technical advisory

* tweak docs a bit

* comment in comment

* lint

* extract token from Bearer header prefix

* review changes

* fix tests

* fix: add fix for activityhandler

* add isSystemUser

* remove IsSystemUser from activity info

* fix: add fix for activityhandler

---------

Co-authored-by: Stefan Benz <stefan@caos.ch>
2023-10-25 15:10:45 +00:00
Livio Spring
68bfab2fb3
feat(login): use default org for login without provided org context (#6625)
* start feature flags

* base feature events on domain const

* setup default features

* allow setting feature in system api

* allow setting feature in admin api

* set settings in login based on feature

* fix rebasing

* unit tests

* i18n

* update policy after domain discovery

* some changes from review

* check feature and value type

* check feature and value type
2023-09-29 08:21:32 +00:00
Elio Bischof
1a49b7d298
perf: project quotas and usages (#6441)
* project quota added

* project quota removed

* add periods table

* make log record generic

* accumulate usage

* query usage

* count action run seconds

* fix filter in ReportQuotaUsage

* fix existing tests

* fix logstore tests

* fix typo

* fix: add quota unit tests command side

* fix: add quota unit tests command side

* fix: add quota unit tests command side

* move notifications into debouncer and improve limit querying

* cleanup

* comment

* fix: add quota unit tests command side

* fix remaining quota usage query

* implement InmemLogStorage

* cleanup and linting

* improve test

* fix: add quota unit tests command side

* fix: add quota unit tests command side

* fix: add quota unit tests command side

* fix: add quota unit tests command side

* action notifications and fixes for notifications query

* revert console prefix

* fix: add quota unit tests command side

* fix: add quota integration tests

* improve accountable requests

* improve accountable requests

* fix: add quota integration tests

* fix: add quota integration tests

* fix: add quota integration tests

* comment

* remove ability to store logs in db and other changes requested from review

* changes requested from review

* changes requested from review

* Update internal/api/http/middleware/access_interceptor.go

Co-authored-by: Silvan <silvan.reusser@gmail.com>

* tests: fix quotas integration tests

* improve incrementUsageStatement

* linting

* fix: delete e2e tests as intergation tests cover functionality

* Update internal/api/http/middleware/access_interceptor.go

Co-authored-by: Silvan <silvan.reusser@gmail.com>

* backup

* fix conflict

* create rc

* create prerelease

* remove issue release labeling

* fix tracing

---------

Co-authored-by: Livio Spring <livio.a@gmail.com>
Co-authored-by: Stefan Benz <stefan@caos.ch>
Co-authored-by: adlerhurst <silvan.reusser@gmail.com>
2023-09-15 16:58:45 +02:00
Silvan
1c354ca977
ci: improve performance (#5953)
* pipeline runs on ubuntu instead of docker
* added Makefile to build zitadel core (backend) and console (frontend)
* pipeline runs in parallel where possible
* pipeline is split into multiple jobs
* removed goreleaser
* added command to check if zitadel instance is running
2023-07-17 10:08:20 +02:00
Elio Bischof
bb756482c7
feat: push telemetry (#6027)
* document analytics config

* rework configuration and docs

* describe HandleActiveInstances better

* describe active instances on quotas better

* only projected events are considered

* cleanup

* describe changes at runtime

* push milestones

* stop tracking events

* calculate and push 4 in 6 milestones

* reduce milestone pushed

* remove docs

* fix scheduled pseudo event projection

* push 5 in 6 milestones

* push 6 in 6 milestones

* ignore client ids

* fix text array contains

* push human readable milestone type

* statement unit tests

* improve dev and db performance

* organize imports

* cleanup

* organize imports

* test projection

* check rows.Err()

* test search query

* pass linting

* review

* test 4 milestones

* simplify milestone by instance ids query

* use type NamespacedCondition

* cleanup

* lint

* lint

* dont overwrite original error

* no opt-in in examples

* cleanup

* prerelease

* enable request headers

* make limit configurable

* review fixes

* only requeue special handlers secondly

* include integration tests

* Revert "include integration tests"

This reverts commit 96db9504ec.

* pass reducers

* test handlers

* fix unit test

* feat: increment version

* lint

* remove prerelease

* fix integration tests
2023-07-06 08:38:13 +02:00
Elio Bischof
681541f41b
feat: add quotas (#4779)
adds possibilities to cap authenticated requests and execution seconds of actions on a defined intervall
2023-02-15 02:52:11 +01:00
Livio Spring
d21bb902f1
fix: push timeout (#4882) (#4885)
* push with timeout

* test: config for eventstore

(cherry picked from commit b9156da76d)

Co-authored-by: Silvan <silvan.reusser@gmail.com>
2022-12-15 09:40:13 +00:00
Stefan Benz
47ffa52f0f
feat: Instance create (#4502)
* feat(instance): implement create instance with direct machine user and credentials

* fix: deprecated add endpoint and variable declaration

* fix(instance): update logic for pats and machinekeys

* fix(instance): unit test corrections and additional unit test for pats and machinekeys

* fix(instance-create): include review changes

* fix(instance-create): linter fixes

* move iframe usage to solution scenarios configurations

* Revert "move iframe usage to solution scenarios configurations"

This reverts commit 9db31f3808.

* fix merge

* fix: add review suggestions

Co-authored-by: Livio Spring <livio.a@gmail.com>

* fix: add review changes

* fix: add review changes for default definitions

* fix: add review changes for machinekey details

* fix: add machinekey output when setup with machineuser

* fix: add changes from review

* fix instance converter for machine and allow overwriting of further machine fields

Co-authored-by: Livio Spring <livio.a@gmail.com>
2022-12-09 14:04:33 +01:00
Silvan
43fb3fd1a6
feat(actions): add token customization flow and extend functionally with modules (#4337)
* fix: potential memory leak

* feat(actions): possibility to parse json
feat(actions): possibility to perform http calls

* add query call

* feat(api): list flow and trigger types
fix(api): switch flow and trigger types to dynamic objects

* fix(translations): add action translations

* use `domain.FlowType`

* localizers

* localization

* trigger types

* options on `query.Action`

* add functions for actions

* feat: management api: add list flow and trigger  (#4352)

* console changes

* cleanup

* fix: wrong localization

Co-authored-by: Max Peintner <max@caos.ch>

* id token works

* check if claims not nil

* feat(actions): metadata api

* refactor(actions): modules

* fix: allow prerelease

* fix: test

* feat(actions): deny list for http hosts

* feat(actions): deny list for http hosts

* refactor: actions

* fix: different error ids

* fix: rename statusCode to status

* Actions objects as options (#4418)

* fix: rename statusCode to status

* fix(actions): objects as options

* fix(actions): objects as options

* fix(actions): set fields

* add http client to old actions

* fix(actions): add log module

* fix(actions): add user to context where possible

* fix(actions): add user to ctx in external authorization/pre creation

* fix(actions): query correct flow in claims

* test: actions

* fix(id-generator): panic if no machine id

* tests

* maybe this?

* fix linting

* refactor: improve code

* fix: metadata and usergrant usage in actions

* fix: appendUserGrant

* fix: allowedToFail and timeout in action execution

* fix: allowed to fail in token complement flow

* docs: add action log claim

* Update defaults.yaml

* fix log claim

* remove prerelease build

Co-authored-by: Max Peintner <max@caos.ch>
Co-authored-by: Livio Spring <livio.a@gmail.com>
2022-10-06 14:23:59 +02:00
Stefan Benz
7a5f7f82cf
feat(saml): implementation of saml for ZITADEL v2 (#3618) 2022-09-12 18:18:08 +02:00
Silvan
f71d158fb0
fix(cli): no more panics during startup (#4333)
* fix(cli): configure id generator

* fix(cli): use correct viper environment
2022-09-07 08:35:12 +00:00
Livio Spring
f610d48569
feat: prepare for multiple database types (#4068)
BREAKING CHANGE: the database and admin user config has changed.
2022-07-28 16:25:42 +02:00
Livio Spring
9b6dad18cb
feat: provide metrics endpoint (#3902)
* feat: provide metrics endpoint

* config

* enable otel metrics by default

Co-authored-by: Florian Forster <florian@caos.ch>
2022-07-18 10:42:32 +02:00
Livio Spring
a1d404291d
fix(notify): notify user in projection (#3889)
* start implement notify user in projection

* fix(stmt): add copy to multi stmt

* use projections for notify users

* feat: notifications from projections

* feat: notifications from projections

* cleanup

* pre-release

* fix tests

* fix types

* fix command

* fix queryNotifyUser

* fix: build version

* fix: HumanPasswordlessInitCodeSent

Co-authored-by: adlerhurst <silvan.reusser@gmail.com>
2022-07-06 14:09:49 +02:00
Livio Spring
12d4d3ea0b
fix: enable env vars in setup steps (and deprecate admin subcommand) (#3871)
* fix: enable env vars in setup steps (and deprecate admin subcommand)

* fix tests and error text
2022-06-27 10:32:34 +00:00