Commit Graph

7 Commits

Author SHA1 Message Date
Silvan
23d98e9d11 fix(projection): Prevent race condition with event push (#10676)
A timing issue (a race condition) was identified in our event processing
system. Under specific circumstances, it was possible for the system to
skip processing certain events, leading to potential data
inconsistencies.

## Which problems are solved

The system tracks its progress through the event log using timestamps.
The issue occurred because we were using the timestamp from the start of
a database transaction. If a query to read new events began after the
transaction started but before the new event was committed, the query
would not see the new event and would fail to process it.

## How the problems are solved

The fix is to change which timestamp is used for tracking. We now use
the precise timestamp of when the event is actually written to the
database. This ensures that the event's timestamp is always correctly
ordered, closing the timing gap and preventing the race condition.

This change enhances the reliability and integrity of our event
processing pipeline. It guarantees that all events are processed in the
correct order and eliminates the risk of skipped events, ensuring data
is always consistent across the system.

## Additional information

original fix: https://github.com/zitadel/zitadel/pull/10560

(cherry picked from commit 136363deda)
2025-09-15 09:41:50 +02:00
Fabienne Bühler
07ce3b6905 chore!: Introduce ZITADEL v3 (#9645)
This PR summarizes multiple changes specifically only available with
ZITADEL v3:

- feat: Web Keys management
(https://github.com/zitadel/zitadel/pull/9526)
- fix(cmd): ensure proper working of mirror
(https://github.com/zitadel/zitadel/pull/9509)
- feat(Authz): system user support for permission check v2
(https://github.com/zitadel/zitadel/pull/9640)
- chore(license): change from Apache to AGPL
(https://github.com/zitadel/zitadel/pull/9597)
- feat(console): list v2 sessions
(https://github.com/zitadel/zitadel/pull/9539)
- fix(console): add loginV2 feature flag
(https://github.com/zitadel/zitadel/pull/9682)
- fix(feature flags): allow reading "own" flags
(https://github.com/zitadel/zitadel/pull/9649)
- feat(console): add Actions V2 UI
(https://github.com/zitadel/zitadel/pull/9591)

BREAKING CHANGE
- feat(webkey): migrate to v2beta API
(https://github.com/zitadel/zitadel/pull/9445)
- chore!: remove CockroachDB Support
(https://github.com/zitadel/zitadel/pull/9444)
- feat(actions): migrate to v2beta API
(https://github.com/zitadel/zitadel/pull/9489)

---------

Co-authored-by: Livio Spring <livio.a@gmail.com>
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
Co-authored-by: Ramon <mail@conblem.me>
Co-authored-by: Elio Bischof <elio@zitadel.com>
Co-authored-by: Kenta Yamaguchi <56732734+KEY60228@users.noreply.github.com>
Co-authored-by: Harsha Reddy <harsha.reddy@klaviyo.com>
Co-authored-by: Livio Spring <livio@zitadel.com>
Co-authored-by: Max Peintner <max@caos.ch>
Co-authored-by: Iraq <66622793+kkrime@users.noreply.github.com>
Co-authored-by: Florian Forster <florian@zitadel.com>
Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Max Peintner <peintnerm@gmail.com>
2025-04-02 16:53:06 +02:00
Tim Möhlmann
bcc6a689fa fix(setup): use template for in_tx_order type (#9346)
# Which Problems Are Solved

Systems running with PostgreSQL before Zitadel v2.39 are likely to have
a wrong type for the `in_tx_order` column in the `eventstore.event2`
table. The migration at the time used the `event_sequence` as default
value without typecast, which results in a `bigint` type for that
column. However, when creating the table from scratch, we explicitly
specify the type to be `integer`.

Starting from Zitadel v2.67 we use a Pl/PgSQL function to push events.
The function requires the types from `eventstore.events2` to the same as
the `select` destinations used in the function. In the function
`in_tx_order` is also expected to by of `integer` type.

CochroachDB systems are not affected because `bigint` is an alias to the
`int` type. In other words, CockroachDB uses `int8` when specifying type
`int`. Therefore the types already match.

# How the Problems Are Solved

Retrieve the actual column type currently in use. A template is used to
assign the type to the `ordinality` column returned as `in_tx_order`.

# Additional Changes

- Detailed logging on migration failure

# Additional Context

- Closes #9180

---------

Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
2025-02-12 11:06:34 +00:00
Silvan
9532c9bea5 fix(eventstore): correct sql push function (#9201)
# Which Problems Are Solved

https://github.com/zitadel/zitadel/pull/9186 introduced the new `push`
sql function for cockroachdb. The function used the wrong database
function to generate the position of the event and would therefore
insert events at a position before events created with an old Zitadel
version.

# How the Problems Are Solved

Instead of `EXTRACT(EPOCH FROM NOW())`, `cluster_logical_timestamp()` is
used to calculate the position of an event.

# Additional Context

- Introduced in https://github.com/zitadel/zitadel/pull/9186
- Affected versions:
https://github.com/zitadel/zitadel/releases/tag/v2.67.3
2025-01-17 15:32:05 +01:00
Silvan
690147b30e perf(eventstore): fast push on crdb (#9186)
# Which Problems Are Solved

The performance of the initial push function can further be increased

# How the Problems Are Solved

`eventstore.push`- and `eventstore.commands_to_events`-functions were
rewritten

# Additional Changes

none

# Additional Context

same optimizations as for postgres:
https://github.com/zitadel/zitadel/pull/9092
2025-01-15 14:55:48 +00:00
Tim Möhlmann
df2c6f1d4c perf(eventstore): optimize commands to events function (#9092)
# Which Problems Are Solved

We were seeing high query costs in a the lateral join executed in the
commands_to_events procedural function in the database. The high cost
resulted in incremental CPU usage as a load test continued and less
req/sec handled, sarting at 836 and ending at 130 req/sec.

# How the Problems Are Solved

1. Set `PARALLEL SAFE`. I noticed that this option defaults to `UNSAFE`.
But it's actually safe if the function doesn't `INSERT`
2. Set the returned `ROWS 10` parameter.
3. Function is re-written in Pl/PgSQL so that we eliminate expensive
joins.
4. Introduced an intermediate state that does `SELECT DISTINCT` for the
aggregate so that we don't have to do an expensive lateral join.

# Additional Changes

Use a `COALESCE` to get the owner from the last event, instead of a
`CASE` switch.

# Additional Context

- Function was introduced in
https://github.com/zitadel/zitadel/pull/8816
- Closes https://github.com/zitadel/zitadel/issues/8352

---------

Co-authored-by: Silvan <27845747+adlerhurst@users.noreply.github.com>
2025-01-08 11:59:44 +00:00
Silvan
dab5d9e756 refactor(eventstore): move push logic to sql (#8816)
# Which Problems Are Solved

If many events are written to the same aggregate id it can happen that
zitadel [starts to retry the push
transaction](48ffc902cc/internal/eventstore/eventstore.go (L101))
because [the locking
behaviour](48ffc902cc/internal/eventstore/v3/sequence.go (L25))
during push does compute the wrong sequence because newly committed
events are not visible to the transaction. These events impact the
current sequence.

In cases with high command traffic on a single aggregate id this can
have severe impact on general performance of zitadel. Because many
connections of the `eventstore pusher` database pool are blocked by each
other.

# How the Problems Are Solved

To improve the performance this locking mechanism was removed and the
business logic of push is moved to sql functions which reduce network
traffic and can be analyzed by the database before the actual push. For
clients of the eventstore framework nothing changed.

# Additional Changes

- after a connection is established prefetches the newly added database
types
- `eventstore.BaseEvent` now returns the correct revision of the event

# Additional Context

- part of https://github.com/zitadel/zitadel/issues/8931

---------

Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
Co-authored-by: Livio Spring <livio.a@gmail.com>
Co-authored-by: Max Peintner <max@caos.ch>
Co-authored-by: Elio Bischof <elio@zitadel.com>
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
Co-authored-by: Miguel Cabrerizo <30386061+doncicuto@users.noreply.github.com>
Co-authored-by: Joakim Lodén <Loddan@users.noreply.github.com>
Co-authored-by: Yxnt <Yxnt@users.noreply.github.com>
Co-authored-by: Stefan Benz <stefan@caos.ch>
Co-authored-by: Harsha Reddy <harsha.reddy@klaviyo.com>
Co-authored-by: Zach H <zhirschtritt@gmail.com>
2024-12-04 13:51:40 +00:00