mirror of
https://github.com/zitadel/zitadel.git
synced 2025-12-23 09:08:02 +00:00
fix(projections): overhaul the event projection system (#10560)
This PR overhauls our event projection system to make it more robust and prevent skipped events under high load. The core change replaces our custom, transaction-based locking with standard PostgreSQL advisory locks. We also introduce a worker pool to manage concurrency and prevent database connection exhaustion. ### Key Changes * **Advisory Locks for Projections:** Replaces exclusive row locks and inspection of `pg_stat_activity` with PostgreSQL advisory locks for managing projection state. This is a more reliable and standard approach to distributed locking. * **Simplified Await Logic:** Removes the complex logic for awaiting open transactions, simplifying it to a more straightforward time-based filtering of events. * **Projection Worker Pool:** Implements a worker pool to limit concurrent projection triggers, preventing connection exhaustion and improving stability under load. A new `MaxParallelTriggers` configuration option is introduced. ### Problem Solved Under high throughput, a race condition could cause projections to miss events from the eventstore. This led to inconsistent data in projection tables (e.g., a user grant might be missing). This PR fixes the underlying locking and concurrency issues to ensure all events are processed reliably. ### How it Works 1. **Event Writing:** When writing events, a *shared* advisory lock is taken. This signals that a write is in progress. 2. **Event Handling (Projections):** * A projection worker attempts to acquire an *exclusive* advisory lock for that specific projection. If the lock is already held, it means another worker is on the job, so the current one backs off. * Once the lock is acquired, the worker briefly acquires and releases the same *shared* lock used by event writers. This acts as a barrier, ensuring it waits for any in-flight writes to complete. * Finally, it processes all events that occurred before its transaction began. ### Additional Information * ZITADEL no longer modifies the `application_name` PostgreSQL variable during event writes. * The lock on the `current_states` table is now `FOR NO KEY UPDATE`. * Fixes https://github.com/zitadel/zitadel/issues/8509 --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
This commit is contained in:
@@ -130,7 +130,7 @@ func (q *Queries) checkAndLock(tx *sql.Tx, projectionName string) (name string,
|
||||
From(currentStateTable.identifier()).
|
||||
Where(sq.Eq{
|
||||
CurrentStateColProjectionName.identifier(): projectionName,
|
||||
}).Suffix("FOR UPDATE").
|
||||
}).Suffix("FOR NO KEY UPDATE").
|
||||
PlaceholderFormat(sq.Dollar).
|
||||
ToSql()
|
||||
if err != nil {
|
||||
|
||||
Reference in New Issue
Block a user