12 Commits

Author SHA1 Message Date
Silvan
dab5d9e756
refactor(eventstore): move push logic to sql (#8816)
# Which Problems Are Solved

If many events are written to the same aggregate id it can happen that
zitadel [starts to retry the push
transaction](48ffc902cc/internal/eventstore/eventstore.go (L101))
because [the locking
behaviour](48ffc902cc/internal/eventstore/v3/sequence.go (L25))
during push does compute the wrong sequence because newly committed
events are not visible to the transaction. These events impact the
current sequence.

In cases with high command traffic on a single aggregate id this can
have severe impact on general performance of zitadel. Because many
connections of the `eventstore pusher` database pool are blocked by each
other.

# How the Problems Are Solved

To improve the performance this locking mechanism was removed and the
business logic of push is moved to sql functions which reduce network
traffic and can be analyzed by the database before the actual push. For
clients of the eventstore framework nothing changed.

# Additional Changes

- after a connection is established prefetches the newly added database
types
- `eventstore.BaseEvent` now returns the correct revision of the event

# Additional Context

- part of https://github.com/zitadel/zitadel/issues/8931

---------

Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
Co-authored-by: Livio Spring <livio.a@gmail.com>
Co-authored-by: Max Peintner <max@caos.ch>
Co-authored-by: Elio Bischof <elio@zitadel.com>
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
Co-authored-by: Miguel Cabrerizo <30386061+doncicuto@users.noreply.github.com>
Co-authored-by: Joakim Lodén <Loddan@users.noreply.github.com>
Co-authored-by: Yxnt <Yxnt@users.noreply.github.com>
Co-authored-by: Stefan Benz <stefan@caos.ch>
Co-authored-by: Harsha Reddy <harsha.reddy@klaviyo.com>
Co-authored-by: Zach H <zhirschtritt@gmail.com>
2024-12-04 13:51:40 +00:00
Silvan
1ee7a1ab7c
feat(eventstore): accept transaction in push (#8945)
# Which Problems Are Solved

Push is not capable of external transactions.

# How the Problems Are Solved

A new function `PushWithClient` is added to the eventstore framework
which allows to pass a client which can either be a `*sql.Client` or
`*sql.Tx` and is used during push.

# Additional Changes

Added interfaces to database package.

# Additional Context

- part of https://github.com/zitadel/zitadel/issues/8931

---------

Co-authored-by: Livio Spring <livio.a@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2024-11-22 17:25:28 +01:00
Silvan
522c82876f
fix(eventstore): set application name during push to instance id (#8918)
# Which Problems Are Solved

Noisy neighbours can introduce projection latencies because the
projections only query events older than the start timestamp of the
oldest push transaction.

# How the Problems Are Solved

During push we set the application name to
`zitadel_es_pusher_<instance_id>` instead of `zitadel_es_pusher` which
is used to query events by projections.
2024-11-18 15:30:12 +00:00
Silvan
9c3e5e467b
perf(query): remove transactions for queries (#8614)
# Which Problems Are Solved

Queries currently execute 3 statements, begin, query, commit

# How the Problems Are Solved

remove transaction handling from query methods in database package

# Additional Changes

- Bump versions of `core_grpc_dependencies`-receipt in Makefile

# Additional info

During load tests we saw a lot of idle transactions of `zitadel_queries`
application name which is the connection pool used to query data in
zitadel. Executed query:

`select query_start - xact_start, pid, application_name, backend_start,
xact_start, query_start, state_change, wait_event_type,
wait_event,substring(query, 1, 200) query from pg_stat_activity where
datname = 'zitadel' and state <> 'idle';`

Mostly the last query executed was `begin isolation level read committed
read only`.

example: 

```
    ?column?     |  pid  |      application_name      |         backend_start         |          xact_start           |          query_start          |         state_change          | wait_event_type |  wait_event  |                                                                                                  query                                                                                                   
-----------------+-------+----------------------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+-----------------+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 00:00:00        | 33030 | zitadel_queries            | 2024-10-16 16:25:53.906036+00 | 2024-10-16 16:30:19.191661+00 | 2024-10-16 16:30:19.191661+00 | 2024-10-16 16:30:19.19169+00  | Client          | ClientRead   | begin isolation level read committed read only
 00:00:00        | 33035 | zitadel_queries            | 2024-10-16 16:25:53.909629+00 | 2024-10-16 16:30:19.19179+00  | 2024-10-16 16:30:19.19179+00  | 2024-10-16 16:30:19.191805+00 | Client          | ClientRead   | begin isolation level read committed read only
 00:00:00.00412  | 33028 | zitadel_queries            | 2024-10-16 16:25:53.904247+00 | 2024-10-16 16:30:19.187734+00 | 2024-10-16 16:30:19.191854+00 | 2024-10-16 16:30:19.191964+00 | Client          | ClientRead   | SELECT created_at, event_type, "sequence", "position", payload, creator, "owner", instance_id, aggregate_type, aggregate_id, revision FROM eventstore.events2 WHERE instance_id = $1 AND aggregate_type 
 00:00:00.084662 | 33134 | zitadel_es_pusher          | 2024-10-16 16:29:54.979692+00 | 2024-10-16 16:30:19.178578+00 | 2024-10-16 16:30:19.26324+00  | 2024-10-16 16:30:19.263267+00 | Client          | ClientRead   | RELEASE SAVEPOINT cockroach_restart
 00:00:00.084768 | 33139 | zitadel_es_pusher          | 2024-10-16 16:29:54.979585+00 | 2024-10-16 16:30:19.180762+00 | 2024-10-16 16:30:19.26553+00  | 2024-10-16 16:30:19.265531+00 | LWLock          | WALWriteLock | commit
 00:00:00.077377 | 33136 | zitadel_es_pusher          | 2024-10-16 16:29:54.978582+00 | 2024-10-16 16:30:19.187883+00 | 2024-10-16 16:30:19.26526+00  | 2024-10-16 16:30:19.265431+00 | Client          | ClientRead   | WITH existing AS (                                                                                                                                                                                      +
                 |       |                            |                               |                               |                               |                               |                 |              |     (SELECT instance_id, aggregate_type, aggregate_id, "sequence" FROM eventstore.events2 WHERE instance_id = $1 AND aggregate_type = $2 AND aggregate_id = $3 ORDER BY "sequence" DE
 00:00:00.012309 | 33123 | zitadel_es_pusher          | 2024-10-16 16:29:54.963484+00 | 2024-10-16 16:30:19.175066+00 | 2024-10-16 16:30:19.187375+00 | 2024-10-16 16:30:19.187376+00 | IO              | WalSync      | commit
 00:00:00        | 33034 | zitadel_queries            | 2024-10-16 16:25:53.90791+00  | 2024-10-16 16:30:19.262921+00 | 2024-10-16 16:30:19.262921+00 | 2024-10-16 16:30:19.263133+00 | Client          | ClientRead   | begin isolation level read committed read only
 00:00:00        | 33039 | zitadel_queries            | 2024-10-16 16:25:53.914106+00 | 2024-10-16 16:30:19.191676+00 | 2024-10-16 16:30:19.191676+00 | 2024-10-16 16:30:19.191687+00 | Client          | ClientRead   | begin isolation level read committed read only
 00:00:00.24539  | 33083 | zitadel_projection_spooler | 2024-10-16 16:27:49.895548+00 | 2024-10-16 16:30:19.020058+00 | 2024-10-16 16:30:19.265448+00 | 2024-10-16 16:30:19.26546+00  | Client          | ClientRead   | SAVEPOINT exec_stmt
 00:00:00        | 33125 | zitadel_es_pusher          | 2024-10-16 16:29:54.963859+00 | 2024-10-16 16:30:19.191715+00 | 2024-10-16 16:30:19.191715+00 | 2024-10-16 16:30:19.191729+00 | Client          | ClientRead   | begin
 00:00:00.004292 | 33032 | zitadel_queries            | 2024-10-16 16:25:53.906624+00 | 2024-10-16 16:30:19.187713+00 | 2024-10-16 16:30:19.192005+00 | 2024-10-16 16:30:19.192062+00 | Client          | ClientRead   | SELECT created_at, event_type, "sequence", "position", payload, creator, "owner", instance_id, aggregate_type, aggregate_id, revision FROM eventstore.events2 WHERE instance_id = $1 AND aggregate_type 
 00:00:00        | 33031 | zitadel_queries            | 2024-10-16 16:25:53.906422+00 | 2024-10-16 16:30:19.191625+00 | 2024-10-16 16:30:19.191625+00 | 2024-10-16 16:30:19.191645+00 | Client          | ClientRead   | begin isolation level read committed read only

```

The amount of idle transactions is significantly less if the query
transactions are removed:

example: 

```
    ?column?     |  pid  |      application_name      |         backend_start         |          xact_start           |          query_start          |         state_change          | wait_event_type | wait_event |                                                                                                  query                                                                                                   
-----------------+-------+----------------------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+-----------------+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 00:00:00.000094 | 32741 | zitadel_queries            | 2024-10-16 16:23:49.73935+00  | 2024-10-16 16:24:59.785589+00 | 2024-10-16 16:24:59.785683+00 | 2024-10-16 16:24:59.785684+00 |                 |            | SELECT created_at, event_type, "sequence", "position", payload, creator, "owner", instance_id, aggregate_type, aggregate_id, revision FROM eventstore.events2 WHERE instance_id = $1 AND aggregate_type 
 00:00:00        | 32762 | zitadel_es_pusher          | 2024-10-16 16:24:02.275136+00 | 2024-10-16 16:24:59.784586+00 | 2024-10-16 16:24:59.784586+00 | 2024-10-16 16:24:59.784607+00 | Client          | ClientRead | begin
 00:00:00.000167 | 32742 | zitadel_queries            | 2024-10-16 16:23:49.740489+00 | 2024-10-16 16:24:59.784274+00 | 2024-10-16 16:24:59.784441+00 | 2024-10-16 16:24:59.784442+00 |                 |            | with usr as (                                                                                                                                                                                           +
                 |       |                            |                               |                               |                               |                               |                 |            |         select u.id, u.creation_date, u.change_date, u.sequence, u.state, u.resource_owner, u.username, n.login_name as preferred_login_name                                                            +
                 |       |                            |                               |                               |                               |                               |                 |            |         from projections.users13 u                                                                                                                                                                      +
                 |       |                            |                               |                               |                               |                               |                 |            |         left join projections.l
 00:00:00.256014 | 32759 | zitadel_projection_spooler | 2024-10-16 16:24:01.418429+00 | 2024-10-16 16:24:59.52959+00  | 2024-10-16 16:24:59.785604+00 | 2024-10-16 16:24:59.785649+00 | Client          | ClientRead | UPDATE projections.milestones SET reached_date = $1 WHERE (instance_id = $2) AND (type = $3) AND (reached_date IS NULL)
 00:00:00.014199 | 32773 | zitadel_es_pusher          | 2024-10-16 16:24:02.320404+00 | 2024-10-16 16:24:59.769509+00 | 2024-10-16 16:24:59.783708+00 | 2024-10-16 16:24:59.783709+00 | IO              | WalSync    | commit
 00:00:00        | 32765 | zitadel_es_pusher          | 2024-10-16 16:24:02.28173+00  | 2024-10-16 16:24:59.780413+00 | 2024-10-16 16:24:59.780413+00 | 2024-10-16 16:24:59.780426+00 | Client          | ClientRead | begin
 00:00:00.012729 | 32777 | zitadel_es_pusher          | 2024-10-16 16:24:02.339737+00 | 2024-10-16 16:24:59.767432+00 | 2024-10-16 16:24:59.780161+00 | 2024-10-16 16:24:59.780195+00 | Client          | ClientRead | RELEASE SAVEPOINT cockroach_restart
```

---------

Co-authored-by: Tim Möhlmann <tim+github@zitadel.com>
Co-authored-by: Livio Spring <livio.a@gmail.com>
Co-authored-by: Max Peintner <max@caos.ch>
Co-authored-by: Elio Bischof <elio@zitadel.com>
Co-authored-by: Stefan Benz <46600784+stebenz@users.noreply.github.com>
Co-authored-by: Miguel Cabrerizo <30386061+doncicuto@users.noreply.github.com>
Co-authored-by: Joakim Lodén <Loddan@users.noreply.github.com>
Co-authored-by: Yxnt <Yxnt@users.noreply.github.com>
Co-authored-by: Stefan Benz <stefan@caos.ch>
Co-authored-by: Harsha Reddy <harsha.reddy@klaviyo.com>
Co-authored-by: Zach H <zhirschtritt@gmail.com>
2024-11-04 10:06:14 +01:00
Silvan
99c645cc60
refactor(database): exchange connection pool (#8325)
# Which Problems Are Solved

The connection pool of go uses a high amount of database connections.

# How the Problems Are Solved

The standard lib connection pool was replaced by `pgxpool.Pool`

# Additional Changes

The `db.BeginTx`-spans are removed because they cause to much noise in
the traces.

# Additional Context

- part of https://github.com/zitadel/zitadel/issues/7639
2024-07-17 15:16:02 +00:00
Silvan
1d84635836
feat(eventstore): add search table (#8191)
# Which Problems Are Solved

To improve performance a new table and method is implemented on
eventstore. The goal of this table is to index searchable fields on
command side to use it on command and query side.

The table allows to store one primitive value (numeric, text) per row.

The eventstore framework is extended by the `Search`-method which allows
to search for objects.
The `Command`-interface is extended by the `SearchOperations()`-method
which does manipulate the the `search`-table.

# How the Problems Are Solved

This PR adds the capability of improving performance for command and
query side by using the `Search`-method of the eventstore instead of
using one of the `Filter`-methods.

# Open Tasks

- [x] Add feature flag
- [x] Unit tests
- [ ] ~~Benchmarks if needed~~
- [x] Ensure no behavior change
- [x] Add setup step to fill table with current data
- [x] Add projection which ensures data added between setup and start of
the new version are also added to the table

# Additional Changes

The `Search`-method is currently used by `ProjectGrant`-command side.

# Additional Context

- Closes https://github.com/zitadel/zitadel/issues/8094
2024-07-03 15:00:56 +00:00
Tim Möhlmann
029a6d393a
fix(crdb): obtain latest sequences when the tx is retried (#7795) 2024-04-18 13:07:05 +00:00
Tim Möhlmann
093dd57a78
fix(db): wrap BeginTx in spans to get acquire metrics (#7689)
feat(db): wrap BeginTx in spans to get acquire metrics

This changes adds a span around most db.BeginTx calls so we can get tracings about the connection pool acquire process.
This might help us pinpoint why sometimes some query package traces show longer execution times, while this was not reflected on database side execution times.

Co-authored-by: Silvan <silvan.reusser@gmail.com>
2024-04-03 11:48:24 +03:00
Silvan
56df515e5f
chore: use pgx v5 (#7577)
* chore: use pgx v5

* chore: update go version

* remove direct pq dependency

* remove unnecessary type

* scan test

* map scanner

* converter

* uint8 number array

* duration

* most unit tests work

* unit tests work

* chore: coverage

* go 1.21

* linting

* int64 gopfertammi

* retry go 1.22

* retry go 1.22

* revert to go v1.21.5

* update go toolchain to 1.21.8

* go 1.21.8

* remove test flag

* go 1.21.5

* linting

* update toolchain

* use correct array

* use correct array

* add byte array

* correct value

* correct error message

* go 1.21 compatible
2024-03-27 15:48:22 +02:00
Silvan
b7d027e2fd
fix(db): always use begin tx (#7142)
* fix(db): always use begin tx

* fix(handler): timeout for begin
2024-01-04 16:12:20 +00:00
Tim Möhlmann
f680dd934d
refactor: rename package errors to zerrors (#7039)
* chore: rename package errors to zerrors

* rename package errors to gerrors

* fix error related linting issues

* fix zitadel error assertion

* fix gosimple linting issues

* fix deprecated linting issues

* resolve gci linting issues

* fix import structure

---------

Co-authored-by: Elio Bischof <elio@zitadel.com>
2023-12-08 15:30:55 +01:00
Silvan
b5564572bc
feat(eventstore): increase parallel write capabilities (#5940)
This implementation increases parallel write capabilities of the eventstore.
Please have a look at the technical advisories: [05](https://zitadel.com/docs/support/advisory/a10005) and  [06](https://zitadel.com/docs/support/advisory/a10006).
The implementation of eventstore.push is rewritten and stored events are migrated to a new table `eventstore.events2`.
If you are using cockroach: make sure that the database user of ZITADEL has `VIEWACTIVITY` grant. This is used to query events.
2023-10-19 12:19:10 +02:00