docs: init benchmarks (#8894)

# Which Problems Are Solved Adds initial benchmarks. # How the Problems Are Solved Added section `apis/benchmarks` # Additional Changes Update Makefile dependencies # Additional Context - Part of https://github.com/zitadel/zitadel/issues/8023 - Part of https://github.com/zitadel/zitadel/issues/8352
2025-08-11 20:47:32 +00:00 · 2024-11-15 22:44:22 +01:00
parent 45cf38e08f
commit fbebe0f183
9 changed files with 2142 additions and 0 deletions
--- a/docs/docs/apis/benchmarks/_template.mdx
+++ b/docs/docs/apis/benchmarks/_template.mdx
@@ -0,0 +1,77 @@
+<!--
+query data from output.csv:
+
+Note: you might need to adjust the WHERE clause to only filter the required trends and the current placeholders
+Warning: it's currently only possible to show data of one endpoint
+
+```
+copy (SELECT
+      metric_name
+      , to_timestamp(timestamp::DOUBLE) as timestamp
+      , approx_quantile(metric_value, 0.50) AS p50
+      , approx_quantile(metric_value, 0.95) AS p95
+      , approx_quantile(metric_value, 0.99) AS p99
+  FROM 
+      read_csv('/path/to/k6-output.csv', auto_detect=false, delim=',', quote='"', escape='"', new_line='\n', skip=0, comment='', header=true, columns={'metric_name': 'VARCHAR', 'timestamp': 'BIGINT', 'metric_value': 'DOUBLE', 'check': 'VARCHAR', 'error': 'VARCHAR', 'error_code': 'VARCHAR', 'expected_response': 'BOOLEAN', 'group': 'VARCHAR', 'method': 'VARCHAR', 'name': 'VARCHAR', 'proto': 'VARCHAR', 'scenario': 'VARCHAR', 'service': 'VARCHAR', 'status': 'BIGINT', 'subproto': 'VARCHAR', 'tls_version': 'VARCHAR', 'url': 'VARCHAR', 'extra_tags': 'VARCHAR', 'metadata': 'VARCHAR'})
+  WHERE
+      metric_name LIKE '%_duration'
+  GROUP BY
+      metric_name
+      , timestamp
+  ORDER BY
+      metric_name
+      , timestamp
+  ) to 'output.json' (ARRAY);
+```
+
+-->
+
+## Summary
+
+TODO: describe the outcome of the test?
+
+## Performance test results
+
+| Metric                                | Value |
+|-:-------------------------------------|-:-----|
+| Baseline                              | none  |
+| Purpose                               |       |
+| Test start                            | UTC |
+| Test duration                         | 30min |
+| Executed test                         |       |
+| k6 version                            |       |
+| VUs                                   |       |
+| Client location                       |       |
+| Client machine specification          | vCPU: <br/> memory: Gb |
+| ZITADEL location                      |       |
+| ZITADEL container specification       | vCPU: <br/> Memory: Gb <br/>Container count: |
+| ZITADEL Version                       |       |
+| ZITADEL Configuration                 |       |
+| ZITADEL feature flags                 |       |
+| Database                              | type: crdb / psql<br />version: |
+| Database location                     |       |
+| Database specification                | vCPU: <br/> memory: Gb |
+| ZITADEL metrics during test           |       |
+| Observed errors                       |       |
+| Top 3 most expensive database queries |       |
+| Database metrics during test          |       |
+| k6 Iterations per second              |       |
+| k6 overview                           |       |
+| k6 output                             |       |
+| flowchart outcome                     |       |
+
+
+## Endpoint latencies
+
+import OutputSource from "!!raw-loader!./output.json";
+
+import { BenchmarkChart } from '/src/components/benchmark_chart';
+
+<BenchmarkChart testResults={OutputSource} />
+
+## k6 output {#k6-output}
+
+```bash
+TODO: add summary of k6
+```
+
--- a/docs/docs/apis/benchmarks/index.mdx
+++ b/docs/docs/apis/benchmarks/index.mdx
@@ -0,0 +1,111 @@
+---
+title: Benchmarks
+sidebar_label: Benchmarks
+---
+
+import DocCardList from '@theme/DocCardList';
+
+Benchmarks are crucial to understand if ZITADEL fulfills your expected workload and what resources it needs to do so.
+
+This document explains the process and goals of load-testing zitadel in a cloud environment.
+
+The results can be found on sub pages.
+
+## Goals
+
+The primary goal is to assess if ZITADEL can scale to required proportion. The goals might change over time and maturity of ZITADEL. At the moment the goal is to assess how the application’s performance scales. There are some concrete goals we have to meet:
+
+1. [https://github.com/zitadel/zitadel/issues/8352](https://github.com/zitadel/zitadel/issues/8352) defines 1000 JWT profile auth/sec  
+2. [https://github.com/zitadel/zitadel/issues/4424](https://github.com/zitadel/zitadel/issues/4424) defines 1200 logins / sec.
+
+## Procedure
+
+First we determine the “target” of our load-test. The target is expressed as a make recipe in the load-test [Makefile](https://github.com/zitadel/zitadel/blob/main/load-test/Makefile). See also the load-test [readme](https://github.com/zitadel/zitadel/blob/main/load-test/README.md) on how to configure and run load-tests.  
+A target should be tested for longer periods of time, as it might take time for certain metrics to show up. For example, cloud SQL samples query insights. A runtime of at least **30 minutes** is advised at the moment.
+
+After each iteration of load-test, we should consult the [After test procedure](#after-test-procedure) to conclude an outcome:
+
+1. Scale  
+2. Log potential issuer and scale  
+3. Terminate testing and resolve issues
+
+
+## Methodology
+
+### Benchmark definition
+
+Tests are implemented in the ecosystem of [k6](https://k6.io). The tests are publicly available in the [zitadel repository](https://github.com/zitadel/zitadel/tree/main/load-test). Custom extensions of k6 are implemented in the [xk6-modules repository](https://github.com/zitadel/xk6-modules).  
+The tests must at least measure the request duration for each API call. This gives an indication on how zitadel behaves over the duration of the load test.
+
+### Metrics
+
+The following metrics must be collected for each test iteration. The metrics are used to follow the decision path of the [After test procedure](https://drive.google.com/open?id=1WVr7aA8dGgV1zd2jUg1y1h_o37mkZF2O6M5Mhafn_NM):
+
+| Metric | Type | Description | Unit |
+| :---- | :---- | :---- | :---- |
+| Baseline | Comparison | Defines the baseline the test is compared against. If not specified the baseline defined in this document is used. | Link to test result |
+| Purpose | Description | Description what should been proved with this test run | text
+| Test start | Setup | Timestamp when the test started. This is useful for gathering additional data like metrics or logs later | Date |
+| Test duration | Setup | Duration of the test | Duration |
+| Executed test | Setup | Name of the make recipe executed. Further information about specific test cases can be found [here](?tab=t.0#heading=h.xav4f3s5r2f3). | Name of the make recipe |
+| k6 version | Setup | Version of the test client (k6) used | semantic version |
+| VUs | Setup | Virtual Users which execute the test scenario in parallel | Number |
+| Client location | Setup | Region or location of the machine which executed the test client. If not further specified the hoster is Google Cloud | Location / Region |
+| Client machine specification | Setup | Definition of the client machine the test client ran on. The resources of the machine could be maxed out during tests therefore we collect this metric as well. The description must at least clarify the following metrics: vCPU Memory egress bandwidth | **vCPU**: Amount of threads ([additional info](https://cloud.google.com/compute/docs/cpu-platforms)) **memory**: GB **egress bandwidth**:Gbps  |
+| ZITADEL location | Setup | Region or location of the deployment of zitadel. If not further specified the hoster is Google Cloud | Location / Region |
+| ZITADEL container specification | Setup | As ZITADEL is mainly run in cloud environments it should also be run as a container during the load tests. The description must at least clarify the following metrics: vCPU Memory egress bandwidth Scale | **vCPU**: Amount of threads ([additional info](https://cloud.google.com/compute/docs/cpu-platforms)) **memory**: GB **egress bandwidth**:Gbps **scale**: The amount of containers running during the test. The amount must not vary during the tests |
+| ZITADEL Version | Setup | The version of zitadel deployed | Semantic version or commit  |
+| ZITADEL Configuration | Setup | Configuration of zitadel which deviates from the defaults and is not secret | yaml |
+| ZITADEL feature flags | Setup | Changed feature flags | yaml |
+| Database  | Setup | Database type and version | **type**: crdb / psql **version**: semantic version |
+| Database location | Setup | Region or location of the deployment of the database. If not further specified the hoster is Google Cloud SQL | Location / Region |
+| Database specification | Setup | The description must at least clarify the following metrics: vCPU, Memory and egress bandwidth (Scale) | **vCPU**: Amount of threads ([additional info](https://cloud.google.com/compute/docs/cpu-platforms)) **memory**: GB **egress bandwidth**:Gbps **scale**: Amount of crdb nodes if crdb is used |
+| ZITADEL metrics during test | Result | This metric helps understanding the bottlenecks of the executed test. At least the following metrics must be provided: CPU usage Memory usage | **CPU usage** in percent **Memory usage** in percent |
+| Observed errors | Result | Errors worth mentioning, mostly unexpected errors | description |
+| Top 3 most expensive database queries | Result | The execution plan of the top 3 most expensive database queries during the test execution | database execution plan |
+| Database metrics during test | Result | This metric helps understanding the bottlenecks of the executed test. At least the following metrics must be provided: CPU usage Memory usage | **CPU usage** in percent **Memory usage** in percent |
+| k6 Iterations per second | Result | How many test iterations were done per second | Number |
+| k6 overview | Result | Shows some basic metrics aggregated over the test run At least the following metrics must be included: duration per request (min, max, avg, p50, p95, p99) VUS For simplicity just add the whole test result printed to the terminal | terminal output |
+| k6 output | Result | Trends and metrics generated during the test, this contains detailed information for each step executed during each iteration | csv |
+
+### Test setup
+
+#### Make recipes
+
+Details about the tests implemented can be found in [this readme](https://github.com/zitadel/zitadel/blob/main/load-test/README.md#test).
+
+### Test conclusion
+
+After each iteration of load-test, we should consult the [Flowchart](#after-test-procedure) to conclude an outcome:
+
+1. [Scale](#scale)  
+2. [Log potential issue and scale](#potential-issues)  
+3. [Terminate testing](#termination) and resolve issues
+
+#### Scale {#scale}
+
+An outcome of scale means that the service hit some kind of resource limit, like CPU or RAM which can be increased. In such cases we increase the suggested parameter and rerun the load-test for the same target. On the next test we should analyse if the increase in scale resulted in a performance improvement proportional to the scale parameter. For example if we scale from 1 to 2 containers, it might be reasonable to expect a doubling of iterations / sec. If such an increase is not noticed, there might be another bottleneck or unlying issue, such as locking.
+
+#### Potential issues {#potential-issues}
+
+A potential issue has an impact on performance, but does not prevent us to scale. Such issues must be logged in GH issues and load-testing can continue. The issue can be resolved at a later time and the load-tests repeated when it is. This is primarily for issues which require big changes to ZITADEL.
+
+#### Termination {#termination}
+
+Scaling no longer improves iterations / second, or some kind of critical error or bug is experienced. The root cause of the issue must be resolved before we can continue with increasing scale.
+
+### After test procedure
+
+This flowchart shows the procedure after running a test.
+
+![Flowchart](/img/benchmark/Flowchart.svg)
+
+## Baseline
+
+Will be established as soon as the goal described above is reached.
+
+## Test results
+
+This chapter provides a table linking to the detailed test results.
+
+<DocCardList />
--- a/docs/docs/apis/benchmarks/v2.65.0/machine_jwt_profile_grant/index.mdx
+++ b/docs/docs/apis/benchmarks/v2.65.0/machine_jwt_profile_grant/index.mdx
@@ -0,0 +1,75 @@
+---
+title: machine jwt profile grant benchmark of zitadel v2.65.0
+sidebar_label: machine jwt profile grant
+---
+
+## Summary
+
+Tests are halted after this test run because of too many [client read events](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/wait-event.clientread.html) on the database.
+
+## Performance test results
+
+| Metric | Value |
+| :---- | :---- |
+| Baseline | none |
+| Test start | 22-10-2024 16:20 UTC |
+| Test duration | 30min |
+| Executed test | machine\_jwt\_profile\_grant |
+| k6 version | v0.54.0 |
+| VUs | 50 |
+| Client location | US1 |
+| Client machine specification | e2-high-cpu-4 |
+| Zitadel location | US1 |
+| Zitadel container specification | vCPUs: 2<br/>Memory: 512 MiB<br/>Container count: 2 |
+| Zitadel feature flags | none |
+| Database | postgres v15 |
+| Database location | US1 |
+| Database specification | vCPUs: 4<br/>Memory: 16 GiB |
+| Zitadel metrics during test |  |
+| Observed errors | Many client read events during push |
+| Top 3 most expensive database queries | 1: Query events `instance_id = $1 AND aggregate_type = $2 AND aggregate_id = $3 AND event_type = ANY($4)`<br/>2: latest sequence query during push events<br/>3: writing events during push (caused lock wait events) |
+| k6 iterations per second | 193 |
+| k6 overview | [output](#k6-output) |
+| flowchart outcome | Halt tests, must resolve an issue |
+
+## /token endpoint latencies
+
+import OutputSource from "!!raw-loader!./output.json";
+
+import { BenchmarkChart } from '/src/components/benchmark_chart';
+
+<BenchmarkChart testResults={OutputSource} />
+
+## k6 output {#k6-output}
+
+```bash
+checks...............................: 100.00% ✓ 695739 ✗ 0   
+data_received........................: 479 MB 265 kB/s  
+data_sent............................: 276 MB 153 kB/s  
+http_req_blocked.....................: min=178ns avg=5µs max=119.8ms p(50)=460ns p(95)=702ns p(99)=921ns   
+http_req_connecting..................: min=0s avg=1.24µs max=43.45ms p(50)=0s p(95)=0s p(99)=0s   
+http_req_duration....................: min=18ms avg=255.3ms max=1.22s p(50)=241.56ms p(95)=479.19ms p(99)=600.92ms  
+{ expected_response:true }.........: min=18ms avg=255.3ms max=1.22s p(50)=241.56ms p(95)=479.19ms p(99)=600.92ms  
+http_req_failed......................: 0.00% ✓ 0 ✗ 347998  
+http_req_receiving...................: min=25.92µs avg=536.96µs max=401.94ms p(50)=89.44µs p(95)=2.39ms p(99)=11.12ms   
+http_req_sending.....................: min=24.01µs avg=63.86µs max=4.48ms p(50)=60.97µs p(95)=88.69µs p(99)=141.74µs  
+http_req_tls_handshaking.............: min=0s avg=2.8µs max=51.05ms p(50)=0s p(95)=0s p(99)=0s   
+http_req_waiting.....................: min=17.65ms avg=254.7ms max=1.22s p(50)=240.88ms p(95)=478.6ms p(99)=600.6ms   
+http_reqs............................: 347998 192.80552/s  
+iteration_duration...................: min=33.86ms avg=258.77ms max=1.22s p(50)=245ms p(95)=482.61ms p(99)=604.32ms  
+iterations...........................: 347788 192.689171/s  
+login_ui_enter_login_name_duration...: min=218.61ms avg=218.61ms max=218.61ms p(50)=218.61ms p(95)=218.61ms p(99)=218.61ms  
+login_ui_enter_password_duration.....: min=18ms avg=18ms max=18ms p(50)=18ms p(95)=18ms p(99)=18ms   
+login_ui_init_login_duration.........: min=90.96ms avg=90.96ms max=90.96ms p(50)=90.96ms p(95)=90.96ms p(99)=90.96ms   
+login_ui_token_duration..............: min=140.02ms avg=140.02ms max=140.02ms p(50)=140.02ms p(95)=140.02ms p(99)=140.02ms  
+oidc_token_duration..................: min=29.85ms avg=255.38ms max=1.22s p(50)=241.61ms p(95)=479.23ms p(99)=600.95ms  
+org_create_org_duration..............: min=64.51ms avg=64.51ms max=64.51ms p(50)=64.51ms p(95)=64.51ms p(99)=64.51ms   
+user_add_machine_key_duration........: min=44.93ms avg=87.89ms max=159.52ms p(50)=84.43ms p(95)=144.59ms p(99)=155.54ms  
+user_create_machine_duration.........: min=65.75ms avg=266.53ms max=421.58ms p(50)=276.59ms p(95)=380.84ms p(99)=414.43ms  
+vus..................................: 0 min=0 max=50   
+vus_max..............................: 50 min=50 max=50 
+
+running (30m04.9s), 00/50 VUs, 347788 complete and 0 interrupted iterations  
+default ✓ [======================================] 50 VUs 30m0s
+```
+
--- a/docs/docs/apis/benchmarks/v2.65.0/machine_jwt_profile_grant/output.json
+++ b/docs/docs/apis/benchmarks/v2.65.0/machine_jwt_profile_grant/output.json
--- a/docs/package.json
+++ b/docs/package.json
@@ -44,6 +44,7 @@
    "react": "^18.2.0",
    "react-copy-to-clipboard": "^5.1.0",
    "react-dom": "^18.2.0",
+    "react-google-charts": "^5.2.1",
    "react-player": "^2.15.1",
    "sitemap": "7.1.1",
    "swc-loader": "^0.2.3",
--- a/docs/sidebars.js
+++ b/docs/sidebars.js
@@ -841,6 +841,30 @@ module.exports = {
      label: "Rate Limits (Cloud)", // The link label
      href: "/legal/policies/rate-limit-policy", // The internal path
    },
+    {
+      type: "category",
+      label: "Benchmarks",
+      collapsed: false,      
+      link: {
+        type: "doc",
+        id: "apis/benchmarks/index",
+      },
+      items: [
+        {
+          type: "category",
+          label: "v2.65.0",
+          link: {
+            title: "v2.65.0",
+            slug: "/apis/benchmarks/v2.65.0",
+            description:
+              "Benchmark results of Zitadel v2.65.0\n"
+          },
+          items: [
+            "apis/benchmarks/v2.65.0/machine_jwt_profile_grant/index",
+          ],
+        },
+      ],
+    },
  ],
  selfHosting: [
    {
--- a/docs/src/components/benchmark_chart.jsx
+++ b/docs/src/components/benchmark_chart.jsx
@@ -0,0 +1,45 @@
+import React from "react";
+import Chart from "react-google-charts";
+
+export function BenchmarkChart(testResults=[], height='500px') {
+
+    const options = {
+        legend: { position: 'bottom' },
+        focusTarget: 'category',
+        hAxis: {
+            title: 'timestamp',
+        },
+        vAxis: {
+            title: 'latency (ms)',
+        },
+    };
+
+    const data = [
+        [
+            {type:"datetime", label: "timestamp"},
+            {type:"number", label: "p50"},
+            {type:"number", label: "p95"},
+            {type:"number", label: "p99"},
+        ],
+    ]
+
+    JSON.parse(testResults.testResults).forEach((result) => {
+        data.push([
+            new Date(result.timestamp),
+            result.p50,
+            result.p95,
+            result.p99,
+        ])
+    });
+
+    return (
+        <Chart
+            chartType="LineChart"
+            width="100%"
+            height="500px"
+            options={options}
+            data={data}
+            legendToggle
+        />
+  );
+}
--- a/docs/static/img/benchmark/Flowchart.svg
+++ b/docs/static/img/benchmark/Flowchart.svg
--- a/docs/yarn.lock
+++ b/docs/yarn.lock
@@ -9479,6 +9479,11 @@ react-fast-compare@^3.0.1, react-fast-compare@^3.2.0, react-fast-compare@^3.2.2:
  resolved "https://registry.yarnpkg.com/react-fast-compare/-/react-fast-compare-3.2.2.tgz#929a97a532304ce9fee4bcae44234f1ce2c21d49"
  integrity sha512-nsO+KSNgo1SbJqJEYRE9ERzo7YtYbou/OqjSQKxV7jcKox7+usiUVZOAC+XnDOABXggQTno0Y1CpVnuWEc1boQ==

+react-google-charts@^5.2.1:
+  version "5.2.1"
+  resolved "https://registry.yarnpkg.com/react-google-charts/-/react-google-charts-5.2.1.tgz#d9cbe8ed45d7c0fafefea5c7c3361bee76648454"
+  integrity sha512-mCbPiObP8yWM5A9ogej7Qp3/HX4EzOwuEzUYvcfHtL98Xt4V/brD14KgfDzSNNtyD48MNXCpq5oVaYKt0ykQUQ==
+
 react-helmet-async@*:
  version "2.0.5"
  resolved "https://registry.yarnpkg.com/react-helmet-async/-/react-helmet-async-2.0.5.tgz#cfc70cd7bb32df7883a8ed55502a1513747223ec"