chore: service ping api design (#9984)

# Which Problems Are Solved

Add the possibility to report information and analytical data from
(self-hosted) ZITADEL systems to a central endpoint.
To be able to do so an API has to be designed to receive the different
reports and information.

# How the Problems Are Solved

- Telemetry service definition added, which currently has two endpoints:
- ReportBaseInformation: To gather the zitadel version and instance
information such as id and creation date
- ReportResourceCounts: Dynamically report (based on #9979) different
resources (orgs, users per org, ...)
- To be able to paginate and send multiple pages to the endpoint a
`report_id` is returned on the first page / request from the server,
which needs to be passed by the client on the following pages.
- Base error handling is described in the proto and is based on gRPC
standards and best practices.

# Additional Changes

none

# Additional Context

Public documentation of the behaviour / error handling and what data is
collected, resp. how to configure will be provided in
https://github.com/zitadel/zitadel/issues/9869.

Closes https://github.com/zitadel/zitadel/issues/9872
This commit is contained in:
Livio Spring
2025-06-05 12:13:26 +02:00
committed by GitHub
parent 7df4f76f3c
commit 63c92104ba
2 changed files with 127 additions and 0 deletions

View File

@@ -0,0 +1,48 @@
syntax = "proto3";
package zitadel.analytics.v2beta;
import "google/protobuf/timestamp.proto";
option go_package = "github.com/zitadel/zitadel/pkg/grpc/analytics/v2beta;analytics";
message InstanceInformation {
// The unique identifier of the instance.
string id = 1;
// The custom domains (incl. generated ones) of the instance.
repeated string domains = 2;
// The creation date of the instance.
google.protobuf.Timestamp created_at = 3;
}
message ResourceCount {
// The ID of the instance for which the resource counts are reported.
string instance_id = 3;
// The parent type of the resource counts (e.g. organization or instance).
// For example, reporting the amount of users per organization would use
// `COUNT_PARENT_TYPE_ORGANIZATION` as parent type and the organization ID as parent ID.
CountParentType parent_type = 4;
// The parent ID of the resource counts (e.g. organization or instance ID).
// For example, reporting the amount of users per organization would use
// `COUNT_PARENT_TYPE_ORGANIZATION` as parent type and the organization ID as parent ID.
string parent_id = 5;
// The resource counts to report, e.g. amount of `users`, `organizations`, etc.
string resource_name = 6;
// The name of the table in the database, which was used to calculate the counts.
// This can be used to deduplicate counts in case of multiple reports.
// For example, if the counts were calculated from the `users14` table,
// the table name would be `users14`, where there could also be a `users15` table
// reported at the same time as the system is rolling out a new version.
string table_name = 7;
// The timestamp when the count was last updated.
google.protobuf.Timestamp updated_at = 8;
// The actual amount of the resource.
uint32 amount = 9;
}
enum CountParentType {
COUNT_PARENT_TYPE_UNSPECIFIED = 0;
COUNT_PARENT_TYPE_INSTANCE = 1;
COUNT_PARENT_TYPE_ORGANIZATION = 2;
}

View File

@@ -0,0 +1,79 @@
syntax = "proto3";
package zitadel.analytics.v2beta;
import "google/protobuf/timestamp.proto";
import "zitadel/analytics/v2beta/telemetry.proto";
option go_package = "github.com/zitadel/zitadel/pkg/grpc/analytics/v2beta;analytics";
// The TelemetryService is used to report telemetry such as usage statistics of the ZITADEL instance(s).
// back to a central storage.
// It is used to collect anonymized data about the usage of ZITADEL features, capabilities, and configurations.
// ZITADEL acts as a client of the TelemetryService.
//
// Reports are sent periodically based on the system's runtime configuration.
// The content of the reports, respectively the data collected, can be configured in the system's runtime configuration.
//
// All endpoints follow the same error and retry handling:
// In case of a failure to report the usage, ZITADEL will retry to report the usage
// based on the configured retry policy and error type:
// - Client side errors will not be retried, as they indicate a misconfiguration or an invalid request:
// - `INVALID_ARGUMENT`: The request was malformed.
// - `NOT_FOUND`: The TelemetryService's endpoint is likely misconfigured.
// - Connection / transfer errors will be retried based on the retry policy configured in the system's runtime configuration:
// - `DEADLINE_EXCEEDED`: The request took too long to complete, it will be retried.
// - `RESOURCE_EXHAUSTED`: The request was rejected due to resource exhaustion, it will be retried after a backoff period.
// - `UNAVAILABLE`: The TelemetryService is currently unavailable, it will be retried after a backoff period.
// Server side errors will also be retried based on the information provided by the server:
// - `FAILED_PRECONDITION`: The request failed due to a precondition, e.g. the report ID does not exists,
// does not correspond to the same system ID or previous reporting is too old, do not retry.
// - `INTERNAL`: An internal error occurred. Check details and logs.
service TelemetryService {
// ReportBaseInformation is used to report the base information of the ZITADEL system,
// including the version, instances, their creation date and domains.
// The response contains a report ID to link it to the resource counts or other reports.
// The report ID is only valid for the same system ID.
rpc ReportBaseInformation (ReportBaseInformationRequest) returns (ReportBaseInformationResponse) {}
// ReportResourceCounts is used to report the resource counts such as amount of organizations
// or users per organization and much more.
// Since the resource counts can be reported in multiple batches,
// the response contains a report ID to continue reporting.
// The report ID is only valid for the same system ID.
rpc ReportResourceCounts (ReportResourceCountsRequest) returns (ReportResourceCountsResponse) {}
}
message ReportBaseInformationRequest {
// The system ID is a unique identifier for the ZITADEL system.
string system_id = 1;
// The current version of the ZITADEL system.
string version = 2;
// A list of instances in the ZITADEL system and their information.
repeated InstanceInformation instances = 3;
}
message ReportBaseInformationResponse {
// The report ID is a unique identifier for the report.
// It is used to identify the report to be able to link it to the resource counts or other reports.
// Note that the report ID is only valid for the same system ID.
string report_id = 1;
}
message ReportResourceCountsRequest {
// The system ID is a unique identifier for the ZITADEL system.
string system_id = 1;
// The previously returned report ID from the server to continue reporting.
// Note that the report ID is only valid for the same system ID.
optional string report_id = 2;
// A list of resource counts to report.
repeated ResourceCount resource_counts = 3;
}
message ReportResourceCountsResponse {
// The report ID is a unique identifier for the report.
// It is used to identify the report in case of additional data / pagination.
// Note that the report ID is only valid for the same system ID.
string report_id = 1;
}