tailscale/docs/k8s/operator-architecture.md
Tom Proctor b3d4ffe168
docs/k8s: add some high-level operator architecture diagrams (#13915)
This is an experiment to see how useful we will find it to have some
text-based diagrams to document how various components of the operator
work. There are no plans to link to this from elsewhere yet, but
hopefully it will be a useful reference internally.

Updates #cleanup

Change-Id: If5911ed39b09378fec0492e87738ec0cc3d8731e
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-12-17 15:36:57 +00:00

16 KiB

Operator architecture diagrams

The Tailscale Kubernetes operator has a collection of use-cases that can be mixed and matched as required. The following diagrams illustrate how the operator implements each use-case.

In each diagram, the "tailscale" namespace is entirely managed by the operator once the operator itself has been deployed.

Tailscale devices are highlighted as black nodes. The salient devices for each use-case are marked as "src" or "dst" to denote which node is a source or a destination in the context of ACL rules that will apply to network traffic.

Note, in some cases, the config and the state Secret may be the same Kubernetes Secret.

API server proxy

Documentation

The operator runs the API server proxy in-process. If the proxy is running in "noauth" mode, it forwards HTTP requests unmodified. If the proxy is running in "auth" mode, it deletes any existing auth headers and adds impersonation headers to the request before forwarding to the API server. A request with impersonation headers will look something like:

GET /api/v1/namespaces/default/pods HTTP/1.1
Host: k8s-api.example.com
Authorization: Bearer <operator-service-account-token>
Impersonate-Group: tailnet-readers
Accept: application/json
%%{ init: { 'theme':'neutral' } }%%
flowchart LR
    classDef tsnode color:#fff,fill:#000;
    classDef pod fill:#fff;

    subgraph Key
        ts[Tailscale device]:::tsnode
        pod((Pod)):::pod
        blank[" "]-->|WireGuard traffic| blank2[" "]
        blank3[" "]-->|Other network traffic| blank4[" "]
    end

    subgraph k8s[Kubernetes cluster]
        subgraph tailscale-ns[namespace=tailscale]
            operator(("operator (dst)")):::tsnode
        end

        subgraph controlplane["Control plane"]
            api[kube-apiserver]
        end
    end

    client["client (src)"]:::tsnode --> operator
    operator -->|"proxy (maybe with impersonation headers)"| api

    linkStyle 0 stroke:red;
    linkStyle 2 stroke:red;

    linkStyle 1 stroke:blue;
    linkStyle 3 stroke:blue;

L3 ingress

Documentation

The user deploys an app to the default namespace, and creates a normal Service that selects the app's Pods. Either add the annotation tailscale.com/expose: "true" or specify .spec.type as Loadbalancer and .spec.loadBalancerClass as tailscale. The operator will create an ingress proxy that allows devices anywhere on the tailnet to access the Service.

The proxy Pod uses iptables or nftables rules to DNAT traffic bound for the proxy's tailnet IP to the Service's internal Cluster IP instead.

%%{ init: { 'theme':'neutral' } }%%
flowchart TD
    classDef tsnode color:#fff,fill:#000;
    classDef pod fill:#fff;

    subgraph Key
        ts[Tailscale device]:::tsnode
        pod((Pod)):::pod
        blank[" "]-->|WireGuard traffic| blank2[" "]
        blank3[" "]-->|Other network traffic| blank4[" "]
    end

    subgraph k8s[Kubernetes cluster]
        subgraph tailscale-ns[namespace=tailscale]
            operator((operator)):::tsnode
            ingress-sts["StatefulSet"]
            ingress(("ingress proxy (dst)")):::tsnode
            config-secret["config Secret"]
            state-secret["state Secret"]
        end

        subgraph defaultns[namespace=default]
            svc[annotated Service]
            svc --> pod1((pod1))
            svc --> pod2((pod2))
        end
    end

    client["client (src)"]:::tsnode --> ingress
    ingress -->|forwards traffic| svc
    operator -.->|creates| ingress-sts
    ingress-sts -.->|manages| ingress
    operator -.->|reads| svc
    operator -.->|creates| config-secret
    config-secret -.->|mounted| ingress
    ingress -.->|stores state| state-secret

    linkStyle 0 stroke:red;
    linkStyle 4 stroke:red;

    linkStyle 1 stroke:blue;
    linkStyle 2 stroke:blue;
    linkStyle 3 stroke:blue;
    linkStyle 5 stroke:blue;

L7 ingress

Documentation

L7 ingress is relatively similar to L3 ingress. It is configured via an Ingress object instead of a Service, and uses tailscale serve to accept traffic instead of configuring iptables or nftables rules. Note that we use tailscaled's local API (SetServeConfig) to set serve config, not the tailscale serve command.

%%{ init: { 'theme':'neutral' } }%%
flowchart TD
    classDef tsnode color:#fff,fill:#000;
    classDef pod fill:#fff;

    subgraph Key
        ts[Tailscale device]:::tsnode
        pod((Pod)):::pod
        blank[" "]-->|WireGuard traffic| blank2[" "]
        blank3[" "]-->|Other network traffic| blank4[" "]
    end

    subgraph k8s[Kubernetes cluster]
        subgraph tailscale-ns[namespace=tailscale]
            operator((operator)):::tsnode
            ingress-sts["StatefulSet"]
            ingress-pod(("ingress proxy (dst)")):::tsnode
            config-secret["config Secret"]
            state-secret["state Secret"]
        end

        subgraph defaultns[namespace=default]
            ingress[tailscale Ingress]
            svc["Service"]
            svc --> pod1((pod1))
            svc --> pod2((pod2))
        end
    end

    client["client (src)"]:::tsnode --> ingress-pod
    ingress-pod -->|forwards /api prefix traffic| svc
    operator -.->|creates| ingress-sts
    ingress-sts -.->|manages| ingress-pod
    operator -.->|reads| ingress
    operator -.->|creates| config-secret
    config-secret -.->|mounted| ingress-pod
    ingress-pod -.->|stores state| state-secret
    ingress -.->|/api prefix| svc

    linkStyle 0 stroke:red;
    linkStyle 4 stroke:red;

    linkStyle 1 stroke:blue;
    linkStyle 2 stroke:blue;
    linkStyle 3 stroke:blue;
    linkStyle 5 stroke:blue;

L3 egress

Documentation

  1. The user deploys a Service with type: ExternalName and an annotation tailscale.com/tailnet-fqdn: db.tails-scales.ts.net.
  2. The operator creates a proxy Pod managed by a single replica StatefulSet, and a headless Service pointing at the proxy Pod.
  3. The operator updates the ExternalName Service's spec.externalName field to point at the headless Service it created in the previous step.

(Optional) If the user also adds the tailscale.com/proxy-group: egress-proxies annotation to their ExternalName Service, the operator will skip creating a proxy Pod and instead point the headless Service at the existing ProxyGroup's pods. In this case, ports are also required in the ExternalName Service spec. See below for a more representative diagram.

%%{ init: { 'theme':'neutral' } }%%

flowchart TD
    classDef tsnode color:#fff,fill:#000;
    classDef pod fill:#fff;

    subgraph Key
        ts[Tailscale device]:::tsnode
        pod((Pod)):::pod
        blank[" "]-->|WireGuard traffic| blank2[" "]
        blank3[" "]-->|Other network traffic| blank4[" "]
    end

    subgraph k8s[Kubernetes cluster]
        subgraph tailscale-ns[namespace=tailscale]
            operator((operator)):::tsnode
            egress(("egress proxy (src)")):::tsnode
            egress-sts["StatefulSet"]
            headless-svc[headless Service]
            cfg-secret["config Secret"]
            state-secret["state Secret"]
        end

        subgraph defaultns[namespace=default]
            svc[ExternalName Service]
            pod1((pod1)) --> svc
            pod2((pod2)) --> svc
        end
    end

    node["db.tails-scales.ts.net (dst)"]:::tsnode

    svc -->|DNS points to| headless-svc
    headless-svc -->|selects egress Pod| egress
    egress -->|forwards traffic| node
    operator -.->|creates| egress-sts
    egress-sts -.->|manages| egress
    operator -.->|creates| headless-svc
    operator -.->|creates| cfg-secret
    operator -.->|watches & updates| svc
    cfg-secret -.->|mounted| egress
    egress -.->|stores state| state-secret

    linkStyle 0 stroke:red;
    linkStyle 6 stroke:red;

    linkStyle 1 stroke:blue;
    linkStyle 2 stroke:blue;
    linkStyle 3 stroke:blue;
    linkStyle 4 stroke:blue;
    linkStyle 5 stroke:blue;

ProxyGroup

Documentation

The ProxyGroup custom resource manages a collection of proxy Pods that can be configured to egress traffic out of the cluster via ExternalName Services. A ProxyGroup is both a high availability (HA) version of L3 egress, and a mechanism to serve multiple ExternalName Services on a single set of Tailscale devices (coalescing).

In this diagram, the ProxyGroup is named pg. The Secrets associated with the ProxyGroup Pods are omitted for simplicity. They are similar to the L3 egress case above, but there is a pair of config + state Secrets per Pod.

Each ExternalName Service defines which ports should be mapped to their defined egress target. The operator maps from these ports to randomly chosen ephemeral ports via the ClusterIP Service and its EndpointSlice. The operator then generates the egress ConfigMap that tells the ProxyGroup Pods which incoming ports map to which egress targets.

ProxyGroups currently only support egress.

%%{ init: { 'theme':'neutral' } }%%

flowchart LR
    classDef tsnode color:#fff,fill:#000;
    classDef pod fill:#fff;

    subgraph Key
        ts[Tailscale device]:::tsnode
        pod((Pod)):::pod
        blank[" "]-->|WireGuard traffic| blank2[" "]
        blank3[" "]-->|Other network traffic| blank4[" "]
    end

    subgraph k8s[Kubernetes cluster]
        subgraph tailscale-ns[namespace=tailscale]
            operator((operator)):::tsnode
            pg-sts[StatefulSet]
            pg-0(("pg-0 (src)")):::tsnode
            pg-1(("pg-1 (src)")):::tsnode
            db-cluster-ip[db ClusterIP Service]
            api-cluster-ip[api ClusterIP Service]
            egress-cm["egress ConfigMap"]
        end

        subgraph cluster-scope["Cluster scoped resources"]
            pg["ProxyGroup 'pg'"]
        end

        subgraph defaultns[namespace=default]
            db-svc[db ExternalName Service]
            api-svc[api ExternalName Service]
            pod1((pod1)) --> db-svc
            pod2((pod2)) --> db-svc
            pod1((pod1)) --> api-svc
            pod2((pod2)) --> api-svc
        end
    end

    db["db.tails-scales.ts.net (dst)"]:::tsnode
    api["api.tails-scales.ts.net (dst)"]:::tsnode

    db-svc -->|DNS points to| db-cluster-ip
    api-svc -->|DNS points to| api-cluster-ip
    db-cluster-ip -->|maps to ephemeral db ports| pg-0
    db-cluster-ip -->|maps to ephemeral db ports| pg-1
    api-cluster-ip -->|maps to ephemeral api ports| pg-0
    api-cluster-ip -->|maps to ephemeral api ports| pg-1
    pg-0 -->|forwards db port traffic| db
    pg-0 -->|forwards api port traffic| api
    pg-1 -->|forwards db port traffic| db
    pg-1 -->|forwards api port traffic| api
    operator -.->|creates & populates endpointslice| db-cluster-ip
    operator -.->|creates & populates endpointslice| api-cluster-ip
    operator -.->|stores port mapping| egress-cm
    egress-cm -.->|mounted| pg-0
    egress-cm -.->|mounted| pg-1
    operator -.->|watches| pg
    operator -.->|creates| pg-sts
    pg-sts -.->|manages| pg-0
    pg-sts -.->|manages| pg-1
    operator -.->|watches| db-svc
    operator -.->|watches| api-svc

    linkStyle 0 stroke:red;
    linkStyle 12 stroke:red;
    linkStyle 13 stroke:red;
    linkStyle 14 stroke:red;
    linkStyle 15 stroke:red;

    linkStyle 1 stroke:blue;
    linkStyle 2 stroke:blue;
    linkStyle 3 stroke:blue;
    linkStyle 4 stroke:blue;
    linkStyle 5 stroke:blue;
    linkStyle 6 stroke:blue;
    linkStyle 7 stroke:blue;
    linkStyle 8 stroke:blue;
    linkStyle 9 stroke:blue;
    linkStyle 10 stroke:blue;
    linkStyle 11 stroke:blue;

Connector

Subnet router and exit node documentation

App connector documentation

The Connector Custom Resource can deploy either a subnet router, an exit node, or an app connector. The following diagram shows all 3, but only one workflow can be configured per Connector resource.

%%{ init: { 'theme':'neutral' } }%%

flowchart TD
    classDef tsnode color:#fff,fill:#000;
    classDef pod fill:#fff;
    classDef hidden display:none;

    subgraph Key
        ts[Tailscale device]:::tsnode
        pod((Pod)):::pod
        blank[" "]-->|WireGuard traffic| blank2[" "]
        blank3[" "]-->|Other network traffic| blank4[" "]
    end

    subgraph grouping[" "]
        subgraph k8s[Kubernetes cluster]
            subgraph tailscale-ns[namespace=tailscale]
                operator((operator)):::tsnode
                cn-sts[StatefulSet]
                cn-pod(("tailscale (dst)")):::tsnode
                cfg-secret["config Secret"]
                state-secret["state Secret"]
            end

            subgraph cluster-scope["Cluster scoped resources"]
                cn["Connector"]
            end

            subgraph defaultns["namespace=default"]
                pod1
            end
        end

        client["client (src)"]:::tsnode
        Internet
    end

    client --> cn-pod
    cn-pod -->|app connector or exit node routes| Internet
    cn-pod -->|subnet route| pod1
    operator -.->|watches| cn
    operator -.->|creates| cn-sts
    cn-sts -.->|manages| cn-pod
    operator -.->|creates| cfg-secret
    cfg-secret -.->|mounted| cn-pod
    cn-pod -.->|stores state| state-secret

    class grouping hidden

    linkStyle 0 stroke:red;
    linkStyle 2 stroke:red;

    linkStyle 1 stroke:blue;
    linkStyle 3 stroke:blue;
    linkStyle 4 stroke:blue;

Recorder nodes

Documentation

The Recorder custom resource makes it easier to deploy tsrecorder to a cluster. It currently only supports a single replica.

%%{ init: { 'theme':'neutral' } }%%

flowchart TD
    classDef tsnode color:#fff,fill:#000;
    classDef pod fill:#fff;
    classDef hidden display:none;

    subgraph Key
        ts[Tailscale device]:::tsnode
        pod((Pod)):::pod
        blank[" "]-->|WireGuard traffic| blank2[" "]
        blank3[" "]-->|Other network traffic| blank4[" "]
    end

    subgraph grouping[" "]
        subgraph k8s[Kubernetes cluster]
            api["kube-apiserver"]

            subgraph tailscale-ns[namespace=tailscale]
                operator(("operator (dst)")):::tsnode
                rec-sts[StatefulSet]
                rec-0(("tsrecorder")):::tsnode
                cfg-secret-0["config Secret"]
                state-secret-0["state Secret"]
            end

            subgraph cluster-scope["Cluster scoped resources"]
                rec["Recorder"]
            end
        end

        client["client (src)"]:::tsnode
        kubectl-exec["kubectl exec (src)"]:::tsnode
        server["server (dst)"]:::tsnode
        s3["S3-compatible storage"]
    end

    kubectl-exec -->|exec session| operator
    operator -->|exec session recording| rec-0
    operator -->|exec session| api
    client -->|ssh session| server
    server -->|ssh session recording| rec-0
    rec-0 -->|session recordings| s3
    operator -.->|watches| rec
    operator -.->|creates| rec-sts
    rec-sts -.->|manages| rec-0
    operator -.->|creates| cfg-secret-0
    cfg-secret-0 -.->|mounted| rec-0
    rec-0 -.->|stores state| state-secret-0

    class grouping hidden

    linkStyle 0 stroke:red;
    linkStyle 2 stroke:red;
    linkStyle 3 stroke:red;
    linkStyle 5 stroke:red;
    linkStyle 6 stroke:red;

    linkStyle 1 stroke:blue;
    linkStyle 4 stroke:blue;
    linkStyle 7 stroke:blue;