cmd/{containerboot,k8s-operator},kube: add preshutdown hook for egress PG proxies
This change is part of work towards minimizing downtime during update
rollouts of egress ProxyGroup replicas.
This change:
- updates the containerboot health check logic to return Pod IP in headers,
if set
- always runs the health check for egress PG proxies
- updates ClusterIP Services created for PG egress endpoints to include
the health check endpoint
- implements preshutdown endpoint in proxies. The preshutdown endpoint
logic waits till, for all currently configured egress services, the ClusterIP
Service health check endpoint is no longer returned by the shutting-down Pod
(by looking at the new Pod IP header).
- ensures that kubelet is configured to call the preshutdown endpoint
This reduces the possibility that, as replicas are terminated during an update,
a replica gets terminated to which cluster traffic is still being routed via
the ClusterIP Service because kube proxy has not yet updated routig rules.
This is not a perfect check as in practice, it only checks that the kube
proxy on the node on which the proxy runs has updated rules. However, overall
this might be good enough.
The preshutdown logic is disabled if users have configured a custom health check
port via TS_LOCAL_ADDR_PORT env var. This change throws a warnign if so and in
future setting of that env var for operator proxies might be disallowed (as users
shouldn't need to configure this for a Pod directly).
This is backwards compatible with earlier proxy versions.
Updates tailscale/tailscale#14326
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
cmd/k8s-operator: add logic to parse L7 Ingresses in HA mode
- Wrap the Tailscale API client used by the Kubernetes Operator
into a client that knows how to manage VIPServices.
- Create/Delete VIPServices and update serve config for L7 Ingresses
for ProxyGroup.
- Ensure that ingress ProxyGroup proxies mount serve config from a shared ConfigMap.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
cmd/{k8s-operator,containerboot}: reload tailscaled configfile when its contents have changed
Instead of restarting the Kubernetes Operator proxies each time
tailscaled config has changed, this dynamically reloads the configfile
using the new reload endpoint.
Older annotation based mechanism will be supported till 1.84
to ensure that proxy versions prior to 1.80 keep working with
operator 1.80 and newer.
Updates tailscale/tailscale#13032
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Currently this does not yet do anything apart from creating
the ProxyGroup resources like StatefulSet.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Every so often, the ProxyGroup and other controllers lose an optimistic locking race
with other controllers that update the objects they create. Stop treating
this as an error event, and instead just log an info level log line for it.
Fixes#14072
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
* cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor
Adds a new spec.metrics.serviceMonitor field to ProxyClass.
If that's set to true (and metrics are enabled), the operator
will create a Prometheus ServiceMonitor for each proxy to which
the ProxyClass applies.
Additionally, create a metrics Service for each proxy that has
metrics enabled.
Updates tailscale/tailscale#11292
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We currently annotate pods with a hash of the tailscaled config so that
we can trigger pod restarts whenever it changes. However, the hash
updates more frequently than is necessary causing more restarts than is
necessary. This commit removes two causes; scaling up/down and removing
the auth key after pods have initially authed to control. However, note
that pods will still restart on scale-up/down because of the updated set
of volumes mounted into each pod. Hopefully we can fix that in a planned
follow-up PR.
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Sets a custom hostinfo app type for ProxyGroup replicas, similarly
to how we do it for all other Kubernetes Operator managed components.
Updates tailscale/tailscale#13406,tailscale/corp#22920
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We don't need to error out and continuously reconcile if ProxyClass
has not (yet) been created, once it gets created the ProxyGroup
reconciler will get triggered.
Updates tailscale/tailscale#13406
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
As discussed in #13684, base the ProxyGroup's proxy definitions on the same
scaffolding as the existing proxies, as defined in proxy.yaml
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The default ProxyClass can be set via helm chart or env var, and applies
to all proxies that do not otherwise have an explicit ProxyClass set.
This ensures proxies created by the new ProxyGroup CRD are consistent
with the behaviour of existing proxies
Nearby but unrelated changes:
* Fix up double error logs (controller runtime logs returned errors)
* Fix a couple of variable names
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Implements the controller for the new ProxyGroup CRD, designed for
running proxies in a high availability configuration. Each proxy gets
its own config and state Secret, and its own tailscale node ID.
We are currently mounting all of the config secrets into the container,
but will stop mounting them and instead read them directly from the kube
API once #13578 is implemented.
Updates #13406
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>