11 Commits

Author SHA1 Message Date
Irbe Krumina
b406f209c3
cmd/{k8s-operator,containerboot},kube: ensure egress ProxyGroup proxies don't terminate while cluster traffic is still routed to them (#14436)
cmd/{containerboot,k8s-operator},kube: add preshutdown hook for egress PG proxies

This change is part of work towards minimizing downtime during update
rollouts of egress ProxyGroup replicas.
This change:
- updates the containerboot health check logic to return Pod IP in headers,
if set
- always runs the health check for egress PG proxies
- updates ClusterIP Services created for PG egress endpoints to include
the health check endpoint
- implements preshutdown endpoint in proxies. The preshutdown endpoint
logic waits till, for all currently configured egress services, the ClusterIP
Service health check endpoint is no longer returned by the shutting-down Pod
(by looking at the new Pod IP header).
- ensures that kubelet is configured to call the preshutdown endpoint

This reduces the possibility that, as replicas are terminated during an update,
a replica gets terminated to which cluster traffic is still being routed via
the ClusterIP Service because kube proxy has not yet updated routig rules.
This is not a perfect check as in practice, it only checks that the kube
proxy on the node on which the proxy runs has updated rules. However, overall
this might be good enough.

The preshutdown logic is disabled if users have configured a custom health check
port via TS_LOCAL_ADDR_PORT env var. This change throws a warnign if so and in
future setting of that env var for operator proxies might be disallowed (as users
shouldn't need to configure this for a Pod directly).
This is backwards compatible with earlier proxy versions.

Updates tailscale/tailscale#14326


Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-29 07:35:50 +00:00
Irbe Krumina
817ba1c300
cmd/{k8s-operator,containerboot},kube/kubetypes: parse Ingresses for ingress ProxyGroup (#14583)
cmd/k8s-operator: add logic to parse L7 Ingresses in HA mode

- Wrap the Tailscale API client used by the Kubernetes Operator
into a client that knows how to manage VIPServices.
- Create/Delete VIPServices and update serve config for L7 Ingresses
for ProxyGroup.
- Ensure that ingress ProxyGroup proxies mount serve config from a shared ConfigMap.

Updates tailscale/corp#24795


Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-21 05:21:03 +00:00
Irbe Krumina
48a95c422a
cmd/containerboot,cmd/k8s-operator: reload tailscaled config (#14342)
cmd/{k8s-operator,containerboot}: reload tailscaled configfile when its contents have changed

Instead of restarting the Kubernetes Operator proxies each time
tailscaled config has changed, this dynamically reloads the configfile
using the new reload endpoint.
Older annotation based mechanism will be supported till 1.84
to ensure that proxy versions prior to 1.80 keep working with
operator 1.80 and newer.

Updates tailscale/tailscale#13032
Updates tailscale/corp#24795

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-10 07:29:11 +00:00
Irbe Krumina
8d4ca13cf8
cmd/k8s-operator,k8s-operator: support ingress ProxyGroup type (#14548)
Currently this does not yet do anything apart from creating
the ProxyGroup resources like StatefulSet.

Updates tailscale/corp#24795

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-01-08 13:43:17 +00:00
Irbe Krumina
00517c8189
kube/{kubeapi,kubeclient},ipn/store/kubestore,cmd/{containerboot,k8s-operator}: emit kube store Events (#14112)
Adds functionality to kube client to emit Events.
Updates kube store to emit Events when tailscaled state has been loaded, updated or if any errors where
encountered during those operations.
This should help in cases where an error related to state loading/updating caused the Pod to crash in a loop-
unlike logs of the originally failed container instance, Events associated with the Pod will still be
accessible even after N restarts.

Updates tailscale/tailscale#14080

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-19 13:07:19 +00:00
Tom Proctor
d8a3683fdf
cmd/k8s-operator: restart ProxyGroup pods less (#14045)
We currently annotate pods with a hash of the tailscaled config so that
we can trigger pod restarts whenever it changes. However, the hash
updates more frequently than is necessary causing more restarts than is
necessary. This commit removes two causes; scaling up/down and removing
the auth key after pods have initially authed to control. However, note
that pods will still restart on scale-up/down because of the updated set
of volumes mounted into each pod. Hopefully we can fix that in a planned
follow-up PR.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-11-12 14:18:19 +00:00
Irbe Krumina
8ba9b558d2
envknob,kube/kubetypes,cmd/k8s-operator: add app type for ProxyGroup (#14029)
Sets a custom hostinfo app type for ProxyGroup replicas, similarly
to how we do it for all other Kubernetes Operator managed components.

Updates tailscale/tailscale#13406,tailscale/corp#22920

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-11-07 12:42:29 +00:00
Tom Proctor
07c157ee9f
cmd/k8s-operator: base ProxyGroup StatefulSet on common proxy.yaml definition (#13714)
As discussed in #13684, base the ProxyGroup's proxy definitions on the same
scaffolding as the existing proxies, as defined in proxy.yaml

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-08 20:05:08 +01:00
Irbe Krumina
861dc3631c
cmd/{k8s-operator,containerboot},kube/egressservices: fix Pod IP check for dual stack clusters (#13721)
Currently egress Services for ProxyGroup only work for Pods and Services
with IPv4 addresses. Ensure that it works on dual stack clusters by reading
proxy Pod's IP from the .status.podIPs list that always contains both
IPv4 and IPv6 address (if the Pod has them) rather than .status.podIP that
could contain IPv6 only for a dual stack cluster.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-08 18:35:23 +01:00
Irbe Krumina
7f016baa87
cmd/k8s-operator,k8s-operator: create ConfigMap for egress services + small fixes for egress services (#13715)
cmd/k8s-operator, k8s-operator: create ConfigMap for egress services + small reconciler fixes

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2024-10-07 20:12:56 +01:00
Tom Proctor
e48cddfbb3
cmd/{containerboot,k8s-operator},k8s-operator,kube: add ProxyGroup controller (#13684)
Implements the controller for the new ProxyGroup CRD, designed for
running proxies in a high availability configuration. Each proxy gets
its own config and state Secret, and its own tailscale node ID.

We are currently mounting all of the config secrets into the container,
but will stop mounting them and instead read them directly from the kube
API once #13578 is implemented.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2024-10-07 14:58:45 +01:00