k8s-proxy was reliably crashing after a long but variable period of
time. On debugging, this was due to the watcher getting stopped when the
connection was severed (for example if my laptop was asleep for a
while), and the switch-case failed to account for the fact the result
chan may have been closed. Fix by checking for channel being closed, and
restarting the watcher in that case. Also set a timeout on the watch so
we're not relying on the watcher being quick to react to severed
connections and regularly exercise the logic to re-watch. Finally, pull
the loadConfigFromSecret call inside the added/created case to ensure
it's never possible to call it with a nil secret.
Change-Id: Ibf0cf971cbe41524cf6db3c0644c29803ad69fc4
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Adds a new reconciler for ProxyGroups of type kube-apiserver that will
provision a Tailscale Service for each replica to advertise. Adds two
new condition types to the ProxyGroup, TailscaleServiceValid and
TailscaleServiceConfigured, to post updates on the state of that
reconciler in a way that's consistent with the service-pg reconciler.
The created Tailscale Service name is configurable via a new ProxyGroup
field spec.kubeAPISserver.ServiceName, which expects a string of the
form "svc:<dns-label>".
Lots of supporting changes were needed to implement this in a way that's
consistent with other operator workflows, including:
* Pulled containerboot's ensureServicesUnadvertised and certManager into
kube/ libraries to be shared with k8s-proxy. Use those in k8s-proxy to
aid Service cert sharing between replicas and graceful Service shutdown.
* For certManager, add an initial wait to the cert loop to wait until
the domain appears in the devices's netmap to avoid a guaranteed error
on the first issue attempt when it's quick to start.
* Made several methods in ingress-for-pg.go and svc-for-pg.go into
functions to share with the new reconciler
* Added a Resource struct to the owner refs stored in Tailscale Service
annotations to be able to distinguish between Ingress- and ProxyGroup-
based Services that need cleaning up in the Tailscale API.
* Added a ListVIPServices method to the internal tailscale client to aid
cleaning up orphaned Services
* Support for reading config from a kube Secret, and partial support for
config reloading, to prevent us having to force Pod restarts when
config changes.
* Fixed up the zap logger so it's possible to set debug log level.
Updates #13358
Change-Id: Ia9607441157dd91fb9b6ecbc318eecbef446e116
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
A trusted peer relay path is always better than an untrusted direct or
peer relay path.
Updates tailscale/corp#30412
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This package promises more performance, but was never used.
The intent of the package is somewhat moot as "encoding/json"
in Go 1.25 (while under GOEXPERIMENT=jsonv2) has been
completely re-implemented using "encoding/json/v2"
such that unmarshal is dramatically faster.
Updates #cleanup
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
udpRelayEndpointReady used to write into the peerMap, which required
holding Conn.mu, but this changed in f9e7131.
Updates #cleanup
Signed-off-by: Jordan Whited <jordan@tailscale.com>
To write the init script.
And fix the JetKVM detection to work during early boot while the filesystem
and modules are still being loaded; it wasn't being detected on early boot
and then tailscaled was failing to start because it didn't know it was on JetKVM
and didn't modprobe tun.
Updates #16524
Change-Id: I0524ca3abd7ace68a69af96aab4175d32c07e116
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When `tailscale exit-node suggest` contacts the LocalAPI for a
suggested exit node, the client consults its netmap for peers that
contain the `suggest-exit-node` peercap. It currently uses a series of
heuristics to determine the exit node to suggest.
When the `traffic-steering` feature flag is enabled on its tailnet,
the client will defer to Control’s priority scores for a particular
peer. These scores, in `tailcfg.Hostinfo.Location.Priority`, were
historically only used for Mullvad exit nodes, but they have now been
extended to score any peer that could host a redundant resource.
Client capability version 119 is the earliest client that understands
these traffic steering scores. Control tells the client to switch to
rely on these scores by adding `tailcfg.NodeAttrTrafficSteering` to
its `AllCaps`.
Updates tailscale/corp#29966
Signed-off-by: Simon Law <sfllaw@tailscale.com>
In this PR, we make ExitNode.AllowOverride configurable as part of the Exit Node ADMX policy setting,
similarly to Always On w/ "Disconnect with reason" option.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates the output for "tailscale ping" to indicate if a peer relay was traversed, just like the output for DERP or direct connections.
Fixestailscale/corp#30034
Signed-off-by: Dylan Bargatze <dylan@tailscale.com>
To signal when a tailnet has the `traffic-steering` feature flag,
Control will send a `traffic-steering` NodeCapability in netmap’s
AllCaps.
This patch adds `tailcfg.NodeAttrTrafficSteering` so that it can be
used in the control plane. Future patches will implement the actual
steering mechanisms.
Updates tailscale/corp#29966
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This commit modifies the k8s-operator and k8s-proxy to support passing down
the accept-routes configuration from the proxy class as a configuration value
read and used by the k8s-proxy when ran as a distinct container managed by
the operator.
Updates #13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit modifies the k8s proxy application configuration to include a
new field named `ServerURL` which, when set, modifies
the tailscale coordination server used by the proxy. This works in the same
way as the operator and the proxies it deploys.
If unset, the default coordination server is used.
Updates https://github.com/tailscale/tailscale/issues/13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit modifies the operator to detect the usage of k8s-apiserver
type proxy groups that wish to use the letsencrypt staging directory and
apply the appropriate environment variable to the statefulset it
produces.
Updates #13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
Errors were mashalled without the correct newlines. Also, they could
generally be mashalled with more data, so an intermediate was introduced
to make them slightly nicer to look at.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
If the NodeAttrDisableRelayClient node attribute is set, ensures that a node cannot allocate endpoints on a UDP relay server itself, and cannot use newly-discovered paths (via disco/CallMeMaybeVia) that traverse a UDP relay server.
Fixestailscale/corp#30180
Signed-off-by: Dylan Bargatze <dylan@tailscale.com>
If the GUI receives a new exit node ID before the new netmap, it may treat the node as offline or invalid
if the previous netmap didn't include the peer at all, or if the peer was offline or not advertised as an exit node.
This may result in briefly issuing and dismissing a warning, or a similar issue, which isn't ideal.
In this PR, we change the operation order to send the new netmap to clients first before selecting the new exit node
and notifying them of the Exit Node change.
Updates tailscale/corp#30252 (an old issue discovered during testing this)
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We already check this for cases where ipn.Prefs.AutoExitNode is configured via syspolicy.
Configuring it directly through EditPrefs should behave the same, so we add a test for that as well.
Additionally, we clarify the implementation and future extensibility in (*LocalBackend).resolveAutoExitNodeLocked,
where the AutoExitNode is actually enforced.
Updates tailscale/corp#29969
Updates #cleanup
Signed-off-by: Nick Khyl <nickk@tailscale.com>
If the specified exit node string starts with "auto:" (i.e., can be parsed as an ipn.ExitNodeExpression),
we update ipn.Prefs.AutoExitNode instead of ipn.Prefs.ExitNodeID.
Fixes#16459
Signed-off-by: Nick Khyl <nickk@tailscale.com>
So it can be used from the CLI without importing ipnlocal.
While there, also remove isAutoExitNodeID, a wrapper around parseAutoExitNodeID
that's no longer used.
Updates tailscale/corp#29969
Updates #cleanup
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The observed generation was set to always 0 in #16429, but this had the
knock-on effect of other controllers considering ProxyGroups never ready
because the observed generation is never up to date in
proxyGroupCondition. Make sure the ProxyGroupAvailable function does not
requires the observed generation to be up to date, and add testing
coverage to catch regressions.
Updates #16327
Change-Id: I42f50ad47dd81cc2d3c3ce2cd7b252160bb58e40
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Adds a new k8s-proxy command to convert operator's in-process proxy to
a separately deployable type of ProxyGroup: kube-apiserver. k8s-proxy
reads in a new config file written by the operator, modelled on tailscaled's
conffile but with some modifications to ensure multiple versions of the
config can co-exist within a file. This should make it much easier to
support reading that config file from a Kube Secret with a stable file name.
To avoid needing to give the operator ClusterRole{,Binding} permissions,
the helm chart now optionally deploys a new static ServiceAccount for
the API Server proxy to use if in auth mode.
Proxies deployed by kube-apiserver ProxyGroups currently work the same as
the operator's in-process proxy. They do not yet leverage Tailscale Services
for presenting a single HA DNS name.
Updates #13358
Change-Id: Ib6ead69b2173c5e1929f3c13fb48a9a5362195d8
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Based on feedback that it wasn't clear what the user is meant to do with
the output of the last command, clarify that it's an optional command to
explore what got created.
Updates #13427
Change-Id: Iff64ec6d02dc04bf4bbebf415d7ed1a44e7dd658
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
When running `tailscale exit-node list`, an empty city or country name
should be displayed as a hyphen "-". However, this only happened when
there was no location at all. If a node provides a Hostinfo.Location,
then the list would display exactly what was provided.
This patch changes the listing so that empty cities and countries will
either render the provided name or "-".
Fixes#16500
Signed-off-by: Simon Law <sfllaw@tailscale.com>
When the policy setting is enabled, it allows users to override the exit node enforced by the ExitNodeID
or ExitNodeIP policy. It's primarily intended for use when ExitNodeID is set to auto:any, but it can also
be used with specific exit nodes. It does not allow disabling exit node usage entirely.
Once the exit node policy is overridden, it will not be enforced again until the policy changes,
the user connects or disconnects Tailscale, switches profiles, or disables the override.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Now that applySysPolicy is only called by (*LocalBackend).reconcilePrefsLocked,
we can make it a method to avoid passing state via parameters and to support
future extensibility.
Also factor out exit node-specific logic into applyExitNodeSysPolicyLocked.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Now that resolveExitNodeInPrefsLocked is the only caller of setExitNodeID,
and setExitNodeID is the only caller of resolveExitNodeIP, we can restructure
the code with resolveExitNodeInPrefsLocked now calling both
resolveAutoExitNodeLocked and resolveExitNodeIPLocked directly.
This prepares for factoring out resolveAutoExitNodeLocked and related
auto-exit-node logic into an ipnext extension in a future commit.
While there, we also update exit node by IP lookup to use (*nodeBackend).NodeByAddr
and (*nodeBackend).NodeByID instead of iterating over all peers in the most recent netmap.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we start passing a LocalAPI actor to (*LocalBackend).Logout to make it subject
to the same access check as disconnects made via tailscale down or the GUI.
We then update the CLI to allow `tailscale logout` to accept a reason, similar to `tailscale down`.
Updates tailscale/corp#26249
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Since a [*lazyEndpoint] makes wireguard-go responsible for peer ID, but
wireguard-go may not yet be configured for said peer, we need a JIT hook
around initiation message reception to call what is usually called from
an [*endpoint].
Updates tailscale/corp#30042
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We can't relay a packet received over the IPv4 socket back out the same
socket if destined to an IPv6 address, and vice versa.
Updates tailscale/corp#30206
Signed-off-by: Jordan Whited <jordan@tailscale.com>