This commit adds a Grafana dashboard for monitoring Tailscale health, connectivity, and performance in Kubernetes environments. The dashboard provides visibility into subnet routers, health messages, and network traffic for Tailscale proxies deployed by the Kubernetes operator.
Signed-off-by: Raj Singh <raj@tailscale.com>
Currently nobody calls SetTailscaleInterfaceName yet, so this is a
no-op. I checked oss, android, and the macOS/iOS client. Nobody calls
this, or ever did.
But I want to in the future.
Updates #15408
Updates #9040
Change-Id: I05dfabe505174f9067b929e91c6e0d8bc42628d7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To let you easily run multiple tailscaled instances for development
and let you route CLI commands to the right one.
Updates #15145
Change-Id: I06b6a7bf024f341c204f30705b4c3068ac89b1a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I noticed logs on one of my machines where it can't auto-update with
scary log spam about "failed to apply tailnet-wide default for
auto-updates".
This avoids trying to do the EditPrefs if we know it's just going to
fail anyway.
Updates #282
Change-Id: Ib7db3b122185faa70efe08b60ebd05a6094eed8c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Noticed while working on a dev tool that uses local.Client.
Updates #cleanup
Change-Id: I981efff74a5cac5f515755913668bd0508a4aa14
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Switch from using the Comment field to a ts-scoped annotation for
tracking which operators are cooperating over ownership of a
VIPService.
Updates tailscale/corp#24795
Change-Id: I72d4a48685f85c0329aa068dc01a1a3c749017bf
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
cmd/k8s-operator,k8s-operator: allow using LE staging endpoint for Ingress
Allow to optionally use LetsEncrypt staging endpoint to issue
certs for Ingress/HA Ingress, so that it is easier to
experiment with initial Ingress setup without hiting rate limits.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
(*LocalBackend).setControlClientLocked() is called to both set and reset b.cc.
We shouldn't attempt to start the audit logger when b.cc is being reset (i.e., cc is nil).
However, it's fine to start the audit logger if b.cc implements auditlog.Transport, even if it's not a controlclient.Auto but a mock control client.
In this PR, we fix both issues and add an assertion that controlclient.Auto is an auditlog.Transport. This ensures a compile-time failure if controlclient.Auto ever stops being a valid transport due to future interface or implementation changes.
Updates tailscale/corp#26435
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Resetting LocalBackend's netmap without also unconfiguring wgengine to reset routes, DNS, and the killswitch
firewall rules may cause connectivity issues until a new netmap is received.
In some cases, such as when bootstrap DNS servers are inaccessible due to network restrictions or other reasons,
or if the control plane is experiencing issues, this can result in a complete loss of connectivity until the user disconnects
and reconnects to Tailscale.
As LocalBackend handles state resets in (*LocalBackend).resetForProfileChangeLockedOnEntry(), and this includes
resetting the netmap, resetting the current netmap in (*LocalBackend).Start() is not necessary.
Moreover, it's harmful if (*LocalBackend).Start() is called more than once for the same profile.
In this PR, we update resetForProfileChangeLockedOnEntry() to reset the packet filter and remove
the redundant resetting of the netmap and packet filter from Start(). We also update the state machine
tests and revise comments that became inaccurate due to previous test updates.
Updates tailscale/corp#27173
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This adds a portable way to do a raw LocalAPI request without worrying
about the Unix-vs-macOS-vs-Windows ways of hitting the LocalAPI server.
(It was already possible but tedious with 'tailscale debug local-creds')
Updates tailscale/corp#24690
Change-Id: I0828ca55edaedf0565c8db192c10f24bebb95f1b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If conffile is used to configure tailscaled, always update
currently advertised services from conffile, even if they
are empty in the conffile, to ensure that it is possible
to transition to a state where no services are advertised.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This makes the web server running inside tailscaled on 100.100.100.100:80 support requests with `Host: 100.100.100.100:80` and its IPv6 equivalent.
Prior to this commit, the web server replied to such requests with a redirect to the node's Tailscale IP:5252.
Fixes https://github.com/tailscale/tailscale/issues/14415
Signed-off-by: Alex Klyubin <klyubin@gmail.com>
There was a flaky failure case where renaming a TLS hostname for an
ingress might leave the old hostname dangling in tailscaled config. This
happened when the proxygroup reconciler loop had an outdated resource
version of the config Secret in its cache after the
ingress-pg-reconciler loop had very recently written it to delete the
old hostname. As the proxygroup reconciler then did a patch, there was
no conflict and it reinstated the old hostname.
This commit updates the patch to an update operation so that if the
resource version is out of date it will fail with an optimistic lock
error. It also checks for equality to reduce the likelihood that we make
the update API call in the first place, because most of the time the
proxygroup reconciler is not even making an update to the Secret in the
case that the hostname has changed.
Updates tailscale/corp#24795
Change-Id: Ie23a97440063976c9a8475d24ab18253e1f89050
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
updates tailscale/corp#27145
We require a means to trigger a recompilation of the DNS configuration
to pick up new nameservers for platforms where we blend the interface
nameservers from the OS into our DNS config.
Notably, on Darwin, the only API we have at our disposal will, in rare instances,
return a transient error when querying the interface nameservers on a link change if
they have not been set when we get the AF_ROUTE messages for the link
update.
There's a corresponding change in corp for Darwin clients, to track
the interface namservers during NEPathMonitor events, and call this
when the nameservers change.
This will also fix the slightly more obscure bug of changing nameservers
while tailscaled is running. That change can now be reflected in
magicDNS without having to stop the client.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
cmd/k8s-operator: configure HA Ingress replicas to share certs
Creates TLS certs Secret and RBAC that allows HA Ingress replicas
to read/write to the Secret.
Configures HA Ingress replicas to run in read-only mode.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Update the HA Ingress controller to wait until it sees AdvertisedServices
config propagated into at least 1 Pod's prefs before it updates the status
on the Ingress, to ensure the ProxyGroup Pods are ready to serve traffic
before indicating that the Ingress is ready
Updates tailscale/corp#24795
Change-Id: I1b8ce23c9e312d08f9d02e48d70bdebd9e1a4757
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The use of html/template causes reflect-based linker bloat. Longer
term we have options to bring the UI back to iOS, but for now, cut
it out.
Updates #15297
Signed-off-by: David Anderson <dave@tailscale.com>
Allows the use of tsweb without pulling in all of the heavy prometheus
client libraries, protobuf and so on.
Updates #15160
Signed-off-by: David Anderson <dave@tailscale.com>
This PR adds some custom logic for reading and writing
kube store values that are TLS certs and keys:
1) when store is initialized, lookup additional
TLS Secrets for this node and if found, load TLS certs
from there
2) if the node runs in certs 'read only' mode and
TLS cert and key are not found in the in-memory store,
look those up in a Secret
3) if the node runs in certs 'read only' mode, run
a daily TLS certs reload to memory to get any
renewed certs
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
When the Ingress is updated to a new hostname, the controller does not
currently clean up the old VIPService from control. Fix this up to parse
the ownership comment correctly and write a test to enforce the improved
behaviour
Updates tailscale/corp#24795
Change-Id: I792ae7684807d254bf2d3cc7aa54aa04a582d1f5
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This adds support for using ACL Grants to configure a role for the
auto-provisioned user.
Fixestailscale/corp#14567
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
cmd/containerboot: manage HA Ingress TLS certs from containerboot
When ran as HA Ingress node, containerboot now can determine
whether it should manage TLS certs for the HA Ingress replicas
and call the LocalAPI cert endpoint to ensure initial issuance
and renewal of the shared TLS certs.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Shovel small events through the pipeine as fast as possible in a few basic
configurations, to establish some baseline performance numbers.
Updates #15160
Change-Id: I1dcbbd1109abb7b93aa4dcb70da57f183eb0e60e
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
* ipn/ipnlocal,envknob: add some primitives for HA replica cert share.
Add an envknob for configuring
an instance's cert store as read-only, so that it
does not attempt to issue or renew TLS credentials,
only reads them from its cert store.
This will be used by the Kubernetes Operator's HA Ingress
to enable multiple replicas serving the same HTTPS endpoint
to be able to share the same cert.
Also some minor refactor to allow adding more tests
for cert retrieval logic.
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Allow customizing the title on the debug index page. Also add methods
for registering http.HandlerFunc to make it a little easier on callers.
Updates tailscale/corp#27058
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
The demo program generates a stream of made up bus events between
a number of bus actors, as a way to generate some interesting activity
to show on the bus debug page.
Signed-off-by: David Anderson <dave@tailscale.com>
This adds a new helper to the netmon package that allows us to
rate-limit log messages, so that they only print once per (major)
LinkChange event. We then use this when constructing the portmapper, so
that we don't keep spamming logs forever on the same network.
Updates #13145
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I6e7162509148abea674f96efd76be9dffb373ae4
updates tailscale/corp#26435
Adds client support for sending audit logs to control via /machine/audit-log.
Specifically implements audit logging for user initiated disconnections.
This will require further work to optimize the peristant storage and exclusion
via build tags for mobile:
tailscale/corp#27011tailscale/corp#27012
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Ensure that the src address for a connection is one of the primary
addresses assigned by Tailscale. Not, for example, a virtual IP address.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
Add support for Cross-Origin XHR requests to the openid-configuration
endpoint to enable clients like Grafana's auto-population of OIDC setup
data from its contents.
Updates https://github.com/tailscale/tailscale/issues/10263
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
fixestailscale/tailscale#15269
Fixes the various CLIs for all of the various flavors of tailscaled on
darwin. The logic in version is updated so that we have methods that
return true only for the actual GUI app (which can beCLI) and the
order of the checks in localTCPPortAndTokenDarwin are corrected so
that the logic works with all 5 combinations of CLI and tailscaled.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
PR #14771 added support for getting certs from alternate ACME servers, but the
certStore caching mechanism breaks unless you install the CA in system roots,
because we check the validity of the cert before allowing a cache hit, which
includes checking for a valid chain back to a trusted CA. For ease of testing,
allow cert cache hits when the chain is unknown to avoid re-issuing the cert
on every TLS request served. We will still get a cache miss when the cert has
expired, as enforced by a test, and this makes it much easier to test against
non-prod ACME servers compared to having to manage the installation of non-prod
CAs on clients.
Updates #14771
Change-Id: I74fe6593fe399bd135cc822195155e99985ec08a
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
And don't return a comma-separated string. That's kinda weird
signature-wise, and not needed by half the callers anyway. The callers
that care can do the join themselves.
Updates #cleanup
Change-Id: Ib5ad51a3c6b663d868eba14fe9dc54b2609cfb0d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
natc itself can't immediately fix the problem, but it can more correctly
error that return bad addresses.
Updates tailscale/corp#26968
Signed-off-by: James Tucker <james@tailscale.com>