Commit Graph

549 Commits

Author SHA1 Message Date
Brad Fitzpatrick
e968b0ecd7 cmd/tailscale,controlclient,ipnlocal: fix 'up', deflake tests more
The CLI's "up" is kinda chaotic and LocalBackend.Start is kinda
chaotic and they both need to be redone/deleted (respectively), but
this fixes some buggy behavior meanwhile. We were previously calling
StartLoginInteractive (to start the controlclient's RegisterRequest)
redundantly in some cases, causing test flakes depending on timing and
up's weird state machine.

We only need to call StartLoginInteractive in the client if Start itself
doesn't. But Start doesn't tell us that. So cheat a bit and a put the
information about whether there's a current NodeKey in the ipn.Status.
It used to be accessible over LocalAPI via GetPrefs as a private key but
we removed that for security. But a bool is fine.

So then only call StartLoginInteractive if that bool is false and don't
do it in the WatchIPNBus loop.

Fixes #12028
Updates #12042

Change-Id: I0923c3f704a9d6afd825a858eb9a63ca7c1df294
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-05-07 22:34:45 -07:00
Maisem Ali
32bc596062 ipn/ipnlocal: acquire b.mu once in Start
We used to Lock, Unlock, Lock, Unlock quite a few
times in Start resulting in all sorts of weird race
conditions. Simplify it all and only Lock/Unlock once.

Updates #11649

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-05-07 20:29:59 -07:00
Brad Fitzpatrick
80df8ffb85 control/controlclient: early return and outdent some code
I found this too hard to read before.

This is pulled out of #12033 as it's unrelated cleanup in retrospect.

Updates #12028

Change-Id: I727c47e573217e3d1973c5b66a76748139cf79ee
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-05-07 11:02:55 -07:00
Maisem Ali
af97e7a793 tailcfg,all: add/plumb Node.IsJailed
This adds a new bool that can be sent down from control
to do jailing on the client side. Previously this would
only be done from control by modifying the packet filter
we sent down to clients. This would result in a lot of
additional work/CPU on control, we could instead just
do this on the client. This has always been a TODO which
we keep putting off, might as well do it now.

Updates tailscale/corp#19623

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2024-05-06 15:32:22 -07:00
Nick Khyl
f62e678df8 net/dns/resolver, control/controlknobs, tailcfg: use UserDial instead of SystemDial to dial DNS servers
Now that tsdial.Dialer.UserDial has been updated to honor the configured routes
and dial external network addresses without going through Tailscale, while also being
able to dial a node/subnet router on the tailnet, we can start using UserDial to forward
DNS requests. This is primarily needed for DNS over TCP when forwarding requests
to internal DNS servers, but we also update getKnownDoHClientForProvider to use it.

Updates tailscale/corp#18725

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2024-05-06 17:29:24 -05:00
Brad Fitzpatrick
e26f76a1c4 tstest/integration: add more debugging, logs to catch flaky test
Updates #11962

Change-Id: I1ab0db69bdf8d1d535aa2cef434c586311f0fe18
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-05-06 15:03:06 -07:00
Fran Bull
1bd1b387b2 appc: add flag shouldStoreRoutes and controlknob for it
When an app connector is reconfigured and domains to route are removed,
we would like to no longer advertise routes that were discovered for
those domains. In order to do this we plan to store which routes were
discovered for which domains.

Add a controlknob so that we can enable/disable the new behavior.

Updates #11008
Signed-off-by: Fran Bull <fran@tailscale.com>
2024-04-29 11:40:04 -07:00
Brad Fitzpatrick
6b95219e3a net/netmon, add: add netmon.State type alias of interfaces.State
... in prep for merging the net/interfaces package into net/netmon.

This is a no-op change that updates a bunch of the API signatures ahead of
a future change to actually move things (and remove the type alias)

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: I477613388f09389214db0d77ccf24a65bff2199c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-28 07:34:52 -07:00
Brad Fitzpatrick
3672f29a4e net/netns, net/dns/resolver, etc: make netmon required in most places
The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached. But first (this change and others)
we need to make sure the one netmon.Monitor is plumbed everywhere.

Some notable bits:

* tsdial.NewDialer is added, taking a now-required netmon

* because a tsdial.Dialer always has a netmon, anything taking both
  a Dialer and a NetMon is now redundant; take only the Dialer and
  get the NetMon from that if/when needed.

* netmon.NewStatic is added, primarily for tests

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: I877f9cb87618c4eb037cee098241d18da9c01691
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-27 12:17:45 -07:00
Brad Fitzpatrick
745931415c health, all: remove health.Global, finish plumbing health.Tracker
Updates #11874
Updates #4136

Change-Id: I414470f71d90be9889d44c3afd53956d9f26cd61
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-26 12:03:11 -07:00
Brad Fitzpatrick
a4a282cd49 control/controlclient: plumb health.Tracker
Updates #11874
Updates #4136

Change-Id: Ia941153bd83523f0c8b56852010f5231d774d91a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-26 10:12:33 -07:00
Brad Fitzpatrick
723c775dbb tsd, ipnlocal, etc: add tsd.System.HealthTracker, start some plumbing
This adds a health.Tracker to tsd.System, accessible via
a new tsd.System.HealthTracker method.

In the future, that new method will return a tsd.System-specific
HealthTracker, so multiple tsnet.Servers in the same process are
isolated. For now, though, it just always returns the temporary
health.Global value. That permits incremental plumbing over a number
of changes. When the second to last health.Global reference is gone,
then the tsd.System.HealthTracker implementation can return a private
Tracker.

The primary plumbing this does is adding it to LocalBackend and its
dozen and change health calls. A few misc other callers are also
plumbed. Subsequent changes will flesh out other parts of the tree
(magicsock, controlclient, etc).

Updates #11874
Updates #4136

Change-Id: Id51e73cfc8a39110425b6dc19d18b3975eac75ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-25 22:13:04 -07:00
Brad Fitzpatrick
ebc552d2e0 health: add Tracker type, in prep for removing global variables
This moves most of the health package global variables to a new
`health.Tracker` type.

But then rather than plumbing the Tracker in tsd.System everywhere,
this only goes halfway and makes one new global Tracker
(`health.Global`) that all the existing callers now use.

A future change will eliminate that global.

Updates #11874
Updates #4136

Change-Id: I6ee27e0b2e35f68cb38fecdb3b2dc4c3f2e09d68
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-25 13:46:22 -07:00
Brad Fitzpatrick
5100bdeba7 types/persist: remove unused field Persist.Provider
It was only obviously unused after the previous change, c39cde79d.

Updates #19334

Change-Id: I9896d5fa692cb4346c070b4a339d0d12340c18f7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-21 10:48:25 -07:00
Brad Fitzpatrick
c39cde79d2 tailcfg: remove some unused fields from RegisterResponseAuth
Fixes #19334

Change-Id: Id6463f28af23078a7bc25b9280c99d4491bd9651
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-21 10:29:19 -07:00
Brad Fitzpatrick
05bfa022f2 tailcfg: pointerify RegisterRequest.Auth, omitemptify RegisterResponseAuth
We were storing server-side lots of:

    "Auth":{"Provider":"","LoginName":"","Oauth2Token":null,"AuthKey":""},

That was about 7% of our total storage of pending RegisterRequest
bodies.

Updates tailscale/corp#19327

Change-Id: Ib73842759a2b303ff5fe4c052a76baea0d68ae7d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-21 07:10:43 -07:00
Brad Fitzpatrick
7c1d6e35a5 all: use Go 1.22 range-over-int
Updates #11058

Change-Id: I35e7ef9b90e83cac04ca93fd964ad00ed5b48430
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-16 15:32:38 -07:00
Brad Fitzpatrick
2409661a0d control/controlclient: delete old naclbox code, require ts2021 Noise
Updates #11585
Updates tailscale/corp#18882

Change-Id: I90e2e4a211c58d429e2b128604614dde18986442
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-03 09:17:27 -07:00
James Tucker
9401b09028 control/controlclient: move client watchdog to cover initial request
The initial control client request can get stuck in the event that a
connection is established but then lost part way through, without any
ICMP or RST. Ensure that the control client will be restarted by timing
out that initial request as well.

Fixes #11542

Signed-off-by: James Tucker <james@tailscale.com>
2024-03-27 16:02:52 -07:00
Brad Fitzpatrick
7b34154df2 all: deprecate Node.Capabilities (more), remove PeerChange.Capabilities [capver 89]
First we had Capabilities []string. Then
https://tailscale.com/blog/acl-grants (#4217) brought CapMap, a
superset of Capabilities. Except we never really finished the
transition inside the codebase to go all-in on CapMap. This does so.

Notably, this coverts Capabilities on the wire early to CapMap
internally so the code can only deal in CapMap, even against an old
control server.

In the process, this removes PeerChange.Capabilities support, which no
known control plane sent anyway. They can and should use
PeerChange.CapMap instead.

Updates #11508
Updates #4217

Change-Id: I872074e226b873f9a578d9603897b831d50b25d9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-03-24 21:08:46 -07:00
Brad Fitzpatrick
b104688e04 ipn/ipnlocal, types/netmap: replace hasCapability with set lookup on NetworkMap
When node attributes were super rare, the O(n) slice scans looking for
node attributes was more acceptable. But now more code and more users
are using increasingly more node attributes. Time to make it a map.

Noticed while working on tailscale/corp#17879

Updates #cleanup

Change-Id: Ic17c80341f418421002fbceb47490729048756d2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-03-22 15:30:46 -07:00
Brad Fitzpatrick
f45594d2c9 control/controlclient: free memory on iOS before full netmap work
Updates tailscale/corp#18514

Change-Id: I8d0330334b030ed8692b25549a0ee887ac6d7188
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-03-22 09:02:19 -07:00
Brad Fitzpatrick
8444937c89 control/controlclient: fix panic regression from earlier load balancer hint header
In the recent 20e9f3369 we made HealthChangeRequest machine requests
include a NodeKey, as it was the oddball machine request that didn't
include one. Unfortunately, that code was sometimes being called (at
least in some of our integration tests) without a node key due to its
registration with health.RegisterWatcher(direct.ReportHealthChange).

Fortunately tests in corp caught this before we cut a release. It's
possible this only affects this particular integration test's
environment, but still worth fixing.

Updates tailscale/corp#1297

Change-Id: I84046779955105763dc1be5121c69fec3c138672
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-03-21 12:54:58 -07:00
Joe Tsai
85febda86d
all: use zstdframe where sensible (#11491)
Use the zstdframe package where sensible instead of plumbing
around our own zstd.Encoder just for stateless operations.

This causes logtail to have a dependency on zstd,
but that's arguably okay since zstd support is implicit
to the protocol between a client and the logging service.
Also, virtually every caller to logger.NewLogger was
manually setting up a zstd.Encoder anyways,
meaning that zstd was functionally always a dependency.

Updates #cleanup
Updates tailscale/corp#18514

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-03-21 12:20:38 -07:00
Adrian Dewhurst
2f7e7be2ea control/controlclient: do not alias peer CapMap
Updates #cleanup

Change-Id: I10fd5e04310cdd7894a3caa3045b86eb0a06b6a0
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2024-03-20 15:42:44 -04:00
Brad Fitzpatrick
20e9f3369d control/controlclient: send load balancing hint HTTP request header
Updates tailscale/corp#1297

Change-Id: I0b102081e81dfc1261f4b05521ab248a2e4a1298
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-03-20 09:39:41 -07:00
Claire Wang
221de01745
control/controlclient: fix sending peer capmap changes (#11457)
Instead of just checking if a peer capmap is nil, compare the previous
state peer capmap with the new peer capmap.
Updates tailscale/corp#17516

Signed-off-by: Claire Wang <claire@tailscale.com>
2024-03-19 18:56:06 -04:00
Brad Fitzpatrick
e1bd7488d0 all: remove LenIter, use Go 1.22 range-over-int instead
Updates #11058
Updates golang/go#65685

Change-Id: Ibb216b346e511d486271ab3d84e4546c521e4e22
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-02-25 12:29:45 -08:00
Joe Tsai
94a4f701c2
all: use reflect.TypeFor now available in Go 1.22 (#11078)
Updates #cleanup

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-02-08 17:34:22 -08:00
Brad Fitzpatrick
2bd3c1474b util/cmpx: delete now that we're using Go 1.22
Updates #11058

Change-Id: I09dea8e86f03ec148b715efca339eab8b1f0f644
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-02-07 18:10:15 -08:00
Jordan Whited
8b47322acc
wgengine/magicsock: implement probing of UDP path lifetime (#10844)
This commit implements probing of UDP path lifetime on the tail end of
an active direct connection. Probing configuration has two parts -
Cliffs, which are various timeout cliffs of interest, and
CycleCanStartEvery, which limits how often a probing cycle can start,
per-endpoint. Initially a statically defined default configuration will
be used. The default configuration has cliffs of 10s, 30s, and 60s,
with a CycleCanStartEvery of 24h. Probing results are communicated via
clientmetric counters. Probing is off by default, and can be enabled
via control knob. Probing is purely informational and does not yet
drive any magicsock behaviors.

Updates #540

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-01-23 09:37:32 -08:00
James Tucker
38a1cf748a control/controlclient,util/execqueue: extract execqueue into a package
This is a useful primitive for asynchronous execution of ordered work I
want to use in another change.

Updates tailscale/corp#16833
Signed-off-by: James Tucker <james@tailscale.com>
2024-01-18 12:08:13 -08:00
James 'zofrex' Sanderson
124dc10261
controlclient,tailcfg,types: expose MaxKeyDuration via localapi (#10401)
Updates tailscale/corp#16016

Signed-off-by: James Sanderson <jsanderson@tailscale.com>
2024-01-05 12:06:12 +00:00
James 'zofrex' Sanderson
10c595d962
ipn/ipnlocal: refresh node key without blocking if cap enabled (#10529)
Updates tailscale/corp#16016

Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
2024-01-04 17:29:04 +00:00
Andrew Lytvynov
2716250ee8
all: cleanup unused code, part 2 (#10670)
And enable U1000 check in staticcheck.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2023-12-21 17:40:03 -08:00
Andrew Lytvynov
945cf836ee
ipn: apply tailnet-wide default for auto-updates (#10508)
When auto-update setting in local Prefs is unset, apply the tailnet
default value from control. This only happens once, when we apply the
default (or when the user manually overrides it), tailnet default no
longer affects the node.

Updates #16244

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2023-12-18 14:57:03 -08:00
Naman Sood
0a59754eda linuxfw,wgengine/route,ipn: add c2n and nodeattrs to control linux netfilter
Updates tailscale/corp#14029.

Signed-off-by: Naman Sood <mail@nsood.in>
2023-12-05 14:22:02 -05:00
Matt Layher
a217f1fccf all: fix nilness issues
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2023-12-05 11:43:14 -05:00
Brad Fitzpatrick
fb829ea7f1 control/controlclient: support incremental packet filter updates [capver 81]
Updates #10299

Change-Id: I87e4235c668a1db7de7ef1abc743f0beecb86d3d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-11-17 10:07:12 -08:00
Jordan Whited
e848736927
control/controlknobs,wgengine/magicsock: implement SilentDisco toggle (#10195)
This change exposes SilentDisco as a control knob, and plumbs it down to
magicsock.endpoint. No changes are being made to magicsock.endpoint
disco behavior, yet.

Updates #540

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-11-13 10:05:04 -08:00
Brad Fitzpatrick
44c6909c92 control/controlclient: move watchdog out of mapSession
In prep for making mapSession's lifetime not be 1:1 with a single HTTP
response's lifetime, this moves the inactivity timer watchdog out of
mapSession and into the caller that owns the streaming HTTP response.

(This is admittedly closer to how it was prior to the mapSession type
existing, but that was before we connected some dots which were
impossible to even see before the mapSession type broke the code up.)

Updates #7175

Change-Id: Ia108dac84a4953db41cbd30e73b1de4a2a676c11
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-11-05 10:57:36 -08:00
Brad Fitzpatrick
c87d58063a control/controlclient: move lastPrintMap field from Direct to mapSession
It was a really a mutable field owned by mapSession that we didn't move
in earlier commits.

Once moved, it's then possible to de-func-ify the code and turn it into
a regular method rather than an installed optional hook.

Noticed while working to move map session lifetimes out of
Direct.sendMapRequest's single-HTTP-connection scope.

Updates #7175
Updates #cleanup

Change-Id: I6446b15793953d88d1cabf94b5943bb3ccac3ad9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-11-05 08:53:47 -08:00
Adrian Dewhurst
5347e6a292 control/controlclient: support certstore without cgo
We no longer build Windows releases with cgo enabled, which
automatically turned off certstore support. Rather than re-enabling cgo,
we updated our fork of the certstore package to no longer require cgo.
This updates the package, cleans up how the feature is configured, and
removes the cgo build tag requirement.

Fixes tailscale/corp#14797
Fixes tailscale/coral#118

Change-Id: Iaea34340761c0431d759370532c16a48c0913374
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2023-10-20 15:17:32 -04:00
Claire Wang
754fb9a8a8
tailcfg: add tailnet field to register request (#9675)
Updates tailscale/corp#10967

Signed-off-by: Claire Wang <claire@tailscale.com>
2023-10-13 14:13:41 -04:00
Maisem Ali
3655fb3ba0 control/controlclient: fix deadlock in shutdown
Fixes a deadlock observed in a different repo.
Regressed in 5b3f5eabb5.

Updates tailscale/corp#14950
Updates tailscale/corp#14515
Updates tailscale/corp#14139
Updates tailscale/corp#13175

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2023-10-03 18:02:55 -07:00
Brad Fitzpatrick
425cf9aa9d tailcfg, all: use []netip.AddrPort instead of []string for Endpoints
It's JSON wire compatible.

Updates #cleanup

Change-Id: Ifa5c17768fec35b305b06d75eb5f0611c8a135a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-10-01 18:23:02 -07:00
Brad Fitzpatrick
246e0ccdca tsnet: add a test for restarting a tsnet server, fix Windows
Thanks to @qur and @eric for debugging!

Fixes #6973

Change-Id: Ib2cf8f030cf595cc73dd061c72e78ac19f5fae5d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-09-29 09:05:45 -07:00
Brad Fitzpatrick
5b3f5eabb5 control/controlclient: fix leaked http2 goroutines on shutdown
If a noise dial was happening concurrently with shutdown, the
http2 goroutines could leak.

Updates tailscale/corp#14950
Updates tailscale/corp#14515
Updates tailscale/corp#14139
Updates tailscale/corp#13175

Signed-off-by: Brad Fitzpatrick <brad@danga.com>
2023-09-28 11:16:46 -07:00
Claire Wang
e3d6236606
winutil: refactor methods to get values from registry to also return (#9536)
errors
Updates tailscale/corp#14879

Signed-off-by: Claire Wang <claire@tailscale.com>
2023-09-26 13:15:11 -04:00
Andrew Dunham
530aaa52f1 net/dns: retry forwarder requests over TCP
We weren't correctly retrying truncated requests to an upstream DNS
server with TCP. Instead, we'd return a truncated request to the user,
even if the user was querying us over TCP and thus able to handle a
large response.

Also, add an envknob and controlknob to allow users/us to disable this
behaviour if it turns out to be buggy ( DNS ).

Updates #9264

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ifb04b563839a9614c0ba03e9c564e8924c1a2bfd
2023-09-25 16:42:07 -04:00