Commit Graph

292 Commits

Author SHA1 Message Date
Brad Fitzpatrick
745931415c health, all: remove health.Global, finish plumbing health.Tracker
Updates #11874
Updates #4136

Change-Id: I414470f71d90be9889d44c3afd53956d9f26cd61
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-26 12:03:11 -07:00
Brad Fitzpatrick
a4a282cd49 control/controlclient: plumb health.Tracker
Updates #11874
Updates #4136

Change-Id: Ia941153bd83523f0c8b56852010f5231d774d91a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-26 10:12:33 -07:00
Brad Fitzpatrick
723c775dbb tsd, ipnlocal, etc: add tsd.System.HealthTracker, start some plumbing
This adds a health.Tracker to tsd.System, accessible via
a new tsd.System.HealthTracker method.

In the future, that new method will return a tsd.System-specific
HealthTracker, so multiple tsnet.Servers in the same process are
isolated. For now, though, it just always returns the temporary
health.Global value. That permits incremental plumbing over a number
of changes. When the second to last health.Global reference is gone,
then the tsd.System.HealthTracker implementation can return a private
Tracker.

The primary plumbing this does is adding it to LocalBackend and its
dozen and change health calls. A few misc other callers are also
plumbed. Subsequent changes will flesh out other parts of the tree
(magicsock, controlclient, etc).

Updates #11874
Updates #4136

Change-Id: Id51e73cfc8a39110425b6dc19d18b3975eac75ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-25 22:13:04 -07:00
Brad Fitzpatrick
ebc552d2e0 health: add Tracker type, in prep for removing global variables
This moves most of the health package global variables to a new
`health.Tracker` type.

But then rather than plumbing the Tracker in tsd.System everywhere,
this only goes halfway and makes one new global Tracker
(`health.Global`) that all the existing callers now use.

A future change will eliminate that global.

Updates #11874
Updates #4136

Change-Id: I6ee27e0b2e35f68cb38fecdb3b2dc4c3f2e09d68
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-25 13:46:22 -07:00
Brad Fitzpatrick
5100bdeba7 types/persist: remove unused field Persist.Provider
It was only obviously unused after the previous change, c39cde79d.

Updates #19334

Change-Id: I9896d5fa692cb4346c070b4a339d0d12340c18f7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-21 10:48:25 -07:00
Brad Fitzpatrick
c39cde79d2 tailcfg: remove some unused fields from RegisterResponseAuth
Fixes #19334

Change-Id: Id6463f28af23078a7bc25b9280c99d4491bd9651
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-21 10:29:19 -07:00
Brad Fitzpatrick
05bfa022f2 tailcfg: pointerify RegisterRequest.Auth, omitemptify RegisterResponseAuth
We were storing server-side lots of:

    "Auth":{"Provider":"","LoginName":"","Oauth2Token":null,"AuthKey":""},

That was about 7% of our total storage of pending RegisterRequest
bodies.

Updates tailscale/corp#19327

Change-Id: Ib73842759a2b303ff5fe4c052a76baea0d68ae7d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-21 07:10:43 -07:00
Brad Fitzpatrick
2409661a0d control/controlclient: delete old naclbox code, require ts2021 Noise
Updates #11585
Updates tailscale/corp#18882

Change-Id: I90e2e4a211c58d429e2b128604614dde18986442
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-04-03 09:17:27 -07:00
James Tucker
9401b09028 control/controlclient: move client watchdog to cover initial request
The initial control client request can get stuck in the event that a
connection is established but then lost part way through, without any
ICMP or RST. Ensure that the control client will be restarted by timing
out that initial request as well.

Fixes #11542

Signed-off-by: James Tucker <james@tailscale.com>
2024-03-27 16:02:52 -07:00
Brad Fitzpatrick
8444937c89 control/controlclient: fix panic regression from earlier load balancer hint header
In the recent 20e9f3369 we made HealthChangeRequest machine requests
include a NodeKey, as it was the oddball machine request that didn't
include one. Unfortunately, that code was sometimes being called (at
least in some of our integration tests) without a node key due to its
registration with health.RegisterWatcher(direct.ReportHealthChange).

Fortunately tests in corp caught this before we cut a release. It's
possible this only affects this particular integration test's
environment, but still worth fixing.

Updates tailscale/corp#1297

Change-Id: I84046779955105763dc1be5121c69fec3c138672
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-03-21 12:54:58 -07:00
Joe Tsai
85febda86d
all: use zstdframe where sensible (#11491)
Use the zstdframe package where sensible instead of plumbing
around our own zstd.Encoder just for stateless operations.

This causes logtail to have a dependency on zstd,
but that's arguably okay since zstd support is implicit
to the protocol between a client and the logging service.
Also, virtually every caller to logger.NewLogger was
manually setting up a zstd.Encoder anyways,
meaning that zstd was functionally always a dependency.

Updates #cleanup
Updates tailscale/corp#18514

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2024-03-21 12:20:38 -07:00
Brad Fitzpatrick
20e9f3369d control/controlclient: send load balancing hint HTTP request header
Updates tailscale/corp#1297

Change-Id: I0b102081e81dfc1261f4b05521ab248a2e4a1298
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2024-03-20 09:39:41 -07:00
James 'zofrex' Sanderson
124dc10261
controlclient,tailcfg,types: expose MaxKeyDuration via localapi (#10401)
Updates tailscale/corp#16016

Signed-off-by: James Sanderson <jsanderson@tailscale.com>
2024-01-05 12:06:12 +00:00
Andrew Lytvynov
945cf836ee
ipn: apply tailnet-wide default for auto-updates (#10508)
When auto-update setting in local Prefs is unset, apply the tailnet
default value from control. This only happens once, when we apply the
default (or when the user manually overrides it), tailnet default no
longer affects the node.

Updates #16244

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2023-12-18 14:57:03 -08:00
Matt Layher
a217f1fccf all: fix nilness issues
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2023-12-05 11:43:14 -05:00
Brad Fitzpatrick
44c6909c92 control/controlclient: move watchdog out of mapSession
In prep for making mapSession's lifetime not be 1:1 with a single HTTP
response's lifetime, this moves the inactivity timer watchdog out of
mapSession and into the caller that owns the streaming HTTP response.

(This is admittedly closer to how it was prior to the mapSession type
existing, but that was before we connected some dots which were
impossible to even see before the mapSession type broke the code up.)

Updates #7175

Change-Id: Ia108dac84a4953db41cbd30e73b1de4a2a676c11
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-11-05 10:57:36 -08:00
Brad Fitzpatrick
c87d58063a control/controlclient: move lastPrintMap field from Direct to mapSession
It was a really a mutable field owned by mapSession that we didn't move
in earlier commits.

Once moved, it's then possible to de-func-ify the code and turn it into
a regular method rather than an installed optional hook.

Noticed while working to move map session lifetimes out of
Direct.sendMapRequest's single-HTTP-connection scope.

Updates #7175
Updates #cleanup

Change-Id: I6446b15793953d88d1cabf94b5943bb3ccac3ad9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-11-05 08:53:47 -08:00
Claire Wang
754fb9a8a8
tailcfg: add tailnet field to register request (#9675)
Updates tailscale/corp#10967

Signed-off-by: Claire Wang <claire@tailscale.com>
2023-10-13 14:13:41 -04:00
Brad Fitzpatrick
425cf9aa9d tailcfg, all: use []netip.AddrPort instead of []string for Endpoints
It's JSON wire compatible.

Updates #cleanup

Change-Id: Ifa5c17768fec35b305b06d75eb5f0611c8a135a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-10-01 18:23:02 -07:00
Brad Fitzpatrick
3b32d6c679 wgengine/magicsock, controlclient, net/dns: reduce some logspam
Updates #cleanup

Change-Id: I78b0697a01e94baa33f3de474b591e616fa5e6af
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-09-23 11:52:47 -07:00
Brad Fitzpatrick
9203916a4a control/controlknobs: move more controlknobs code from controlclient
Updates #cleanup

Change-Id: I2b8b6ac97589270f307bfb20e33674894ce873b5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-09-12 12:44:35 -07:00
Brad Fitzpatrick
3af051ea27 control/controlclient, types/netmap: start plumbing delta netmap updates
Currently only the top four most popular changes: endpoints, DERP
home, online, and LastSeen.

Updates #1909

Change-Id: I03152da176b2b95232b56acabfb55dcdfaa16b79
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-09-12 12:23:24 -07:00
Brad Fitzpatrick
42072683d6 control/controlknobs: move ForceBackgroundSTUN to controlknobs.Knobs
This is both more efficient (because the knobs' bool is only updated
whenever Node is changed, rarely) and also gets us one step closer to
removing a case of storing a netmap.NetworkMap in
magicsock. (eventually we want to phase out much of the use of that
type internally)

Updates #1909

Change-Id: I37e81789f94133175064fdc09984e4f3a431f1a1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-09-11 18:11:09 -07:00
Brad Fitzpatrick
4e91cf20a8 control/controlknobs, all: add plumbed Knobs type, not global variables
Previously two tsnet nodes in the same process couldn't have disjoint
sets of controlknob settings from control as both would overwrite each
other's global variables.

This plumbs a new controlknobs.Knobs type around everywhere and hangs
the knobs sent by control on that instead.

Updates #9351

Change-Id: I75338646d36813ed971b4ffad6f9a8b41ec91560
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-09-11 12:44:03 -07:00
Brad Fitzpatrick
9a86aa5732 all: depend on zstd unconditionally, remove plumbing to make it optional
All platforms use it at this point, including iOS which was the
original hold out for memory reasons. No more reason to make it
optional.

Updates #9332

Change-Id: I743fbc2f370921a852fbcebf4eb9821e2bdd3086
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-09-10 08:36:05 -07:00
Maisem Ali
d06a75dcd0 ipn/ipnlocal: fix deadlock in resetControlClientLocked
resetControlClientLocked is called while b.mu was held and
would call cc.Shutdown which would wait for the observer queue
to drain.
However, there may be active callbacks from cc already waiting for
b.mu resulting in a deadlock.

This makes it so that resetControlClientLocked does not call
Shutdown, and instead just returns the value.
It also makes it so that any status received from previous cc
are ignored.

Updates tailscale/corp#12827

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2023-09-02 13:47:32 -07:00
Brad Fitzpatrick
f7b7ccf835 control/controlclient, ipn/ipnlocal: unplumb a bool true literal opt
Updates #cleanup

Change-Id: I664f280a2e06b9875942458afcaf6be42a5e462a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-31 22:34:21 -07:00
Brad Fitzpatrick
313a129fe5 control/controlclient: use slices package more
Updates #cleanup

Change-Id: Ic17384266dc59bc4e710efdda311d6e0719529da
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-30 13:45:20 -07:00
Brad Fitzpatrick
55bb7314f2 control/controlclient: replace a status func with Observer interface
For now the method has only one interface (the same as the func it's
replacing) but it will grow, eventually with the goal to remove the
controlclient.Status type for most purposes.

Updates #1909

Change-Id: I715c8bf95e3f5943055a94e76af98d988558a2f2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-28 21:07:04 -07:00
Brad Fitzpatrick
84b94b3146 types/netmap, all: make NetworkMap.SelfNode a tailcfg.NodeView
Updates #1909

Change-Id: I8c470cbc147129a652c1d58eac9b790691b87606
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-21 13:34:49 -07:00
Brad Fitzpatrick
b5ff68a968 control/controlclient: flesh out mapSession to break up gigantic method
Now mapSession has a bunch more fields and methods, rather than being
just one massive func with a ton of local variables.

So far there are no major new optimizations, though. It should behave
the same as before.

This has been done with an eye towards testability (so tests can set
all the callback funcs as needed, or not, without a huge Direct client
or long-running HTTP requests), but this change doesn't add new tests
yet. That will follow in the changes which flesh out the NetmapUpdater
interface.

Updates #1909

Change-Id: Iad4e7442d5bbbe2614bd4b1dc4b02e27504898df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-21 10:38:32 -07:00
Brad Fitzpatrick
165f0116f1 types/netmap: move some mutations earlier, remove, document some fields
And optimize the Persist setting a bit, allocating later and only mutating
fields when there's been a Node change.

Updates #1909

Change-Id: Iaddfd9e88ef76e1d18e8d0a41926eb44d0955312
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-20 16:26:11 -07:00
Brad Fitzpatrick
21170fb175 control/controlclient: scope a variable tighter, de-pointer a *time.Time
Just misc cleanups.

Updates #1909

Change-Id: I9d64cb6c46d634eb5fdf725c13a6c5e514e02e9a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-20 15:06:24 -07:00
Brad Fitzpatrick
7dec09d169 control/controlclient: remove Opts.KeepSharerAndUserSplit
It was added 2.5 years ago in c1dabd9436 but was never used.
Clearly that migration didn't matter.

We can attempt this again later if/when this matters.

Meanwhile this simplifies the code and thus makes working on other
current efforts in these parts of the code easier.

Updates #1909

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-19 15:06:05 -07:00
Brad Fitzpatrick
121d1d002c tailcfg: add nodeAttrs for forcing OneCGNAT on/off [capver 71]
Updates #8923

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-17 13:32:12 -07:00
Brad Fitzpatrick
25663b1307 tailcfg: remove most Debug fields, move bulk to nodeAttrs [capver 70]
Now a nodeAttr: ForceBackgroundSTUN, DERPRoute, TrimWGConfig,
DisableSubnetsIfPAC, DisableUPnP.

Kept support for, but also now a NodeAttr: RandomizeClientPort.

Removed: SetForceBackgroundSTUN, SetRandomizeClientPort (both never
used, sadly... never got around to them. But nodeAttrs are better
anyway), EnableSilentDisco (will be a nodeAttr later when that effort
resumes).

Updates #8923

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-17 10:52:47 -07:00
Brad Fitzpatrick
239ad57446 tailcfg: move LogHeapPprof from Debug to c2n [capver 69]
And delete Debug.GoroutineDumpURL, which was already in c2n.

Updates #8923

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-16 20:35:04 -07:00
Brad Fitzpatrick
2398993804 control/controlclient: refactor in prep for optimized delta handling
See issue. This is a baby step towards passing through deltas
end-to-end from node to control back to node and down to the various
engine subsystems, not computing diffs from two full netmaps at
various levels. This will then let us support larger netmaps without
burning CPU.

But this change itself changes no behavior. It just changes a func
type to an interface with one method. That paves the way for future
changes to then add new NetmapUpdater methods that do more
fine-grained work than updating the whole world.

Updates #1909

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-08-12 16:09:58 -07:00
Maisem Ali
d16946854f control/controlclient: add Auto.updateRoutine
Instead of having updates replace the map polls, create
a third goroutine which is solely responsible for making
sure that control is aware of the latest client state.

This also makes it so that the streaming map polls are only
broken when there are auth changes, or the client is paused.

Updates tailscale/corp#5761

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2023-08-10 12:23:08 -07:00
Maisem Ali
734928d3cb control/controlclient: make Direct own all changes to Persist
It was being modified in two places in Direct for the auth routine
and then in LocalBackend when a new NetMap was received. This was
confusing, so make Direct also own changes to Persist when a new
NetMap is received.

Updates #7726

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2023-08-08 13:43:37 -06:00
Maisem Ali
6aaf1d48df types/persist: drop duplicated Persist.LoginName
It was duplicated from Persist.UserProfile.LoginName, drop it.

Updates #7726

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2023-08-08 13:43:37 -06:00
salman aljammaz
25a7204bb4
wgengine,ipn,cmd/tailscale: add size option to ping (#8739)
This adds the capability to pad disco ping message payloads to reach a
specified size. It also plumbs it through to the tailscale ping -size
flag.

Disco pings used for actual endpoint discovery do not use this yet.

Updates #311.

Signed-off-by: salman <salman@tailscale.com>
Co-authored-by: Val <valerie@tailscale.com>
2023-08-08 13:11:28 +01:00
Claire Wang
a17c45fd6e
control: use tstime instead of time (#8595)
Updates #8587
Signed-off-by: Claire Wang <claire@tailscale.com>
2023-08-04 19:29:44 -04:00
Maisem Ali
60e5761d60 control/controlclient: reset backoff in mapRoutine on netmap recv
We were never resetting the backoff in streaming mapResponses.
The call to `PollNetMap` always returns with an error. Changing that contract
is harder, so manually reset backoff when a netmap is received.

Updates tailscale/corp#12894

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2023-07-13 21:08:28 -07:00
Andrew Dunham
42fd964090 control/controlclient: use dnscache.Resolver for Noise client
This passes the *dnscache.Resolver down from the Direct client into the
Noise client and from there into the controlhttp client. This retains
the Resolver so that it can share state across calls instead of creating
a new resolver.

Updates #4845
Updates #6110

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ia5d6af1870f3b5b5d7dd5685d775dcf300aec7af
2023-05-01 13:22:10 -07:00
Mihai Parparita
7330aa593e all: avoid repeated default interface lookups
On some platforms (notably macOS and iOS) we look up the default
interface to bind outgoing connections to. This is both duplicated
work and results in logspam when the default interface is not available
(i.e. when a phone has no connectivity, we log an error and thus cause
more things that we will try to upload and fail).

Fixed by passing around a netmon.Monitor to more places, so that we can
use its cached interface state.

Fixes #7850
Updates #7621

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-04-20 15:46:01 -07:00
Mihai Parparita
4722f7e322 all: move network monitoring from wgengine/monitor to net/netmon
We're using it in more and more places, and it's not really specific to
our use of Wireguard (and does more just link/interface monitoring).

Also removes the separate interface we had for it in sockstats -- it's
a small enough package (we already pull in all of its dependencies
via other paths) that it's not worth the extra complexity.

Updates #7621
Updates #7850

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-04-20 10:15:59 -07:00
Mihai Parparita
9a655a1d58 net/dnsfallback: more explicitly pass through logf function
Redoes the approach from #5550 and #7539 to explicitly pass in the logf
function, instead of having global state that can be overridden.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-04-17 12:06:23 -07:00
Andrew Dunham
83fa17d26c various: pass logger.Logf through to more places
Updates #7537

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id89acab70ea678c8c7ff0f44792d54c7223337c6
2023-03-12 12:38:38 -04:00
Maisem Ali
be027a9899 control/controlclient: improve handling of concurrent lite map requests
This reverts commit 6eca47b16c and fixes forward.

Previously the first ever streaming MapRequest that a client sent would also
set ReadOnly to true as it didn't have any endpoints and expected/relied on the
map poll to restart as soon as it got endpoints. However with 48f6c1eba4,
we would no longer restart MapRequests as frequently as we used to, so control
would only ever get the first streaming MapRequest which had ReadOnly=true.

Control would treat this as an uninteresting request and would not send it
any further netmaps, while the client would happily stay in the map poll forever
while litemap updates happened in parallel.

This makes it so that we never set `ReadOnly=true` when we are doing a streaming
MapRequest. This is no longer necessary either as most endpoint discovery happens
over disco anyway.

Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2023-03-09 11:36:44 -08:00