LocalBackend can talk to magicsock on its own to do this without
the "Engine" being involved.
(Continuing a little side quest of cleaning up the Engine
interface...)
Updates #cleanup
Change-Id: I8654acdca2b883b1bd557fdc0cfb90cd3a418a62
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Currently only the top four most popular changes: endpoints, DERP
home, online, and LastSeen.
Updates #1909
Change-Id: I03152da176b2b95232b56acabfb55dcdfaa16b79
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We're trying to start using that monster type less and eventually get
rid of it.
Updates #1909
Change-Id: I8e1e725bce5324fb820a9be6c7952767863e6542
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is both more efficient (because the knobs' bool is only updated
whenever Node is changed, rarely) and also gets us one step closer to
removing a case of storing a netmap.NetworkMap in
magicsock. (eventually we want to phase out much of the use of that
type internally)
Updates #1909
Change-Id: I37e81789f94133175064fdc09984e4f3a431f1a1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously two tsnet nodes in the same process couldn't have disjoint
sets of controlknob settings from control as both would overwrite each
other's global variables.
This plumbs a new controlknobs.Knobs type around everywhere and hangs
the knobs sent by control on that instead.
Updates #9351
Change-Id: I75338646d36813ed971b4ffad6f9a8b41ec91560
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In prep for incremental netmap update plumbing (#1909), make peerMap
also keyed by NodeID, as all the netmap node mutations passed around
later will be keyed by NodeID.
In the process, also:
* add envknob.InDevMode, as a signal that we can panic more aggressively
in unexpected cases.
* pull two moderately large blocks of code in Conn.SetNetworkMap out
into their own methods
* convert a few more sets from maps to set.Set
Updates #1909
Change-Id: I7acdd64452ba58e9d554140ee7a8760f9043f961
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I didn't clean up the more idiomatic map[T]bool with true values, at
least yet. I just converted the relatively awkward struct{}-valued
maps.
Updates #cleanup
Change-Id: I758abebd2bb1f64bc7a9d0f25c32298f4679c14f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Replace %w verb with %v verb when logging errors.
Use %w only for wrapping errors with fmt.Errorf()
Fixes: #9213
Signed-off-by: Craig Rodrigues <rodrigc@crodrigues.org>
Upcoming work on incremental netmap change handling will require some
replumbing of which subsystems get notified about what. Done naively,
it could break "tailscale status --json" visibility later. To make sure
I understood the flow of all the updates I was rereading the status code
and realized parts of ipnstate.Status were being populated by the wrong
subsystems.
The engine (wireguard) and magicsock (data plane, NAT traveral) should
only populate the stuff that they uniquely know. The WireGuard bits
were fine but magicsock was populating stuff stuff that LocalBackend
could've better handled, so move it there.
Updates #1909
Change-Id: I6d1b95d19a2d1b70fbb3c875fac8ea1e169e8cb0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If we don't have the ICMP hint available, such as on Android, we can use
the signal of rx traffic to bias toward a particular endpoint.
We don't want to stick to a particular endpoint for a very long time
without any signals, so the sticky time is reduced to 1 second, which is
large enough to avoid excessive packet reordering in the common case,
but should be small enough that either rx provides a strong signal, or
we rotate in a user-interactive schedule to another endpoint, improving
the feel of failover to other endpoints.
Updates #8999
Co-authored-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
When sending a ping from the CLI, only accept a pong that is in reply
to the specific CLI ping we sent.
Updates #311
Signed-off-by: Val <valerie@tailscale.com>
Now a nodeAttr: ForceBackgroundSTUN, DERPRoute, TrimWGConfig,
DisableSubnetsIfPAC, DisableUPnP.
Kept support for, but also now a NodeAttr: RandomizeClientPort.
Removed: SetForceBackgroundSTUN, SetRandomizeClientPort (both never
used, sadly... never got around to them. But nodeAttrs are better
anyway), EnableSilentDisco (will be a nodeAttr later when that effort
resumes).
Updates #8923
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To record wether user is using iptables or nftables after we add support to nftables on linux, we
are adding a field FirewallMode to NetInfo in HostInfo to reflect what firewall mode the host is
running, and form metrics. The information is gained from a global constant in hostinfo.go. We
set it when selection heuristic made the decision, and magicsock reports this to control.
Updates: tailscale/corp#13943
Signed-off-by: KevinLiang10 <kevinliang@tailscale.com>
If a node is flapping or otherwise generating lots of STUN endpoints, we
can end up caching a ton of useless values and sending them to peers.
Instead, let's apply a fixed per-Addr limit of endpoints that we cache,
so that we're only sending peers up to the N most recent.
Updates tailscale/corp#13890
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8079a05b44220c46da55016c0e5fc96dd2135ef8
Netcheck no longer performs I/O itself, instead it makes requests via
SendPacket and expects users to route reply traffic to
ReceiveSTUNPacket.
Netcheck gains a Standalone function that stands up sockets and
goroutines to implement I/O when used in a standalone fashion.
Magicsock now unconditionally routes STUN traffic to the netcheck.Client
that it hosts, and plumbs the send packet sink.
The CLI is updated to make use of the Standalone mode.
Fixes#8723
Signed-off-by: James Tucker <james@tailscale.com>
This sets the Don't Fragment flag, for now behind the
TS_DEBUG_ENABLE_PMTUD envknob.
Updates #311.
Signed-off-by: Val <valerie@tailscale.com>
Signed-off-by: salman <salman@tailscale.com>
This adds the capability to pad disco ping message payloads to reach a
specified size. It also plumbs it through to the tailscale ping -size
flag.
Disco pings used for actual endpoint discovery do not use this yet.
Updates #311.
Signed-off-by: salman <salman@tailscale.com>
Co-authored-by: Val <valerie@tailscale.com>
The nonce value is not read by anything, and di.sharedKey.Seal()
a few lines below generates its own. #cleanup
Signed-off-by: salman <salman@tailscale.com>
This change removes the noV4/noV6 check from addrForSendWireGuardLocked.
On Android, the client panics when reaching `rand.Intn()`, likely due to
the candidates list being containing no candidates. The suspicion is
that the `noV4` and the `noV6` are both being triggered causing the
loop to continue.
Updates tailscale/corp#12938
Updates #7826
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
Switch our best address selection to use a scoring-based approach, where
we boost each address based on whether it's a private IP or IPv6.
For users in cloud environments, this biases endpoint selection towards
using an endpoint that is less likely to cost the user money, and should
be less surprising to users.
This also involves updating the tests to not use private IPv4 addresses;
other than that change, the behaviour should be identical for existing
endpoints.
Updates #8097
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I069e3b399daea28be66b81f7e44fc27b2943d8af
This change introduces address selection for wireguard only endpoints.
If a endpoint has not been used before, an address is randomly selected
to be used based on information we know about, such as if they are able
to use IPv4 or IPv6. When an address is initially selected, we also
initiate a new ICMP ping to the endpoints addresses to determine which
endpoint offers the best latency. This information is then used to
update which endpoint we should be using based on the best possible
route. If the latency is the same for a IPv4 and an IPv6 address, IPv6
will be used.
Updates #7826
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
Avoid selecting an endpoint as "better" than the current endpoint if the
total latency improvement is less than 1%. This adds some hysteresis to
avoid flapping between endpoints for a minimal improvement in latency.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If8312e1768ea65c4b4d4e13d8de284b3825d7a73
On some platforms (notably macOS and iOS) we look up the default
interface to bind outgoing connections to. This is both duplicated
work and results in logspam when the default interface is not available
(i.e. when a phone has no connectivity, we log an error and thus cause
more things that we will try to upload and fail).
Fixed by passing around a netmon.Monitor to more places, so that we can
use its cached interface state.
Fixes#7850
Updates #7621
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
We're using it in more and more places, and it's not really specific to
our use of Wireguard (and does more just link/interface monitoring).
Also removes the separate interface we had for it in sockstats -- it's
a small enough package (we already pull in all of its dependencies
via other paths) that it's not worth the extra complexity.
Updates #7621
Updates #7850
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
This is a follow-up to #7905 that adds two more linters and fixes the corresponding findings. As per the previous PR, this only flags things that are "obviously" wrong, and fixes the issues found.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8739bdb7bc4f75666a7385a7a26d56ec13741b7c
Previously, when updating endpoints we would immediately stop
advertising any endpoint that wasn't discovered during
determineEndpoints. This could result in, for example, a case where we
performed an incremental netcheck, didn't get any of our three STUN
packets back, and then dropped our STUN endpoint from the set of
advertised endpoints... which would result in clients falling back to a
DERP connection until the next call to determineEndpoints.
Instead, let's cache endpoints that we've discovered and continue
reporting them to clients until a timeout expires. In the above case
where we temporarily don't have a discovered STUN endpoint, we would
continue reporting the old value, then re-discover the STUN endpoint
again and continue reporting it as normal, so clients never see a
withdrawal.
Updates tailscale/coral#108
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I42de72e7418ab328a6c732bdefc74549708cf8b9
The comment still said *magicsock.Conn implemented wireguard-go conn.Bind.
That wasn't accurate anymore.
A doc #cleanup.
Change-Id: I7fd003b939497889cc81147bfb937b93e4f6865c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So we're staying within the netip.Addr/AddrPort consistently and
avoiding allocs/conversions to the legacy net addr types.
Updates #5162
Change-Id: I59feba60d3de39f773e68292d759766bac98c917
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This also adds a bunch of tests for this function to ensure that we're
returning the proper IP(s) in all cases.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I0d9d57170dbab5f2bf07abdf78ecd17e0e635399
Using log.Printf may end up being printed out to the console, which
is not desirable. I noticed this when I was investigating some client
logs with `sockstats: trace "NetcheckClient" was overwritten by another`.
That turns to be harmless/expected (the netcheck client will fall back
to the DERP client in some cases, which does its own sockstats trace).
However, the log output could be visible to users if running the
`tailscale netcheck` CLI command, which would be needlessly confusing.
Updates tailscale/corp#9230
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
The lazy initialization of the disco key is not necessary, and
contributes to unnecessary locking and state checking.
Updates #cleanup
Signed-off-by: James Tucker <james@tailscale.com>
A peer can have IsWireGuardOnly, which means it will not support DERP or
Disco, and it must have Endpoints filled in order to be usable.
In the present implementation only the first Endpoint will be used as
the bestAddr.
Updates tailscale/corp#10351
Co-authored-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Identified in review in #7821 endpoint.discoKey and endpoint.discoShort
are often accessed without first taking endpoint.mu. The arrangement
with endpoint.mu is inconvenient for a good number of those call-sites,
so it is instead replaced with an atomic pointer to carry both pieces of
disco info. This will also help with #7821 that wants to add explicit
checks/guards to disable disco behaviors when disco keys are missing
which is necessarily implicitly mostly covered by this change.
Updates #7821
Signed-off-by: James Tucker <james@tailscale.com>
Make developing derp easier by:
1. Creating an envknob telling clients to use HTTP to connect to derp
servers, so devs don't have to acquire a valid TLS cert.
2. Creating an envknob telling clients which derp server to connect
to, so devs don't have to edit the ACLs in the admin console to add a
custom DERP map.
3. Explaining how the -dev and -a command lines args to derper
interact.
To use this:
1. Run derper with -dev.
2. Run tailscaled with TS_DEBUG_USE_DERP_HTTP=1 and
TS_DEBUG_USE_DERP_ADDR=localhost
This will result in the client connecting to derp via HTTP on port
3340.
Fixes#7700
Signed-off-by: Val <valerie@tailscale.com>
This commit implements UDP offloading for Linux. GSO size is passed to
and from the kernel via socket control messages. Support is probed at
runtime.
UDP GSO is dependent on checksum offload support on the egress netdev.
UDP GSO will be disabled in the event sendmmsg() returns EIO, which is
a strong signal that the egress netdev does not support checksum
offload.
Updates tailscale/corp#8734
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This adds the util/sysresources package, which currently only contains a
function to return the total memory size of the current system.
Then, we modify magicsock to scale the number of buffered DERP messages
based on the system's available memory, ensuring that we never use a
value lower than the previous constant of 32.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib763c877de4d0d4ee88869078e7d512f6a3a148d