Also, bit of behavior change: on non-nil err but expired context,
don't reset the consecutive failure count. I don't think the old
behavior was intentional.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
NetworkMap text diffs being empty were currently used to short-circuit
calling magicsock's SetNetworkMap (via Engine.SetNetworkMap), but that
went away in c7582dc2 (0.100.0-230)
Prior to c7582dc2 (notably, in 0.100.0-225 and below, down to
0.100.0), a change in only disco key (as when a node restarts) but
without endpoint changes (as would happen for a client not behind a
NAT with random ports) could result in a "netmap diff: (none)" being
printed, as well as Engine.SetNetworkMap being skipped, leading to
broken discovery endpoints.
c7582dc2 fixed the Engine.SetNetworkMap skippage.
This change fixes the "netmap diff: (none)" print so we'll actually see when a peer
restarts with identical endpoints but a new discovery key.
This code is currently racy due to an incorrect assumption
that goal is never modified in-place, so does not require extra locking.
This change makes the assumption correct.
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
This adds a new magicsock endpoint type only used when both sides
support discovery (that is, are advertising a discovery
key). Otherwise the old code is used.
So far the new code only communicates over DERP as proof that the new
code paths are wired up. None of the actually discovery messaging is
implemented yet.
Support for discovery (generating and advertising a key) are still
behind an environment variable for now.
Updates #483
As part of disabling background STUN packets when idle, we want an
emergency override switch to turn it back on, in case it interacts
poorly in the wild. We'll send that via control, but we'll want to
plumb it down to magicsock via NetworkMap.
Updates tailscale/corp#320
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It can only be built with corp deps anyway, and having it split
from the control code makes our lives harder.
Signed-off-by: David Anderson <danderson@tailscale.com>
Instead of hard-coding the DERP map (except for cmd/tailscale netcheck
for now), get it from the control server at runtime.
And make the DERP map support multiple nodes per region with clients
picking the first one that's available. (The server will balance the
order presented to clients for load balancing)
This deletes the stunner package, merging it into the netcheck package
instead, to minimize all the config hooks that would've been
required.
Also fix some test flakes & races.
Fixes#387 (Don't hard-code the DERP map)
Updates #388 (Add DERP region support)
Fixes#399 (wgengine: flaky tests)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
By default, nothing differentiates errors or fatals from regular logs, so they just
blend into the rest of the logs.
As a bonus, if you run a test using t.Run(), the log messages printed
via the sub-t.Run() are printed at a different time from log messages
printed via the parent t.Run(), making debugging almost impossible.
This doesn't actually fix the test flake I'm looking for, but at least
I can find it in the logs now.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
If a test calls log.Printf, 'go test' horrifyingly rearranges the
output to no longer be in chronological order, which makes debugging
virtually impossible. Let's stop that from happening by making
log.Printf panic if called from any module, no matter how deep, during
tests.
This required us to change the default error handler in at least one
http.Server, as well as plumbing a bunch of logf functions around,
especially in magicsock and wgengine, but also in logtail and backoff.
To add insult to injury, 'go test' also rearranges the output when a
parent test has multiple sub-tests (all the sub-test's t.Logf is always
printed after all the parent tests t.Logf), so we need to screw around
with a special Logf that can point at the "current" t (current_t.Logf)
in some places. Probably our entire way of using subtests is wrong,
since 'go test' would probably like to run them all in parallel if you
called t.Parallel(), but it definitely can't because the're all
manipulating the shared state created by the parent test. They should
probably all be separate toplevel tests instead, with common
setup/teardown logic. But that's a job for another time.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>