It used to make assumptions based on having Anycast IPs that are super
near. Now we're intentionally going to a bunch of different distant
IPs to measure latency.
Also, optimize how the hairpin detection works. No need to STUN on
that socket. Just use that separate socket for sending, once we know
the other UDP4 socket's endpoint. The trick is: make our test probe
also a STUN packet, so it fits through magicsock's existing STUN
routing.
This drops netcheck from ~5 seconds to ~250-500ms.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Failure to do this leads to fd exhaustion at -count=10000,
and increasingly poor execution north of -count=100.
Signed-off-by: David Anderson <danderson@tailscale.com>
Failure to do so triggers either a data race or a panic
in the testing package, due to racey use of t.Logf.
Signed-off-by: David Anderson <danderson@tailscale.com>
Basically, don't trust the OS-level link monitor to only tell you
interesting things. Sanity check it.
Also, move the interfaces package into the net directory now that we
have it.
This was (presumably) missing from wgengine because the
interactions between magicsock and wireguard-go meant that the
shutdown never worked. Now those are fixed, actually shut down.
Fixes occasional flake in expanded ipn/e2e_test.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
The device name "tailscale0" will be used for all platforms except for
OpenBSD where "tun" is enforced by the kernel. `CreateTUN()` in
`wireguard-go` will select the next available "tunX" device name on the
OpenBSD system.
Signed-off-by: Martin Baillie <martin@baillie.email>
The UDP reader goroutine was clobbering `n` and `err` from the
main goroutine, whose accesses are not synchronized the way `b` is.
Signed-off-by: David Anderson <danderson@tailscale.com>
wireguard-go closes magicsock, and expects this to unblock reads
so that its internal goroutines can wind down. We were incorrectly
blocking the read indefinitey and breaking this contract.
Signed-off-by: David Anderson <danderson@tailscale.com>
It's extremely flaky in several dimensions, as well as very slow.
It's making the CI completely red all the time without telling us
useful information.
Set RUN_CURSED_TESTS=1 to run locally.
This change just alters the semantics of the one flaky test, without
trying to speed up timeouts on the others. Empirically, speeding up
the timeouts causes _more_ flakes right now :(
The remaining flake occurs due to a mysterious packet loss. This
doesn't affect normal tailscaled operations, so until I track down
where the loss occurs and fix it, the flaky test is going to be
lenient about packet loss (but not about whether the spray logic
worked).
Signed-off-by: David Anderson <danderson@tailscale.com>