Instead of hard-coding the DERP map (except for cmd/tailscale netcheck
for now), get it from the control server at runtime.
And make the DERP map support multiple nodes per region with clients
picking the first one that's available. (The server will balance the
order presented to clients for load balancing)
This deletes the stunner package, merging it into the netcheck package
instead, to minimize all the config hooks that would've been
required.
Also fix some test flakes & races.
Fixes#387 (Don't hard-code the DERP map)
Updates #388 (Add DERP region support)
Fixes#399 (wgengine: flaky tests)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If Australia's far away and not going to be used, it's still going to
be far away a minute later. No need to send backup
just-in-case-UDP-gets-lost STUN packets to the known far away
destinations. Those are the ones most likely to trigger retries due to
delay anyway (in random 50-250ms, currently). But we'll keep sending 1
packet to them, just in case our airplane landed.
Likewise, be less aggressive with IPv6. The main point is just to see
whether IPv6 works. No need to send up to 10 packets every round. Max
two is enough (except for the first round). This does mean our STUN
traffic graphs for IPv4-vs-IPv6 will change shape. Oh well. It was a
weird eyeball metric for IPv6 connectivity anyway and we have better
metrics.
We can tweak this policy over time. It's factored out and has tests
now.
It used to make assumptions based on having Anycast IPs that are super
near. Now we're intentionally going to a bunch of different distant
IPs to measure latency.
Also, optimize how the hairpin detection works. No need to STUN on
that socket. Just use that separate socket for sending, once we know
the other UDP4 socket's endpoint. The trick is: make our test probe
also a STUN packet, so it fits through magicsock's existing STUN
routing.
This drops netcheck from ~5 seconds to ~250-500ms.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>