This adds new fields (currently unused) to discoInfo to track what the
last verified (unambiguous) NodeKey a DiscoKey last mapped to, and
when.
Then on CallMeMaybe, Pong and on most Pings, we update the mapping
from DiscoKey to the current NodeKey for that DiscoKey.
Updates #3088
Change-Id: Idc4261972084dec71cf8ec7f9861fb9178eb0a4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit a6d02dc122b635fe008587d1c40100134456f297)
This lets clients quickly (sub-millisecond within a local LAN) map
from an ambiguous disco key to a node key without waiting for a
CallMeMaybe (over relatively high latency DERP).
Updates #3088
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 75a7779b42164e9de250ab83158409e72054db13)
https://github.com/tailscale/tailscale/pull/3014 added a
rebind on STUN failure, which means there can now be a
tailscale.com/wgengine/magicsock.(*RebindingUDPConn).ReadFromNetaddr
in progress at the end of the test waiting for a STUN
response which will never arrive.
This causes a test flake due to the resource leak in those
cases where the Conn decided to rebind. For whatever reason,
it mostly flakes with Windows.
If the Conn is closed, don't Rebind after a send error.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
(cherry picked from commit def650b3e8486ad513cf59a53abd40dfbd3ec4cd)
Renames only; continuation of earlier 8049063d35159c8c9d6f1984604fec73af1d79dd
These kept confusing me while working on #3088
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 569f70abfdfdf762122ab4a34c3683eee8b9ee97)
The one remaining caller of peerMap.endpointForDiscoKey was making the
improper assumption that there's exactly 1 node with a given DiscoKey
in the network. That was the cause of #3088.
Now that all the other callers have been updated to not use
endpointForDiscoKey, there's no need to try to keep maintaining that
prone-to-misuse index.
Updates #3088
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 695df497ba6ac9a231cababaf6f4a31baa4cccf6)
A DiscoKey maps 1:n to endpoints. When we get a disco pong, we don't
necessarily know which endpoint sent it to us. Ask them all. There
will only usually be 1 (and in rare circumstances 2). So it's easier
to ask all two rather than building new maps from the random ping TxID
to its endpoint.
Updates #3088
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 04fd94acd6f764ba7b8e96b9aa9cae5e28ae3b85)
We can reply to a ping without knowing which exact node it's from. As
long as it's in our netmap, it's safe to reply. If there's more than
one node with that discokey, it doesn't matter who we're relpying to.
Updates #3088
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 151b4415ca88a24fb883e95edad0cf4b6c450543)
As more prep for removing the false assumption that you're able to
map from DiscoKey to a single peer, move the lastPingFrom and lastPingTime
fields from the endpoint type to a new discoInfo type, effectively upgrading
the old sharedDiscoKey map (which only held a *[32]byte nacl precomputed key
as its value) to discoInfo which then includes that naclbox key.
Then start plumbing it into handlePing in prep for removing the need
for handlePing to take an endpoint parameter.
Updates #3088
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit d86081f3535168921cd9b4ddf1dd127f3639bd00)
The pass just after in this method handles cleaning up sharedDiscoKey.
No need to do it wrong (assuming DiscoKey => 1 node) earlier.
Updates #3088
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 36a07089ee6a49a7a178db486c9716de1e77201a)
It's not valid to assume that a discokey is globally unique.
This removes the first two of the four callers.
Updates #3088
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 82fa15fa3b3c80f4e42ec0f261b02d2d6dab5303)
On iOS (and possibly other platforms), sometimes our UDP socket would
get stuck in a state where it was bound to an invalid interface (or no
interface) after a network reconfiguration. We can detect this by
actually checking the error codes from sending our STUN packets.
If we completely fail to send any STUN packets, we know something is
very broken. So on the next STUN attempt, let's rebind the UDP socket
to try to correct any problems.
This fixes a problem where iOS would sometimes get stuck using DERP
instead of direct connections until the backend was restarted.
Fixes#2994
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
And add health check errors to ipnstate.Status (tailscale status --json).
Updates #2746
Updates #2775
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Over time, other magicsock refactors have made Start effectively a
no-op, except that some other functions choose to panic if called
before Start.
Signed-off-by: David Anderson <danderson@tailscale.com>
We were returning an error almost, but not quite like errConnClosed in
a single codepath, which could still trip the panic on reconfig in the
test logic.
Signed-off-by: David Anderson <danderson@tailscale.com>
It only existed to override one test-only behavior with a
different test-only behavior, in both cases working around
an annoying feature of our CI environments. Instead, handle
that weirdness entirely in the test code, with a tweaked
TestOnlyPacketListener that gets injected.
Signed-off-by: David Anderson <danderson@tailscale.com>
The docstring said it was meant for use in tests, but it's specifically a
special codepath that is _only_ used in tests, so make the claim stronger.
Signed-off-by: David Anderson <danderson@tailscale.com>
Instead of using the legacy codepath, teach discoEndpoint to handle
peers that have a home DERP, but no disco key. We can still communicate
with them, but only over DERP.
Signed-off-by: David Anderson <danderson@tailscale.com>
magicsock makes multiple calls to Now per packet.
Move to mono.Now. Changing some of the calls to
use package mono has a cascading effect,
causing non-per-packet call sites to also switch.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Pull in the latest version of wireguard-windows.
Switch to upstream wireguard-go.
This requires reverting all of our import paths.
Unfortunately, this has to happen at the same time.
The wireguard-go change is very low risk,
as that commit matches our fork almost exactly.
(The only changes are import paths, CI files, and a go.mod entry.)
So if there are issues as a result of this commit,
the first place to look is wireguard-windows changes.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
magicsock.Conn.ParseEndpoint requires a peer's public key,
disco key, and legacy ip/ports in order to do its job.
We currently accomplish that by:
* adding the public key in our wireguard-go fork
* encoding the disco key as magic hostname
* using a bespoke comma-separated encoding
It's a bit messy.
Instead, switch to something simpler: use a json-encoded struct
containing exactly the information we need, in the form we use it.
Our wireguard-go fork still adds the public key to the
address when it passes it to ParseEndpoint, but now the code
compensating for that is just a couple of simple, well-commented lines.
Once this commit is in, we can remove that part of the fork
and remove the compensating code.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
Fields rename only.
Part of the general effort to make our code agnostic about endpoint formatting.
It's just a name, but it will soon be a misleading one; be more generic.
Do this as a separate commit because it generates a lot of whitespace changes.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Upstream wireguard-go renamed the interface method
from CreateEndpoint to ParseEndpoint.
I missed some comments. Fix them.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>