Commit Graph

958 Commits

Author SHA1 Message Date
Josh Bleecher Snyder
cab2c9376f wgengine/magicsock: fix typo in comment 2021-11-10 14:02:06 -08:00
Brad Fitzpatrick
d972099c78 wgengine/magicsock: add a stress test
Updates tailscale/corp#3016

Change-Id: I23708e68ed44d81986d9e2be82029d4555547592
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-10 12:09:54 -08:00
Brad Fitzpatrick
dc9a2909ac wgengine/magicsock: remove peerMap.byDiscoKey map
No longer used.

Updates #3088

Change-Id: I0ced3f87baa4053d3838d3c4a828ed0293923825
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit c30fa5903d)
2021-10-19 12:22:26 -07:00
David Anderson
6c0723fbd6 wgengine/magicsock: track IP<>node mappings without relying on discokeys.
Updates #3088.

Signed-off-by: David Anderson <danderson@tailscale.com>
(cherry picked from commit b956139b0c)
2021-10-19 12:18:17 -07:00
Brad Fitzpatrick
7fbbaff617 wgengine/magicsock: finish TODO to speed up peerMap.forEachEndpointWithDiscoKey
Now that peerMap tracks the set of nodes for a DiscoKey.

Updates #3088

Change-Id: I927bf2bdfd2b8126475f6b6acc44bc799fcb489f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 7a243ae5b1)
2021-10-19 12:18:17 -07:00
Brad Fitzpatrick
b9983e6eb8 wgengine/magicsock: don't check always-non-nil endpoint for nil-ness
Continuation of 2aa5df7ac1, remove nil
check because it can never be nil. (It previously was able to be nil.)

Change-Id: I59cd9ad611dbdcbfba680ed9b22e841b00c9d5e6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 11fdb14c53)
2021-10-19 12:18:17 -07:00
David Anderson
89739e077c wgengine/magicsock: add an explicit else branch to peerMap update.
Clarifies that the replace+delete of peerinfo data is only when peerInfo
already exists.

Signed-off-by: David Anderson <danderson@tailscale.com>
(cherry picked from commit e7eb46bced)
2021-10-19 12:18:17 -07:00
Maisem Ali
4a531a0aed wgengine: don't try to delete legacy netfilter rules on synology.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
(cherry picked from commit 53199738fb)
2021-10-19 12:18:17 -07:00
David Anderson
5cf0619cb2 wgengine/magicsock: document and enforce that peerInfo.ep is non-nil.
Signed-off-by: David Anderson <danderson@tailscale.com>
(cherry picked from commit 2aa5df7ac1)
2021-10-19 12:18:17 -07:00
David Anderson
ee02c95259 wgengine/magicsock: move discoKey fields to the mutex-protected section.
Fixes #3106

Signed-off-by: David Anderson <danderson@tailscale.com>
(cherry picked from commit 521b44e653)
2021-10-19 12:18:17 -07:00
Brad Fitzpatrick
cb0d784a79 wgengine/magicsock: track which NodeKey each DiscoKey was last for
This adds new fields (currently unused) to discoInfo to track what the
last verified (unambiguous) NodeKey a DiscoKey last mapped to, and
when.

Then on CallMeMaybe, Pong and on most Pings, we update the mapping
from DiscoKey to the current NodeKey for that DiscoKey.

Updates #3088

Change-Id: Idc4261972084dec71cf8ec7f9861fb9178eb0a4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit a6d02dc122)
2021-10-19 12:18:17 -07:00
Brad Fitzpatrick
430d378f7d wgengine/magicsock: fix data race with sync.Pool in error+logging path
Fixes #3122

Change-Id: Ib52e84f9bd5813d6cf2e80ce5b2296912a48e064
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit c759fcc7d3)
2021-10-19 12:18:17 -07:00
Brad Fitzpatrick
ac4cda9303 disco, wgengine/magicsock: send self node key in disco pings
This lets clients quickly (sub-millisecond within a local LAN) map
from an ambiguous disco key to a node key without waiting for a
CallMeMaybe (over relatively high latency DERP).

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 75a7779b42)
2021-10-19 12:18:17 -07:00
Denton Gentry
76ad9d7a7a wgengine/magicsock: don't Rebind after STUN error if closed.
https://github.com/tailscale/tailscale/pull/3014 added a
rebind on STUN failure, which means there can now be a
tailscale.com/wgengine/magicsock.(*RebindingUDPConn).ReadFromNetaddr
in progress at the end of the test waiting for a STUN
response which will never arrive.

This causes a test flake due to the resource leak in those
cases where the Conn decided to rebind. For whatever reason,
it mostly flakes with Windows.

If the Conn is closed, don't Rebind after a send error.

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
(cherry picked from commit def650b3e8)
2021-10-19 12:18:17 -07:00
Brad Fitzpatrick
46fffa32ed wgengine/magicsock: don't call setAddrToDiscoLocked on DERP ping
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit f55c2bccf5)
2021-10-19 12:18:17 -07:00
Brad Fitzpatrick
3e317852ce wgengine/magicsock: finish some renamings of discoEndpoint to endpoint
Renames only; continuation of earlier 8049063d35

These kept confusing me while working on #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 569f70abfd)
2021-10-19 12:18:17 -07:00
Brad Fitzpatrick
f054e16451 wgengine/magicsock: delete peerMap.endpointForDiscoKey, remove remaining caller
The one remaining caller of peerMap.endpointForDiscoKey was making the
improper assumption that there's exactly 1 node with a given DiscoKey
in the network. That was the cause of #3088.

Now that all the other callers have been updated to not use
endpointForDiscoKey, there's no need to try to keep maintaining that
prone-to-misuse index.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 695df497ba)
2021-10-19 12:18:17 -07:00
Brad Fitzpatrick
0651845a2c wgengine/magicsock: remove endpointForDiscoKey call from handleDiscoMessage
A DiscoKey maps 1:n to endpoints. When we get a disco pong, we don't
necessarily know which endpoint sent it to us. Ask them all. There
will only usually be 1 (and in rare circumstances 2). So it's easier
to ask all two rather than building new maps from the random ping TxID
to its endpoint.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 04fd94acd6)
2021-10-19 12:18:16 -07:00
Brad Fitzpatrick
2d18624a8e wgengine/magicsock: remove endpoint parameter from handlePingLocked
We can reply to a ping without knowing which exact node it's from.  As
long as it's in our netmap, it's safe to reply. If there's more than
one node with that discokey, it doesn't matter who we're relpying to.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 151b4415ca)
2021-10-19 12:18:16 -07:00
Brad Fitzpatrick
07b569fe26 wgengine/magicsock: add new discoInfo type for DiscoKey state, move some fields
As more prep for removing the false assumption that you're able to
map from DiscoKey to a single peer, move the lastPingFrom and lastPingTime
fields from the endpoint type to a new discoInfo type, effectively upgrading
the old sharedDiscoKey map (which only held a *[32]byte nacl precomputed key
as its value) to discoInfo which then includes that naclbox key.

Then start plumbing it into handlePing in prep for removing the need
for handlePing to take an endpoint parameter.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit d86081f353)
2021-10-19 12:18:16 -07:00
Brad Fitzpatrick
fd85b3274e wgengine/magicsock: move temporary endpoint lookup later, add TODO to remove
Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit e5779f019e)
2021-10-19 12:18:16 -07:00
Brad Fitzpatrick
093ae70293 wgengine/magicsock: remove redundant/wrong sharedDiscoKey delete
The pass just after in this method handles cleaning up sharedDiscoKey.
No need to do it wrong (assuming DiscoKey => 1 node) earlier.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 36a07089ee)
2021-10-19 12:18:16 -07:00
Brad Fitzpatrick
e921482548 wgengine/magicsock: pass src NodeKey to handleDiscoMessage for DERP disco msgs
And then use it to avoid another lookup-by-DiscoKey.

Updates #3088

(cherry picked from commit 3e80806804)
2021-10-19 12:18:16 -07:00
Brad Fitzpatrick
7b7ff1f2e4 wgengine/magicsock: start removing endpointForDiscoKey
It's not valid to assume that a discokey is globally unique.

This removes the first two of the four callers.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 82fa15fa3b)
2021-10-19 12:18:16 -07:00
Maisem Ali
56095e9824 wgengine: only use AmbientCaps on DSM7+
Signed-off-by: Maisem Ali <maisem@tailscale.com>
(cherry picked from commit 27799a1a96)
2021-10-18 13:43:07 -04:00
Brad Fitzpatrick
9df2516f96 wgengine/router: ignore Linux ip route error adding dup route
Updates #3060
Updates #391

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(cherry picked from commit 14f9c75293)
2021-10-14 19:23:52 -04:00
Avery Pennarun
0d4a0bf60e magicsock: if STUN failed to send before, rebind before STUNning again.
On iOS (and possibly other platforms), sometimes our UDP socket would
get stuck in a state where it was bound to an invalid interface (or no
interface) after a network reconfiguration. We can detect this by
actually checking the error codes from sending our STUN packets.

If we completely fail to send any STUN packets, we know something is
very broken. So on the next STUN attempt, let's rebind the UDP socket
to try to correct any problems.

This fixes a problem where iOS would sometimes get stuck using DERP
instead of direct connections until the backend was restarted.

Fixes #2994

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-10-08 02:17:09 +09:00
David Anderson
830f641c6b wgengine/magicsock: update discokeys on netmap change.
Fixes #3008.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-06 14:52:47 -07:00
Brad Fitzpatrick
29a8fb45d3 wgengine/netstack: include DNS.ExtraRecords in DNSMap
So SOCKS5 dialer can dial HTTPS cert names, for instance.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-28 10:01:36 -07:00
Brad Fitzpatrick
52737c14ac wgengine/monitor: ignore ipsec link monitor events on iOS/macOS
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-27 20:45:51 -07:00
Denton Gentry
93c2882a2f wgengine: flush DNS cache after major link change.
Windows has a public dns.Flush used in router_windows.go.
However that won't work for platforms like Linux, where
we need a different flush mechanism for resolved versus
other implementations.

We're instead adding a FlushCaches method to the dns Manager,
which can be made to work on all platforms as needed.

Fixes https://github.com/tailscale/tailscale/issues/2132

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2021-09-19 22:58:53 -07:00
Josh Bleecher Snyder
d5ab18b2e6 cmd/cloner: add Clone context to regen struct assignments
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-09-17 16:46:08 -07:00
Josh Bleecher Snyder
a722e48cef wgengine/magicsock: skip alloc test with -race
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-09-17 09:56:32 -07:00
Josh Bleecher Snyder
7693d36aed all: close fake userspace engines when tests complete
We were leaking FDs.
In a few places, switch from defer to t.Cleanup.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-09-15 15:31:51 -07:00
Josh Bleecher Snyder
4bbf5a8636 cmd/cloner: reduce diff noise when changing command
Spelling out the command to run for every type
means that changing the command makes for a large, repetitive diff.
Stop doing that.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-09-15 10:58:12 -07:00
Brad Fitzpatrick
dabeda21e0 net/tstun: block looped disco traffic
Updates #1526 (maybe fixes? time will tell)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-13 16:00:28 -07:00
Brad Fitzpatrick
31c1331415 wgengine/magicsock: deflake TestReceiveFromAllocs
100 iterations isn't enough with background allocs happening
apparently. 1000 seems to be reliable.

Fixes #2826

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-09 11:49:44 -07:00
Brad Fitzpatrick
2238814b99 wgengine/magicsock: fix crash introduced in recent cleanups
Fixes #2801

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-08 08:27:51 -07:00
Brad Fitzpatrick
640134421e all: update tests to use tstest.MemLogger
And give MemLogger a mutex, as one caller had, which does match the logf
contract better.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-07 20:06:15 -07:00
Brad Fitzpatrick
4c68b7df7c tstest: add MemLogger bytes.Buffer wrapper with Logf method
We use it tons of places. Updated three at least in this PR.

Another use in next commit.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-07 15:33:45 -07:00
David Crawshaw
9502b515f1 net/dns: replace resolver IPs with type for DoH
We currently plumb full URLs for DNS resolvers from the control server
down to the client. But when we pass the values into the net/dns
package, we throw away any URL that isn't a bare IP. This commit
continues the plumbing, and gets the URL all the way to the built in
forwarder. (It stops before plumbing URLs into the OS configurations
that can handle them.)

For #2596

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-09-07 14:44:26 -07:00
Brad Fitzpatrick
7bfd4f521d cmd/tailscale: fix "tailscale ip $self-host-hostname"
And in the process, fix the related confusing error messages from
pinging your own IP or hostname.

Fixes #2803

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-07 11:57:23 -07:00
David Anderson
efe8020dfa wgengine/magicsock: fix race condition in tests.
AFAICT this was always present, the log read mid-execution was never safe.
But it seems like the recent magicsock refactoring made the race much
more likely.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-05 17:42:33 -07:00
Evan Anderson
000f90d4d7 wgengine/wglog: Fix docstring on wireguardGoString to match args
@danderson linked this on Twitter and I noticed the mismatch.

Signed-off-by: Evan Anderson <evan.k.anderson@gmail.com>
2021-09-05 15:52:16 -07:00
Brad Fitzpatrick
5bacbf3744 wgengine/magicsock, health, ipn/ipnstate: track DERP-advertised health
And add health check errors to ipnstate.Status (tailscale status --json).

Updates #2746
Updates #2775

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-02 10:20:25 -07:00
David Anderson
daf54d1253 control/controlclient: remove TS_DEBUG_USE_DISCO=only.
It was useful early in development when disco clients were the
exception and tailscale logs were noisier than today, but now
non-disco is the exception.

Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 18:11:32 -07:00
David Anderson
954064bdfe wgengine/wgcfg/nmcfg: don't configure peers who can't DERP or disco.
Fixes #2770

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 18:11:32 -07:00
David Anderson
f90ac11bd8 wgengine: remove unnecessary magicConnStarted channel.
Having removed magicconn.Start, there's no need to synchronize startup
of other things to it any more.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 18:11:32 -07:00
David Anderson
bb10443edf wgengine/wgcfg: use just the hexlified node key as the WireGuard endpoint.
The node key is all magicsock needs to find the endpoint that WireGuard
needs.

Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 15:13:21 -07:00
David Anderson
d00341360f wgengine/magicsock: remove unused debug knob.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-09-01 15:13:21 -07:00