Commit Graph

325 Commits

Author SHA1 Message Date
Brad Fitzpatrick
3fa86a8b23 wgengine/magicsock: use relatively new netaddr.IPPort.IsZero method 2021-01-15 19:21:10 -08:00
Brad Fitzpatrick
4811236189 wgengine/magicsock: speed up BenchmarkReceiveFrom, store context.Done chan
context.cancelCtx.Done involves a mutex and isn't as cheap as I
previously assumed. Convert the donec method into a struct field and
store the channel value once. Our one magicsock.Conn gets one pointer
larger, but it cuts ~1% of the CPU time of the ReceiveFrom benchmark
and removes a bubble from the --svg output :)
2021-01-15 19:19:27 -08:00
Josh Bleecher Snyder
63af950d8c wgengine/magicsock: adapt to wireguard-go without UpdateDst
22507adf54 stopped relying on
our fork of wireguard-go's UpdateDst callback.
As a result, we can unwind that code,
and the extra return value of ReceiveIPv{4,6}.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-01-15 17:13:58 -08:00
David Anderson
57d95dd005 wgengine/magicsock: default legacy networking to off for some tests.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-01-15 15:54:45 -08:00
David Anderson
a2463e8948 wgengine/magicsock: add an option to disable legacy peer handling.
Used in tests to ensure we're not relying on behavior we're going
to remove eventually.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-01-15 15:01:33 -08:00
David Anderson
d456bfdc6d wgengine/magicsock: fix BenchmarkReceiveFrom.
Previously, this benchmark relied on behavior of the legacy
receive codepath, which I changed in 22507adf. With this
change, the benchmark instead relies on the new active discovery
path.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-01-15 15:01:33 -08:00
Josh Bleecher Snyder
08baa17d9a wgengine/magicsock: shortcircuit discoEndpoint.heartbeat when its connection is closed
This prevents us from continuing to do unnecessary work
(including logging) after the connection has closed.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-01-15 14:44:56 -08:00
Josh Bleecher Snyder
7c76435bf7 wgengine/magicsock: simplify
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-01-15 14:44:56 -08:00
Josh Bleecher Snyder
654b5f1570 all: convert from []wgcfg.Endpoint to string
This eliminates a dependency on wgcfg.Endpoint,
as part of the effort to eliminate our wireguard-go fork.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-01-14 13:54:07 -08:00
David Anderson
22507adf54 wgengine/magicsock: stop depending on UpdateDst in legacy codepaths.
This makes connectivity between ancient and new tailscale nodes slightly
worse in some cases, but only in cases where the ancient version would
likely have failed to get connectivity anyway.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-01-14 12:56:48 -08:00
Denton Gentry
0aa55bffce magicsock: test error case in derpWriteChanOfAddr
In derpWriteChanOfAddr when we call derphttp.NewRegionClient(),
there is a check of whether the connection is already errored and
if so it returns before grabbing the lock. The lock might already
be held and would be a deadlock.

This corner case is not being reliably exercised by other tests.
This shows up in code coverage reports, the lines of code in
derpWriteChanOfAddr are alternately added and subtracted from
code coverage.

Add a test to specifically exercise this code path, and verify that
it doesn't deadlock.

This is the best tradeoff I could come up with:
+ the moment code calls Err() to check if there is an error, we
  grab the lock to make sure it would deadlock if it tries to grab
  the lock itself.
+ if a new call to Err() is added in this code path, only the
  first one will be covered and the rest will not be tested.
+ this test doesn't verify whether code is checking for Err() in
  the right place, which ideally I guess it would.

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2021-01-12 04:29:28 -08:00
Brad Fitzpatrick
85e54af0d7 wgengine: on TCP connect fail/timeout, log some clues about why it failed
So users can see why things aren't working.

A start. More diagnostics coming.

Updates #1094
2021-01-11 22:09:09 -08:00
Brad Fitzpatrick
f85769b1ed wgengine/magicsock: drop netaddr.IPPort cache
netaddr.IP no longer allocates, so don't need a cache or all its associated
code/complexity.

This totally removes groupcache/lru from the deps.

Also go mod tidy.
2021-01-11 13:23:04 -08:00
Brad Fitzpatrick
5aa5db89d6 cmd/tailscaled, wgengine/netstack: add start of gvisor userspace netstack work
Not usefully functional yet (mostly a proof of concept), but getting
it submitted for some work @namansood is going to do atop this.

Updates #707
Updates #634
Updates #48
Updates #835
2021-01-11 09:31:14 -08:00
Brad Fitzpatrick
5efb0a8bca cmd/tailscale: change formatting of "tailscale status"
* show DNS name over hostname, removing domain's common MagicDNS suffix.
  only show hostname if there's no DNS name.
  but still show shared devices' MagicDNS FQDN.

* remove nerdy low-level details by default: endpoints, DERP relay,
  public key.  They're available in JSON mode still for those who need
  them.

* only show endpoint or DERP relay when it's active with the goal of
  making debugging easier. (so it's easier for users to understand
  what's happening) The asterisks are gone.

* remove Tx/Rx numbers by default for idle peers; only show them when
  there's traffic.

* include peers' owner login names

* add CLI option to not show peers (matching --self=true, --peers= also
  defaults to true)

* sort by DNS/host name, not public key

* reorder columns
2021-01-10 12:11:22 -08:00
Brad Fitzpatrick
b5b9866ba2 wgengine/magicsock: copy self DNS name to PeerStatus, re-fill OS
The OS used to be sent back from the server but that has since
been removed as being redundant.
2021-01-08 20:55:57 -08:00
David Anderson
86fe22a1b1 Update netaddr, and adjust wgengine/magicsock due to API change.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-30 17:36:03 -08:00
Josh Bleecher Snyder
56a7652dc9 wgkey: new package
This is a replacement for the key-related parts
of the wireguard-go wgcfg package.

This is almost a straight copy/paste from the wgcfg package.
I have slightly changed some of the exported functions and types
to avoid stutter, added and tweaked some comments,
and removed some now-unused code.

To avoid having wireguard-go depend on this new package,
wgcfg will keep its key types.

We translate into and out of those types at the last minute.
These few remaining uses will be eliminated alongside
the rest of the wgcfg package.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2020-12-30 17:33:02 -08:00
Josh Bleecher Snyder
2fe770ed72 all: replace wgcfg.IP and wgcfg.CIDR with netaddr types
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2020-12-28 13:00:42 -08:00
Brad Fitzpatrick
053a1d1340 all: annotate log verbosity levels on most egregiously spammy log prints
Fixes #924
Fixes #282

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-12-21 12:59:33 -08:00
David Anderson
294ceb513c ipn, wgengine/magicsock: fix tailscale status display.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-19 13:50:44 -08:00
David Anderson
c8c493f3d9 wgengine/magicsock: make ReceiveIPv4 a little easier to follow.
The previous code used a lot of whole-function variables and shared
behavior that only triggered based on prior action from a single codepath.
Instead of that, move the small amounts of "shared" code into each switch
case.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-18 01:15:53 -08:00
David Anderson
0ad109f63d wgengine/magicsock: move legacy endpoint creation into legacy.go.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-18 01:15:53 -08:00
David Anderson
f873da5b16 wgengine/magicsock: move more legacy endpoint handling.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-18 01:15:53 -08:00
David Anderson
58fcd103c4 wgengine/magicsock: move legacy sending code to legacy.go.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-18 01:15:53 -08:00
David Anderson
65ae66260f wgengine/magicsock: unexport AddrSet.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-18 01:15:53 -08:00
David Anderson
c9b9afd761 wgengine/magicsock: move most legacy nat traversal bits to another file.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-18 01:15:53 -08:00
David Anderson
554a20becb wgengine/magicsock: only log about lazy config when actually doing lazy config.
Before, tailscaled would log every 10 seconds when the periodic noteRecvActivity
call happens. This is noisy, but worse it's misleading, because the message
suggests that the disco code is starting a lazy config run for a missing peer,
whereas in fact it's just an internal piece of keepalive logic.

With this change, we still log when going from 0->1 tunnel for the peer, but
not every 10s thereafter.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-17 12:11:36 -08:00
Brad Fitzpatrick
fa412c8760 wgengine/filter, wgengine/magicsock: use new IP.BitLen to simplify some code 2020-12-15 12:12:56 -08:00
David Anderson
9cee0bfa8c wgengine/magicsock: sprinkle more docstrings.
Magicsock is too damn big, but this might help me page it back
in faster next time.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-14 23:59:17 -08:00
Brad Fitzpatrick
713cbe84c1 wgengine/magicsock: use net.JoinHostPort when host might have colons (udp6)
Only affected tests. (where it just generated log spam)
2020-12-02 20:19:28 -08:00
Brad Fitzpatrick
450cfedeba wgengine/magicsock: quiet an IPv6 warning in tests
In tests, we force binding to localhost to avoid OS firewall warning
dialogs.

But for IPv6, we were trying (and failing) to bind to 127.0.0.1.

You'd think we'd just say "localhost", but that's apparently ill
defined. See
https://tools.ietf.org/html/draft-ietf-dnsop-let-localhost-be-localhost
and golang/go#22826. (It's bitten me in the past, but I can't
remember specific bugs.)

So use "::1" explicitly for "udp6", which makes the test quieter.
2020-11-10 09:14:29 -08:00
Brad Fitzpatrick
fd2a30cd32 wgengine/magicsock: make test pass on Windows and without firewall dialog box
Updates #50
2020-10-28 09:02:08 -07:00
Brad Fitzpatrick
ac866054c7 wgengine/magicsock: add a backoff on DERP reconnects
Fixes #808
2020-10-19 15:15:40 -07:00
Brad Fitzpatrick
105a820622 wgengine/magicsock: skip an endpoint update at start-up
At startup the client doesn't yet have the DERP map so can't do STUN
queries against DERP servers, so it only knows it local interface
addresses, not its STUN-mapped addresses.

We were reporting the interface-local addresses to control, getting
the DERP map, and then immediately reporting the full set of
updates. That was an extra HTTP request to control, but worse: it was
an extra broadcast from control out to all the peers in the network.

Now, skip the initial update if there are no stun results and we don't
have a DERP map.

More work remains optimizing start-up requests/map updates, but this
is a start.

Updates tailscale/corp#557
2020-10-14 11:01:19 -07:00
Brad Fitzpatrick
2076a50862 wgengine/magicsock: finish a comment sentence that ended prematurely 2020-10-13 12:10:51 -07:00
Brad Fitzpatrick
3e4c46259d wgengine/magicsock: don't do netchecks either when network is down
A continuation of 6ee219a25d

Updates #640
2020-10-06 20:24:10 -07:00
Brad Fitzpatrick
6ee219a25d ipn, wgengine, magicsock, tsdns: be quieter and less aggressive when offline
If no interfaces are up, calm down and stop spamming so much. It was
noticed as especially bad on Windows, but probably was bad
everywhere. I just have the best network conditions testing on a
Windows VM.

Updates #604
2020-10-06 15:26:53 -07:00
Christina Wen
48fbe93e72
wgengine/magicsock: clarify pre-disco 'tailscale ping' error message
This change clarifies the error message when a user pings a peer that is using an outdated version of Tailscale.
2020-09-16 11:54:00 -04:00
Josh Bleecher Snyder
0c0239242c wgengine/magicsock: make discoPingPurpose a stringer
It was useful for debugging once, it'll probably be useful again.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2020-09-14 14:29:28 -07:00
Josh Bleecher Snyder
57e642648f wgengine/magicsock: fix typo in comment 2020-09-02 11:34:20 -07:00
Brad Fitzpatrick
756d6a72bd wgengine: lazily create peer wireguard configs more explicitly
Rather than consider bigs jumps in last-received-from activity as a
signal to possibly reconfigure the set of wireguard peers to have
configured, instead just track the set of peers that are currently
excluded from the configuration. Easier to reason about.

Also adds a bit more logging.

This might fix an error we saw on a machine running a recent unstable
build:

2020-08-26 17:54:11.528033751 +0000 UTC: 8.6M/92.6M magicsock: [unexpected] lazy endpoint not created for [UcppE], d:42a770f678357249
2020-08-26 17:54:13.691305296 +0000 UTC: 8.7M/92.6M magicsock: DERP packet received from idle peer [UcppE]; created=false
2020-08-26 17:54:13.691383687 +0000 UTC: 8.7M/92.6M magicsock: DERP packet from unknown key: [UcppE]

If it does happen again, though, we'll have more logs.
2020-08-26 12:26:06 -07:00
halulu
f27a57911b
cmd/tailscale: add derp and endpoints status (#703)
cmd/tailscale: add local node's information to status output (by default)

RELNOTE=yes

Updates #477

Signed-off-by: Halulu <lzjluzijie@gmail.com>
2020-08-25 16:26:10 -07:00
David Crawshaw
dd2c61a519 magicsock: call RequestStatus when DERP connects
Second attempt.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-08-25 16:35:28 -04:00
David Crawshaw
a67b174da1 Revert "magicsock: call RequestStatus when DERP connects"
Seems to break linux CI builder. Cannot reproduce locally,
so attempting a rollback.

This reverts commit cd7bc02ab1.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-08-25 15:15:37 -04:00
David Crawshaw
cd7bc02ab1 magicsock: call RequestStatus when DERP connects
Without this, a freshly started ipn client will be stuck in the
"Starting" state until something triggers a call to RequestStatus.
Usually a UI does this, but until then we can sit in this state
until poked by an external event, as is evidenced by our e2e tests
locking up when DERP is attached.

(This only recently became a problem when we enabled lazy handshaking
everywhere, otherwise the wireugard tunnel creation would also
trigger a RequestStatus.)

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-08-25 10:38:02 -04:00
Brad Fitzpatrick
f6dc47efe4 tailcfg, controlclient, magicsock: add control feature flag to enable DRPO
Updates #150
2020-08-17 13:01:39 -07:00
Brad Fitzpatrick
85c3d17b3c wgengine/magicsock: use disco ping src as a candidate endpoint
Consider:

   Hard NAT (A) <---> Hard NAT w/ mapped port (B)

If A sends a packet to B's mapped port, A can disco ping B directly,
with low latency, without DERP.

But B couldn't establish a path back to A and needed to use DERP,
despite already logging about A's endpoint and adding a mapping to it
for other purposes (the wireguard conn.Endpoint lookup also needed
it).

This adds the tracking to discoEndpoint too so it'll be used for
finding a path back.

Fixes tailscale/corp#556

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-08-12 21:33:43 -07:00
Brad Fitzpatrick
0512fd89a1 wgengine/magicsock: simplify handlePingLocked
It's no longer true that 'de may be nil'
2020-08-12 19:25:38 -07:00
Brad Fitzpatrick
84dc891843 cmd/tailscale/cli: add ping subcommand
For example:

$ tailscale ping -h
USAGE
  ping <hostname-or-IP>

FLAGS
  -c 10                   max number of pings to send
  -stop-once-direct true  stop once a direct path is established
  -verbose false          verbose output

$ tailscale ping mon.ts.tailscale.com
pong from monitoring (100.88.178.64) via DERP(sfo) in 65ms
pong from monitoring (100.88.178.64) via DERP(sfo) in 252ms
pong from monitoring (100.88.178.64) via [2604:a880:2:d1::36:d001]:41641 in 33ms

Fixes #661

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-08-10 12:50:56 -07:00
Brad Fitzpatrick
9a346fd8b4 wgengine,magicsock: fix two lazy wireguard config issues
1) we weren't waking up a discoEndpoint that once existed and
   went idle for 5 minutes and then got a disco message again.

2) userspaceEngine.noteReceiveActivity had a buggy check; fixed
   and added a test
2020-08-06 15:02:29 -07:00
Brad Fitzpatrick
cff737786e wgengine/magicsock: fix lazy config deadlock, document more lock ordering
This removes the atomic bool that tried to track whether we needed to acquire
the lock on a future recursive call back into magicsock. Unfortunately that
hack doesn't work because we also had a lock ordering issue between magicsock
and userspaceEngine (see issue). This documents that too.

Fixes #644
2020-08-06 08:43:48 -07:00
Brad Fitzpatrick
2bd9ad4b40 wgengine: fix deadlock between engine and magicsock 2020-08-05 16:37:15 -07:00
Brad Fitzpatrick
7c38db0c97 wgengine/magicsock: don't deadlock on pre-disco Endpoints w/ lazy wireguard configs
Fixes tailscale/tailscale#637
2020-08-04 17:06:05 -07:00
Brad Fitzpatrick
4987a7d46c wgengine/magicsock: when hard NAT, add stun-ipv4:static-port as candidate
If a node is behind a hard NAT and is using an explicit local port
number, assume they might've mapped a port and add their public IPv4
address with the local tailscaled's port number as a candidate endpoint.
2020-08-04 09:48:34 -07:00
Brad Fitzpatrick
bfcb0aa0be wgengine/magicsock: deflake tests, Close deadlock again
Better fix than 37903a9056

Fixes tailscale/corp#533
2020-08-04 09:36:38 -07:00
Brad Fitzpatrick
cb970539a6 wgengine/magicsock: remove TODO comment that's no longer applicable 2020-07-30 21:33:37 -07:00
Brad Fitzpatrick
915f65ddae wgengine/magicsock: stop disco activity on IPN stop
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-07-30 14:01:33 -07:00
Brad Fitzpatrick
c180abd7cf wgengine/magicsock: merge errClosed and errConnClosed 2020-07-30 13:59:30 -07:00
Brad Fitzpatrick
d55fdd4669 wgengine/magicsock: update, flesh out a TODO 2020-07-29 12:59:25 -07:00
Brad Fitzpatrick
58b721f374 wgengine/magicsock: deflake some tests with an ugly hack
Starting with fe68841dc7, some e2e tests
got flaky. Rather than debug them (they're gnarly), just revert to the old
behavior as far as those tests are concerned. The tests were somehow
using magicsock without a private key and expecting it to do ... something.

My goal with fe68841dc7 was to stop log spam
and unnecessary work I saw on the iOS app when when stopping the app.

Instead, only stop doing that work on any transition from
once-had-a-private-key to no-longer-have-a-private-key. That fixes
what I wanted to fix while still making the mysterious e2e tests
happy.
2020-07-27 16:32:35 -07:00
David Anderson
0249236cc0 ipn/ipnstate: record assigned Tailscale IPs.
wgengine/magicsock: use ipnstate to find assigned Tailscale IPs.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-07-27 14:09:54 -07:00
David Anderson
f582eeabd1 wgengine/magicsock: add a test for active path discovery.
Uses natlab only, because the point of this active discovery test is going to be
that it should get through a lot of obstacles.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-07-27 14:09:54 -07:00
Brad Fitzpatrick
37903a9056 wgengine/magicsock: fix occasional deadlock on Conn.Close on c.derpStarted
The deadlock was:

* Conn.Close was called, which acquired c.mu
* Then this goroutine scheduled:

    if firstDerp {
        startGate = c.derpStarted
        go func() {
            dc.Connect(ctx)
            close(c.derpStarted)
        }()
    }

* The getRegion hook for that derphttp.Client then ran, which also
  tries to acquire c.mu.

This change makes that hook first see if we're already in a closing
state and then it can pretend that region doesn't exist.
2020-07-27 12:27:10 -07:00
Brad Fitzpatrick
fe68841dc7 wgengine/magicsock: log better with less spam on transition to stopped state
Required a minor test update too, which now needs a private key to get far
enough to test the thing being tested.
2020-07-27 10:19:17 -07:00
Brad Fitzpatrick
e298327ba8 wgengine/magicsock: remove overkill, slow reflect.DeepEqual of NetworkMap
No need to allocate or compare all the fields we don't care about.
2020-07-25 19:37:08 -07:00
Brad Fitzpatrick
16a9cfe2f4 wgengine: configure wireguard peers lazily, as needed
wireguard-go uses 3 goroutines per peer (with reasonably large stacks
& buffers).

Rather than tell wireguard-go about all our peers, only tell it about
peers we're actively communicating with. That means we need hooks into
magicsock's packet receiving path and tstun's packet sending path to
lazily create a wireguard peer on demand from the network map.

This frees up lots of memory for iOS (where we have almost nothing
left for larger domains with many users).

We should ideally do this in wireguard-go itself one day, but that'd
be a pretty big change.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-07-24 12:50:15 -07:00
Brad Fitzpatrick
5066b824a6 wgengine/magicsock: don't log about disco ping timeouts if we have a working address
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-07-24 11:21:50 -07:00
Brad Fitzpatrick
c06d2a8513 wgengine/magicsock: fix typo in comment 2020-07-18 13:57:26 -07:00
Brad Fitzpatrick
bf195cd3d8 wgengine/magicsock: reduce log verbosity of discovery messages
Don't log heartbeat pings & pongs. Track the reason for pings and then
only log the ping/pong traffic if it was for initial path discovery.
2020-07-18 13:54:00 -07:00
Brad Fitzpatrick
10ac066013 all: fix vet warnings 2020-07-16 08:39:38 -07:00
Brad Fitzpatrick
d74c9aa95b wgengine/magicsock: update comment, fix earlier commit
891898525c had a continue that meant the didCopy synchronization never ran.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-07-16 08:29:38 -07:00
Brad Fitzpatrick
c976264bd1 wgengine/magicsock: gofmt 2020-07-16 08:15:27 -07:00
Dmytro Shynkevych
f3e2b65637
wgengine/magicsock: time.Sleep -> time.After
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-07-16 11:04:53 -04:00
Dmytro Shynkevych
380ee76d00
wgengine/magicsock: make time.Sleep in runDerpReader respect cancellation.
Before this patch, the 250ms sleep would not be interrupted by context cancellation,
which would result in the goroutine sometimes lingering in tests (100ms grace period).

Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-07-16 10:45:48 -04:00
Dmytro Shynkevych
891898525c
wgengine/magicsock: make receive from didCopy respect cancellation.
Very rarely, cancellation occurs between a successful send on derpRecvCh
and a call to copyBuf on the receiving side.
Without this patch, this situation results in <-copyBuf blocking indefinitely.

Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-07-16 10:34:49 -04:00
Brad Fitzpatrick
a2267aae99 wgengine: only launch pingers for peers predating the discovery protocol
Peers advertising a discovery key know how to speak the discovery
protocol and do their own heartbeats to get through NATs and keep NATs
open. No need for the pinger except for with legacy peers.
2020-07-15 21:08:26 -07:00
Dmytro Shynkevych
2f15894a10 wgengine/magicsock: wait for derphttp client goroutine to exit
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-07-14 14:20:35 -04:00
Brad Fitzpatrick
6c74065053 wgengine/magicsock, tstest/natlab: start hooking up natlab to magicsock
Also adds ephemeral port support to natlab.

Work in progress.

Pairing with @danderson.
2020-07-10 14:32:58 -07:00
Brad Fitzpatrick
bd59bba8e6 wgengine/magicsock: stop discoEndpoint timers on Close
And add some defensive early returns on c.closed.
2020-07-08 16:51:17 -07:00
Brad Fitzpatrick
de875a4d87 wgengine/magicsock: remove DisableSTUNForTesting 2020-07-08 15:50:41 -07:00
Brad Fitzpatrick
5c6d8e3053 netcheck, tailcfg, interfaces, magicsock: survey UPnP, NAT-PMP, PCP
Don't do anything with UPnP, NAT-PMP, PCP yet, but see how common they
are in the wild.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-07-06 15:25:35 -07:00
Brad Fitzpatrick
6196b7e658 wgengine/magicsock: change API to not permit disco key changes
Generate the disco key ourselves and give out the public half instead.

Fixes #525
2020-07-06 12:10:39 -07:00
Brad Fitzpatrick
5132edacf7 wgengine/magicsock: fix data race from undocumented wireguard-go requirement
Endpoints need to be Stringers apparently.

Fixes tailscale/corp#422
2020-07-03 22:27:52 -07:00
Brad Fitzpatrick
630379a1d0 cmd/tailscale: add tailscale status region name, last write, consistently star
There's a lot of confusion around what tailscale status shows, so make it better:
show region names, last write time, and put stars around DERP too if active.

Now stars are always present if activity, and always somewhere.
2020-07-03 13:44:22 -07:00
Brad Fitzpatrick
9a8700b02a wgengine/magicsock: add discoEndpoint heartbeat
Updates #483
2020-07-03 12:43:39 -07:00
Brad Fitzpatrick
9f930ef2bf wgengine/magicsock: remove the discoEndpoint.timers map
It ended up being more complicated than it was worth.
2020-07-03 11:45:41 -07:00
Brad Fitzpatrick
f5f3885b5b wgengine/magicsock: bunch of misc discovery path cleanups
* fix tailscale status for peers using discovery
* as part of that, pull out disco address selection into reusable
  and testable discoEndpoint.addrForSendLocked
* truncate ping/pong logged hex txids in half to eliminate noise
* move a bunch of random time constants into named constants
  with docs
* track a history of per-endpoint pong replies for future use &
  status display
* add "send" and " got" prefix to discovery message logging
  immediately before the frame type so it's easier to read than
  searching for the "<-" or "->" arrows earlier in the line; but keep
  those as the more reasily machine readable part for later.

Updates #483
2020-07-03 11:26:22 -07:00
Brad Fitzpatrick
6c70cf7222 wgengine/magicsock: stop ping timeout timer on pong receipt, misc log cleanup
Updates #483
2020-07-02 22:54:57 -07:00
Brad Fitzpatrick
c52905abaa wgengine/magicsock: log less on no-op disco route switches
Also, renew trustBestAddrUntil even if latency isn't better.
2020-07-02 11:39:05 -07:00
Brad Fitzpatrick
0f0ed3dca0 wgengine/magicsock: clean up discovery logging
Updates #483
2020-07-02 10:48:13 -07:00
Brad Fitzpatrick
056fbee4ef wgengine/magicsock: add TS_DEBUG_OMIT_LOCAL_ADDRS knob to force STUN use only
For debugging.
2020-07-02 09:53:10 -07:00
Brad Fitzpatrick
e03cc2ef57 wgengine/magicsock: populate discoOfAddr upon receiving ping frames
Updates #483
2020-07-02 08:37:46 -07:00
Brad Fitzpatrick
275a20f817 wgengine/magicsock: keep discoOfAddr populated, use it for findEndpoint
Update the mapping from ip:port to discokey, so when we retrieve a
packet from the network, we can find the same conn.Endpoint that we
gave to wireguard-go previously, without making it think we've
roamed. (We did, but we're not using its roaming.)

Updates #483
2020-07-01 22:15:41 -07:00
Brad Fitzpatrick
77e89c4a72 wgengine/magicsock: handle CallMeMaybe discovery mesages
Roughly feature complete now. Testing and polish remains.

Updates #483
2020-07-01 15:30:25 -07:00
Brad Fitzpatrick
710ee88e94 wgengine/magicsock: add timeout on discovery pings, clean up state
Updates #483
2020-07-01 14:39:21 -07:00
Brad Fitzpatrick
77d3ef36f4 wgengine/magicsock: hook up discovery messages, upgrade to LAN works
Ping messages now go out somewhat regularly, pong replies are sent,
and pong replies are now partially handled enough to upgrade off DERP
to LAN.

CallMeMaybe packets are sent & received over DERP, but aren't yet
handled. That's next (and regular maintenance timers), and then WAN
should work.

Updates #483
2020-07-01 13:00:50 -07:00
Brad Fitzpatrick
9b8ca219a1 wgengine/magicsock: remove allocs in UDP write, use new netaddr.PutUDPAddr
The allocs were only introduced yesterday with a TODO. Now they're gone again.
2020-07-01 10:17:08 -07:00
Brad Fitzpatrick
7b3c0bb7f6 wgengine/magicsock: fix crash reading DERP packet
Starting at yesterday's e96f22e560 (convering some UDPAddrs to
IPPorts), Conn.ReceiveIPv4 could return a nil addr, which would make
its way through wireguard-go and blow up later. The DERP read path
wasn't initializing the addr result parameter any more, and wgRecvAddr
wasn't checking it either.

Fixes #515
2020-07-01 09:36:19 -07:00
Brad Fitzpatrick
47b4a19786 wgengine/magicsock: use netaddr.ParseIPPort instead of net.ResolveUDPAddr 2020-07-01 08:23:37 -07:00
Brad Fitzpatrick
f7124c7f06 wgengine/magicsock: start of discoEndpoint state tracking
Updates #483
2020-06-30 15:33:56 -07:00
Brad Fitzpatrick
92252b0988 wgengine/magicsock: add a little LRU cache for netaddr.IPPort lookups
And while plumbing, a bit of discovery work I'll need: the
endpointOfAddr map to map from validated paths to the discoEndpoint.
Not being populated yet.

Updates #483
2020-06-30 14:38:10 -07:00
Brad Fitzpatrick
2d6e84e19e net/netcheck, wgengine/magicsock: replace more UDPAddr with netaddr.IPPort 2020-06-30 13:25:13 -07:00
Brad Fitzpatrick
9070aacdee wgengine/magicsock: minor comments & logging & TODO changes 2020-06-30 13:14:41 -07:00
Brad Fitzpatrick
e96f22e560 wgengine/magicsock: start handling disco message, use netaddr.IPPort more
Updates #483
2020-06-30 12:24:23 -07:00
Brad Fitzpatrick
a83ca9e734 wgengine/magicsock: cache precomputed nacl/box shared keys
Updates #483
2020-06-29 14:26:25 -07:00
Brad Fitzpatrick
a975e86bb8 wgengine/magicsock: add new endpoint type used for discovery-supporting peers
This adds a new magicsock endpoint type only used when both sides
support discovery (that is, are advertising a discovery
key). Otherwise the old code is used.

So far the new code only communicates over DERP as proof that the new
code paths are wired up. None of the actually discovery messaging is
implemented yet.

Support for discovery (generating and advertising a key) are still
behind an environment variable for now.

Updates #483
2020-06-29 13:59:54 -07:00
Brad Fitzpatrick
103c06cc68 wgengine/magicsock: open discovery naclbox messages from known peers
And track known peers.

Doesn't yet do anything with the messages. (nor does it send any yet)

Start of docs on the message format. More will come in subsequent changes.

Updates #483
2020-06-26 14:57:12 -07:00
Brad Fitzpatrick
23e74a0f7a wgengine, magicsock, tstun: don't regularly STUN when idle (mobile only for now)
If there's been 5 minutes of inactivity, stop doing STUN lookups. That
means NAT mappings will expire, but they can resume later when there's
activity again.

We'll do this for all platforms later.

Updates tailscale/corp#320

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-06-25 19:14:24 -07:00
Brad Fitzpatrick
fe50cd0c48 ipn, wgengine: plumb NetworkMap down to magicsock
Now we can have magicsock make decisions based on tailcfg.Debug
settings sent by the server.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-06-25 19:14:24 -07:00
Brad Fitzpatrick
53fb25fc2f all: generate discovery key, plumb it around
Not actually used yet.

Updates #483
2020-06-19 12:12:00 -07:00
Brad Fitzpatrick
abd79ea368 derp: reduce DERP memory use; don't require callers to pass in memory to use
The magicsock derpReader was holding onto 65KB for each DERP
connection forever, just in case.

Make the derp{,http}.Client be in charge of memory instead. It can
reuse its bufio.Reader buffer space.
2020-06-15 10:26:50 -07:00
Brad Fitzpatrick
280e8884dd wgengine/magicsock: limit redundant log spam on packets from low-pri addresses
Fixes #407

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-06-11 09:40:55 -07:00
Brad Fitzpatrick
9e5d79e2f1 wgengine/magicsock: drop a bytes.Buffer sync.Pool, use logger.ArgWriter instead 2020-05-31 15:29:04 -07:00
Brad Fitzpatrick
db2a216561 wgengine/magicsock: don't log on UDP send errors if address family known missing
Fixes #376
2020-05-29 12:41:30 -07:00
Brad Fitzpatrick
9e3ad4f79f net/netns: add package for start of network namespace support
And plumb in netcheck STUN packets.

TODO: derphttp, logs, control.

Updates #144

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-05-28 16:20:16 -07:00
Brad Fitzpatrick
a428656280 wgengine/magicsock: don't report v4 localhost addresses on IPv6-only systems
Updates #376
2020-05-28 14:16:23 -07:00
Avery Pennarun
30e5c19214 magicsock: work around race condition initializing .Regions[].
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-28 03:42:03 -04:00
Brad Fitzpatrick
b0c10fa610 stun, netcheck: move under net 2020-05-25 09:18:24 -07:00
Brad Fitzpatrick
e6b84f2159 all: make client use server-provided DERP map, add DERP region support
Instead of hard-coding the DERP map (except for cmd/tailscale netcheck
for now), get it from the control server at runtime.

And make the DERP map support multiple nodes per region with clients
picking the first one that's available. (The server will balance the
order presented to clients for load balancing)

This deletes the stunner package, merging it into the netcheck package
instead, to minimize all the config hooks that would've been
required.

Also fix some test flakes & races.

Fixes #387 (Don't hard-code the DERP map)
Updates #388 (Add DERP region support)
Fixes #399 (wgengine: flaky tests)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-05-23 22:31:59 -07:00
Brad Fitzpatrick
e6d0c92b1d wgengine/magicsock: clean up earlier fix a bit
Move WaitReady from fc88e34f42 into the
test code, and keep the derp-reading goroutine named for debugging.
2020-05-14 10:01:48 -07:00
Avery Pennarun
fc88e34f42 wgengine/magicsock/tests: wait for home DERP connection before sending packets.
This fixes an elusive test flake. Fixes #161.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-13 23:50:25 -04:00
Avery Pennarun
08acb502e5 Add tstest.PanicOnLog(), and fix various problems detected by this.
If a test calls log.Printf, 'go test' horrifyingly rearranges the
output to no longer be in chronological order, which makes debugging
virtually impossible. Let's stop that from happening by making
log.Printf panic if called from any module, no matter how deep, during
tests.

This required us to change the default error handler in at least one
http.Server, as well as plumbing a bunch of logf functions around,
especially in magicsock and wgengine, but also in logtail and backoff.

To add insult to injury, 'go test' also rearranges the output when a
parent test has multiple sub-tests (all the sub-test's t.Logf is always
printed after all the parent tests t.Logf), so we need to screw around
with a special Logf that can point at the "current" t (current_t.Logf)
in some places. Probably our entire way of using subtests is wrong,
since 'go test' would probably like to run them all in parallel if you
called t.Parallel(), but it definitely can't because the're all
manipulating the shared state created by the parent test. They should
probably all be separate toplevel tests instead, with common
setup/teardown logic. But that's a job for another time.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2020-05-13 23:12:35 -04:00
Brad Fitzpatrick
fefd7e10dc types/structs: add structs.Incomparable annotation, use it where applicable
Shotizam before and output queries:

sqlite> select sum(size) from bin where func like 'type..%';
129067
=>
120216
2020-05-03 14:05:32 -07:00
Brad Fitzpatrick
e1526b796e ipn: don't listen on the unspecified address in test
To avoid the Mac firewall dialog of (test) death.

See 4521a59f30
which I added to help debug this.
2020-04-28 19:20:02 -07:00
Brad Fitzpatrick
18017f7630 ipn, wgengine/magicsock: be more idle when in Stopped state with no peers
(Previously as #288, but with some more.)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-04-28 13:41:29 -07:00
David Anderson
9669b85b41 wgengine/magicsock: wait for endpoint updater goroutine when closing.
Fixes #204.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-04-27 14:46:10 -07:00
Brad Fitzpatrick
268d331cb5 wgengine/magicsock: prune key.Public-keyed on peer removals
Fixes #215
2020-04-18 08:48:01 -07:00
Brad Fitzpatrick
00d053e25a wgengine/magicsock: fix slow memory leak as peer endpoints move around
Updates #215
2020-04-18 08:28:10 -07:00
Brad Fitzpatrick
7fc97c5493 wgengine/magicsock: use netaddr more
In prep for deleting from the ever-growing maps.
2020-04-17 15:15:42 -07:00
Brad Fitzpatrick
6fb30ff543 wgengine/magicsock: start using inet.af/netaddr a bit 2020-04-17 13:51:52 -07:00
Brad Fitzpatrick
a7e7c7b548 wgengine/magicsock: close derp connections on rebind
Fixes #276

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-04-10 20:43:00 -07:00
Brad Fitzpatrick
614261d00d wgengine/magicsock: reset AddrSet states on Rebind
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-04-10 20:27:35 -07:00
Blake Gentry
e19287f60f wgengine/magicsock: fix Conn docs type reference
The docs on magicsock.Conn stated that they implemented the
wireguard/device.Bind interface, yet this type does not exist. In
reality, the Conn type implements the wireguard/conn.Bind interface.

I also fixed a small typo in the same file.

Signed-off-by: Blake Gentry <blakesgentry@gmail.com>
2020-04-06 15:11:56 -07:00
Brad Fitzpatrick
322499473e cmd/tailscaled, wgengine, ipn: add /debug/ipn handler with world state
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-26 14:26:24 -07:00
Brad Fitzpatrick
2d48f92a82 wgengine/magicsock: re-stun every [20,27] sec, not 28
28 is cutting it close, and we think jitter will help some spikes
we're seeing.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-25 14:25:33 -07:00
Brad Fitzpatrick
577f321c38 wgengine/magicsock: revise derp fallback logic
Revision to earlier 6284454ae5

Don't be sticky if we have no peers.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-25 13:09:18 -07:00
Brad Fitzpatrick
6284454ae5 wgengine/magicsock: if UDP blocked, pick DERP where most peers are
Updates #207

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-25 08:00:44 -07:00
Brad Fitzpatrick
d321190578 wgengine/magicsock: stringify [IPv6]:port normally in AddrSet.String 2020-03-24 13:40:43 -07:00
Brad Fitzpatrick
3c3ea8bc8a wgengine/magicsock: finish IPv6 transport support
DEBUG_INCLUDE_IPV6=1 is still required, but works now.

Updates #18 (fixes it, once env var gate is removed)
2020-03-24 10:56:22 -07:00
Brad Fitzpatrick
82ed7e527e wgengine/magicsock: remove log allocation
This was the whole point but I goofed at the last line.
2020-03-24 08:14:47 -07:00
Brad Fitzpatrick
8454bbbda5 wgengine/magicsock: more logging improvements
* remove endpoint discovery noise when results unchanged
* consistently spell derp nodes as "derp-N"
* replace "127.3.3.40:" with "derp-" in CreateEndpoint log output
* stop early DERP setup before SetPrivateKey is called;
  it just generates log nosie
* fix stringification of peer ShortStrings (it had an old %x on it,
  rendering it garbage)
* describe why derp routes are changing, with one of:
  shared home, their home, our home, alt
2020-03-24 08:12:55 -07:00
Brad Fitzpatrick
680311b3df wgengine/magicsock: fix few remaining logs without package prefix 2020-03-23 22:11:49 -07:00
Brad Fitzpatrick
c473927558 wgengine/magicsock: clean up, add, improve DERP logs 2020-03-23 21:57:58 -07:00
Brad Fitzpatrick
ea9310403d wgengine/magicsock: re-STUN on DERP connection death
Fixes #201
2020-03-23 13:19:33 -07:00
Brad Fitzpatrick
1ab5b31c4b derp, magicsock: send new "peer gone" frames when previous sender disconnects
Updates #150 (not yet enabled by default in magicsock)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-22 21:00:47 -07:00
Brad Fitzpatrick
b6f77cc48d wgengine/magicsock: return early, outdent in derpWriteChanOfAddr 2020-03-22 14:08:59 -07:00
Brad Fitzpatrick
dd31285ad4 wgengine/magicsock: send IPv6 using pconn6, if available
In prep for IPv6 support. Nothing should make it this far yet.
2020-03-20 14:30:12 -07:00
Brad Fitzpatrick
af277a6762 controlclient, magicsock: add debug knob to request IPv6 endpoints
Add opt-in method to request IPv6 endpoints from the control plane.
For now they should just be skipped. A previous version of this CL was
unconditional and reportedly had problems that I can't reproduce. So
make it a knob until the mystery is solved.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-20 14:27:24 -07:00
Brad Fitzpatrick
221e7d7767 wgengine/magicsock: make log message include DERP port (node) 2020-03-20 13:51:20 -07:00