Commit Graph

962 Commits

Author SHA1 Message Date
Brad Fitzpatrick
42e98d4edc Quiet two little log annoyances. 2020-03-13 09:42:09 -07:00
Brad Fitzpatrick
db2436c7ff wgengine/magicsock: don't interrupt endpoint updates, merge all mutex into one
Before, endpoint updates were constantly being interrupted and resumed
on Linux due to tons of LinkChange messages from over-zealous Linux
netlink messages (from router_linux.go)

Now that endpoint updates are fast and bounded in time anyway, just
let them run to completion, but note that another needs to be
scheduled after.

Now logs went from pages of noise to just:

root@taildoc:~# grep -i -E 'stun|endpoint update' log
2020/03/13 08:51:29 magicsock.Conn: starting endpoint update (initial)
2020/03/13 08:51:30 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:31 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:31 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:33 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:33 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:35 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:35 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")

Or, seen in another run:

2020/03/13 08:45:41 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:46:09 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:46:21 magicsock.Conn: starting endpoint update (link-change-major)
2020/03/13 08:46:37 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:47:05 magicsock.Conn: starting endpoint update (periodic)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-13 09:34:11 -07:00
Brad Fitzpatrick
db31550854 wgengine: don't Reconfig on boring link changes
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-13 07:45:59 -07:00
Avery Pennarun
7dd63abaed tailcfg.NetInfo: add a .String() renderer.
For pretty printing purposes in logs.
2020-03-13 01:29:47 -04:00
Avery Pennarun
b23cb11eaf ipn: Prefs.String(): print the current derp setting. 2020-03-13 00:43:19 -04:00
David Anderson
aeb88864e0 ipn: don't clobber netinfo in Start(). 2020-03-12 21:39:01 -07:00
Avery Pennarun
8b8e3f08a0 Fix staticcheck complaint. 2020-03-12 23:33:51 -04:00
Avery Pennarun
b4897e7de8 controlclient/netmap: write our own b.ConciseDiffFrom(a) function.
This removes the need for go-cmp, which is extremely bloaty so we had
to leave it out of iOS. As a result, we had also left it out of macOS,
and so we didn't print netmap diffs at all on darwin-based platforms.
Oops.

As a bonus, the output format of the new function is way better.

Minor oddity: because I used the dumbest possible diff algorithm, the
sort order is a bit dumb. We print all "removed" lines and then print
all "added" lines, rather than doing the usual diff-like thing of
interspersing them. This probably doesn't matter (maybe it's an
improvement).
2020-03-12 23:01:08 -04:00
Avery Pennarun
96bb05ce2f controlclient: reformat netmap.Concise() and add DERP server info.
The .Concise() view had grown hard to read over time. Originally, we
assumed a peer almost always had just one endpoint and one-or-more
allowedips. With magicsock, we now almost always have multiple
endpoints per peer. And empirically, almost every peer has only one
allowedip.

Change their order so we can line up allowedips vertically. Also do
some tweaking to make multiple endpoints easier to read.

While we're here, add a column to show the home DERP server of each
peer, if any.
2020-03-12 22:29:24 -04:00
Avery Pennarun
f2e2ffa423 controlclient: log the entire netmap up to every 5 minutes.
We log it once upon receiving the first copy of the map, then
subsequently when a new one appears, but only if we haven't logged one
less than 5 minutes ago.

This avoids overly cluttering the log (as we did before, logging the
netmap every time one appeared, which could be hundreds of lines every
few seconds), but still gives the log enough context to help in
diagnosing problems retroactively.
2020-03-12 22:28:11 -04:00
Brad Fitzpatrick
b9c6d3ceb8 netcheck: work behind UDP-blocked networks again, add tests
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-12 14:49:06 -07:00
Brad Fitzpatrick
a87ee4168a stunner: quiet a harmless log warning 2020-03-12 14:14:23 -07:00
Brad Fitzpatrick
bc73dcf204 wgengine/magicsock: don't block in Send waiting for derphttp.Send
Fixes #137
Updates #109
Updates #162
Updates #163

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-12 12:19:12 -07:00
Brad Fitzpatrick
8807913be9 wgengine/magicsock: wait for previous DERP goroutines to end before new ones
Updates #109 (hopefully fixes, will wait for graphs to be happy)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-12 12:19:12 -07:00
Brad Fitzpatrick
eff6dcdb4e wgengine/magicsock: log more about why we're re-STUNing 2020-03-12 12:09:25 -07:00
David Crawshaw
5ad947c761 cmd/derper: set a write timeout
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-03-12 14:42:48 -04:00
David Crawshaw
72dbf26f63 derp: test that client a->b and a->c relaying do not interfere
Without the recent write deadline introduction, this test fails.

They still do interfere, but the interference is now bound by
the write deadline. Many improvements are possible.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-03-12 14:42:48 -04:00
David Crawshaw
e838b3fb59 derp: use a write timeout when sending to clients
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-03-12 14:42:48 -04:00
David Crawshaw
3df1b97ea8 derp: do not treat failure to relay as the fault of the sender
If Alice attempts to send a packet to Bob and the DERP server
encounters an error on the socket to Bob, we should not disconnect
Alice for that.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-03-12 14:42:48 -04:00
David Crawshaw
43aa8595dd derp: introduce Conn interface
This lets us test with something other than a net.Conn.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-03-12 14:42:48 -04:00
David Crawshaw
41ac4a79d6 net/nettest: new package with net-like testing primitives
This is a lot like wiring up a local UDP socket, read and write
deadlines work. The big difference is the Block feature, which
lets you stop the packet flow without breaking the connection.
This lets you emulate broken sockets and test timeouts actually
work.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-03-12 14:42:48 -04:00
Brad Fitzpatrick
52c0cb12fb stunner: return wrapped error (currently unused) 2020-03-12 11:21:19 -07:00
Brad Fitzpatrick
b4d02a251a syncs: add new package for extra sync types 2020-03-12 11:13:33 -07:00
David Crawshaw
57f220656c ipn: search for ErrStateNotExist with errors.Is
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-03-12 08:44:24 -04:00
Avery Pennarun
40c6f952c5 Merge branch 'master' of github.com:tailscale/tailscale into HEAD
* 'master' of github.com:tailscale/tailscale:
  netcheck: fix data races for staggler STUN packets arriving after GetReport
  wgengine/magicsock: add a pointer value for logging
  netcheck: ignore IPv4 STUN failures if we saw at least one reply
  netcheck: ignore IPv6 STUN failures
  derp: add clients_replaced counter
  version: bump OSS version datestamp.
2020-03-11 21:01:18 -04:00
Avery Pennarun
509247bf42 tailscale, tailscaled: update safesocket port number.
This makes them able to connect to each other on Windows.
2020-03-11 21:00:25 -04:00
Brad Fitzpatrick
afc3479d04 netcheck: fix data races for staggler STUN packets arriving after GetReport
Fixes #179
2020-03-11 15:35:12 -07:00
Brad Fitzpatrick
b3ddf51a15 wgengine/magicsock: add a pointer value for logging
Updates #109

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-11 15:12:19 -07:00
Brad Fitzpatrick
0d3f42e1d8 netcheck: ignore IPv4 STUN failures if we saw at least one reply
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-11 13:57:23 -07:00
Brad Fitzpatrick
ed7e088729 netcheck: ignore IPv6 STUN failures
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-11 12:44:59 -07:00
Brad Fitzpatrick
4fd29349b9 derp: add clients_replaced counter
Updates #109
2020-03-11 11:55:43 -07:00
David Anderson
b364a871bf version: bump OSS version datestamp. 2020-03-11 10:47:37 -07:00
David Anderson
72d9e1d633 go.mod: bump wireguard-go version. 2020-03-11 10:32:50 -07:00
Brad Fitzpatrick
b0f8931d26 wgengine/magicsock: make a test signature a bit more explicit 2020-03-11 09:51:33 -07:00
David Crawshaw
7ec54e0064 wgengine/magicsock: remove TODO
The TODO above derphttp.NewClient suggests it does network I/O,
but the derphttp client connects lazily and so creating one is
very cheap.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-03-11 12:17:37 -04:00
David Crawshaw
af58cfc476 go.mod: bump wireguard-go version
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-03-11 11:29:14 -04:00
Brad Fitzpatrick
01b4bec33f stunner: re-do how Stunner works
It used to make assumptions based on having Anycast IPs that are super
near. Now we're intentionally going to a bunch of different distant
IPs to measure latency.

Also, optimize how the hairpin detection works. No need to STUN on
that socket. Just use that separate socket for sending, once we know
the other UDP4 socket's endpoint. The trick is: make our test probe
also a STUN packet, so it fits through magicsock's existing STUN
routing.

This drops netcheck from ~5 seconds to ~250-500ms.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-11 08:08:48 -07:00
David Anderson
4affea2691 go.mod: bump wireguard-go version.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-10 18:00:37 -07:00
David Anderson
77af7e5436 wgengine/magicsock: mark test logfunc as a helper.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-10 18:00:37 -07:00
David Anderson
7eda3af034 wgengine/magicsock: clean up derp http servers on shutdown.
Failure to do this leads to fd exhaustion at -count=10000,
and increasingly poor execution north of -count=100.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-10 18:00:37 -07:00
David Anderson
d651715528 wgengine/magicsock: synchronize test STUN shutdown.
Failure to do so triggers either a data race or a panic
in the testing package, due to racey use of t.Logf.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-10 18:00:37 -07:00
David Anderson
86baf60bd4 wgengine/magicsock: synchronize epUpdate cleanup on shutdown.
Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-10 18:00:37 -07:00
Brad Fitzpatrick
023df9239e Move linkstate boring change filtering to magicsock
So we can at least re-STUN on boring updates.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-10 12:50:03 -07:00
David Anderson
592fec7606 wgengine/magicsock: move device close to uncursed portion of test.
Device close used to suffer from deadlocks, but no longer.

Signed-off-by: David Anderson <danderson@tailscale.com>
2020-03-10 11:57:57 -07:00
Brad Fitzpatrick
a265d7cbff wgengine/magicsock: in STUN-disabled test mode, let endpoint discovery proceed 2020-03-10 11:35:43 -07:00
Brad Fitzpatrick
5c1e443d34 wgengine/monitor: don't call LinkChange when interfaces look unchanged
Basically, don't trust the OS-level link monitor to only tell you
interesting things. Sanity check it.

Also, move the interfaces package into the net directory now that we
have it.
2020-03-10 11:03:19 -07:00
Brad Fitzpatrick
39c0ae1dba derp/derpmap: new DERP config package, merge netcheck into magicsock more
Fixes #153
Updates #162
Updates #163

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2020-03-10 10:37:25 -07:00
Brad Fitzpatrick
bd0e20f351 net/dnscache: ignore annoying staticcheck check 2020-03-09 22:12:22 -07:00
Brad Fitzpatrick
d44325295e net/dnscache: initialize the single Resolver more directly 2020-03-09 21:05:01 -07:00
Brad Fitzpatrick
d07146aafb go.mod, go.sum: update 2020-03-09 21:01:08 -07:00