2889 Commits

Author SHA1 Message Date
Brad Fitzpatrick
b8fb8264a5 wgengine/netstack: avoid delivering incoming packets to both netstack + host
The earlier eb06ec172f1d984bb87c589da1dd2d3f15dc6d82 fixed
the flaky SSH issue (tailscale/corp#1725) by making sure that packets
addressed to Tailscale IPs in hybrid netstack mode weren't delivered
to netstack, but another issue remained:

All traffic handled by netstack was also potentially being handled by
the host networking stack, as the filter hook returned "Accept", which
made it keep processing. This could lead to various random racey chaos
as a function of OS/firewalls/routes/etc.

Instead, once we inject into netstack, stop our caller's packet
processing.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-06 06:43:16 -07:00
Brad Fitzpatrick
7f2eb1d87a net/tstun: fix TUN log spam when ACLs drop a packet
Whenever we dropped a packet due to ACLs, wireguard-go was logging:

Failed to write packet to TUN device: packet dropped by filter

Instead, just lie to wireguard-go and pretend everything is okay.

Fixes #1229

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-06 06:42:58 -07:00
Brad Fitzpatrick
2585edfaeb cmd/tailscale: fix tailscale up --advertise-exit-node validation
Fixes #1859

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-05 20:50:47 -07:00
Brad Fitzpatrick
1a1123d461 wgengine: fix pendopen debug to not track SYN+ACKs, show Node.Online state
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-05 15:25:11 -07:00
Brad Fitzpatrick
b2de34a45d version: bump date
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-05 14:49:20 -07:00
Brad Fitzpatrick
eb06ec172f wgengine/netstack: don't pass non-subnet traffic to netstack in hybrid mode
Fixes tailscale/corp#1725

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-05 13:38:55 -07:00
Brad Fitzpatrick
7629cd6120 net/tsaddr: add NewContainsIPFunc (move from wgengine)
I want to use this from netstack but it's not exported.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-05 13:15:50 -07:00
Josh Bleecher Snyder
78d4c561b5 types/logger: add key grinder stats lines to rate-limiting exemption list
Updates #1749

Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-05-05 08:25:15 -07:00
Josh Bleecher Snyder
f116a4c44f types/logger: fix rate limiter allowlist
Upstream wireguard-go renamed the interface method
from CreateEndpoint to ParseEndpoint.
I updated the log call site but not the allowlist.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 21:59:05 -07:00
Josh Bleecher Snyder
be56aa4962 workflows: execute benchmarks
#1817 removed the only place in our CI where we executed our benchmark code.
Fix that by executing it everywhere.

The benchmarks are generally cheap and fast, 
so this should add minimal overhead.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 20:21:03 -07:00
Brad Fitzpatrick
52e1031428 cmd/tailscale: gofmt
From 6d10655dc3887f1a161015514a8555c175802b4d

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-04 13:04:33 -07:00
Josh Bleecher Snyder
ac75958d2e workflows: run staticcheck on more platforms
To prevent issues like #1786, run staticcheck on the primary GOOSes:
linux, mac, and windows.

Windows also has a fair amount of GOARCH-specific code.
If we ever have GOARCH staticcheck failures on other GOOSes,
we can expand the test matrix further.

This requires installing the staticcheck binary so that
we can execute it with different GOOSes.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 12:50:13 -07:00
Avery Pennarun
6d10655dc3 ipnlocal: accept a new opts.UpdatePrefs field.
This is needed because the original opts.Prefs field was at some point
subverted for use in frontend->backend state migration for backward
compatibility on some platforms. We still need that feature, but we
also need the feature of providing the full set of prefs from
`tailscale up`, *not* including overwriting the prefs.Persist keys, so
we can't use the original field from `tailscale up`.

`tailscale up` had attempted to compensate for that by doing SetPrefs()
before Start(), but that violates the ipn.Backend contract, which says
you should call Start() before anything else (that's why it's called
Start()). As a result, doing SetPrefs({ControlURL=...,
WantRunning=true}) would cause a connection to the *previous* control
server (because WantRunning=true), and then connect to the *new*
control server only after running Start().

This problem may have been avoided before, but only by pure luck.

It turned out to be relatively harmless since the connection to the old
control server was immediately closed and replaced anyway, but it
created a race condition that could have caused spurious notifications
or rejected keys if the server responded quickly.

As already covered by existing TODOs, a better fix would be to have
Start() get out of the business of state migration altogether. But
we're approaching a release so I want to make the minimum possible fix.

Fixes #1840.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-05-04 15:19:25 -04:00
Josh Bleecher Snyder
7dbbe0c7c7 cmd/tailscale/cli: fix running from Xcode
We were over-eager in running tailscale in GUI mode.
f42ded7acf63e2f3711f6512b701ddeac0e2d7a6 fixed that by
checking for a variety of shell-ish env vars and using those
to force us into CLI mode.

However, for reasons I don't understand, those shell env vars
are present when Xcode runs Tailscale.app on my machine.
(I've changed no configs, modified nothing on a brand new machine.)
Work around that by adding an additional "only in GUI mode" check.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 11:37:02 -07:00
Brad Fitzpatrick
4066c606df ipn/ipnlocal: update peerapi logging of received PUTs
Clarify direction and add duration.

(per chat with Avery)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-04 11:09:02 -07:00
Josh Bleecher Snyder
d3ba860ffd syncs: stop running TestWatchMultipleValues on CI
It's flaky, and not just on Windows.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 10:21:21 -07:00
Brad Fitzpatrick
f5bccc0746 ipn/ipnlocal: redact more errors
Updates tailscale/corp#1636

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-04 09:58:09 -07:00
Josh Bleecher Snyder
47ebd1e9a2 wgengine/router: use net.IP.Equal instead of bytes.Equal to compare IPs
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 08:54:50 -07:00
Josh Bleecher Snyder
737151ea4a safesocket: delete unused function
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 08:54:50 -07:00
Josh Bleecher Snyder
f91c2dfaca wgengine/router: remove unused field
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 08:54:50 -07:00
Josh Bleecher Snyder
bfd2b71926 portlist: suppress staticcheck error
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 08:54:50 -07:00
Josh Bleecher Snyder
42c8b9ad53 net/tstun: remove unnecessary break statement
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 08:54:50 -07:00
Josh Bleecher Snyder
61e411344f logtail/filch: add staticcheck annotation
To work around a staticcheck bug when running with GOOS=windows.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 08:54:50 -07:00
Josh Bleecher Snyder
9360f36ebd all: use lower-case letters at the start of error message
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-04 08:54:50 -07:00
Brad Fitzpatrick
962bf74875 cmd/tailscale: fail if tailscaled closes the IPN connection
I was going to write a test for this using the tstest/integration test
stuff, but the testcontrol implementation isn't quite there yet (it
always registers nodes and doesn't provide AuthURLs). So, manually
tested for now.

Fixes #1843

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-04 07:51:23 -07:00
Brad Fitzpatrick
68fb51b833 tstest/integration: misc cleanups
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-03 14:22:18 -07:00
Brad Fitzpatrick
3237e140c4 tstest/integration: add testNode.AwaitListening, DERP+STUN, improve proxy trap
Updates #1840
2021-05-03 12:14:20 -07:00
David Crawshaw
1f48d3556f cmd/tailscale/cli: don't report outdated auth URL to web UI
This brings the web 'up' logic into line with 'tailscale up'.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-05-03 11:18:58 -07:00
David Crawshaw
1336ed8d9e cmd/tailscale/cli: skip new tab on web login
It doesn't work properly.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-05-03 11:18:58 -07:00
David Crawshaw
85beaa52b3 paths: add synology socket path
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-05-03 11:18:58 -07:00
Josh Bleecher Snyder
64047815b0 wgenengine/magicsock: delete cursed tests
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-03 11:09:44 -07:00
Brad Fitzpatrick
ca65c6cbdb cmd/tailscale: make 'file cp' have better error messages on bad targets
Say when target isn't owned by current user, and when target doesn't
exist in netmap.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-03 10:33:55 -07:00
Josh Bleecher Snyder
96ef8d34ef ipn/ipnlocal: switch from testify to quicktest
Per discussion, we want to have only one test assertion library,
and we want to start by exploring quicktest.

This was a mostly mechanical translation.
I think we could make this nicer by defining a few helper
closures at the beginning of the test. Later.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-05-03 10:09:13 -07:00
Brad Fitzpatrick
90002be6c0 cmd/tailscale: make pref-revert checks ignore OS-irrelevant prefs
This fixes #1833 in two ways:

* stop setting NoSNAT on non-Linux. It only matters on Linux and the flag
  is hidden on non-Linux, but the code was still setting it. Because of
  that, the new pref-reverting safety checks were failing when it was
  changing.

* Ignore the two Linux-only prefs changing on non-Linux.

Fixes #1833

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-03 09:37:50 -07:00
Brad Fitzpatrick
fb67d8311c cmd/tailscale: pull out, parameterize up FlagSet creation for tests
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-05-03 09:23:55 -07:00
Brad Fitzpatrick
98d7c28faa tstest/integration: start factoring test types out to clean things up
To enable easy multi-node testing (including inter-node traffic) later.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-04-30 20:27:05 -07:00
Brad Fitzpatrick
f6e3240dee cmd/tailscale/cli: add test to catch ipn.Pref additions 2021-04-30 13:29:06 -07:00
Avery Pennarun
6caa02428e cmd/tailscale/cli/up: "LoggedOut" pref is implicit.
There's no need to warn that it was not provided on the command line
after doing a sequence of up; logout; up --args. If you're asking for
tailscale to be up, you always mean that you prefer LoggedOut to become
false.

Fixes #1828

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-04-30 16:15:04 -04:00
Josh Bleecher Snyder
59026a291d wgengine/wglog: improve wireguard-go logging rate limiting
Prior to wireguard-go using printf-style logging,
all wireguard-go logging occurred using format string "%s".
We fixed that but continued to use %s when we rewrote
peer identifiers into Tailscale style.

This commit removes that %sl, which makes rate limiting work correctly.
As a happy side-benefit, it should generate less garbage.

Instead of replacing all wireguard-go peer identifiers
that might occur anywhere in a fully formatted log string,
assume that they only come from args.
Check all args for things that look like *device.Peers
and replace them with appropriately reformatted strings.

There is a variety of ways that this could go wrong
(unusual format verbs or modifiers, peer identifiers
occurring as part of a larger printed object, future API changes),
but none of them occur now, are likely to be added,
or would be hard to work around if they did.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-04-30 09:45:10 -07:00
Josh Bleecher Snyder
1f94d43b50 wgengine/wglog: delay formatting
The "stop phrases" we use all occur in wireguard-go in the format string.
We can avoid doing a bunch of fmt.Sprintf work when they appear.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-04-30 09:45:10 -07:00
Brad Fitzpatrick
544d8d0ab8 ipn/ipnlocal: remove NewLocalBackendWithClientGen
This removes the NewLocalBackendWithClientGen constructor added in
b4d04a065fd384ca7f57891a2bb87e1ff5205fb6 and instead adds
LocalBackend.SetControlClientGetterForTesting, mirroring
LocalBackend.SetHTTPTestClient. NewLocalBackendWithClientGen was
weird in being exported but taking an unexported type. This was noted
during code review:

https://github.com/tailscale/tailscale/pull/1818#discussion_r623155669

which ended in:

"I'll leave it for y'all to clean up if you find some way to do it elegantly."

This is more idiomatic.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-04-30 07:36:53 -07:00
Avery Pennarun
0181a4d0ac ipnlocal: don't pause the controlclient until we get at least one netmap.
Without this, macOS would fail to display its menu state correctly if you
started it while !WantRunning. It relies on the netmap in order to show
the logged-in username.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-04-30 09:18:13 -04:00
Avery Pennarun
4ef207833b ipn: !WantRunning + !LoggedOut should not be idle on startup.
There was logic that would make a "down" tailscale backend (ie.
!WantRunning) refuse to do any network activity. Unfortunately, this
makes the macOS and iOS UI unable to render correctly if they start
while !WantRunning.

Now that we have Prefs.LoggedOut, use that instead. So `tailscale down`
will still allow the controlclient to connect its authroutine, but
pause the maproutine. `tailscale logout` will entirely stop all
activity.

This new behaviour is not obviously correct; it's a bit annoying that
`tailsale down` doesn't terminate all activity like you might expect.
Maybe we should redesign the UI code to render differently when
disconnected, and then revert this change.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-04-30 09:18:13 -04:00
Avery Pennarun
4f3315f3da ipnlocal: setting WantRunning with EditPrefs was special.
EditPrefs should be just a wrapper around the action of changing prefs,
but someone had added a side effect of calling Login() sometimes. The
side effect happened *after* running the state machine, which would
sometimes result in us going into NeedsLogin immediately before calling
cc.Login().

This manifested as the macOS app not being able to Connect if you
launched it with LoggedOut=false and WantRunning=false. Trying to
Connect() would sent us to the NeedsLogin state instead.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-04-30 09:18:13 -04:00
Avery Pennarun
2a4d1cf9e2 Add prefs.LoggedOut to fix several state machine bugs.
Fixes: tailscale/corp#1660

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-04-30 09:18:13 -04:00
Avery Pennarun
b0382ca167 ipn/ipnlocal: some state_test cleanups.
This doesn't change the actual functionality. Just some additional
comments and fine tuning.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-04-30 09:18:12 -04:00
Avery Pennarun
ac9cd48c80 ipnlocal: fix deadlock when calling Shutdown() from Start().
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-04-30 09:17:47 -04:00
Avery Pennarun
ecdba913d0 Revert "ipn/ipnlocal: be authoritative for the entire MagicDNS record tree."
Unfortunately this broke MagicDNS almost entirely.

Updates: tailscale/corp#1706

This reverts commit 1d7e7b49eb8e16c31e41420deff527671a87dc0c.
2021-04-30 06:16:58 -04:00
Brad Fitzpatrick
5e9e11a77d tstest/integration/testcontrol: add start of test control server
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-04-29 22:51:22 -07:00
Avery Pennarun
19c3e6cc9e types/logger: rate limited: more hysteresis, better messages.
- Switch to our own simpler token bucket, since x/time/rate is missing
  necessary stuff (can't provide your own time func; can't check the
  current bucket contents) and it's overkill anyway.

- Add tests that actually include advancing time.

- Don't remove the rate limit on a message until there's enough room to
  print at least two more of them. When we do, we'll also print how
  many we dropped, as a contextual reminder that some were previously
  lost. (This is more like how the Linux kernel does it.)

- Reformat the [RATE LIMITED] messages to be shorter, and to not
  corrupt original message. Instead, we print the message, then print
  its format string.

- Use %q instead of \"%s\", for more accurate parsing later, if the
  format string contained quotes.

Fixes #1772

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-04-30 01:01:15 -04:00