Commit Graph

2904 Commits

Author SHA1 Message Date
Julian Knodt
525eb5ce41
Merge pull request #2092 from tailscale/queue_latency
derp: add pkt queue latency timer
2021-06-11 09:48:38 -07:00
julianknodt
fe54721e31 derp: add pkt queue latency timer
It would be useful to know the time that packets spend inside of a queue before they are sent
off, as that can be indicative of the load the server is handling (and there was also an
existing TODO). This adds a simple exponential moving average metric to track the average packet
queue duration.
Changes during review:
Add CAS loop for recording queue timing w/ expvar.Func, rm snake_case, annotate in milliseconds,
convert

Signed-off-by: julianknodt <julianknodt@gmail.com>
2021-06-11 09:41:06 -07:00
Brad Fitzpatrick
80a4052593 cmd/tailscale, wgengine, tailcfg: don't assume LastSeen is present [mapver 20]
Updates #2107

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-11 08:41:16 -07:00
Christine Dodrill
8b2b899989
tstest/integration: test Alpine Linux (#2098)
Alpine Linux[1] is a minimal Linux distribution built around musl libc.
It boots very quickly, requires very little ram and is as close as you
can get to an ideal citizen for testing Tailscale on musl. Alpine has a
Tailscale package already[2], but this patch also makes it easier for us
to provide an Alpine Linux package off of pkgs in the future.

Alpine only offers Tailscale on the rolling-release edge branch.

[1]: https://alpinelinux.org/
[2]: https://pkgs.alpinelinux.org/packages?name=tailscale&branch=edge

Updates #1988

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-11 09:20:13 -04:00
Brad Fitzpatrick
0affcd4e12 tstest/integration: add some debugging for TestAddPingRequest flakes
This fails pretty reliably with a lot of output now showing what's
happening:

TS_DEBUG_MAP=1 go test --failfast -v -run=Ping -race -count=20 ./tstest/integration --verbose-tailscaled

I haven't dug into the details yet, though.

Updates #2079
2021-06-10 15:13:14 -07:00
Brad Fitzpatrick
ee3df2f720 tstest/integration: rename ambiguous --verbose test flag
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-10 11:24:01 -07:00
Fletcher Nichol
a49df5cfda wgenine/router: fix OpenBSD route creation
The route creation for the `tun` device was augmented in #1469 but
didn't account for adding IPv4 vs. IPv6 routes. There are 2 primary
changes as a result:

* Ensure that either `-inet` or `-inet6` was used in the
  [`route(8)`](https://man.openbsd.org/route) command
* Use either the `localAddr4` or `localAddr6` for the gateway argument
  depending which destination network is being added

The basis for the approach is based on the implementation from
`router_userspace_bsd.go`, including the `inet()` helper function.

Fixes #2048
References #1469

Signed-off-by: Fletcher Nichol <fnichol@nichol.ca>
2021-06-10 10:48:33 -07:00
Dave Anderson
144c68b80b
net/dns: avoid using NetworkManager as much as possible. (#1945)
Addresses #1699 as best as possible.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-06-10 10:46:08 -04:00
Maisem Ali
f944614c5c cmd/tailscale/web: add support for QNAP
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2021-06-10 19:06:05 +05:00
Adrian Dewhurst
8b11937eaf net/dns/resolver: permit larger max responses, signal truncation
This raises the maximum DNS response message size from 512 to 4095. This
should be large enough for almost all situations that do not need TCP.
We still do not recognize EDNS, so we will still forward requests that
claim support for a larger response size than 4095 (that will be solved
later). For now, when a response comes back that is too large to fit in
our receive buffer, we now set the truncation flag in the DNS header,
which is an improvement from before but will prompt attempts to use TCP
which isn't supported yet.

On Windows, WSARecvFrom into a buffer that's too small returns an error
in addition to the data. On other OSes, the extra data is silently
discarded. In this case, we prefer the latter so need to catch the error
on Windows.

Partially addresses #1123

Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2021-06-08 19:29:12 -04:00
Brad Fitzpatrick
fc5fba0fbf client/tailscale: document SetDNS more
Updates #1235

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-08 15:25:03 -07:00
Brad Fitzpatrick
796e222901 client/tailscale: add SetDNS func
Updates #1235

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-08 14:49:56 -07:00
Simeng He
f0121468f4 control/controlclient: add Pinger interface, Options.Pinger
Plumbs down a pinger to the direct to enable client to client Ping
functionality from control.

Signed-off-by: Simeng He <simeng@tailscale.com>
2021-06-08 16:30:06 -04:00
Matt Layher
6956645ec8 go.mod: bump github.com/mdlayher/netlink to v1.4.1
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2021-06-08 12:01:38 -07:00
Christine Dodrill
b402e76185
.github/workflows: add integration test with a custom runner (#2044)
This runner is in my homelab while we muse about a better, more
permanent home for these tests.

Updates #1988

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-08 12:49:23 -04:00
Christine Dodrill
622dc7b093
tstest/integration/vms: download images from s3 (#2035)
This makes integration tests pull pristine VM images from Amazon S3 if
they don't exist on disk. If the S3 fetch fails, it will fall back to
grabbing the image from the public internet. The VM images on the public
internet are known to be updated without warning and thusly change their
SHA256 checksum. This is not ideal for a test that we want to be able to
fire and forget, then run reliably for a very long time.

This requires an AWS profile to be configured at the default path. The
S3 bucket is rigged so that the requester pays. The VM images are
currently about 6.9 gigabytes. Please keep this in mind when running
these tests on your machine.

Documentation was added to the integration test folder to aid others in
running these tests on their machine.

Some wording in the logs of the tests was altered.

Updates #1988

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-08 12:47:24 -04:00
Christine Dodrill
3f1405fa2a
tstest/integration/vms: bump images, fix caching bug (#2052)
Before this redownloaded the image every time. Now it only redownloads
it when it needs to.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-08 10:15:59 -04:00
Brad Fitzpatrick
e29cec759a ipn/{ipnlocal,localapi}, control/controlclient: add SetDNS localapi
Updates #1235

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-07 20:35:56 -07:00
David Anderson
8236464252 packages/deb: add package to extract metadata from .deb files.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-06-07 16:22:23 -07:00
David Anderson
1c6946f971 cmd/mkpkg: allow zero files in a package.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-06-07 16:22:23 -07:00
David Anderson
7fab244614 net/dns/resolver: don't spam logs on EHOSTUNREACH.
Fixes #1719.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-06-07 10:45:29 -07:00
Simeng He
0141390365 tstest/integration/testcontrol: add Server.AddPingRequest
Signed-off-by: Simeng He <simeng@tailscale.com>
2021-06-07 13:40:35 -04:00
David Anderson
dfb1385fcc build_dist.sh: add a command to output the shell vars.
Some downstream distros eval'd version/version.sh to get at the shell variables
within their own build process. They can now `./build_dist.sh shellvars` to get
those.

Fixes #2058.

Signed-off-by: David Anderson <dave@natulte.net>
2021-06-05 19:02:42 -07:00
Josh Bleecher Snyder
e92fd19484 wgengine/wglog: match upstream wireguard-go's code for wireguardGoString
It is a bit faster.

But more importantly, it matches upstream byte-for-byte,
which ensures there'll be no corner cases in which we disagree.

name        old time/op    new time/op    delta
SetPeers-8    3.58µs ± 0%    3.16µs ± 2%  -11.74%  (p=0.016 n=4+5)

name        old alloc/op   new alloc/op   delta
SetPeers-8    2.53kB ± 0%    2.53kB ± 0%     ~     (all equal)

name        old allocs/op  new allocs/op  delta
SetPeers-8      99.0 ± 0%      99.0 ± 0%     ~     (all equal)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-06-04 13:06:28 -07:00
Christine Dodrill
adaecd83c8
tstest/integration/vms: add DownloadImages test to download images (#2039)
The image downloads can take a significant amount of time for the tests.
This creates a new test that will download every distro image into the
local cache in parallel, optionally matching the distribution regex.

Updates #1988

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-04 15:30:58 -04:00
Christine Dodrill
607b7ab692
tstest/integration/vms: aggressively re-verify shasums (#2050)
I've run into a couple issues where the tests time out while a VM image
is being downloaded, making the cache poisoned for the next run. This
moves the hash checking into its own function and calls it much sooner
in the testing chain. If the hash check fails, the OS is redownloaded.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-04 15:27:03 -04:00
David Anderson
df8a5d09c3 net/tstun: add a debug envvar to override tun MTU.
Signed-off-by: David Anderson <dave@natulte.net>
2021-06-04 11:55:11 -07:00
Christine Dodrill
6ce77b8eca
tstest/integration/vms: log qemu output (#2047)
Most of the time qemu will output nothing when it is running. This is
expected behavior. However when qemu is unable to start due to some
problem, it prints that to either stdout or stderr. Previously this
output wasn't being captured. This patch captures that output to aid in
debugging qemu issues.

Updates #1988

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-04 14:44:04 -04:00
Brad Fitzpatrick
58cc2cc921 tstest/integration/testcontrol: add Server.nodeLocked 2021-06-04 08:19:23 -07:00
David Anderson
aa6abc98f3 build_dist.sh: fix after the change to version stamping.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-06-03 13:14:32 -07:00
Brad Fitzpatrick
a573779c5c version: bump date
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-03 11:21:57 -07:00
Brad Fitzpatrick
5bf65c580d version: fix Short when link-stamped
And remove old SHORT, LONG deprecated variables.

Fixes tailscale/corp#1905

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-03 11:20:06 -07:00
Brad Fitzpatrick
ecfb2639cc ipn/ipnlocal: avoid initPeerAPIListener crash on certain concurrent actions
We were crashing on in initPeerAPIListener when called from
authReconfig when b.netMap is nil. But authReconfig already returns
before the call to initPeerAPIListener when b.netMap is nil, but it
releases the b.mu mutex before calling initPeerAPIListener which
reacquires it and assumes it's still nil.

The only thing that can be setting it to nil is setNetMapLocked, which
is called by ResetForClientDisconnect, Logout/logout, or Start, all of
which can happen during an authReconfig.

So be more defensive.

Fixes #1996

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-03 09:46:28 -07:00
Brad Fitzpatrick
713c5c9ab1 net/{interfaces,netns}: change which build tag means mac/ios Network/System Extension
We used to use "redo" for that, but it was pretty vague.

Also, fix the build tags broken in interfaces_default_route_test.go from
a9745a0b68, moving those Linux-specific
tests to interfaces_linux_test.go.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-03 08:29:22 -07:00
Christine Dodrill
0a655309c6
tstest/integration/vms: only build binaries once (#2042)
Previously this built the binaries for every distro. This is a bit
overkill given we are using static binaries. This patch makes us only
build once.

There was also a weird issue with how processes were being managed.
Previously we just killed qemu with Process.Kill(), however that was
leaving behind zombies. This has been mended to not only kill qemu but
also waitpid() the process so it doesn't become a zombie.

Updates #1988

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-03 10:58:35 -04:00
Christine Dodrill
a282819026
tstest/integration/vms: fix OpenSUSE Leap 15.1 (#2038)
The OpenSUSE 15.1 image we are using (and conseqentially the only one
that is really available easily given it is EOL) has cloud-init
hardcoded to use the OpenStack metadata thingy. Other OpenSUSE Leap
images function fine with the NoCloud backend, but this one seems to
just not work with it. No bother, we can just pretend to be OpenStack.

Thanks to Okami for giving me an example OpenStack configuration seed
image.

Updates #1988

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-03 09:29:07 -04:00
Christine Dodrill
4da5e79c39
tstest/integration/vms: test on Arch Linux (#2040)
Arch is a bit of a weirder distro, however as a side effect it is much
more of a systemd purist experience. Adding it to our test suite will
make sure that we are working in the systemd happy path.

Updates #1988

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-03 09:09:18 -04:00
Maisem Ali
95e296fd96 cmd/tailscale/web: restrict web access to synology admins.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2021-06-03 08:41:47 +05:00
David Anderson
5088af68cf version: remove all the redo stuff, only support embedding via go ldflags.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-06-02 14:17:46 -07:00
Brad Fitzpatrick
a321c24667 go.mod: update netaddr
Involves minor IPSetBuilder.Set API change.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-02 09:05:06 -07:00
Brad Fitzpatrick
9794be375d tailcfg: add SetDNSRequest type
Updates #1235

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-01 20:05:01 -07:00
Christine Dodrill
ca96357d4b
tstest/integration/vms: add OpenSUSE Leap 15.3 (#2026)
This distro is about to be released. OpenSUSE has historically had the
least coverage for functional testing, so this may prove useful in the
future.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-01 11:08:45 -04:00
David Anderson
33bc06795b go.mod: update for corp resync. 2021-05-31 21:47:37 -07:00
David Anderson
c54cc24e87 util/dnsname: make ToFQDN take exactly 0 or 1 allocs for everything.
name                                    old time/op    new time/op    delta
ToFQDN/www.tailscale.com.-32              9.55ns ± 2%   12.13ns ± 3%  +27.03%  (p=0.000 n=10+10)
ToFQDN/www.tailscale.com-32               86.3ns ± 1%    40.7ns ± 1%  -52.86%  (p=0.000 n=10+9)
ToFQDN/.www.tailscale.com-32              86.5ns ± 1%    40.4ns ± 1%  -53.29%  (p=0.000 n=10+9)
ToFQDN/_ssh._tcp.www.tailscale.com.-32    12.8ns ± 2%    14.7ns ± 2%  +14.24%  (p=0.000 n=9+10)
ToFQDN/_ssh._tcp.www.tailscale.com-32      104ns ± 1%      45ns ± 0%  -57.16%  (p=0.000 n=10+9)

name                                    old alloc/op   new alloc/op   delta
ToFQDN/www.tailscale.com.-32               0.00B          0.00B          ~     (all equal)
ToFQDN/www.tailscale.com-32                72.0B ± 0%     24.0B ± 0%  -66.67%  (p=0.000 n=10+10)
ToFQDN/.www.tailscale.com-32               72.0B ± 0%     24.0B ± 0%  -66.67%  (p=0.000 n=10+10)
ToFQDN/_ssh._tcp.www.tailscale.com.-32     0.00B          0.00B          ~     (all equal)
ToFQDN/_ssh._tcp.www.tailscale.com-32       112B ± 0%       32B ± 0%  -71.43%  (p=0.000 n=10+10)

name                                    old allocs/op  new allocs/op  delta
ToFQDN/www.tailscale.com.-32                0.00           0.00          ~     (all equal)
ToFQDN/www.tailscale.com-32                 2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
ToFQDN/.www.tailscale.com-32                2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
ToFQDN/_ssh._tcp.www.tailscale.com.-32      0.00           0.00          ~     (all equal)
ToFQDN/_ssh._tcp.www.tailscale.com-32       2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-05-31 21:13:50 -07:00
David Anderson
d7f6ef3a79 util/dnsname: add a benchmark for ToFQDN.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-05-31 21:13:50 -07:00
David Anderson
caaefa00a0 util/dnsname: don't validate the contents of DNS labels.
DNS names consist of labels, but outside of length limits, DNS
itself permits any content within the labels. Some records require
labels to conform to hostname limitations (which is what we implemented
before), but not all.

Fixes #2024.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-05-31 21:13:50 -07:00
Christine Dodrill
2802a01b81
tstest/integration/vms: test vms as they are ready (#2022)
Instead of testing all the VMs at once when they are all ready, this
patch changes the testing logic so that the vms are tested as soon as
they register with testcontrol. Also limit the amount of VM ram used at
once with the `-ram-limit` flag. That uses a semaphore to guard resource
use.

Also document CentOS' sins.

Updates #1988

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-05-31 17:04:49 -04:00
Avery Pennarun
eaa6507cc9 ipnlocal: in Start() fast path, don't forget to send Prefs.
The resulting empty Prefs had AllowSingleHosts=false and
Routeall=false, so that on iOS if you did these steps:
- Login and leave running
- Terminate the frontend
- Restart the frontend (fast path restart, missing prefs)
- Set WantRunning=false
- Set WantRunning=true
...then you would have Tailscale running, but with no routes. You would
also accidentally disable the ExitNodeID/IP prefs (symptom: the current
exit node setting didn't appear in the UI), but since nothing
else worked either, you probably didn't notice.

The fix was easy enough. It turns out we already knew about the
problem, so this also fixes one of the BUG entries in state_test.

Fixes: #1918 (BUG-1) and some as-yet-unreported bugs with exit nodes.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-05-31 14:53:49 -04:00
Avery Pennarun
8a7d35594d ipnlocal: don't assume NeedsLogin immediately after StartLogout().
Previously, there was no server round trip required to log out, so when
you asked ipnlocal to Logout(), it could clear the netmap immediately
and switch to NeedsLogin state.

In v1.8, we added a true Logout operation. ipn.Logout() would trigger
an async cc.StartLogout() and *also* immediately switch to NeedsLogin.
Unfortunately, some frontends would see NeedsLogin and immediately
trigger a new StartInteractiveLogin() operation, before the
controlclient auth state machine actually acted on the Logout command,
thus accidentally invalidating the entire logout operation, retaining
the netmap, and violating the user's expectations.

Instead, add a new LogoutFinished signal from controlclient
(paralleling LoginFinished) and, upon starting a logout, don't update
the ipn state machine until it's received.

Updates: #1918 (BUG-2)
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-05-31 14:53:49 -04:00
Christine Dodrill
36cb69002a
tstest/integration/vms: regex-match distros using a flag (#2021)
If you set `-distro-regex` to match a subset of distros, only those
distros will be tested. Ex:

    $ go test -run-vm-tests -distro-regex='opensuse'

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-05-31 13:23:38 -04:00