2655 Commits

Author SHA1 Message Date
Christine Dodrill
1e83b97498
tstest/integration/vms: outgoing SSH test (#2349)
This does a few things:

1. Rewrites the tests so that we get a log of what individual tests
   failed at the end of a test run.
2. Adds a test that runs an HTTP server via the tester tailscale node and
   then has the VMs connect to that over Tailscale.
3. Dials the VM over Tailscale and ensures it answers SSH requests.
4. Other minor framework refactoring.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-07-08 11:38:01 -04:00
Christine Dodrill
97279a0fe0
tstest/integration/vms: add Oracle Linux image (#2328)
Oracle Linux[1] is a CentOS fork. It is not very special. I am adding it
to the integration jungle because I am adding it to pkgs and the website
directions.

[1]: https://www.oracle.com/linux/

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-07-08 10:26:20 -04:00
Brad Fitzpatrick
a9fc583211 cmd/tailscale/cli: document the web subcommand a bit more
Fixes #2326

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-07 21:16:33 -07:00
Josh Bleecher Snyder
0ad92b89a6 net/tstun: fix data races
To remove some multi-case selects, we intentionally allowed
sends on closed channels (cc23049cd286f137536d011b7840b3be5fdc8cb4).

However, we also introduced concurrent sends and closes,
which is a data race.

This commit fixes the data race. The mutexes here are uncontended,
and thus very cheap.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-07 16:15:29 -07:00
Brad Fitzpatrick
7d417586a8 tstest/integration: help bust cmd/go's test caching
It was caching too aggressively, as it didn't see our deps due to our
running "go install tailscaled" as a child process.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-07 13:14:21 -07:00
Brad Fitzpatrick
3dcd18b6c8 tailcfg: note RegionID 900-999 reservation 2021-07-07 12:23:41 -07:00
Brad Fitzpatrick
ddb8726c98 util/deephash: don't reflect.Copy if element type is a defined uint8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-07 11:58:04 -07:00
Brad Fitzpatrick
df176c82f5 util/deephash: skip alloc test under race detector
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-07 11:40:28 -07:00
Brad Fitzpatrick
6dc38ff25c util/deephash: optimize hashing of byte arrays, reduce allocs in Hash
name              old time/op    new time/op    delta
Hash-6               173µs ± 4%     101µs ± 3%   -41.69%  (p=0.000 n=10+9)
HashMapAcyclic-6     101µs ± 5%     105µs ± 3%    +3.52%  (p=0.001 n=9+10)
TailcfgNode-6       29.4µs ± 2%    16.4µs ± 3%   -44.25%  (p=0.000 n=8+10)

name              old alloc/op   new alloc/op   delta
Hash-6              3.60kB ± 0%    1.13kB ± 0%   -68.70%  (p=0.000 n=10+10)
HashMapAcyclic-6    2.53kB ± 0%    2.53kB ± 0%      ~     (p=0.137 n=10+8)
TailcfgNode-6         528B ± 0%        0B       -100.00%  (p=0.000 n=10+10)

name              old allocs/op  new allocs/op  delta
Hash-6                84.0 ± 0%      40.0 ± 0%   -52.38%  (p=0.000 n=10+10)
HashMapAcyclic-6       202 ± 0%       202 ± 0%      ~     (all equal)
TailcfgNode-6         11.0 ± 0%       0.0       -100.00%  (p=0.000 n=10+10)

Updates tailscale/corp#2130

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-07 11:30:49 -07:00
Brad Fitzpatrick
3962744450 util/deephash: prevent infinite loop on map cycle
Fixes #2340

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-07 10:57:46 -07:00
Brad Fitzpatrick
aceaa70b16 util/deephash: move funcs to methods
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-07 08:17:18 -07:00
Irshad Pananilath
9288e0d61c build_docker.sh: use build_dist.sh to inject version information
version.sh was removed in commit 5088af68. Use `build_dist.sh shellvars`
to provide version information instead.

Signed-off-by: Irshad Pananilath <pmirshad+code@gmail.com>
2021-07-07 06:38:04 -07:00
Christine Dodrill
a8360050e7
tstest/integration/vms: make first end to end test (#2332)
This makes sure `tailscale status` and `tailscale ping` works. It also
switches goexpect to use a batch instead of manually banging out each
line, which makes the tests so much easier to read.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-07-06 12:50:19 -04:00
David Crawshaw
805d5d3cde ipnlocal: move log line inside if statement
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-07-06 09:35:01 -07:00
Brad Fitzpatrick
14f901da6d util/deephash: fix sync.Pool usage
Whoops.

From yesterday's 9ae3bd09394c8d41bc065fdf8856739b468f51f7 (not yet
used by anything, fortunately)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-05 22:21:44 -07:00
Brad Fitzpatrick
e0258ffd92 util/deephash: use keyed struct literal, fix vet
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-05 21:31:30 -07:00
Brad Fitzpatrick
bf9f279768 util/deephash: optimize CPU a bit by by avoiding fmt in more places
name              old time/op    new time/op    delta
Hash-6               179µs ± 5%     173µs ± 4%   -3.12%  (p=0.004 n=10+10)
HashMapAcyclic-6     115µs ± 3%     101µs ± 5%  -11.51%  (p=0.000 n=9+9)
TailcfgNode-6       30.8µs ± 4%    29.4µs ± 2%   -4.51%  (p=0.000 n=10+8)

name              old alloc/op   new alloc/op   delta
Hash-6              3.60kB ± 0%    3.60kB ± 0%     ~     (p=0.445 n=9+10)
HashMapAcyclic-6    2.53kB ± 0%    2.53kB ± 0%     ~     (p=0.065 n=9+10)
TailcfgNode-6         528B ± 0%      528B ± 0%     ~     (all equal)

name              old allocs/op  new allocs/op  delta
Hash-6                84.0 ± 0%      84.0 ± 0%     ~     (all equal)
HashMapAcyclic-6       202 ± 0%       202 ± 0%     ~     (all equal)
TailcfgNode-6         11.0 ± 0%      11.0 ± 0%     ~     (all equal)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-05 21:28:54 -07:00
Brad Fitzpatrick
58f2ef6085 util/deephash: add a benchmark and some benchmark data
No code changes.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-05 21:21:52 -07:00
Brad Fitzpatrick
9ae3bd0939 util/deephash: export a Hash func for use by the control plane
name              old time/op    new time/op    delta
Hash-6              69.4µs ± 6%    68.4µs ± 4%     ~     (p=0.286 n=9+9)
HashMapAcyclic-6     115µs ± 5%     115µs ± 4%     ~     (p=1.000 n=10+10)

name              old alloc/op   new alloc/op   delta
Hash-6              2.29kB ± 0%    1.88kB ± 0%  -18.13%  (p=0.000 n=10+10)
HashMapAcyclic-6    2.53kB ± 0%    2.53kB ± 0%     ~     (all equal)

name              old allocs/op  new allocs/op  delta
Hash-6                58.0 ± 0%      54.0 ± 0%   -6.90%  (p=0.000 n=10+10)
HashMapAcyclic-6       202 ± 0%       202 ± 0%     ~     (all equal)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-05 11:41:44 -07:00
Brad Fitzpatrick
700badd8f8 util/deephash: move internal/deephash to util/deephash
No code changes. Just a minor package doc addition about lack of API
stability.
2021-07-02 21:33:02 -07:00
Josh Bleecher Snyder
7f095617f2 internal/deephash: 8 bits of output is not enough
Running hex.Encode(b, b) is a bad idea.
The first byte of input will overwrite the first two bytes of output.
Subsequent bytes have no impact on the output.

Not related to today's IPv6 bug, but...wh::ps.

This caused us to spuriously ignore some wireguard config updates.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-02 13:48:27 -07:00
Josh Bleecher Snyder
c35a832de6 net/tstun: add inner loop to poll
This avoids re-enqueuing to t.bufferConsumed,
which makes the code a bit clearer.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-02 11:02:12 -07:00
Josh Bleecher Snyder
a4cc7b6d54 net/tstun: simplify code
Calculate whether the packet is injected directly,
rather than via an else branch.

Unify the exit paths. It is easier here than duplicating them.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-02 11:02:12 -07:00
Josh Bleecher Snyder
cc23049cd2 net/tstun: remove multi-case selects from hot code
Every TUN Read went through several multi-case selects.
We know from past experience with wireguard-go that these are slow
and cause scheduler churn.

The selects served two purposes: they separated errors from data and
gracefully handled shutdown. The first is fairly easy to replace by sending
errors and data over a single channel. The second, less so.

We considered a few approaches: Intricate webs of channels,
global condition variables. They all get ugly fast.

Instead, let's embrace the ugly and handle shutdown ungracefully.
It's horrible, but the horror is simple and localized.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-02 11:02:12 -07:00
Denton Gentry
64ee6cf64b api.md: update preview example
The implementation of the preview function has changed since the
API was documented, update the document to match.

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2021-07-02 08:24:19 -07:00
Brad Fitzpatrick
1e6d8a1043 version: don't allocate parsing unsupported versions, empty strings
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-01 14:25:50 -07:00
Josh Bleecher Snyder
f11a8928a6 ipn/ipnlocal: fix data race
We can't access b.netMap without holding b.mu.
We already grabbed it earlier in the function with the lock held.

Introduced in Nov 2020 in 7ea809897df1764ea420be2ff6ae58459b0e6902.
Discovered during stress testing.
Apparently it's a pretty rare?

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-01 12:29:02 -07:00
Christine Dodrill
5813da885c
tstest/integration/vms: verbosify nixos logs to fs, disable unstable (#2294)
This puts nix build logs on the filesystem so that we can debug them
later. This also disables nixos unstable until
https://github.com/NixOS/nixpkgs/issues/128783 is fixed.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-30 13:38:28 -04:00
David Crawshaw
6b9f8208f4 net/dns: do not run wsl.exe as LocalSystem
It doesn't work. It needs to run as the user.

	https://github.com/microsoft/WSL/issues/4803

The mechanism for doing this was extracted from:

	https://web.archive.org/web/20101009012531/http://blogs.msdn.com/b/winsdk/archive/2009/07/14/launching-an-interactive-process-from-windows-service-in-windows-vista-and-later.aspx

While here, we also reclaculate WSL distro set on SetDNS.
This accounts for:

	1. potential inability to access wsl.exe on startup
	2. WSL being installed while Tailscale is running
	3. A new WSL distrobution being installed

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-06-30 10:11:33 -07:00
Christine Dodrill
6f3a5802a6
experimental VM test: add -v
Apparently if you don't add -v the tests don't report anything useful when they break. Joy.

Signed-Off-By: Christine Dodrill <xe@tailscale.com>
2021-06-30 09:28:58 -04:00
Maisem Ali
ec52760a3d wgengine/router_windows: support toggling local lan access when using
exit nodes.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2021-06-29 09:22:10 -07:00
David Crawshaw
c37713b927 cmd/tailscale/cli: accept login server synonym
Fixes #2272

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-06-29 07:20:02 -07:00
julianknodt
e68d4d5805 cmd/tailscale: add debug flag to dump derp map
This adds a flag in tailscale debug for dumping the derp map to stdout.

Fixes #2249.

Signed-off-by: julianknodt <julianknodt@gmail.com>
2021-06-28 22:50:59 -07:00
Brad Fitzpatrick
fd7fddd44f control/controlclient: add debug knob to force node to only IPv6 self addr
Updates #2268

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-28 15:26:58 -07:00
Brad Fitzpatrick
722859b476 wgengine/netstack: make SOCKS5 resolve names to IPv6 if self node when no IPv4
For instance, ephemeral nodes with only IPv6 addresses can now
SOCKS5-dial out to names like "foo" and resolve foo's IPv6 address
rather than foo's IPv4 address and get a "no route"
(*tcpip.ErrNoRoute) error from netstack's dialer.

Per https://github.com/tailscale/tailscale/issues/2268#issuecomment-870027626
which is only part of the isuse.

Updates #2268

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-28 15:20:37 -07:00
David Crawshaw
1147c7fd4f net/dns: set WSL /etc/resolv.conf
We also have to make a one-off change to /etc/wsl.conf to stop every
invocation of wsl.exe clobbering the /etc/resolv.conf. This appears to
be a safe change to make permanently, as even though the resolv.conf is
constantly clobbered, it is always the same stable internal IP that is
set as a nameserver. (I believe the resolv.conf clobbering predates the
MS stub resolver.)

Tested on WSL2, should work for WSL1 too.

Fixes #775

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-06-28 14:18:15 -07:00
David Crawshaw
9b063b86c3 net/dns: factor directManager out over an FS interface
This is preliminary work for using the directManager as
part of a wslManager on windows, where in addition to configuring
windows we'll use wsl.exe to edit the linux file system and modify the
system resolv.conf.

The pinholeFS is a little funky, but it's designed to work through
simple unix tools via wsl.exe without invoking bash. I would not have
thought it would stand on its own like this, but it turns out it's
useful for writing a test for the directManager.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-06-28 14:18:15 -07:00
julianknodt
506c2fe8e2 cmd/tailscale: make netcheck use active DERP map, delete static copy
After allowing for custom DERP maps, it's convenient to be able to see their latency in
netcheck. This adds a query to the local tailscaled for the current DERPMap.

Updates #1264

Signed-off-by: julianknodt <julianknodt@gmail.com>
2021-06-28 14:08:47 -07:00
Brad Fitzpatrick
15677d8a0e net/socks5/tssocks: add a SOCKS5 dialer type, method-ifying code
https://twitter.com/bradfitz/status/1409605220376580097

Prep for #1970, #2264, #2268

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-28 13:12:42 -07:00
Brad Fitzpatrick
3910c1edaf net/socks5/tssocks: add new package, move SOCKS5 glue out of tailscaled
Prep for #1970, #2264, #2268

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-28 11:34:50 -07:00
Brad Fitzpatrick
5e19ac7adc tstest/integration: always run SOCK5 server, parse out its listening address
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-28 11:34:41 -07:00
David Crawshaw
54199d9d58 controlclient: log server key and URL
Turns out we never reliably log the control plane URL a client connects
to. Do it here, and include the server public key, which might
inadvertently tell us something interesting some day.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-06-28 09:38:23 -07:00
David Crawshaw
d6f4b5f5cb ipn, etc: use controlplane.tailscale.com
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-06-28 09:38:23 -07:00
Brad Fitzpatrick
82e15d3450 cmd/tailscaled: log SOCKS5 port when port 0 requested
Part of #2158

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-28 08:32:50 -07:00
Christine Dodrill
2adbfc920d
integration vm tests: run on every commit to main (#2159)
This is an experiment to see how often this test would fail if we run it
on every commit. This depends on #2145 to fix a flaky part of the test.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-28 10:01:30 -04:00
Christine Dodrill
b131a74f99
tstest/integration/vms: build and run NixOS (#2190)
Okay, so, at a high level testing NixOS is a lot different than
other distros due to NixOS' determinism. Normally NixOS wants packages to
be defined in either an overlay, a custom packageOverrides or even
yolo-inline as a part of the system configuration. This is going to have
us take a different approach compared to other distributions. The overall
plan here is as following:

1. make the binaries as normal
2. template in their paths as raw strings to the nixos system module
3. run `nixos-generators -f qcow -o $CACHE_DIR/tailscale/nixos/version -c generated-config.nix`
4. pass that to the steps that make the virtual machine

It doesn't really make sense for us to use a premade virtual machine image
for this as that will make it harder to deterministically create the image.

Nix commands generate a lot of output, so their output is hidden behind the
`-verbose-nix-output` flag.

This unfortunately makes this test suite have a hard dependency on
Nix/NixOS, however the test suite has only ever been run on NixOS (and I
am not sure if it runs on other distros at all), so this probably isn't too
big of an issue.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-28 09:45:45 -04:00
julianknodt
72a0b5f042 net/dns/resolver: fmt item
This has been bothering me for a while, but everytime I run format from the root directory
it also formats this file. I didn't want to add it to my other PRs but it's annoying to have to
revert it every time.

Signed-off-by: julianknodt <julianknodt@gmail.com>
2021-06-27 23:57:55 -07:00
Brad Fitzpatrick
10d7c2583c net/dnsfallback: don't depend on derpmap.Prod
Move derpmap.Prod to a static JSON file (go:generate'd) instead,
to make its role explicit. And add a TODO about making dnsfallback
use an update-over-time DERP map file instead of a baked-in one.

Updates #1264

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-06-27 22:07:40 -07:00
Christine Dodrill
194d5b8412
tstest/integration/vms: add in-process DERP server (#2108)
Previously this test would reach out to the public DERP servers in order
to help machines connect with eachother. This is not ideal given our
plans to run these tests completely disconnected from the internet. This
patch introduces an in-process DERP server running on its own randomly
assigned HTTP port.

Updates #1988

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-25 15:59:45 -04:00
Christine Dodrill
6b234323a0
tstest/integration/vms: fix flake when testing (#2145)
Occasionally the test framework would fail with a timeout due to a
virtual machine not phoning home in time. This seems to be happen
whenever qemu can't bind the VNC or SSH ports for a virtual machine.
This was fixed by taking the following actions:

1. Don't listen on VNC unless the `-use-vnc` flag is passed, this
   removes the need to listen on VNC at all in most cases. The option to
   use VNC is still left in for debugging virtual machines, but removing
   this makes it easier to deal with (VNC uses this odd system of
   "displays" that are mapped to ports above 5900, and qemu doesn't
   offer a decent way to use a normal port number, so we just disable
   VNC by default as a compromise).
2. Use a (hopefully) inactive port for SSH. In an ideal world I'd just
   have the VM's SSH port be exposed via a Unix socket, however the QEMU
   documentation doesn't really say if you can do this or not. While I
   do more research, this stopgap will have to make do.
3. Strictly tie more VM resource lifetimes to the tests themselves.
   Previously the disk image layers for virtual machines were only
   cleaned up at the end of the test and existed in the parent
   test-scoped temporary folder. This can make your tmpfs run out of
   space, which is not ideal. This should minimize the use of temporary
   storage as much as I know how to.
4. Strictly tie the qemu process lifetime to the lifetime of the test
   using testing.T#Cleanup. Previously it used a defer statement to
   clean up the qemu process, however if the tests timed out this defer
   was not run. This left around an orphaned qemu process that had to be
   killed manually. This change ensures that all qemu processes exit
   when their relevant tests finish.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-06-25 14:45:12 -04:00