This raises the maximum DNS response message size from 512 to 4095. This
should be large enough for almost all situations that do not need TCP.
We still do not recognize EDNS, so we will still forward requests that
claim support for a larger response size than 4095 (that will be solved
later). For now, when a response comes back that is too large to fit in
our receive buffer, we now set the truncation flag in the DNS header,
which is an improvement from before but will prompt attempts to use TCP
which isn't supported yet.
On Windows, WSARecvFrom into a buffer that's too small returns an error
in addition to the data. On other OSes, the extra data is silently
discarded. In this case, we prefer the latter so need to catch the error
on Windows.
Partially addresses #1123
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
This runner is in my homelab while we muse about a better, more
permanent home for these tests.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
This makes integration tests pull pristine VM images from Amazon S3 if
they don't exist on disk. If the S3 fetch fails, it will fall back to
grabbing the image from the public internet. The VM images on the public
internet are known to be updated without warning and thusly change their
SHA256 checksum. This is not ideal for a test that we want to be able to
fire and forget, then run reliably for a very long time.
This requires an AWS profile to be configured at the default path. The
S3 bucket is rigged so that the requester pays. The VM images are
currently about 6.9 gigabytes. Please keep this in mind when running
these tests on your machine.
Documentation was added to the integration test folder to aid others in
running these tests on their machine.
Some wording in the logs of the tests was altered.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Some downstream distros eval'd version/version.sh to get at the shell variables
within their own build process. They can now `./build_dist.sh shellvars` to get
those.
Fixes#2058.
Signed-off-by: David Anderson <dave@natulte.net>
It is a bit faster.
But more importantly, it matches upstream byte-for-byte,
which ensures there'll be no corner cases in which we disagree.
name old time/op new time/op delta
SetPeers-8 3.58µs ± 0% 3.16µs ± 2% -11.74% (p=0.016 n=4+5)
name old alloc/op new alloc/op delta
SetPeers-8 2.53kB ± 0% 2.53kB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
SetPeers-8 99.0 ± 0% 99.0 ± 0% ~ (all equal)
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The image downloads can take a significant amount of time for the tests.
This creates a new test that will download every distro image into the
local cache in parallel, optionally matching the distribution regex.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
I've run into a couple issues where the tests time out while a VM image
is being downloaded, making the cache poisoned for the next run. This
moves the hash checking into its own function and calls it much sooner
in the testing chain. If the hash check fails, the OS is redownloaded.
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Most of the time qemu will output nothing when it is running. This is
expected behavior. However when qemu is unable to start due to some
problem, it prints that to either stdout or stderr. Previously this
output wasn't being captured. This patch captures that output to aid in
debugging qemu issues.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
We were crashing on in initPeerAPIListener when called from
authReconfig when b.netMap is nil. But authReconfig already returns
before the call to initPeerAPIListener when b.netMap is nil, but it
releases the b.mu mutex before calling initPeerAPIListener which
reacquires it and assumes it's still nil.
The only thing that can be setting it to nil is setNetMapLocked, which
is called by ResetForClientDisconnect, Logout/logout, or Start, all of
which can happen during an authReconfig.
So be more defensive.
Fixes#1996
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We used to use "redo" for that, but it was pretty vague.
Also, fix the build tags broken in interfaces_default_route_test.go from
a9745a0b68, moving those Linux-specific
tests to interfaces_linux_test.go.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously this built the binaries for every distro. This is a bit
overkill given we are using static binaries. This patch makes us only
build once.
There was also a weird issue with how processes were being managed.
Previously we just killed qemu with Process.Kill(), however that was
leaving behind zombies. This has been mended to not only kill qemu but
also waitpid() the process so it doesn't become a zombie.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
The OpenSUSE 15.1 image we are using (and conseqentially the only one
that is really available easily given it is EOL) has cloud-init
hardcoded to use the OpenStack metadata thingy. Other OpenSUSE Leap
images function fine with the NoCloud backend, but this one seems to
just not work with it. No bother, we can just pretend to be OpenStack.
Thanks to Okami for giving me an example OpenStack configuration seed
image.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Arch is a bit of a weirder distro, however as a side effect it is much
more of a systemd purist experience. Adding it to our test suite will
make sure that we are working in the systemd happy path.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
This distro is about to be released. OpenSUSE has historically had the
least coverage for functional testing, so this may prove useful in the
future.
Signed-off-by: Christine Dodrill <xe@tailscale.com>
DNS names consist of labels, but outside of length limits, DNS
itself permits any content within the labels. Some records require
labels to conform to hostname limitations (which is what we implemented
before), but not all.
Fixes#2024.
Signed-off-by: David Anderson <danderson@tailscale.com>
Instead of testing all the VMs at once when they are all ready, this
patch changes the testing logic so that the vms are tested as soon as
they register with testcontrol. Also limit the amount of VM ram used at
once with the `-ram-limit` flag. That uses a semaphore to guard resource
use.
Also document CentOS' sins.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
The resulting empty Prefs had AllowSingleHosts=false and
Routeall=false, so that on iOS if you did these steps:
- Login and leave running
- Terminate the frontend
- Restart the frontend (fast path restart, missing prefs)
- Set WantRunning=false
- Set WantRunning=true
...then you would have Tailscale running, but with no routes. You would
also accidentally disable the ExitNodeID/IP prefs (symptom: the current
exit node setting didn't appear in the UI), but since nothing
else worked either, you probably didn't notice.
The fix was easy enough. It turns out we already knew about the
problem, so this also fixes one of the BUG entries in state_test.
Fixes: #1918 (BUG-1) and some as-yet-unreported bugs with exit nodes.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
Previously, there was no server round trip required to log out, so when
you asked ipnlocal to Logout(), it could clear the netmap immediately
and switch to NeedsLogin state.
In v1.8, we added a true Logout operation. ipn.Logout() would trigger
an async cc.StartLogout() and *also* immediately switch to NeedsLogin.
Unfortunately, some frontends would see NeedsLogin and immediately
trigger a new StartInteractiveLogin() operation, before the
controlclient auth state machine actually acted on the Logout command,
thus accidentally invalidating the entire logout operation, retaining
the netmap, and violating the user's expectations.
Instead, add a new LogoutFinished signal from controlclient
(paralleling LoginFinished) and, upon starting a logout, don't update
the ipn state machine until it's received.
Updates: #1918 (BUG-2)
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
If you set `-distro-regex` to match a subset of distros, only those
distros will be tested. Ex:
$ go test -run-vm-tests -distro-regex='opensuse'
Signed-off-by: Christine Dodrill <xe@tailscale.com>
Don't try to do heuristics on the name. Use the net/interfaces package
which we already have to do this sort of stuff.
Fixes#2011
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Instead of pulling packages from pkgs.tailscale.com, we should use the
tailscale binaries that are local to this git commit. This exposes a bit
of the integration testing stack in order to copy the binaries
correctly.
This commit also bumps our version of github.com/pkg/sftp to the latest
commit.
If you run into trouble with yaml, be sure to check out the
commented-out alpine linux image complete with instructions on how to
use it.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
netaddr allocated at the time this was written. No longer.
name old time/op new time/op delta
TailscaleServiceAddr-4 5.46ns ± 4% 1.83ns ± 3% -66.52% (p=0.008 n=5+5)
A bunch of the others can probably be simplified too, but this
was the only one with just an IP and not an IPPrefix.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously we spewed a lot of output to stdout and stderr, even when
`-v` wasn't set. This is sub-optimal for various reasons. This patch
shunts that output to test logs so it only shows up when `-v` is set.
Updates #1988
Signed-off-by: Christine Dodrill <xe@tailscale.com>
The cyolosecurity fork of certstore did not update its module name and
thus can only be used with a replace directive. This interferes with
installing using `go install` so I created a tailscale fork with an
updated module name.
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>