3332 Commits

Author SHA1 Message Date
Josh Bleecher Snyder
9da4181606 tstime/rate: new package
This is a simplified rate limiter geared for exactly our needs:
A fast, mono.Time-based rate limiter for use in tstun.
It was generated by stripping down the x/time/rate rate limiter
to just our needs and switching it to use mono.Time.

It removes one time.Now call per packet.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-29 12:56:58 -07:00
Josh Bleecher Snyder
f6e833748b wgengine: use mono.Time
Migrate wgengine to mono.Time for performance-sensitive call sites.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-29 12:56:58 -07:00
Josh Bleecher Snyder
8a3d52e882 wgengine/magicsock: use mono.Time
magicsock makes multiple calls to Now per packet.
Move to mono.Now. Changing some of the calls to
use package mono has a cascading effect,
causing non-per-packet call sites to also switch.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-29 12:56:58 -07:00
Josh Bleecher Snyder
c2202cc27c net/tstun: use mono.Time
There's a call to Now once per packet.
Move to mono.Now.

Though the current implementation provides high precision,
we document it to be coarse, to preserve the ability
to switch to a coarse monotonic time later.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-29 12:56:58 -07:00
Josh Bleecher Snyder
142670b8c2 tstime/mono: new package
Package mono provides a fast monotonic time.

Its primary advantage is that it is fast:
It is approximately twice as fast as time.Now.
This is because time.Now uses two clock calls,
one for wall time and one for monotonic time.

We ask for the current time 4-6 times per network packet.
At ~50ns per call to time.Now, that's enough to show
up in CPU profiles.

Package mono is a first step towards addressing that.
It is designed to be a near drop-in replacement for package time.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-29 12:56:58 -07:00
Josh Bleecher Snyder
881bb8bcdc net/dns/resolver: allow an extra alloc for go closure allocation
Go 1.17 switches to a register ABI on amd64 platforms.
Part of that switch is that go and defer calls use an argument-less
closure, which allocates. This means that we have an extra
alloc in some DNS work. That's unfortunate but not a showstopper,
and I don't see a clear path to fixing it.
The other performance benefits from the register ABI will all
but certainly outweigh this extra alloc.

Fixes #2545

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-29 12:56:28 -07:00
Brad Fitzpatrick
b6179b9e83 net/dnsfallback: add new nodes
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-29 10:50:49 -07:00
Pratik
2d35737a7a
Dockerfile: remove extra COPY step (#2355)
Signed-off-by: pratikbalar <pratik@improwised.com>
2021-07-28 11:07:50 -07:00
Aaron Bieber
c179b9b535 cmd/tsshd: switch from github.com/kr/pty to github.com/creack/pty
The kr/pty module moved to creack/pty per the kr/pty README[1].

creack/pty brings in support for a number of OS/arch combos that
are lacking in kr/pty.

Run `go mod tidy` while here.

[1] https://github.com/kr/pty/blob/master/README.md

Signed-off-by: Aaron Bieber <aaron@bolddaemon.com>
2021-07-28 09:14:47 -07:00
Brad Fitzpatrick
690ade4ee1 ipn/ipnlocal: add URL to IP forwarding error message
Updates #606

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-28 08:00:53 -07:00
David Crawshaw
f414a9cc01 net/dns/resolver: EDNS OPT record off-by-one
I don't know how to get access to a real packet. Basing this commit
entirely off:

       +------------+--------------+------------------------------+
       | Field Name | Field Type   | Description                  |
       +------------+--------------+------------------------------+
       | NAME       | domain name  | MUST be 0 (root domain)      |
       | TYPE       | u_int16_t    | OPT (41)                     |
       | CLASS      | u_int16_t    | requestor's UDP payload size |
       | TTL        | u_int32_t    | extended RCODE and flags     |
       | RDLEN      | u_int16_t    | length of all RDATA          |
       | RDATA      | octet stream | {attribute,value} pairs      |
       +------------+--------------+------------------------------+

From https://datatracker.ietf.org/doc/html/rfc6891#section-6.1.2

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-07-27 16:39:27 -07:00
Josh Bleecher Snyder
1034b17bc7 net/tstun: buffer outbound channel
The handoff between tstun.Wrap's Read and poll methods
is one of the per-packet hotspots. It shows up in pprof.

Making outbound buffered increases throughput.

It is hard to measure exactly how much, because the numbers
are highly variable, but I'd estimate it at about 1%,
using the best observed max throughput across three runs.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-27 15:54:34 -07:00
Josh Bleecher Snyder
965dccd4fc net/tstun: buffer outbound channel
The handoff between tstun.Wrap's Read and poll methods
is one of the per-packet hotspots. It shows up in pprof.

Making outbound buffered increases throughput.

It is hard to measure exactly how much, because the numbers
are highly variable, but I'd estimate it at about 1%,
using the best observed max throughput across three runs.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-27 15:54:34 -07:00
Brad Fitzpatrick
7b9f02fcb1 cmd/tailscale/cli: document that empty string disable exit nodes, routes
Updates #2529

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-27 13:00:50 -07:00
Brad Fitzpatrick
d8d9036dbb tailcfg: add Node.PrimaryRoutes
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-27 12:09:40 -07:00
Brad Fitzpatrick
1b14e1d6bd version: bump date
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-27 08:05:17 -07:00
Denton Gentry
bf7ad05230 VERSION.txt: this is v1.13.0.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2021-07-27 07:15:59 -07:00
Brad Fitzpatrick
68df379a7d net/portmapper: rename ErrGatewayNotFound to ErrGatewayRange, reword text
It confused & scared people. And it was just bad.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 20:30:28 -07:00
Brad Fitzpatrick
aaf2df7ab1 net/{dnscache,interfaces}: use netaddr.IP.IsPrivate, delete copied code
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 20:30:28 -07:00
Christine Dodrill
dde8e28f00 disable vm tests on every commit to main
This experiment apparently failed.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-07-26 16:42:56 -07:00
Brad Fitzpatrick
c17d743886 net/dnscache: update a comment
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 16:16:08 -07:00
Brad Fitzpatrick
281d503626 net/dnscache: make Dialer try all resolved IPs
Tested manually with:

$ go test -v ./net/dnscache/ -dial-test=bogusplane.dev.tailscale.com:80

Where bogusplane has three A records, only one of which works.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 15:44:32 -07:00
Brad Fitzpatrick
dfa5e38fad control/controlclient: report whether we're in a snap package
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 15:16:40 -07:00
Brad Fitzpatrick
e299300b48 net/dnscache: cache all IPs per hostname
Not yet used in the dialer, but plumbed around.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 12:27:46 -07:00
Brad Fitzpatrick
7428ecfebd ipn/ipnlocal: populate Hostinfo.Package on Android
Fixes tailscale/corp#2266

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 10:35:37 -07:00
Brad Fitzpatrick
5c266bdb73 wgengine: re-set DNS config on Linux after a major link change
Updates #2458 (maybe fixes it)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-26 08:01:27 -07:00
julianknodt
3377089583 tsweb: add float64 to logged metrics
A previously added metric which was float64 was being ignored in tsweb, because it previously
only accepted int64 and ints. It can be handled in the same way as ints.

Signed-off-by: julianknodt <julianknodt@gmail.com>
2021-07-25 21:02:36 -07:00
Brad Fitzpatrick
53a2f63658 net/dns/resolver: race well-known resolvers less aggressively
Instead of blasting away at all upstream resolvers at the same time,
make a timing plan upon reconfiguration and have each upstream have an
associated start delay, depending on the overall forwarding config.

So now if you have two or four upstream Google or Cloudflare DNS
servers (e.g. two IPv4 and two IPv6), we now usually only send a
query, not four.

This is especially nice on iOS where we start fewer DoH queries and
thus fewer HTTP/1 requests (because we still disable HTTP/2 on iOS),
fewer sockets, fewer goroutines, and fewer associated HTTP buffers,
etc, saving overall memory burstiness.

Fixes #2436
Updates tailscale/corp#2250
Updates tailscale/corp#2238

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-25 20:45:47 -07:00
Brad Fitzpatrick
e94ec448a7 net/dns/resolver: add forwardQuery type as race work prep
Add a place to hang state in a future change for #2436.
For now this just simplifies the send signature without
any functional change.

Updates #2436

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-25 15:43:49 -07:00
Brad Fitzpatrick
064b916b1a net/dns/resolver: fix func used as netaddr.IP in printf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-25 15:21:51 -07:00
Joe Tsai
d145c594ad
util/deephash: improve cycle detection (#2470)
The previous algorithm used a map of all visited pointers.
The strength of this approach is that it quickly prunes any nodes
that we have ever visited before. The detriment of the approach
is that pruning is heavily dependent on the order that pointers
were visited. This is especially relevant for hashing a map
where map entries are visited in a non-deterministic manner,
which would cause the map hash to be non-deterministic
(which defeats the point of a hash).

This new algorithm uses a stack of all visited pointers,
similar to how github.com/google/go-cmp performs cycle detection.
When we visit a pointer, we push it onto the stack, and when
we leave a pointer, we pop it from the stack.
Before visiting a pointer, we first check whether the pointer exists
anywhere in the stack. If yes, then we prune the node.
The detriment of this approach is that we may hash a node more often
than before since we do not prune as aggressively.

The set of visited pointers up until any node is only the
path of nodes up to that node and not any other pointers
that may have been visited elsewhere. This provides us
deterministic hashing regardless of visit order.
We can now delete hashMapFallback and associated complexity,
which only exists because the previous approach was non-deterministic
in the presence of cycles.

This fixes a failure of the old algorithm where obviously different
values are treated as equal because the pruning was too aggresive.
See https://github.com/tailscale/tailscale/issues/2443#issuecomment-883653534

The new algorithm is slightly slower since it prunes less aggresively:
	name              old time/op    new time/op    delta
	Hash-8              66.1µs ± 1%    68.8µs ± 1%   +4.09%        (p=0.000 n=19+19)
	HashMapAcyclic-8    63.0µs ± 1%    62.5µs ± 1%   -0.76%        (p=0.000 n=18+19)
	TailcfgNode-8       9.79µs ± 2%    9.88µs ± 1%   +0.95%        (p=0.000 n=19+17)
	HashArray-8          643ns ± 1%     653ns ± 1%   +1.64%        (p=0.000 n=19+19)
However, a slower but more correct algorithm seems
more favorable than a faster but incorrect algorithm.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2021-07-22 15:22:48 -07:00
Brad Fitzpatrick
7b295f3d21 net/portmapper: disable UPnP on iOS for now
Updates #2495

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-22 13:33:38 -07:00
Brad Fitzpatrick
4a2c3e2a0a control/controlclient: grow goroutine debug buffer as needed
To not allocate 1MB up front on iOS.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-22 13:18:05 -07:00
Brad Fitzpatrick
1986d071c3 control/controlclient: don't use regexp in goroutine stack scrubbing
To reduce binary size on iOS.

Updates tailscale/corp#2238

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-22 13:18:05 -07:00
Christine Dodrill
60f34c70a2
tstest/integration/vms: disable rDNS for sshd on centos (#2492)
This prevents centos tests from timing out because sshd does reverse dns
lookups on every session being established instead of doing it once on
the acutal ssh connection being established. This is odd. Appending this
to the sshd config and restarting it seems to fix it though.

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-07-22 15:24:52 -04:00
Christine Dodrill
8db26a2261
tstest/integration/vms: disable nixos unstable (#2491)
cloud-init broke with the upgrade to python 3.9:
https://github.com/NixOS/nixpkgs/issues/131098

Signed-off-by: Christine Dodrill <xe@tailscale.com>
2021-07-22 15:16:11 -04:00
Brad Fitzpatrick
cecfc14875 net/dns: don't build init*.go on non-windows
To remove the regexp dep on iOS, notably.

Updates tailscale/corp#2238

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-22 11:58:42 -07:00
Brad Fitzpatrick
2968893add net/dns/resolver: bound DoH usage on iOS
Updates tailscale/corp#2238

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-22 10:54:24 -07:00
Brad Fitzpatrick
95a9adbb97 wgengine/netstack: implement UDP relaying to advertised subnets
TCP was done in 662fbd4a09664e849f0b898d1e8df13325d36efa.

This does the same for UDP.

Tested by hand. Integration tests will have to come later. I'd wanted
to do it in this commit, but the SOCKS5 server needed for interop
testing between two userspace nodes doesn't yet support UDP and I
didn't want to invent some whole new userspace packet injection
interface at this point, as SOCKS seems like a better route, but
that's its own bug.

Fixes #2302

RELNOTE=netstack mode can now UDP relay to subnets

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-21 22:32:26 -07:00
Brad Fitzpatrick
3daf27eaad net/dns/resolver: fall back to IPv6 for well-known DoH servers if v4 fails
Should help with IPv6-only environments when the tailnet admin
only specified IPv4 DNS IPs.

See https://github.com/tailscale/tailscale/issues/2447#issuecomment-884188562

Co-Author: Adrian Dewhurst <adrian@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-21 12:45:25 -07:00
Brad Fitzpatrick
74eee4de1c net/dns/resolver: use correct Cloudflare DoH hostnames
We were using the wrong ones for the malware & adult content
variants. Docs:

https://developers.cloudflare.com/1.1.1.1/1.1.1.1-for-families/setup-instructions/dns-over-https

Earlier commit which added them:
236eb4d04d33c43b0d73fb7372353cb26b62421b

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-21 12:24:36 -07:00
Joe Tsai
d666bd8533
util/deephash: disambiguate hashing of AppendTo (#2483)
Prepend size to AppendTo output.

Fixes #2443

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2021-07-21 11:29:08 -07:00
Joe Tsai
23ad028414
util/deephash: include type as part of hash for interfaces (#2476)
A Go interface may hold any number of different concrete types.
Just because two underlying values hash to the same thing
does not mean the two values are identical if they have different
concrete types. As such, include the type in the hash.
2021-07-21 10:26:04 -07:00
julianknodt
3a4201e773 net/portmapper: return correct upnp port
Previously, this was incorrectly returning the internal port, and using that with the external
exposed IP when it did not use WANIPConnection2. In the case when we must provide a port, we
return it instead.

Noticed this while implementing the integration test for upnp.

Signed-off-by: julianknodt <julianknodt@gmail.com>
2021-07-21 10:11:47 -07:00
Joe Tsai
a5fb8e0731
util/deephash: introduce deliberate instability (#2477)
Seed the hash upon first use with the current time.
This ensures that the stability of the hash is bounded within
the lifetime of one program execution.
Hopefully, this prevents future bugs where someone assumes that
this hash is stable.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2021-07-21 09:23:04 -07:00
Brad Fitzpatrick
ecac74bb65 wgengine/netstack: fix doc comment
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-21 08:25:05 -07:00
Brad Fitzpatrick
e4fecfe31d wgengine/{monitor,router}: restore Linux ip rules when systemd deletes them
Thanks.

Fixes #1591

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-20 15:52:22 -07:00
Josh Bleecher Snyder
0aa77ba80f tstest/integration: fix filch test flake
Filch doesn't like having multiple processes competing
for the same log files (#937).

Parallel integration tests were all using the same log files.

Add a TS_LOGS_DIR env var that the integration test can use
to use separate log files per test.

Fixes #2269

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-20 14:16:28 -07:00
Brad Fitzpatrick
ed8587f90d wgengine/router: take a link monitor
Prep for #1591 which will need to make Linux's router react to changes
that the link monitor observes.

The router package already depended on the monitor package
transitively. Now it's explicit.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-07-20 13:43:40 -07:00
Josh Bleecher Snyder
24db1a3c9b safesocket: print full lsof command on failure
This makes it easier to manually run the command
to discover why it is failing.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-07-20 13:35:31 -07:00