tailscaled was using 100% CPU on a machine with ~1M lines, 100MB+
of /proc/net/route data.
Two problems: in likelyHomeRouterIPLinux, we didn't stop reading the
file once we found the default route (which is on the first non-header
line when present). Which meant it was finding the answer and then
parsing 100MB over 1M lines unnecessarily. Second was that if the
default route isn't present, it'd read to the end of the file looking
for it. If it's not in the first 1,000 lines, it ain't coming, or at
least isn't worth having. (it's only used for discovering a potential
UPnP/PMP/PCP server, which is very unlikely to be present in the
environment of a machine with a ton of routes)
Change-Id: I2c4a291ab7f26aedc13885d79237b8f05c2fd8e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It was broken on Windows:
Error: util\winutil\winutil_windows.go:15:7: regBase redeclared in this block
Error: D:\a\tailscale\tailscale\util\winutil\winutil_notwindows.go:7:17: previous declaration
Error: util\winutil\winutil_windows.go:29:6: getRegString redeclared in this block
Error: D:\a\tailscale\tailscale\util\winutil\winutil_notwindows.go:9:40: previous declaration
Error: util\winutil\winutil_windows.go:47:6: getRegInteger redeclared in this block
Error: D:\a\tailscale\tailscale\util\winutil\winutil_notwindows.go:11:48: previous declaration
Error: util\winutil\winutil_windows.go:77:6: isSIDValidPrincipal redeclared in this block
Error: D:\a\tailscale\tailscale\util\winutil\winutil_notwindows.go:13:38: previous declaration
Change-Id: Ib1ce4b647f5711547840c736b933a6c42bf09583
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Our current workaround made the user check too lax, thus allowing deleted
users. This patch adds a helper function to winutil that checks that the
uid's SID represents a valid Windows security principal.
Now if `lookupUserFromID` determines that the SID is invalid, we simply
propagate the error.
Updates https://github.com/tailscale/tailscale/issues/869
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
We're finding a bunch of host operating systems/firewalls interact poorly
with peerapi. We either get ICMP errors from the host or users need to run
commands to allow the peerapi port:
https://github.com/tailscale/tailscale/issues/3842#issuecomment-1025133727
... even though the peerapi should be an internal implementation detail.
Rather than fight the host OS & firewalls, this change handles the
server side of peerapi entirely in netstack (except on iOS), so it
never makes its way to the host OS where it might be messed with. Two
main downsides are:
1) netstack isn't as fast, but we don't really need speed for peerapi.
And actually, with fewer trips to/from the kernel, we might
actually make up for some of the netstack performance loss by
staying in userspace.
2) tcpdump / Wireshark etc packet captures will no longer see the peerapi
traffic. Oh well. Crawshaw's been wanting to add packet capture server
support to tailscaled, so we'll probably do that sooner now.
A future change might also then use peerapi for the client-side
(except on iOS).
Updates #3842 (probably fixes, as well as many exit node issues I bet)
Change-Id: Ibc25edbb895dc083d1f07bd3cab614134705aa39
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Also fix a somewhat related printing bug in the process where
some paths would print "Success." inconsistently even
when there otherwise was no output (in the EditPrefs path)
Fixes#3830
Updates #3702 (which broke it once while trying to fix it)
Change-Id: Ic51e14526ad75be61ba00084670aa6a98221daa5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Now that Go 1.17 has module graph pruning
(https://go.dev/doc/go1.17#go-command), we should be able to use
upstream netstack without breaking our private repo's build
that then depends on the tailscale.com Go module.
This is that experiment.
Updates #1518 (the original bug to break out netstack to own module)
Updates #2642 (this updates netstack, but doesn't remove workaround)
Change-Id: I27a252c74a517053462e5250db09f379de8ac8ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Salamanders also have no scales. I checked the interweb, and there
doesn't seem to be any subspecies that would let us claim that
*some* salamanders are scaley.
But they are tailey, for sure.
Signed-off-by: David Anderson <danderson@tailscale.com>
So you can run Caddy etc as a non-root user and let it have access to
get certs.
Updates caddyserver/caddy#4541
Change-Id: Iecc5922274530e2b00ba107d4b536580f374109b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So Linux/etc CLI users get helpful advice to run tailscale
with --operator=$USER when they try to 'tailscale file {cp,get}'
but are mysteriously forbidden.
Signed-off-by: David Eger <eger@google.com>
Signed-off-by: David Eger <david.eger@gmail.com>
Disabled by default.
To use, run tailscaled with:
TS_SSH_ALLOW_LOGIN=you@bar.com
And enable with:
$ TAILSCALE_USE_WIP_CODE=true tailscale up --ssh=true
Then ssh [any-user]@[your-tailscale-ip] for a root bash shell.
(both the "root" and "bash" part are temporary)
Updates #3802
Change-Id: I268f8c3c95c8eed5f3231d712a5dc89615a406f0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
A new package can also later record/report which knobs are checked and
set. It also makes the code cleaner & easier to grep for env knobs.
Change-Id: Id8a123ab7539f1fadbd27e0cbeac79c2e4f09751
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Surveying the fleet prior to turning off old/unused/insecure
TLS versions.
Updates tailscale/corp#3615
Signed-off-by: David Anderson <danderson@tailscale.com>
Mudpuppies are salamanders, and as such have tails but no scales.
The management apologizes for the error.
Signed-off-by: David Anderson <danderson@tailscale.com>
Currently only search domains are stored. This was an oversight
(under?) on my part.
As things are now, when MagicDNS is on and "Override local DNS" is
off, the dns forwarder has to timeout before names resolve. This
introduces a pretty annoying lang that makes everything feel
extremely slow. You will also see an error: "upstream nameservers
not set".
I tested with "Override local DNS" on and off. In both situations
things seem to function as expected (and quickly).
Signed-off-by: Aaron Bieber <aaron@bolddaemon.com>
This fixes a deadlock on shutdown.
One goroutine is waiting to send on c.derpRecvCh before unlocking c.mu.
The other goroutine is waiting to lock c.mu before receiving from c.derpRecvCh.
#3736 has a more detailed explanation of the sequence of events.
Fixes#3736
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
-W is milliseconds on darwin, not seconds, and empirically it's
milliseconds after a 1 second base.
Change-Id: I2520619e6699d9c505d9645ce4dfee4973555227
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
With this change, the client can obtain the initial handshake message
separately from the rest of the handshake, for embedding into another
protocol. This enables things like RTT reduction by stuffing the
handshake initiation message into an HTTP header.
Similarly, the server API optionally accepts a pre-read Noise initiation
message, in addition to reading the message directly off a net.Conn.
Updates #3488
Signed-off-by: David Anderson <danderson@tailscale.com>
This test set the bar too high.
Just a couple of missed timers was enough to fail.
Change the test to more of a sanity check.
While we're here, run it for just 1s instead of 5s.
Prior to this change, on a 13" M1 MPB, with
stress -p 512 ./rate.test -test.run=QPS
I saw 90%+ failures.
After this change, I'm at 30k runs with no failures yet.
Fixes#3733
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Go 1.17 added a HandshakeContext func to take care of timeouts during
TLS handshaking, so switch from our homegrown goroutine implementation
to the standard way.
Signed-off-by: David Anderson <danderson@tailscale.com>
Cancelling the context makes the timeout goroutine race with the write that
reports a successful TLS handshake, so you can end up with a successful TLS
handshake that mysteriously reports that it timed out after ~0s in flight.
The context is always canceled and cleaned up as the function exits, which
happens mere microseconds later, so just let function exit clean up and
thereby avoid races.
Signed-off-by: David Anderson <danderson@tailscale.com>
This started as an attempt to placate GitHub's code scanner,
but it's also probably generally a good idea.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Turning this on at the beginning of the 1.21.x dev cycle, for 1.22.
Updates #150
Change-Id: I1de567cfe0be3df5227087de196ab88e60c9eb56
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>