I noticed that failed tests were leaving aroudn stray tailscaled processes
on macOS at least.
To repro, add this to tstest/integration:
func TestFailInFewSeconds(t *testing.T) {
t.Parallel()
time.Sleep(3 * time.Second)
os.Exit(1)
t.Fatal("boom")
}
Those three seconds let the other parallel tests (with all their
tailscaled child processes) start up and start running their tests,
but then we violently os.Exit(1) the test driver and all the children
were kept alive (and were spinning away, using all available CPU in
gvisor scheduler code, which is a separate scary issue)
Updates #cleanup
Change-Id: I9c891ed1a1ec639fb2afec2808c04dbb8a460e0e
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
As a fallback to package managers, allow updating tailscale that was
self-installed in some way. There are some tricky bits around updating
the systemd unit (should we stick to local binary paths or to the ones
in tailscaled.service?), so leaving that out for now.
Updates #6995
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
They were entirely redundant and 1:1 with the status field
so this turns them into methods instead.
Updates #cleanup
Updates #1909
Change-Id: I7d939750749edf7dae4c97566bbeb99f2f75adbc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I'm not saying it works, but it compiles.
Updates #5794
Change-Id: I2f3c99732e67fe57a05edb25b758d083417f083e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The Windows Security Center is a component that manages the registration of
security products on a Windows system. Only products that have obtained a
special cert from Microsoft may register themselves using the WSC API.
Practically speaking, most vendors do in fact sign up for the program as it
enhances their legitimacy.
From our perspective, this is useful because it gives us a high-signal
source of information to query for the security products installed on the
system. I've tied this query into the osdiag package and is run during
bugreports.
It uses COM bindings that were automatically generated by my prototype
metadata processor, however that program still has a few bugs, so I had
to make a few manual tweaks. I dropped those binding into an internal
package because (for the moment, at least) they are effectively
purpose-built for the osdiag use case.
We also update the wingoes dependency to pick up BSTR.
Fixes#10646
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Adds ability to start Funnel in the foreground and stream incoming
connections. When foreground process is stopped, Funnel is turned
back off for the port.
Exampe usage:
```
TAILSCALE_FUNNEL_V2=on tailscale funnel 8080
```
Updates #8489
Signed-off-by: Marwan Sulaiman <marwan@tailscale.com>
If a node is flapping or otherwise generating lots of STUN endpoints, we
can end up caching a ton of useless values and sending them to peers.
Instead, let's apply a fixed per-Addr limit of endpoints that we cache,
so that we're only sending peers up to the N most recent.
Updates tailscale/corp#13890
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8079a05b44220c46da55016c0e5fc96dd2135ef8
This removes the unsafe/linkname and only uses the standard library.
It's a bit slower, for now, but https://go.dev/cl/518336 should get us
back.
On darwin/arm64, without https://go.dev/cl/518336
pkg: tailscale.com/tstime/mono
│ before │ after │
│ sec/op │ sec/op vs base │
MonoNow-8 16.20n ± 0% 19.75n ± 0% +21.92% (p=0.000 n=10)
TimeNow-8 39.46n ± 0% 39.40n ± 0% -0.16% (p=0.002 n=10)
geomean 25.28n 27.89n +10.33%
And with it,
MonoNow-8 16.34n ± 1% 16.93n ± 0% +3.67% (p=0.001 n=10)
TimeNow-8 39.55n ± 15% 38.46n ± 1% -2.76% (p=0.000 n=10)
geomean 25.42n 25.52n +0.41%
Updates #8839
Updates tailscale/go#70
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* We update wingoes to pick up new version information functionality
(See pe/version.go in the https://github.com/dblohm7/wingoes repo);
* We move the existing LogSupportInfo code (including necessary syscall
stubs) out of util/winutil into a new package, util/osdiag, and implement
the public LogSupportInfo function may be implemented for other platforms
as needed;
* We add a new reason argument to LogSupportInfo and wire that into
localapi's bugreport implementation;
* We add module information to the Windows implementation of LogSupportInfo
when reason indicates a bugreport. We enumerate all loaded modules in our
process, and for each one we gather debug, authenticode signature, and
version information.
Fixes#7802
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
The util/linuxfw/iptables.go had a bunch of code that wasn't yet used
(in prep for future work) but because of its imports, ended up
initializing code deep within gvisor that panicked on init on arm64
systems not using 4KB pages.
This deletes the unused code to delete the imports and remove the
panic. We can then cherry-pick this back to the branch and restore it
later in a different way.
A new test makes sure we don't regress in the future by depending on
the panicking package in question.
Fixes#8658
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This allows sending logs from the "logpolicy" package (and associated
callees) to something other than the log package. The behaviour for
tailscaled remains the same, passing in log.Printf
Updates #8249
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ie1d43b75fa7281933d9225bffd388462c08a5f31
When performing a fallback DNS query, run the recursive resolver in a
separate goroutine and compare the results returned by the recursive
resolver with the results we get from "regular" bootstrap DNS. This will
allow us to gather data about whether the recursive DNS resolver works
better, worse, or about the same as "regular" bootstrap DNS.
Updates #5853
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ifa0b0cc9eeb0dccd6f7a3d91675fe44b3b34bd48
This change is introducing new netfilterRunner interface and moving iptables manipulation to a lower leveled iptables runner.
For #391
Signed-off-by: KevinLiang10 <kevinliang@tailscale.com>
In order to improve our ability to understand the state of policies and
registry settings when troubleshooting, we enumerate all values in all subkeys.
x/sys/windows does not already offer this, so we need to call RegEnumValue
directly.
For now we're just logging this during startup, however in a future PR I plan to
also trigger this code during a bugreport. I also want to log more than just
registry.
Fixes#8141
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
The retry logic was pathological in the following ways:
* If we restarted the logging service, any pending uploads
would be placed in a retry-loop where it depended on backoff.Backoff,
which was too aggresive. It would retry failures within milliseconds,
taking at least 10 retries to hit a delay of 1 second.
* In the event where a logstream was rate limited,
the aggressive retry logic would severely exacerbate the problem
since each retry would also log an error message.
It is by chance that the rate of log error spam
does not happen to exceed the rate limit itself.
We modify the retry logic in the following ways:
* We now respect the "Retry-After" header sent by the logging service.
* Lacking a "Retry-After" header, we retry after a hard-coded period of
30 to 60 seconds. This avoids the thundering-herd effect when all nodes
try reconnecting to the logging service at the same time after a restart.
* We do not treat a status 400 as having been uploaded.
This is simply not the behavior of the logging service.
Updates #tailscale/corp#11213
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Signed-off-by: Chenyang Gao <gps949@outlook.com>
in commit 6e96744, the tsd system type has been added.
Which will cause the daemon will crash on some OSs (Windows, darwin and so on).
The root cause is that on those OSs, handleSubnetsInNetstack() will return true and set the conf.Router with a wrapper.
Later in NewUserspaceEngine() it will do subsystem set and found that early set router mismatch to current value, then panic.
This is part of an effort to clean up tailscaled initialization between
tailscaled, tailscaled Windows service, tsnet, and the mac GUI.
Updates #8036
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This holds back gvisor, kubernetes, goreleaser, and esbuild, which all
had breaking API changes.
Updates #8043
Updates #7381
Updates #8042 (updates u-root which adds deps)
Change-Id: I889759bea057cd3963037d41f608c99eb7466a5b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This change introduces address selection for wireguard only endpoints.
If a endpoint has not been used before, an address is randomly selected
to be used based on information we know about, such as if they are able
to use IPv4 or IPv6. When an address is initially selected, we also
initiate a new ICMP ping to the endpoints addresses to determine which
endpoint offers the best latency. This information is then used to
update which endpoint we should be using based on the best possible
route. If the latency is the same for a IPv4 and an IPv6 address, IPv6
will be used.
Updates #7826
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
On some platforms (notably macOS and iOS) we look up the default
interface to bind outgoing connections to. This is both duplicated
work and results in logspam when the default interface is not available
(i.e. when a phone has no connectivity, we log an error and thus cause
more things that we will try to upload and fail).
Fixed by passing around a netmon.Monitor to more places, so that we can
use its cached interface state.
Fixes#7850
Updates #7621
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
We're using it in more and more places, and it's not really specific to
our use of Wireguard (and does more just link/interface monitoring).
Also removes the separate interface we had for it in sockstats -- it's
a small enough package (we already pull in all of its dependencies
via other paths) that it's not worth the extra complexity.
Updates #7621
Updates #7850
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
Redoes the approach from #5550 and #7539 to explicitly pass in the logf
function, instead of having global state that can be overridden.
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
This splits Prometheus metric handlers exposed by tsweb into two
modules:
- `varz.Handler` exposes Prometheus metrics generated by our expvar
converter;
- `promvarz.Handler` combines our expvar-converted metrics and native
Prometheus metrics.
By default, tsweb will use the promvarz handler, however users can keep
using only the expvar converter. Specifically, `tailscaled` now uses
`varz.Handler` explicitly, which avoids a dependency on the
(heavyweight) Prometheus client.
Updates https://github.com/tailscale/corp/issues/10205
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The handler will expose built-in process and Go metrics by default,
which currently duplicate some of the expvar-proxied metrics
(`goroutines` vs `go_goroutines`, `memstats` vs `go_memstats`), but as
long as their names are different, Prometheus server will just scrape
both.
This will change /debug/varz behaviour for most tsweb binaries, but
notably not for control, which configures a `tsweb.VarzHandler`
[explicitly](a5b5d5167f/cmd/tailcontrol/tailcontrol.go (L779))
Updates https://github.com/tailscale/corp/issues/10205
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
I realized that a lot of the problems that we're seeing around migration and
LocalBackend state can be avoided if we drive Windows pref migration entirely
from within tailscaled. By doing it this way, tailscaled can automatically
perform the migration as soon as the connection with the client frontend is
established.
Since tailscaled is already running as LocalSystem, it already has access to
the user's local AppData directory. The profile manager already knows which
user is connected, so we simply need to resolve the user's prefs file and read
it from there.
Of course, to properly migrate this information we need to also check system
policies. I moved a bunch of policy resolution code out of the GUI and into
a new package in util/winutil/policy.
Updates #7626
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This adds the util/sysresources package, which currently only contains a
function to return the total memory size of the current system.
Then, we modify magicsock to scale the number of buffered DERP messages
based on the system's available memory, ensuring that we never use a
value lower than the previous constant of 32.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib763c877de4d0d4ee88869078e7d512f6a3a148d
When running a SOCKS or HTTP proxy, configure the tshttpproxy package to
drop those addresses from any HTTP_PROXY or HTTPS_PROXY environment
variables.
Fixes#7407
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I6cd7cad7a609c639780484bad521c7514841764b
This adds support to make exit nodes and subnet routers work
when in scenarios where NAT is required.
It also updates the NATConfig to be generated from a `wgcfg.Config` as
that handles merging prefs with the netmap, so it has the required information
about whether an exit node is already configured and whether routes are accepted.
Updates tailscale/corp#8020
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Since users can run tailscaled in a variety of ways (root, non-root,
non-root with process capabilities on Linux), this check will print the
current process permissions to the log to aid in debugging.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ida93a206123f98271a0c664775d0baba98b330c7
* wgengine/magicsock: add envknob to send CallMeMaybe to non-existent peer
For testing older client version responses to the PeerGone packet format change.
Updates #4326
Signed-off-by: Val <valerie@tailscale.com>
* derp: remove dead sclient struct member replaceLimiter
Leftover from an previous solution to the duplicate client problem.
Updates #2751
Signed-off-by: Val <valerie@tailscale.com>
* derp, derp/derphttp, wgengine/magicsock: add new PeerGone message type Not Here
Extend the PeerGone message type by adding a reason byte. Send a
PeerGone "Not Here" message when an endpoint sends a disco message to
a peer that this server has no record of.
Fixes#4326
Signed-off-by: Val <valerie@tailscale.com>
---------
Signed-off-by: Val <valerie@tailscale.com>
We were checking against the wrong directory, instead if we
have a custom store configured just use that.
Fixes#7588Fixes#7665
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This change focuses on the backend log ID, which is the mostly commonly
used in the client. Tests which don't seem to make use of the log ID
just use the zero value.
Signed-off-by: Will Norris <will@tailscale.com>