This is a fun one. Right now, when a client is connecting through a
subnet router, here's roughly what happens:
1. The client initiates a connection to an IP address behind a subnet
router, and sends a TCP SYN
2. The subnet router gets the SYN packet from netstack, and after
running through acceptTCP, starts DialContext-ing the destination IP,
without accepting the connection¹
3. The client retransmits the SYN packet a few times while the dial is
in progress, until either...
4. The subnet router successfully establishes a connection to the
destination IP and sends the SYN-ACK back to the client, or...
5. The subnet router times out and sends a RST to the client.
6. If the connection was successful, the client ACKs the SYN-ACK it
received, and traffic starts flowing
As a result, the notification code in forwardTCP never notices when a
new connection attempt is aborted, and it will wait until either the
connection is established, or until the OS-level connection timeout is
reached and it aborts.
To mitigate this, add a per-client limit on how many in-flight TCP
forwarding connections can be in-progress; after this, clients will see
a similar behaviour to the global limit, where new connection attempts
are aborted instead of waiting. This prevents a single misbehaving
client from blocking all other clients of a subnet router by ensuring
that it doesn't starve the global limiter.
Also, bump the global limit again to a higher value.
¹ We can't accept the connection before establishing a connection to the
remote server since otherwise we'd be opening the connection and then
immediately closing it, which breaks a bunch of stuff; see #5503 for
more details.
Updates tailscale/corp#12184
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I76e7008ddd497303d75d473f534e32309c8a5144
This is so that if a backend Service gets created after the Ingress, it gets picked up by the operator.
Updates tailscale/tailscale#11251
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Anton Tolchanov <1687799+knyar@users.noreply.github.com>
Containerboot container created for operator's ingress and egress proxies
are now always configured by passing a configfile to tailscaled
(tailscaled --config <configfile-path>.
It does not run 'tailscale set' or 'tailscale up'.
Upgrading existing setups to this version as well as
downgrading existing setups at this version works.
Updates tailscale/tailscale#10869
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Add logic to autogenerate CRD docs.
.github/workflows/kubemanifests.yaml CI workflow will fail if the doc is out of date with regard to the current CRDs.
Docs can be refreshed by running make kube-generate-all.
Updates tailscale/tailscale#11023
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
On Alpine, we add the tailscale service but fail to call start.
This means that tailscale does not start up until the user reboots the machine.
Fixes#11161
Signed-off-by: Keli Velazquez <keli@tailscale.com>
Not yet used. This is being made available so magicsock/wgengine can
use it to ignore certain sends (UDP + DERP) later on at least mobile,
letting wireguard-go think it's doing its full attempt schedule, but
we can cut it short conditionally based on what we know from the
control plane.
Updates #7617
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ia367cf6bd87b2aeedd3c6f4989528acdb6773ca7
Otherwise on OS retransmits, we'd make redundant timers in Go's timer
heap that upon firing just do nothing (well, grab a mutex and check a
map and see that there's nothing to do).
Updates #cleanup
Change-Id: Id30b8b2d629cf9c7f8133a3f7eca5dc79e81facb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
No need to hold wgLock while using the device to LookupPeer;
that has its own mutex already.
Updates #cleanup
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ib56049fcc7163cf5a2c2e7e12916f07b4f9d67cb
It's unnecessary. Returning an array value is already a copy.
Updates #cleanup
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: If7f350b61003ea08f16a531b7b4e8ae483617939
When reverse path filtering is in strict mode on Linux, using an exit
node blocks all network connectivity. This change adds a warning about
this to `tailscale status` and the logs.
Example in `tailscale status`:
```
- not connected to home DERP region 22
- The following issues on your machine will likely make usage of exit nodes impossible: [interface "eth0" has strict reverse-path filtering enabled], please set rp_filter=2 instead of rp_filter=1; see https://github.com/tailscale/tailscale/issues/3310
```
Example in the logs:
```
2024/02/21 21:17:07 health("overall"): error: multiple errors:
not in map poll
The following issues on your machine will likely make usage of exit nodes impossible: [interface "eth0" has strict reverse-path filtering enabled], please set rp_filter=2 instead of rp_filter=1; see https://github.com/tailscale/tailscale/issues/3310
```
Updates #3310
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Tailscaled becomes inoperative if the Tailscale Tunnel wintun adapter is abruptly removed.
wireguard-go closes the device in case of a read error, but tailscaled keeps running.
This adds detection of a closed WireGuard device, triggering a graceful shutdown of tailscaled.
It is then restarted by the tailscaled watchdog service process.
Fixes#11222
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The WinTun adapter may have been removed by the time we're closing
the dns.windowsManager, and its associated interface registry key might
also have been deleted. We shouldn't use winutil.OpenKeyWait and wait
for the interface key to appear when performing a cleanup as a part of
the windowsManager shutdown.
Updates #11222
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Starts using peer capabilities to restrict the management client
on a per-view basis. This change also includes a bulky cleanup
of the login-toggle.tsx file, which was getting pretty unwieldy
in its previous form.
Updates tailscale/corp#16695
Signed-off-by: Sonia Appasamy <sonia@tailscale.com>
This change adds a new apiHandler struct for use from serveAPI
to aid with restricting endpoints to specific peer capabilities.
Updates tailscale/corp#16695
Signed-off-by: Sonia Appasamy <sonia@tailscale.com>
- add a clientmetric with a counter of TCP forwarder drops due to the
max attempts;
- fix varz metric types, as they are all counters.
Updates #8210
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Instead of modeling remote WebDAV servers as actual
webdav.FS instances, we now just proxy traffic to them.
This not only simplifies the code, but it also allows
WebDAV locking to work correctly by making sure locks are
handled by the servers that need to (i.e. the ones actually
serving the files).
Updates tailscale/corp#16827
Signed-off-by: Percy Wegmann <percy@tailscale.com>
That's already the default. Avoid the overhead of writing it on one
side and reading it on the other to do nothing.
Updates #cleanup (noticed while researching something else)
Change-Id: I449c88a022271afb9be5da876bfaf438fe5d3f58
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If a client socket is remotely lost but the client is not sent an RST in
response to the next request, the socket might sit in RTO for extended
lengths of time, resulting in "no internet" for users. Instead, timeout
after 10s, which will close the underlying socket, recovering from the
situation more promptly.
Updates #10967
Signed-off-by: James Tucker <james@tailscale.com>
I missed a case in the earlier patch, and so we're still sending 15s TCP
keepalive for TLS connections, now adjusted there too.
Updates tailscale/corp#17587
Updates #3363
Signed-off-by: James Tucker <james@tailscale.com>
This appears to be one of the contributors to this CI target regularly
entering a bad state with a partially written toolchain.
Updates #self
Signed-off-by: James Tucker <james@tailscale.com>
We don't need a log line every time defaultRoute is read in the good
case, and we now only log default interface updates that are actually
changes.
Updates #3363
Signed-off-by: James Tucker <james@tailscale.com>
Update vite to 5.1.4, and vitest to 1.3.1 (their latest versions). Also
remove vite-plugin-rewrite-all as this is no longer necessary with vite
5.x and has a dependency on vite 4.x.
Updates https://github.com/tailscale/corp/issues/17715
Signed-off-by: Mario Minardi <mario@tailscale.com>
This adds details on how to configure node attributes to allow
sharing and accessing shares.
Updates tailscale/corp#16827
Signed-off-by: Percy Wegmann <percy@tailscale.com>
An increasing number of users have very large subnet route
configurations, which can produce very large amounts of log data when
WireGuard is reconfigured. The logs don't contain the actual routes, so
they're largely useless for diagnostics, so we'll just suppress them.
Fixestailscale/corp#17532
Signed-off-by: James Tucker <james@tailscale.com>
Update plugin-react-swc to the latest version (3.6.0) ahead of updating vite to 5.x.
Updates https://github.com/tailscale/corp/issues/17715
Signed-off-by: Mario Minardi <mario@tailscale.com>
Update vite-plugin-svgr to the latest version (4.2.0) ahead of updating
vite to 5.x. This is a major version bump from our previous 3.x, and
requires changing the import paths used for SVGs.
Updates https://github.com/tailscale/corp/issues/17715
Signed-off-by: Mario Minardi <mario@tailscale.com>
The derper sends an in-protocol keepalive every 60-65s, so frequent TCP
keepalives are unnecessary. In this tuning TCP keepalives should never
occur for a DERP client connection, as they will send an L7 keepalive
often enough to always reset the TCP keepalive timer. If however a
connection does not receive an ACK promptly it will now be shutdown,
which happens sooner than it would with a normal TCP keepalive tuning.
This re-tuning reduces the frequency of network traffic from derp to
client, reducing battery cost.
Updates tailscale/corp#17587
Updates #3363
Signed-off-by: James Tucker <james@tailscale.com>
Updates ENG-2133. Adds the ResetToDefaults visibility policy currently only available on macOS, so that the Windows client can read its value.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
So derpers can check an external URL for whether to permit access
to a certain public key.
Updates tailscale/corp#17693
Change-Id: I8594de58f54a08be3e2dbef8bcd1ff9b728ab297
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This allows coverage from tests that hit multiple packages at once
to be reflected in all those packages' coverage.
Updates #cleanup
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Setting a user timeout will be a more practical tuning knob for a number
of endpoints, this provides a way to set it.
Updates tailscale/corp#17587
Signed-off-by: James Tucker <james@tailscale.com>
So we can probe load balancers by their unique DNS name but without
asking for that cert name.
Updates tailscale/corp#13050
Change-Id: Ie4c0a2f951328df64281ed1602b4e624e3c8cf2e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
From a packet trace we have seen log connections being closed
prematurely by the client, resulting in unnecessary extra TLS setup
traffic.
Updates #3363
Updates tailscale/corp#9230
Updates tailscale/corp#8564
Signed-off-by: James Tucker <james@tailscale.com>
This fixes an infinite loop caused by the configuration of
systemd-resolved on Amazon Linux 2023 and how that interacts with
Tailscale's "direct" mode. We now drop the Tailscale service IP from the
OS's "base configuration" when we detect this configuration.
Updates #7816
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I73a4ea8e65571eb368c7e179f36af2c049a588ee
Adds logic in gocross to detect environment variables and pass the right flags so that the backend can be built with the visionOS SDK.
Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Instead of constructing the `ip:port` string ourselves, use
netip.AddrPortFrom which handles IPv6 correctly.
Updates #11164
Signed-off-by: Will Norris <will@tailscale.com>
Small fix to make sure doctor API endpoint returns correctly - I spotted it when checking my tailscaled node and noticed it was handled slightly different compare to the rest
Signed-off-by: San <santrancisco@users.noreply.github.com>