This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.
A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.
TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.
The first result is from commit 57856fc without TCP GSO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender
[ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver
The second result is from this commit with TCP GSO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Previously, we were registering TCP and UDP connections in the same map,
which could result in erroneously removing a mapping if one of the two
connections completes while the other one is still active.
Add a "proto string" argument to these functions to avoid this.
Additionally, take the "proto" argument in LocalAPI, and plumb that
through from the CLI and add a new LocalClient method.
Updates tailscale/corp#20600
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I35d5efaefdfbf4721e315b8ca123f0c8af9125fb
This moves NewContainsIPFunc from tsaddr to new ipset package.
And wgengine/filter types gets split into wgengine/filter/filtertype,
so netmap (and thus the CLI, etc) doesn't need to bring in ipset,
bart, etc.
Then add a test making sure the CLI deps don't regress.
Updates #1278
Change-Id: Ia246d6d9502bbefbdeacc4aef1bed9c8b24f54d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This refactors the logic for determining whether a packet should be sent
to the host or not into a function, and then adds tests for it.
Updates #11304
Updates #12448
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ief9afa98eaffae00e21ceb7db073c61b170355e5
Fix a bug where, for a subnet router that advertizes
4via6 route, all packets with a source IP matching
the 4via6 address were being sent to the host itself.
Instead, only send to host packets whose destination
address is host's local address.
Fixestailscale/tailscale#12448
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
This adds a new ListenPacket function on tsnet.Server
which acts mostly like `net.ListenPacket`.
Unlike `Server.Listen`, this requires listening on a
specific IP and does not automatically listen on both
V4 and V6 addresses of the Server when the IP is unspecified.
To test this, it also adds UDP support to tsdial.Dialer.UserDial
and plumbs it through the localapi. Then an associated test
to make sure the UDP functionality works from both sides.
Updates #12182
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Fixestailscale/tailscale#10393Fixestailscale/corp#15412Fixestailscale/corp#19808
On Apple platforms, exit nodes and subnet routers have been unable to relay pings from Tailscale devices to non-Tailscale devices due to sandbox restrictions imposed on our network extensions by Apple. The sandbox prevented the code in netstack.go from spawning the `ping` process which we were using.
Replace that exec call with logic to send an ICMP echo request directly, which appears to work in userspace, and not trigger a sandbox violation in the syslog.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Previously, a node that was advertising a 4via6 route wouldn't be able
to make use of that same route; the packet would be delivered to
Tailscale, but since we weren't accepting it in handleLocalPackets, the
packet wouldn't be delivered to netstack and would never hit the 4via6
logic. Let's add that support so that usage of 4via6 is consistent
regardless of where the connection is initiated from.
Updates #11304
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ic28dc2e58080d76100d73b93360f4698605af7cb
I'm on a mission to simplify LocalBackend.Start and its locking
and deflake some tests.
I noticed this hasn't been used since March 2023 when it was removed
from the Windows client in corp 66be796d33c.
So, delete.
Updates #11649
Change-Id: I40f2cb75fb3f43baf23558007655f65a8ec5e1b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It was used when we only supported subnet routers on linux
and would nil out the SubnetRoutes slice as no other router
worked with it, but now we support subnet routers on ~all platforms.
The field it was setting to nil is now only used for network logging
and nowhere else, so keep the field but drop the SubnetRouterWrapper
as it's not useful.
Updates #cleanup
Change-Id: Id03f9b6ec33e47ad643e7b66e07911945f25db79
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This change updates all tailfs functions and the majority of the tailfs
variables to use the new drive naming.
Updates tailscale/corp#16827
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
This change updates the tailfs file and package names to their new
naming convention.
Updates #tailscale/corp#16827
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
This fixes a bug that was introduced in #11258 where the handling of the
per-client limit didn't properly account for the fact that the gVisor
TCP forwarder will return 'true' to indicate that it's handled a
duplicate SYN packet, but not launch the handler goroutine.
In such a case, we neither decremented our per-client limit in the
wrapper function, nor did we do so in the handler function, leading to
our per-client limit table slowly filling up without bound.
Fix this by doing the same duplicate-tracking logic that the TCP
forwarder does so we can detect such cases and appropriately decrement
our in-flight counter.
Updates tailscale/corp#12184
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib6011a71d382a10d68c0802593f34b8153d06892
The `stack.PacketBufferPtr` type no longer exists; replace it with
`*stack.PacketBuffer` instead.
Updates #8043
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib56ceff09166a042aa3d9b80f50b2aa2d34b3683
This fixes a regression introduced with 993acf4 and released in
v1.60.0.
The regression caused us to intercept all userspace traffic to port
8080 which prevented users from exposing their own services to their
tailnet at port 8080.
Now, we only intercept traffic to port 8080 if it's bound for
100.100.100.100 or fd7a:115c:a1e0::53.
Fixes#11283
Signed-off-by: Percy Wegmann <percy@tailscale.com>
(cherry picked from commit 17cd0626f3)
This is a fun one. Right now, when a client is connecting through a
subnet router, here's roughly what happens:
1. The client initiates a connection to an IP address behind a subnet
router, and sends a TCP SYN
2. The subnet router gets the SYN packet from netstack, and after
running through acceptTCP, starts DialContext-ing the destination IP,
without accepting the connection¹
3. The client retransmits the SYN packet a few times while the dial is
in progress, until either...
4. The subnet router successfully establishes a connection to the
destination IP and sends the SYN-ACK back to the client, or...
5. The subnet router times out and sends a RST to the client.
6. If the connection was successful, the client ACKs the SYN-ACK it
received, and traffic starts flowing
As a result, the notification code in forwardTCP never notices when a
new connection attempt is aborted, and it will wait until either the
connection is established, or until the OS-level connection timeout is
reached and it aborts.
To mitigate this, add a per-client limit on how many in-flight TCP
forwarding connections can be in-progress; after this, clients will see
a similar behaviour to the global limit, where new connection attempts
are aborted instead of waiting. This prevents a single misbehaving
client from blocking all other clients of a subnet router by ensuring
that it doesn't starve the global limiter.
Also, bump the global limit again to a higher value.
¹ We can't accept the connection before establishing a connection to the
remote server since otherwise we'd be opening the connection and then
immediately closing it, which breaks a bunch of stuff; see #5503 for
more details.
Updates tailscale/corp#12184
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I76e7008ddd497303d75d473f534e32309c8a5144
- add a clientmetric with a counter of TCP forwarder drops due to the
max attempts;
- fix varz metric types, as they are all counters.
Updates #8210
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
FileSystemForLocal was listening on the node's Tailscale address,
which potentially exposes the user's view of TailFS shares to other
Tailnet users. Remote nodes should connect to exported shares via
the peerapi.
This removes that code so that FileSystemForLocal is only avaialable
on 100.100.100.100:8080.
Updates tailscale/corp#16827
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Adds support for node attribute tailfs:access. If this attribute is
not present, Tailscale will not accept connections to the local TailFS
server at 100.100.100.100:8080.
Updates tailscale/corp#16827
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Add a WebDAV-based folder sharing mechanism that is exposed to local clients at
100.100.100.100:8080 and to remote peers via a new peerapi endpoint at
/v0/tailfs.
Add the ability to manage folder sharing via the new 'share' CLI sub-command.
Updates tailscale/corp#16827
Signed-off-by: Percy Wegmann <percy@tailscale.com>
When tailscaled is run with "-debug 127.0.0.1:12345", these metrics are
available at:
http://localhost:12345/debug/metrics
Updates #8210
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I19db6c445ac1f8344df2bc1066a3d9c9030606f8
Prior to an earlier netstack bump this code used a string conversion
path to cover multiple cases of behavior seemingly checking for
unspecified addresses, adding unspecified addresses to v6. The behavior
is now crashy in netstack, as it is enforcing address length in various
areas of the API, one in particular being address removal.
As netstack is now protocol specific, we must not create invalid
protocol addresses - an address is v4 or v6, and the address value
contained inside must match. If a control path attempts to do something
otherwise it is now logged and skipped rather than incorrect addressing
being added.
Fixestailscale/corp#15377
Signed-off-by: James Tucker <james@tailscale.com>
Prepare for path MTU discovery by splitting up the concept of
DefaultMTU() into the concepts of the Tailscale TUN MTU, MTUs of
underlying network interfaces, minimum "safe" TUN MTU, user configured
TUN MTU, probed path MTU to a peer, and maximum probed MTU. Add a set
of likely MTUs to probe.
Updates #311
Signed-off-by: Val <valerie@tailscale.com>
Use buffer pools for UDP packet forwarding to prepare for increasing the
forwarded UDP packet size for peer path MTU discovery.
Updates #311
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Val <valerie@tailscale.com>
We weren't correctly retrying truncated requests to an upstream DNS
server with TCP. Instead, we'd return a truncated request to the user,
even if the user was querying us over TCP and thus able to handle a
large response.
Also, add an envknob and controlknob to allow users/us to disable this
behaviour if it turns out to be buggy (✨ DNS ✨).
Updates #9264
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ifb04b563839a9614c0ba03e9c564e8924c1a2bfd
Prepare for path MTU discovery by splitting up the concept of
DefaultMTU() into the concepts of the Tailscale TUN MTU, MTUs of
underlying network interfaces, minimum "safe" TUN MTU, user configured
TUN MTU, probed path MTU to a peer, and maximum probed MTU. Add a set
of likely MTUs to probe.
Updates #311
Signed-off-by: Val <valerie@tailscale.com>
Use buffer pools for UDP packet forwarding to prepare for increasing the
forwarded UDP packet size for peer path MTU discovery.
Updates #311
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Val <valerie@tailscale.com>
And convert all callers over to the methods that check SelfNode.
Now we don't have multiple ways to express things in tests (setting
fields on SelfNode vs NetworkMap, sometimes inconsistently) and don't
have multiple ways to check those two fields (often only checking one
or the other).
Updates #9443
Change-Id: I2d7ba1cf6556142d219fae2be6f484f528756e3c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It had exactly one user: netstack. Just have LocalBackend notify
netstack when here's a new netmap instead, simplifying the bloated
Engine interface that has grown a bunch of non-Engine-y things.
(plenty of rando stuff remains after this, but it's a start)
Updates #cleanup
Change-Id: I45e10ab48119e962fc4967a95167656e35b141d8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The netstack code had a bunch of logic to figure out if the LocalBackend should handle an
incoming connection and then would call the function directly on LocalBackend. Move that
logic to LocalBackend and refactor the methods to return conn handlers.
Updates #cleanup
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Various BSD-derived operating systems including macOS and FreeBSD
require that ping6 be used for IPv6 destinations. The "ping" command
does not understand an IPv6 destination.
FreeBSD 13.x and later do handle IPv6 in the regular ping command,
but also retain a ping6 command. We use ping6 on all versions of
FreeBSD.
Fixes https://github.com/tailscale/tailscale/issues/8225
Signed-off-by: Denton Gentry <dgentry@tailscale.com>