When in tun mode on Linux, AllowedIPs are not automatically added to
netstack because the kernel is responsible for handling subnet routes.
This ensures that virtual IPs are always added to netstack.
When in tun mode, pings were also not being handled, so this adds
explicit support for ping as well.
Fixestailscale/corp#26387
Change-Id: I6af02848bf2572701288125f247d1eaa6f661107
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Cubic performs better than Reno in higher BDP scenarios, and enables the
use of the hystart++ implementation contributed by Coder. This improves
throughput on higher BDP links with a much faster ramp.
gVisor is bumped as well for some fixes related to send queue processing
and RTT tracking.
Updates #9707
Updates #10408
Updates #12393
Updates tailscale/corp#24483
Updates tailscale/corp#25169
Signed-off-by: James Tucker <james@tailscale.com>
Originally identified by Coder and documented in their blog post, this
implementation differs slightly as our link endpoint was introduced for
a different purpose, but the behavior is the same: apply backpressure
rather than dropping packets. This reduces the negative impact of large
packet count bursts substantially. An alternative would be to swell the
size of the channel buffer substantially, however that's largely just
moving where buffering occurs and may lead to reduced signalling back to
lower layer or upstream congestion controls.
Updates #9707
Updates #10408
Updates #12393
Updates tailscale/corp#24483
Updates tailscale/corp#25169
Signed-off-by: James Tucker <james@tailscale.com>
The gVisor RACK implementation appears to perfom badly, particularly in
scenarios with higher BDP. This may have gone poorly noticed as a result
of it being gated on SACK, which is not enabled by default in upstream
gVisor, but itself has a higher positive impact on performance. Both the
RACK and DACK implementations (which are now one) have overlapping
non-completion of tasks in their work streams on the public tracker.
Updates #9707
Signed-off-by: James Tucker <james@tailscale.com>
Some natc instances have been observed with excessive memory growth,
dominant in gvisor buffers. It is likely that the connection buffers are
sticking around for too long due to the default long segment time, and
uptuned buffer size applied by default in wgengine/netstack. Apply
configurations in natc specifically which are a better match for the
natc use case, most notably a 5s maximum segment lifetime.
Updates tailscale/corp#25169
Signed-off-by: James Tucker <james@tailscale.com>
This commit intend to provide support for TCP and Web VIP services and also allow user to use Tun
for VIP services if they want to.
The commit includes:
1.Setting TCP intercept function for VIP Services.
2.Update netstack to send packet written from WG to netStack handler for VIP service.
3.Return correct TCP hander for VIP services when netstack acceptTCP.
This commit also includes unit tests for if the local backend setServeConfig would set correct TCP intercept
function and test if a hander gets returned when getting TCPHandlerForDst. The shouldProcessInbound
check is not unit tested since the test result just depends on mocked functions. There should be an integration
test to cover shouldProcessInbound and if the returned TCP handler actually does what the serveConfig says.
Updates tailscale/corp#24604
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
Fixes#14492
-----
Developer Certificate of Origin
Version 1.1
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
Change-Id: I6dc1068d34bbfa7477e7b7a56a4325b3868c92e1
Signed-off-by: Marc Paquette <marcphilippaquette@gmail.com>
We have several places where LocalBackend instances are created for testing, but they are rarely shut down
when the tests that created them exit.
In this PR, we update newTestLocalBackend and similar functions to use testing.TB.Cleanup(lb.Shutdown)
to ensure LocalBackend instances are properly shut down during test cleanup.
Updates #12687
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This gets close to all of the remaining ones.
Updates #12912
Change-Id: I9c672bbed2654a6c5cab31e0cbece6c107d8c6fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
A filesystem was plumbed into netstack in 993acf4475b22d693
but hasn't been used since 2d5d6f5403f3. Remove it.
Noticed while rebasing a Tailscale fork elsewhere.
Updates tailscale/corp#16827
Change-Id: Ib76deeda205ffe912b77a59b9d22853ebff42813
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
this commit changes usermetrics to be non-global, this is a building
block for correct metrics if a go process runs multiple tsnets or
in tests.
Updates #13420
Updates tailscale/corp#22075
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
When the TS_DEBUG_NETSTACK_LOOPBACK_PORT environment variable is set,
netstack will loop back (dnat to addressFamilyLoopback:loopbackPort)
TCP & UDP flows originally destined to localServicesIP:loopbackPort.
localServicesIP is quad-100 or the IPv6 equivalent.
Updates tailscale/corp#22713
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This was previously disabled in 8e42510 due to missing GSO-awareness in
tstun, which was resolved in d097096.
Updates tailscale/corp#22511
Signed-off-by: Jordan Whited <jordan@tailscale.com>
net/tstun.Wrapper.InjectInboundPacketBuffer is not GSO-aware, which can
break quad-100 TCP streams as a result. Linux is the only platform where
gVisor GSO was previously enabled.
Updates tailscale/corp#22511
Updates #13211
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In df6014f1d7bf437adf239b75a62fd4c2f389ea2a we removed build tag
gating preventing importation, which tripped a NetworkExtension limit
test in corp. This was a reversal of
25f0a3fc8f6f9cf681bb5afda8e1762816c67a8b which actually made the
situation worse, hence the simplification.
This commit goes back to the strategy in
25f0a3fc8f6f9cf681bb5afda8e1762816c67a8b, and gets us back under the
limit in my local testing. Admittedly, we don't fully understand
the effects of importing or excluding importation of this package,
and have seen mixed results, but this commit allows us to move forward
again.
Updates tailscale/corp#22125
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In 2f27319baf71681e221904d3a3ffe1badedc8e2e we disabled GRO due to a
data race around concurrent calls to tstun.Wrapper.Write(). This commit
refactors GRO to be thread-safe, and re-enables it on Linux.
This refactor now carries a GRO type across tstun and netstack APIs
with a lifetime that is scoped to a single tstun.Wrapper.Write() call.
In 25f0a3fc8f6f9cf681bb5afda8e1762816c67a8b we used build tags to
prevent importation of gVisor's GRO package on iOS as at the time we
believed it was contributing to additional memory usage on that
platform. It wasn't, so this commit simplifies and removes those
build tags.
Updates tailscale/corp#22353
Updates tailscale/corp#22125
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
A SIGSEGV was observed around packet merging logic in gVisor's GRO
package.
Updates tailscale/corp#22353
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This commit increases gVisor's TCP max send (4->6MiB) and receive
(4->8MiB) buffer sizes on all platforms except iOS. These values are
biased towards higher throughput on high bandwidth-delay product paths.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. 100ms of RTT latency is
introduced via Linux's traffic control network emulator queue
discipline.
The first set of results are from commit f0230ce prior to TCP buffer
resizing.
gVisor write direction:
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 180 MBytes 151 Mbits/sec 0 sender
[ 5] 0.00-10.10 sec 179 MBytes 149 Mbits/sec receiver
gVisor read direction:
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.10 sec 337 MBytes 280 Mbits/sec 20 sender
[ 5] 0.00-10.00 sec 323 MBytes 271 Mbits/sec receiver
The second set of results are from this commit with increased TCP
buffer sizes.
gVisor write direction:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 297 MBytes 249 Mbits/sec 0 sender
[ 5] 0.00-10.10 sec 297 MBytes 247 Mbits/sec receiver
gVisor read direction:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.10 sec 501 MBytes 416 Mbits/sec 17 sender
[ 5] 0.00-10.00 sec 485 MBytes 407 Mbits/sec receiver
Updates #9707
Updates tailscale/corp#22119
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This commit implements TCP GRO for packets being written to gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported IP checksum functions.
gVisor is updated in order to make use of newly exported
stack.PacketBuffer GRO logic.
TCP throughput towards gVisor, i.e. TUN write direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement, sometimes as high as 2x. High bandwidth-delay product
paths remain receive window limited, bottlenecked by gVisor's default
TCP receive socket buffer size. This will be addressed in a follow-on
commit.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.
The first result is from commit 57856fc without TCP GRO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec 20 sender
[ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec receiver
The second result is from this commit with TCP GRO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec 20 sender
[ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec receiver
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.
A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.
TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.
The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.
The first result is from commit 57856fc without TCP GSO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender
[ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver
The second result is from this commit with TCP GSO.
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender
[ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver
Updates #6816
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Previously, we were registering TCP and UDP connections in the same map,
which could result in erroneously removing a mapping if one of the two
connections completes while the other one is still active.
Add a "proto string" argument to these functions to avoid this.
Additionally, take the "proto" argument in LocalAPI, and plumb that
through from the CLI and add a new LocalClient method.
Updates tailscale/corp#20600
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I35d5efaefdfbf4721e315b8ca123f0c8af9125fb
This moves NewContainsIPFunc from tsaddr to new ipset package.
And wgengine/filter types gets split into wgengine/filter/filtertype,
so netmap (and thus the CLI, etc) doesn't need to bring in ipset,
bart, etc.
Then add a test making sure the CLI deps don't regress.
Updates #1278
Change-Id: Ia246d6d9502bbefbdeacc4aef1bed9c8b24f54d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This refactors the logic for determining whether a packet should be sent
to the host or not into a function, and then adds tests for it.
Updates #11304
Updates #12448
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ief9afa98eaffae00e21ceb7db073c61b170355e5
Fix a bug where, for a subnet router that advertizes
4via6 route, all packets with a source IP matching
the 4via6 address were being sent to the host itself.
Instead, only send to host packets whose destination
address is host's local address.
Fixestailscale/tailscale#12448
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
This adds a new ListenPacket function on tsnet.Server
which acts mostly like `net.ListenPacket`.
Unlike `Server.Listen`, this requires listening on a
specific IP and does not automatically listen on both
V4 and V6 addresses of the Server when the IP is unspecified.
To test this, it also adds UDP support to tsdial.Dialer.UserDial
and plumbs it through the localapi. Then an associated test
to make sure the UDP functionality works from both sides.
Updates #12182
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Fixestailscale/tailscale#10393Fixestailscale/corp#15412Fixestailscale/corp#19808
On Apple platforms, exit nodes and subnet routers have been unable to relay pings from Tailscale devices to non-Tailscale devices due to sandbox restrictions imposed on our network extensions by Apple. The sandbox prevented the code in netstack.go from spawning the `ping` process which we were using.
Replace that exec call with logic to send an ICMP echo request directly, which appears to work in userspace, and not trigger a sandbox violation in the syslog.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Previously, a node that was advertising a 4via6 route wouldn't be able
to make use of that same route; the packet would be delivered to
Tailscale, but since we weren't accepting it in handleLocalPackets, the
packet wouldn't be delivered to netstack and would never hit the 4via6
logic. Let's add that support so that usage of 4via6 is consistent
regardless of where the connection is initiated from.
Updates #11304
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ic28dc2e58080d76100d73b93360f4698605af7cb
I'm on a mission to simplify LocalBackend.Start and its locking
and deflake some tests.
I noticed this hasn't been used since March 2023 when it was removed
from the Windows client in corp 66be796d33c.
So, delete.
Updates #11649
Change-Id: I40f2cb75fb3f43baf23558007655f65a8ec5e1b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It was used when we only supported subnet routers on linux
and would nil out the SubnetRoutes slice as no other router
worked with it, but now we support subnet routers on ~all platforms.
The field it was setting to nil is now only used for network logging
and nowhere else, so keep the field but drop the SubnetRouterWrapper
as it's not useful.
Updates #cleanup
Change-Id: Id03f9b6ec33e47ad643e7b66e07911945f25db79
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This change updates all tailfs functions and the majority of the tailfs
variables to use the new drive naming.
Updates tailscale/corp#16827
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
This change updates the tailfs file and package names to their new
naming convention.
Updates #tailscale/corp#16827
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
This fixes a bug that was introduced in #11258 where the handling of the
per-client limit didn't properly account for the fact that the gVisor
TCP forwarder will return 'true' to indicate that it's handled a
duplicate SYN packet, but not launch the handler goroutine.
In such a case, we neither decremented our per-client limit in the
wrapper function, nor did we do so in the handler function, leading to
our per-client limit table slowly filling up without bound.
Fix this by doing the same duplicate-tracking logic that the TCP
forwarder does so we can detect such cases and appropriately decrement
our in-flight counter.
Updates tailscale/corp#12184
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib6011a71d382a10d68c0802593f34b8153d06892
The `stack.PacketBufferPtr` type no longer exists; replace it with
`*stack.PacketBuffer` instead.
Updates #8043
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib56ceff09166a042aa3d9b80f50b2aa2d34b3683
This fixes a regression introduced with 993acf4 and released in
v1.60.0.
The regression caused us to intercept all userspace traffic to port
8080 which prevented users from exposing their own services to their
tailnet at port 8080.
Now, we only intercept traffic to port 8080 if it's bound for
100.100.100.100 or fd7a:115c:a1e0::53.
Fixes#11283
Signed-off-by: Percy Wegmann <percy@tailscale.com>
(cherry picked from commit 17cd0626f35dbc7948a78665d06a5862fc3dfdab)
This is a fun one. Right now, when a client is connecting through a
subnet router, here's roughly what happens:
1. The client initiates a connection to an IP address behind a subnet
router, and sends a TCP SYN
2. The subnet router gets the SYN packet from netstack, and after
running through acceptTCP, starts DialContext-ing the destination IP,
without accepting the connection¹
3. The client retransmits the SYN packet a few times while the dial is
in progress, until either...
4. The subnet router successfully establishes a connection to the
destination IP and sends the SYN-ACK back to the client, or...
5. The subnet router times out and sends a RST to the client.
6. If the connection was successful, the client ACKs the SYN-ACK it
received, and traffic starts flowing
As a result, the notification code in forwardTCP never notices when a
new connection attempt is aborted, and it will wait until either the
connection is established, or until the OS-level connection timeout is
reached and it aborts.
To mitigate this, add a per-client limit on how many in-flight TCP
forwarding connections can be in-progress; after this, clients will see
a similar behaviour to the global limit, where new connection attempts
are aborted instead of waiting. This prevents a single misbehaving
client from blocking all other clients of a subnet router by ensuring
that it doesn't starve the global limiter.
Also, bump the global limit again to a higher value.
¹ We can't accept the connection before establishing a connection to the
remote server since otherwise we'd be opening the connection and then
immediately closing it, which breaks a bunch of stuff; see #5503 for
more details.
Updates tailscale/corp#12184
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I76e7008ddd497303d75d473f534e32309c8a5144
- add a clientmetric with a counter of TCP forwarder drops due to the
max attempts;
- fix varz metric types, as they are all counters.
Updates #8210
Signed-off-by: Anton Tolchanov <anton@tailscale.com>