tailscale/wgengine
Andrew Dunham c5abbcd4b4 wgengine/netstack: add a per-client limit for in-flight TCP forwards
This is a fun one. Right now, when a client is connecting through a
subnet router, here's roughly what happens:

1. The client initiates a connection to an IP address behind a subnet
   router, and sends a TCP SYN
2. The subnet router gets the SYN packet from netstack, and after
   running through acceptTCP, starts DialContext-ing the destination IP,
   without accepting the connection¹
3. The client retransmits the SYN packet a few times while the dial is
   in progress, until either...
4. The subnet router successfully establishes a connection to the
   destination IP and sends the SYN-ACK back to the client, or...
5. The subnet router times out and sends a RST to the client.
6. If the connection was successful, the client ACKs the SYN-ACK it
   received, and traffic starts flowing

As a result, the notification code in forwardTCP never notices when a
new connection attempt is aborted, and it will wait until either the
connection is established, or until the OS-level connection timeout is
reached and it aborts.

To mitigate this, add a per-client limit on how many in-flight TCP
forwarding connections can be in-progress; after this, clients will see
a similar behaviour to the global limit, where new connection attempts
are aborted instead of waiting. This prevents a single misbehaving
client from blocking all other clients of a subnet router by ensuring
that it doesn't starve the global limiter.

Also, bump the global limit again to a higher value.

¹ We can't accept the connection before establishing a connection to the
remote server since otherwise we'd be opening the connection and then
immediately closing it, which breaks a bunch of stuff; see #5503 for
more details.

Updates tailscale/corp#12184

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I76e7008ddd497303d75d473f534e32309c8a5144
2024-02-27 15:25:40 -05:00
..
bench tailcfg, all: use []netip.AddrPort instead of []string for Endpoints 2023-10-01 18:23:02 -07:00
capture various: add golangci-lint, fix issues (#7905) 2023-04-17 18:38:24 -04:00
filter wgengine/filter: add protocol-agnostic packet checker (#10446) 2023-12-02 16:30:33 -06:00
magicsock all: remove LenIter, use Go 1.22 range-over-int instead 2024-02-25 12:29:45 -08:00
netlog wgengine/netlog: fix nil pointer dereference in logtail (#8598) 2023-07-13 08:54:29 -07:00
netstack wgengine/netstack: add a per-client limit for in-flight TCP forwards 2024-02-27 15:25:40 -05:00
router wgengine/router: fix ip rule restoration 2024-02-15 11:36:40 -05:00
wgcfg all: remove LenIter, use Go 1.22 range-over-int instead 2024-02-25 12:29:45 -08:00
wgint ipn/ipnstate, wgengine/wgint: add handshake attempts accessors 2024-02-26 19:09:12 -08:00
wglog wgengine/wglog: add TS_DEBUG_RAW_WGLOG envknob for raw wg logs 2024-02-24 14:59:48 -08:00
winnet all: update copyright and license headers 2023-01-27 15:36:29 -08:00
mem_ios.go all: update copyright and license headers 2023-01-27 15:36:29 -08:00
pendopen.go wgengine: make pendOpen time later, after dup check 2024-02-26 19:09:12 -08:00
userspace_ext_test.go tsd: add package with System type to unify subsystem init, discovery 2023-05-04 14:21:59 -07:00
userspace_test.go control,tailcfg,wgengine/magicsock: add nodeAttr to enable/disable peer MTU 2023-09-21 04:17:12 -07:00
userspace.go ipn/ipnstate, wgengine/wgint: add handshake attempts accessors 2024-02-26 19:09:12 -08:00
watchdog_js.go all: update copyright and license headers 2023-01-27 15:36:29 -08:00
watchdog_test.go all: update copyright and license headers 2023-01-27 15:36:29 -08:00
watchdog.go cmd/tailscaled, ipn/ipnlocal, wgengine: shutdown tailscaled if wgdevice is closed 2024-02-26 14:45:35 -06:00
wgengine.go cmd/tailscaled, ipn/ipnlocal, wgengine: shutdown tailscaled if wgdevice is closed 2024-02-26 14:45:35 -06:00