tailscale/net/tstun/wrap.go

1402 lines
41 KiB
Go
Raw Normal View History

// Copyright (c) Tailscale Inc & AUTHORS
// SPDX-License-Identifier: BSD-3-Clause
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
package tstun
import (
"errors"
"fmt"
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
"io"
"net/netip"
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
"os"
"reflect"
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
"runtime"
"slices"
"strings"
"sync"
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
"sync/atomic"
"time"
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
"github.com/gaissmai/bart"
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
"github.com/tailscale/wireguard-go/conn"
"github.com/tailscale/wireguard-go/device"
"github.com/tailscale/wireguard-go/tun"
"go4.org/mem"
"gvisor.dev/gvisor/pkg/tcpip/stack"
"tailscale.com/disco"
"tailscale.com/net/connstats"
"tailscale.com/net/packet"
"tailscale.com/net/packet/checksum"
"tailscale.com/net/tsaddr"
"tailscale.com/syncs"
"tailscale.com/tstime/mono"
"tailscale.com/types/ipproto"
"tailscale.com/types/key"
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
"tailscale.com/types/logger"
"tailscale.com/util/clientmetric"
"tailscale.com/wgengine/capture"
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
"tailscale.com/wgengine/filter"
"tailscale.com/wgengine/wgcfg"
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
)
const maxBufferSize = device.MaxMessageSize
// PacketStartOffset is the minimal amount of leading space that must exist
// before &packet[offset] in a packet passed to Read, Write, or InjectInboundDirect.
// This is necessary to avoid reallocation in wireguard-go internals.
const PacketStartOffset = device.MessageTransportHeaderSize
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
// MaxPacketSize is the maximum size (in bytes)
// of a packet that can be injected into a tstun.Wrapper.
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
const MaxPacketSize = device.MaxContentSize
const tapDebug = false // for super verbose TAP debugging
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
var (
// ErrClosed is returned when attempting an operation on a closed Wrapper.
ErrClosed = errors.New("device closed")
// ErrFiltered is returned when the acted-on packet is rejected by a filter.
ErrFiltered = errors.New("packet dropped by filter")
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
)
var (
errPacketTooBig = errors.New("packet too big")
errOffsetTooBig = errors.New("offset larger than buffer length")
errOffsetTooSmall = errors.New("offset smaller than PacketStartOffset")
)
// parsedPacketPool holds a pool of Parsed structs for use in filtering.
// This is needed because escape analysis cannot see that parsed packets
// do not escape through {Pre,Post}Filter{In,Out}.
var parsedPacketPool = sync.Pool{New: func() any { return new(packet.Parsed) }}
// FilterFunc is a packet-filtering function with access to the Wrapper device.
// It must not hold onto the packet struct, as its backing storage will be reused.
type FilterFunc func(*packet.Parsed, *Wrapper) filter.Response
// Wrapper augments a tun.Device with packet filtering and injection.
//
// A Wrapper starts in a "corked" mode where Read calls are blocked
// until the Wrapper's Start method is called.
type Wrapper struct {
logf logger.Logf
limitedLogf logger.Logf // aggressively rate-limited logf used for potentially high volume errors
// tdev is the underlying Wrapper device.
tdev tun.Device
isTAP bool // whether tdev is a TAP device
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
started atomic.Bool // whether Start has been called
startCh chan struct{} // closed in Start
closeOnce sync.Once
// lastActivityAtomic is read/written atomically.
// On 32 bit systems, if the fields above change,
// you might need to add an align64 field here.
lastActivityAtomic mono.Time // time of last send or receive
destIPActivity syncs.AtomicValue[map[netip.Addr]func()]
//lint:ignore U1000 used in tap_linux.go
destMACAtomic syncs.AtomicValue[[6]byte]
discoKey syncs.AtomicValue[key.DiscoPublic]
// timeNow, if non-nil, will be used to obtain the current time.
timeNow func() time.Time
// peerConfig stores the current NAT configuration.
peerConfig atomic.Pointer[peerConfigTable]
// vectorBuffer stores the oldest unconsumed packet vector from tdev. It is
// allocated in wrap() and the underlying arrays should never grow.
vectorBuffer [][]byte
// bufferConsumedMu protects bufferConsumed from concurrent sends, closes,
// and send-after-close (by way of bufferConsumedClosed).
bufferConsumedMu sync.Mutex
// bufferConsumedClosed is true when bufferConsumed has been closed. This is
// read by bufferConsumed writers to prevent send-after-close.
bufferConsumedClosed bool
// bufferConsumed synchronizes access to vectorBuffer (shared by Read() and
// pollVector()).
//
// Close closes bufferConsumed and sets bufferConsumedClosed to true.
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
bufferConsumed chan struct{}
// closed signals poll (by closing) when the device is closed.
closed chan struct{}
// outboundMu protects outbound and vectorOutbound from concurrent sends,
// closes, and send-after-close (by way of outboundClosed).
outboundMu sync.Mutex
// outboundClosed is true when outbound or vectorOutbound have been closed.
// This is read by outbound and vectorOutbound writers to prevent
// send-after-close.
outboundClosed bool
// vectorOutbound is the queue by which packets leave the TUN device.
//
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
// The directions are relative to the network, not the device:
// inbound packets arrive via UDP and are written into the TUN device;
// outbound packets are read from the TUN device and sent out via UDP.
// This queue is needed because although inbound writes are synchronous,
// the other direction must wait on a WireGuard goroutine to poll it.
//
// Empty reads are skipped by WireGuard, so it is always legal
// to discard an empty packet instead of sending it through vectorOutbound.
//
// Close closes vectorOutbound and sets outboundClosed to true.
vectorOutbound chan tunVectorReadResult
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
// eventsUpDown yields up and down tun.Events that arrive on a Wrapper's events channel.
eventsUpDown chan tun.Event
// eventsOther yields non-up-and-down tun.Events that arrive on a Wrapper's events channel.
eventsOther chan tun.Event
wgengine/bench: speed test for channels, sockets, and wireguard-go. This tries to generate traffic at a rate that will saturate the receiver, without overdoing it, even in the event of packet loss. It's unrealistically more aggressive than TCP (which will back off quickly in case of packet loss) but less silly than a blind test that just generates packets as fast as it can (which can cause all the CPU to be absorbed by the transmitter, giving an incorrect impression of how much capacity the total system has). Initial indications are that a syscall about every 10 packets (TCP bulk delivery) is roughly the same speed as sending every packet through a channel. A syscall per packet is about 5x-10x slower than that. The whole tailscale wireguard-go + magicsock + packet filter combination is about 4x slower again, which is better than I thought we'd do, but probably has room for improvement. Note that in "full" tailscale, there is also a tundev read/write for every packet, effectively doubling the syscall overhead per packet. Given these numbers, it seems like read/write syscalls are only 25-40% of the total CPU time used in tailscale proper, so we do have significant non-syscall optimization work to do too. Sample output: $ GOMAXPROCS=2 go test -bench . -benchtime 5s ./cmd/tailbench goos: linux goarch: amd64 pkg: tailscale.com/cmd/tailbench cpu: Intel(R) Core(TM) i7-4785T CPU @ 2.20GHz BenchmarkTrivialNoAlloc/32-2 56340248 93.85 ns/op 340.98 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTrivialNoAlloc/124-2 57527490 99.27 ns/op 1249.10 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTrivialNoAlloc/1024-2 52537773 111.3 ns/op 9200.39 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTrivial/32-2 41878063 135.6 ns/op 236.04 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTrivial/124-2 41270439 138.4 ns/op 896.02 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTrivial/1024-2 36337252 154.3 ns/op 6635.30 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkBlockingChannel/32-2 12171654 494.3 ns/op 64.74 MB/s 0 %lost 1791 B/op 0 allocs/op BenchmarkBlockingChannel/124-2 12149956 507.8 ns/op 244.17 MB/s 0 %lost 1792 B/op 1 allocs/op BenchmarkBlockingChannel/1024-2 11034754 528.8 ns/op 1936.42 MB/s 0 %lost 1792 B/op 1 allocs/op BenchmarkNonlockingChannel/32-2 8960622 2195 ns/op 14.58 MB/s 8.825 %lost 1792 B/op 1 allocs/op BenchmarkNonlockingChannel/124-2 3014614 2224 ns/op 55.75 MB/s 11.18 %lost 1792 B/op 1 allocs/op BenchmarkNonlockingChannel/1024-2 3234915 1688 ns/op 606.53 MB/s 3.765 %lost 1792 B/op 1 allocs/op BenchmarkDoubleChannel/32-2 8457559 764.1 ns/op 41.88 MB/s 5.945 %lost 1792 B/op 1 allocs/op BenchmarkDoubleChannel/124-2 5497726 1030 ns/op 120.38 MB/s 12.14 %lost 1792 B/op 1 allocs/op BenchmarkDoubleChannel/1024-2 7985656 1360 ns/op 752.86 MB/s 13.57 %lost 1792 B/op 1 allocs/op BenchmarkUDP/32-2 1652134 3695 ns/op 8.66 MB/s 0 %lost 176 B/op 3 allocs/op BenchmarkUDP/124-2 1621024 3765 ns/op 32.94 MB/s 0 %lost 176 B/op 3 allocs/op BenchmarkUDP/1024-2 1553750 3825 ns/op 267.72 MB/s 0 %lost 176 B/op 3 allocs/op BenchmarkTCP/32-2 11056336 503.2 ns/op 63.60 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTCP/124-2 11074869 533.7 ns/op 232.32 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTCP/1024-2 8934968 671.4 ns/op 1525.20 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkWireGuardTest/32-2 1403702 4547 ns/op 7.04 MB/s 14.37 %lost 467 B/op 3 allocs/op BenchmarkWireGuardTest/124-2 780645 7927 ns/op 15.64 MB/s 1.537 %lost 420 B/op 3 allocs/op BenchmarkWireGuardTest/1024-2 512671 11791 ns/op 86.85 MB/s 0.5206 %lost 411 B/op 3 allocs/op PASS ok tailscale.com/wgengine/bench 195.724s Updates #414. Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-03-23 21:35:35 -04:00
// filter atomically stores the currently active packet filter
filter atomic.Pointer[filter.Filter]
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
// filterFlags control the verbosity of logging packet drops/accepts.
filterFlags filter.RunFlags
// jailedFilter is the packet filter for jailed nodes.
// Can be nil, which means drop all packets.
jailedFilter atomic.Pointer[filter.Filter]
// PreFilterPacketInboundFromWireGuard is the inbound filter function that runs before the main filter
// and therefore sees the packets that may be later dropped by it.
PreFilterPacketInboundFromWireGuard FilterFunc
// PostFilterPacketInboundFromWireGuard is the inbound filter function that runs after the main filter.
PostFilterPacketInboundFromWireGuard FilterFunc
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GRO for Linux (#12921) This commit implements TCP GRO for packets being written to gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported IP checksum functions. gVisor is updated in order to make use of newly exported stack.PacketBuffer GRO logic. TCP throughput towards gVisor, i.e. TUN write direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement, sometimes as high as 2x. High bandwidth-delay product paths remain receive window limited, bottlenecked by gVisor's default TCP receive socket buffer size. This will be addressed in a follow-on commit. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GRO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec 20 sender [ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec receiver The second result is from this commit with TCP GRO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec 20 sender [ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-02 10:41:10 -07:00
// EndPacketVectorInboundFromWireGuardFlush is a function that runs after all packets in a given vector
// have been handled by all filters. Filters may queue packets for the purposes of GRO, requiring an
// explicit flush.
EndPacketVectorInboundFromWireGuardFlush func()
// PreFilterPacketOutboundToWireGuardNetstackIntercept is a filter function that runs before the main filter
// for packets from the local system. This filter is populated by netstack to hook
// packets that should be handled by netstack. If set, this filter runs before
// PreFilterFromTunToEngine.
PreFilterPacketOutboundToWireGuardNetstackIntercept FilterFunc
// PreFilterPacketOutboundToWireGuardEngineIntercept is a filter function that runs before the main filter
// for packets from the local system. This filter is populated by wgengine to hook
// packets which it handles internally. If both this and PreFilterFromTunToNetstack
// filter functions are non-nil, this filter runs second.
PreFilterPacketOutboundToWireGuardEngineIntercept FilterFunc
// PostFilterPacketOutboundToWireGuard is the outbound filter function that runs after the main filter.
PostFilterPacketOutboundToWireGuard FilterFunc
// OnTSMPPongReceived, if non-nil, is called whenever a TSMP pong arrives.
OnTSMPPongReceived func(packet.TSMPPongReply)
// OnICMPEchoResponseReceived, if non-nil, is called whenever a ICMP echo response
// arrives. If the packet is to be handled internally this returns true,
// false otherwise.
OnICMPEchoResponseReceived func(*packet.Parsed) bool
// PeerAPIPort, if non-nil, returns the peerapi port that's
// running for the given IP address.
PeerAPIPort func(netip.Addr) (port uint16, ok bool)
// disableFilter disables all filtering when set. This should only be used in tests.
disableFilter bool
// disableTSMPRejected disables TSMP rejected responses. For tests.
disableTSMPRejected bool
// stats maintains per-connection counters.
stats atomic.Pointer[connstats.Statistics]
captureHook syncs.AtomicValue[capture.Callback]
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
// tunInjectedRead is an injected packet pretending to be a tun.Read().
type tunInjectedRead struct {
// Only one of packet or data should be set, and are read in that order of
// precedence.
packet *stack.PacketBuffer
data []byte
}
// tunVectorReadResult is the result of a tun.Read(), or an injected packet
// pretending to be a tun.Read().
type tunVectorReadResult struct {
// When err AND data are nil, injected will be set with meaningful data
// (injected packet). If either err OR data is non-nil, injected should be
// ignored (a "real" tun.Read).
err error
data [][]byte
injected tunInjectedRead
dataOffset int
}
type setWrapperer interface {
// setWrapper enables the underlying TUN/TAP to have access to the Wrapper.
// It MUST be called only once during initialization, other usage is unsafe.
setWrapper(*Wrapper)
}
// Start unblocks any Wrapper.Read calls that have already started
// and makes the Wrapper functional.
//
// Start must be called exactly once after the various Tailscale
// subsystems have been wired up to each other.
func (w *Wrapper) Start() {
w.started.Store(true)
close(w.startCh)
}
func WrapTAP(logf logger.Logf, tdev tun.Device) *Wrapper {
return wrap(logf, tdev, true)
}
func Wrap(logf logger.Logf, tdev tun.Device) *Wrapper {
return wrap(logf, tdev, false)
}
func wrap(logf logger.Logf, tdev tun.Device, isTAP bool) *Wrapper {
logf = logger.WithPrefix(logf, "tstun: ")
w := &Wrapper{
logf: logf,
limitedLogf: logger.RateLimitedFn(logf, 1*time.Minute, 2, 10),
isTAP: isTAP,
tdev: tdev,
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
// bufferConsumed is conceptually a condition variable:
// a goroutine should not block when setting it, even with no listeners.
bufferConsumed: make(chan struct{}, 1),
closed: make(chan struct{}),
// vectorOutbound can be unbuffered; the buffer is an optimization.
vectorOutbound: make(chan tunVectorReadResult, 1),
eventsUpDown: make(chan tun.Event),
eventsOther: make(chan tun.Event),
// TODO(dmytro): (highly rate-limited) hexdumps should happen on unknown packets.
filterFlags: filter.LogAccepts | filter.LogDrops,
startCh: make(chan struct{}),
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
w.vectorBuffer = make([][]byte, tdev.BatchSize())
for i := range w.vectorBuffer {
w.vectorBuffer[i] = make([]byte, maxBufferSize)
}
go w.pollVector()
go w.pumpEvents()
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
// The buffer starts out consumed.
w.bufferConsumed <- struct{}{}
w.noteActivity()
if sw, ok := w.tdev.(setWrapperer); ok {
sw.setWrapper(w)
}
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
return w
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
// now returns the current time, either by calling t.timeNow if set or time.Now
// if not.
func (t *Wrapper) now() time.Time {
if t.timeNow != nil {
return t.timeNow()
}
return time.Now()
}
// SetDestIPActivityFuncs sets a map of funcs to run per packet
// destination (the map keys).
//
// The map ownership passes to the Wrapper. It must be non-nil.
func (t *Wrapper) SetDestIPActivityFuncs(m map[netip.Addr]func()) {
net/packet: remove the custom IP4/IP6 types in favor of netaddr.IP. Upstream netaddr has a change that makes it alloc-free, so it's safe to use in hot codepaths. This gets rid of one of the many IP types in our codebase. Performance is currently worse across the board. This is likely due in part to netaddr.IP being a larger value type (4b -> 24b for IPv4, 16b -> 24b for IPv6), and in other part due to missing low-hanging fruit optimizations in netaddr. However, the regression is less bad than it looks at first glance, because we'd micro-optimized packet.IP* in the past few weeks. This change drops us back to roughly where we were at the 1.2 release, but with the benefit of a significant code and architectural simplification. name old time/op new time/op delta pkg:tailscale.com/net/packet goos:linux goarch:amd64 Decode/tcp4-8 12.2ns ± 5% 29.7ns ± 2% +142.32% (p=0.008 n=5+5) Decode/tcp6-8 12.6ns ± 3% 65.1ns ± 2% +418.47% (p=0.008 n=5+5) Decode/udp4-8 11.8ns ± 3% 30.5ns ± 2% +157.94% (p=0.008 n=5+5) Decode/udp6-8 27.1ns ± 1% 65.7ns ± 2% +142.36% (p=0.016 n=4+5) Decode/icmp4-8 24.6ns ± 2% 30.5ns ± 2% +23.65% (p=0.016 n=4+5) Decode/icmp6-8 22.9ns ±51% 65.5ns ± 2% +186.19% (p=0.008 n=5+5) Decode/igmp-8 18.1ns ±44% 30.2ns ± 1% +66.89% (p=0.008 n=5+5) Decode/unknown-8 20.8ns ± 1% 10.6ns ± 9% -49.11% (p=0.016 n=4+5) pkg:tailscale.com/wgengine/filter goos:linux goarch:amd64 Filter/icmp4-8 30.5ns ± 1% 77.9ns ± 3% +155.01% (p=0.008 n=5+5) Filter/tcp4_syn_in-8 43.7ns ± 3% 123.0ns ± 3% +181.72% (p=0.008 n=5+5) Filter/tcp4_syn_out-8 24.5ns ± 2% 45.7ns ± 6% +86.22% (p=0.008 n=5+5) Filter/udp4_in-8 64.8ns ± 1% 210.0ns ± 2% +223.87% (p=0.008 n=5+5) Filter/udp4_out-8 119ns ± 0% 278ns ± 0% +133.78% (p=0.016 n=4+5) Filter/icmp6-8 40.3ns ± 2% 204.4ns ± 4% +407.70% (p=0.008 n=5+5) Filter/tcp6_syn_in-8 35.3ns ± 3% 199.2ns ± 2% +464.95% (p=0.008 n=5+5) Filter/tcp6_syn_out-8 32.8ns ± 2% 81.0ns ± 2% +147.10% (p=0.008 n=5+5) Filter/udp6_in-8 106ns ± 2% 290ns ± 2% +174.48% (p=0.008 n=5+5) Filter/udp6_out-8 184ns ± 2% 314ns ± 3% +70.43% (p=0.016 n=4+5) pkg:tailscale.com/wgengine/tstun goos:linux goarch:amd64 Write-8 9.02ns ± 3% 8.92ns ± 1% ~ (p=0.421 n=5+5) name old alloc/op new alloc/op delta pkg:tailscale.com/net/packet goos:linux goarch:amd64 Decode/tcp4-8 0.00B 0.00B ~ (all equal) Decode/tcp6-8 0.00B 0.00B ~ (all equal) Decode/udp4-8 0.00B 0.00B ~ (all equal) Decode/udp6-8 0.00B 0.00B ~ (all equal) Decode/icmp4-8 0.00B 0.00B ~ (all equal) Decode/icmp6-8 0.00B 0.00B ~ (all equal) Decode/igmp-8 0.00B 0.00B ~ (all equal) Decode/unknown-8 0.00B 0.00B ~ (all equal) pkg:tailscale.com/wgengine/filter goos:linux goarch:amd64 Filter/icmp4-8 0.00B 0.00B ~ (all equal) Filter/tcp4_syn_in-8 0.00B 0.00B ~ (all equal) Filter/tcp4_syn_out-8 0.00B 0.00B ~ (all equal) Filter/udp4_in-8 0.00B 0.00B ~ (all equal) Filter/udp4_out-8 16.0B ± 0% 64.0B ± 0% +300.00% (p=0.008 n=5+5) Filter/icmp6-8 0.00B 0.00B ~ (all equal) Filter/tcp6_syn_in-8 0.00B 0.00B ~ (all equal) Filter/tcp6_syn_out-8 0.00B 0.00B ~ (all equal) Filter/udp6_in-8 0.00B 0.00B ~ (all equal) Filter/udp6_out-8 48.0B ± 0% 64.0B ± 0% +33.33% (p=0.008 n=5+5) name old allocs/op new allocs/op delta pkg:tailscale.com/net/packet goos:linux goarch:amd64 Decode/tcp4-8 0.00 0.00 ~ (all equal) Decode/tcp6-8 0.00 0.00 ~ (all equal) Decode/udp4-8 0.00 0.00 ~ (all equal) Decode/udp6-8 0.00 0.00 ~ (all equal) Decode/icmp4-8 0.00 0.00 ~ (all equal) Decode/icmp6-8 0.00 0.00 ~ (all equal) Decode/igmp-8 0.00 0.00 ~ (all equal) Decode/unknown-8 0.00 0.00 ~ (all equal) pkg:tailscale.com/wgengine/filter goos:linux goarch:amd64 Filter/icmp4-8 0.00 0.00 ~ (all equal) Filter/tcp4_syn_in-8 0.00 0.00 ~ (all equal) Filter/tcp4_syn_out-8 0.00 0.00 ~ (all equal) Filter/udp4_in-8 0.00 0.00 ~ (all equal) Filter/udp4_out-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Filter/icmp6-8 0.00 0.00 ~ (all equal) Filter/tcp6_syn_in-8 0.00 0.00 ~ (all equal) Filter/tcp6_syn_out-8 0.00 0.00 ~ (all equal) Filter/udp6_in-8 0.00 0.00 ~ (all equal) Filter/udp6_out-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-19 16:43:25 -08:00
t.destIPActivity.Store(m)
}
// SetDiscoKey sets the current discovery key.
//
// It is only used for filtering out bogus traffic when network
// stack(s) get confused; see Issue 1526.
func (t *Wrapper) SetDiscoKey(k key.DiscoPublic) {
t.discoKey.Store(k)
}
// isSelfDisco reports whether packet p
// looks like a Disco packet from ourselves.
// See Issue 1526.
func (t *Wrapper) isSelfDisco(p *packet.Parsed) bool {
if p.IPProto != ipproto.UDP {
return false
}
pkt := p.Payload()
discobs, ok := disco.Source(pkt)
if !ok {
return false
}
discoSrc := key.DiscoPublicFromRaw32(mem.B(discobs))
selfDiscoPub := t.discoKey.Load()
return selfDiscoPub == discoSrc
}
func (t *Wrapper) Close() error {
var err error
t.closeOnce.Do(func() {
if t.started.CompareAndSwap(false, true) {
close(t.startCh)
}
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
close(t.closed)
t.bufferConsumedMu.Lock()
t.bufferConsumedClosed = true
close(t.bufferConsumed)
t.bufferConsumedMu.Unlock()
t.outboundMu.Lock()
t.outboundClosed = true
close(t.vectorOutbound)
t.outboundMu.Unlock()
err = t.tdev.Close()
})
return err
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
// isClosed reports whether t is closed.
func (t *Wrapper) isClosed() bool {
select {
case <-t.closed:
return true
default:
return false
}
}
// pumpEvents copies events from t.tdev to t.eventsUpDown and t.eventsOther.
// pumpEvents exits when t.tdev.events or t.closed is closed.
// pumpEvents closes t.eventsUpDown and t.eventsOther when it exits.
func (t *Wrapper) pumpEvents() {
defer close(t.eventsUpDown)
defer close(t.eventsOther)
src := t.tdev.Events()
for {
// Retrieve an event from the TUN device.
var event tun.Event
var ok bool
select {
case <-t.closed:
return
case event, ok = <-src:
if !ok {
return
}
}
// Pass along event to the correct recipient.
// Though event is a bitmask, in practice there is only ever one bit set at a time.
dst := t.eventsOther
if event&(tun.EventUp|tun.EventDown) != 0 {
dst = t.eventsUpDown
}
select {
case <-t.closed:
return
case dst <- event:
}
}
}
// EventsUpDown returns a TUN event channel that contains all Up and Down events.
func (t *Wrapper) EventsUpDown() chan tun.Event {
return t.eventsUpDown
}
// Events returns a TUN event channel that contains all non-Up, non-Down events.
// It is named Events because it is the set of events that we want to expose to wireguard-go,
// and Events is the name specified by the wireguard-go tun.Device interface.
func (t *Wrapper) Events() <-chan tun.Event {
return t.eventsOther
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
func (t *Wrapper) File() *os.File {
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
return t.tdev.File()
}
func (t *Wrapper) MTU() (int, error) {
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
return t.tdev.MTU()
}
func (t *Wrapper) Name() (string, error) {
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
return t.tdev.Name()
}
const ethernetFrameSize = 14 // 2 six byte MACs, 2 bytes ethertype
// pollVector polls t.tdev.Read(), placing the oldest unconsumed packet vector
// into t.vectorBuffer. This is needed because t.tdev.Read() in general may
// block (it does on Windows), so packets may be stuck in t.vectorOutbound if
// t.Read() called t.tdev.Read() directly.
func (t *Wrapper) pollVector() {
sizes := make([]int, len(t.vectorBuffer))
readOffset := PacketStartOffset
if t.isTAP {
readOffset = PacketStartOffset - ethernetFrameSize
}
for range t.bufferConsumed {
DoRead:
for i := range t.vectorBuffer {
t.vectorBuffer[i] = t.vectorBuffer[i][:cap(t.vectorBuffer[i])]
}
var n int
var err error
for n == 0 && err == nil {
if t.isClosed() {
return
}
n, err = t.tdev.Read(t.vectorBuffer[:], sizes, readOffset)
if t.isTAP && tapDebug {
s := fmt.Sprintf("% x", t.vectorBuffer[0][:])
for strings.HasSuffix(s, " 00") {
s = strings.TrimSuffix(s, " 00")
}
t.logf("TAP read %v, %v: %s", n, err, s)
}
}
for i := range sizes[:n] {
t.vectorBuffer[i] = t.vectorBuffer[i][:readOffset+sizes[i]]
}
if t.isTAP {
if err == nil {
ethernetFrame := t.vectorBuffer[0][readOffset:]
if t.handleTAPFrame(ethernetFrame) {
goto DoRead
}
}
// Fall through. We got an IP packet.
if sizes[0] >= ethernetFrameSize {
t.vectorBuffer[0] = t.vectorBuffer[0][:readOffset+sizes[0]-ethernetFrameSize]
}
if tapDebug {
t.logf("tap regular frame: %x", t.vectorBuffer[0][PacketStartOffset:PacketStartOffset+sizes[0]])
}
}
t.sendVectorOutbound(tunVectorReadResult{
data: t.vectorBuffer[:n],
dataOffset: PacketStartOffset,
err: err,
})
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
}
// sendBufferConsumed does t.bufferConsumed <- struct{}{}.
func (t *Wrapper) sendBufferConsumed() {
t.bufferConsumedMu.Lock()
defer t.bufferConsumedMu.Unlock()
if t.bufferConsumedClosed {
return
}
t.bufferConsumed <- struct{}{}
}
// injectOutbound does t.vectorOutbound <- r
func (t *Wrapper) injectOutbound(r tunInjectedRead) {
t.outboundMu.Lock()
defer t.outboundMu.Unlock()
if t.outboundClosed {
return
}
t.vectorOutbound <- tunVectorReadResult{
injected: r,
}
}
// sendVectorOutbound does t.vectorOutbound <- r.
func (t *Wrapper) sendVectorOutbound(r tunVectorReadResult) {
t.outboundMu.Lock()
defer t.outboundMu.Unlock()
if t.outboundClosed {
return
}
t.vectorOutbound <- r
}
// snat does SNAT on p if the destination address requires a different source address.
func (pc *peerConfigTable) snat(p *packet.Parsed) {
oldSrc := p.Src.Addr()
newSrc := pc.selectSrcIP(oldSrc, p.Dst.Addr())
if oldSrc != newSrc {
checksum.UpdateSrcAddr(p, newSrc)
}
}
// dnat does destination NAT on p.
func (pc *peerConfigTable) dnat(p *packet.Parsed) {
oldDst := p.Dst.Addr()
newDst := pc.mapDstIP(p.Src.Addr(), oldDst)
if newDst != oldDst {
checksum.UpdateDstAddr(p, newDst)
}
}
// findV4 returns the first Tailscale IPv4 address in addrs.
func findV4(addrs []netip.Prefix) netip.Addr {
for _, ap := range addrs {
a := ap.Addr()
if a.Is4() && tsaddr.IsTailscaleIP(a) {
return a
}
}
return netip.Addr{}
}
// findV6 returns the first Tailscale IPv6 address in addrs.
func findV6(addrs []netip.Prefix) netip.Addr {
for _, ap := range addrs {
a := ap.Addr()
if a.Is6() && tsaddr.IsTailscaleIP(a) {
return a
}
}
return netip.Addr{}
}
// peerConfigTable contains configuration for individual peers and related
// information necessary to perform peer-specific operations. It should be
// treated as immutable.
//
// The nil value is a valid configuration.
type peerConfigTable struct {
// nativeAddr4 and nativeAddr6 are the IPv4/IPv6 Tailscale Addresses of
// the current node.
//
// These are implicitly used as the address to rewrite to in the DNAT
// path (as configured by listenAddrs, below). The IPv4 address will be
// used if the inbound packet is IPv4, and the IPv6 address if the
// inbound packet is IPv6.
nativeAddr4, nativeAddr6 netip.Addr
// byIP contains configuration for each peer, indexed by a peer's IP
// address(es).
byIP bart.Table[*peerConfig]
// masqAddrCounts is a count of peers by MasqueradeAsIP.
// TODO? for logging
masqAddrCounts map[netip.Addr]int
}
// peerConfig is the configuration for a single peer.
type peerConfig struct {
// dstMasqAddr{4,6} are the addresses that should be used as the
// source address when masquerading packets to this peer (i.e.
// SNAT). If an address is not valid, the packet should not be
// masqueraded for that address family.
dstMasqAddr4 netip.Addr
dstMasqAddr6 netip.Addr
// jailed is whether this peer is "jailed" (i.e. is restricted from being
// able to initiate connections to this node). This is the case for shared
// nodes.
jailed bool
}
func (c *peerConfigTable) String() string {
if c == nil {
return "peerConfigTable(nil)"
}
var b strings.Builder
b.WriteString("peerConfigTable{")
fmt.Fprintf(&b, "nativeAddr4: %v, ", c.nativeAddr4)
fmt.Fprintf(&b, "nativeAddr6: %v, ", c.nativeAddr6)
// TODO: figure out how to iterate/debug/print c.byIP
b.WriteString("}")
return b.String()
}
func (c *peerConfig) String() string {
if c == nil {
return "peerConfig(nil)"
}
var b strings.Builder
b.WriteString("peerConfig{")
fmt.Fprintf(&b, "dstMasqAddr4: %v, ", c.dstMasqAddr4)
fmt.Fprintf(&b, "dstMasqAddr6: %v, ", c.dstMasqAddr6)
fmt.Fprintf(&b, "jailed: %v}", c.jailed)
return b.String()
}
// mapDstIP returns the destination IP to use for a packet to dst.
// If dst is not one of the listen addresses, it is returned as-is,
// otherwise the native address is returned.
func (pc *peerConfigTable) mapDstIP(src, oldDst netip.Addr) netip.Addr {
if pc == nil {
return oldDst
}
// The packet we're processing is inbound from WireGuard, received from
// a peer. The 'src' of the packet is the remote peer's IP address,
// possibly the masqueraded address (if the peer is shared/etc.).
//
// The 'dst' of the packet is the address for this local node. It could
// be a masquerade address that we told other nodes to use, or one of
// our local node's Addresses.
c, ok := pc.byIP.Lookup(src)
if !ok {
return oldDst
}
if oldDst.Is4() && pc.nativeAddr4.IsValid() && c.dstMasqAddr4 == oldDst {
return pc.nativeAddr4
}
if oldDst.Is6() && pc.nativeAddr6.IsValid() && c.dstMasqAddr6 == oldDst {
return pc.nativeAddr6
}
return oldDst
}
// selectSrcIP returns the source IP to use for a packet to dst.
// If the packet is not from the native address, it is returned as-is.
func (pc *peerConfigTable) selectSrcIP(oldSrc, dst netip.Addr) netip.Addr {
if pc == nil {
return oldSrc
}
// If this packet doesn't originate from this Tailscale node, don't
// SNAT it (e.g. if we're a subnet router).
if oldSrc.Is4() && oldSrc != pc.nativeAddr4 {
return oldSrc
}
if oldSrc.Is6() && oldSrc != pc.nativeAddr6 {
return oldSrc
}
// Look up the configuration for the destination
c, ok := pc.byIP.Lookup(dst)
if !ok {
return oldSrc
}
// Perform SNAT based on the address family and whether we have a valid
// addr.
if oldSrc.Is4() && c.dstMasqAddr4.IsValid() {
return c.dstMasqAddr4
}
if oldSrc.Is6() && c.dstMasqAddr6.IsValid() {
return c.dstMasqAddr6
}
// No SNAT; use old src
return oldSrc
}
// peerConfigTableFromWGConfig generates a peerConfigTable from nm. If NAT is
// not required, and no additional configuration is present, it returns nil.
func peerConfigTableFromWGConfig(wcfg *wgcfg.Config) *peerConfigTable {
if wcfg == nil {
return nil
}
nativeAddr4 := findV4(wcfg.Addresses)
nativeAddr6 := findV6(wcfg.Addresses)
if !nativeAddr4.IsValid() && !nativeAddr6.IsValid() {
return nil
}
ret := &peerConfigTable{
nativeAddr4: nativeAddr4,
nativeAddr6: nativeAddr6,
masqAddrCounts: make(map[netip.Addr]int),
}
// When using an exit node that requires masquerading, we need to
// fill out the routing table with all peers not just the ones that
// require masquerading.
exitNodeRequiresMasq := false // true if using an exit node and it requires masquerading
for _, p := range wcfg.Peers {
isExitNode := slices.Contains(p.AllowedIPs, tsaddr.AllIPv4()) || slices.Contains(p.AllowedIPs, tsaddr.AllIPv6())
if isExitNode {
hasMasqAddr := false ||
(p.V4MasqAddr != nil && p.V4MasqAddr.IsValid()) ||
(p.V6MasqAddr != nil && p.V6MasqAddr.IsValid())
if hasMasqAddr {
exitNodeRequiresMasq = true
}
break
}
}
byIPSize := 0
for i := range wcfg.Peers {
p := &wcfg.Peers[i]
// Build a routing table that configures DNAT (i.e. changing
// the V4MasqAddr/V6MasqAddr for a given peer to the current
// peer's v4/v6 IP).
var addrToUse4, addrToUse6 netip.Addr
if p.V4MasqAddr != nil && p.V4MasqAddr.IsValid() {
addrToUse4 = *p.V4MasqAddr
ret.masqAddrCounts[addrToUse4]++
}
if p.V6MasqAddr != nil && p.V6MasqAddr.IsValid() {
addrToUse6 = *p.V6MasqAddr
ret.masqAddrCounts[addrToUse6]++
}
// If the exit node requires masquerading, set the masquerade
// addresses to our native addresses.
if exitNodeRequiresMasq {
if !addrToUse4.IsValid() && nativeAddr4.IsValid() {
addrToUse4 = nativeAddr4
}
if !addrToUse6.IsValid() && nativeAddr6.IsValid() {
addrToUse6 = nativeAddr6
}
}
if !addrToUse4.IsValid() && !addrToUse6.IsValid() && !p.IsJailed {
// NAT not required for this peer.
continue
}
// Use the same peer configuration for each address of the peer.
pc := &peerConfig{
dstMasqAddr4: addrToUse4,
dstMasqAddr6: addrToUse6,
jailed: p.IsJailed,
}
// Insert an entry into our routing table for each allowed IP.
for _, ip := range p.AllowedIPs {
ret.byIP.Insert(ip, pc)
byIPSize++
}
}
if byIPSize == 0 && len(ret.masqAddrCounts) == 0 {
return nil
}
return ret
}
func (pc *peerConfigTable) inboundPacketIsJailed(p *packet.Parsed) bool {
if pc == nil {
return false
}
c, ok := pc.byIP.Lookup(p.Src.Addr())
if !ok {
return false
}
return c.jailed
}
func (pc *peerConfigTable) outboundPacketIsJailed(p *packet.Parsed) bool {
if pc == nil {
return false
}
c, ok := pc.byIP.Lookup(p.Dst.Addr())
if !ok {
return false
}
return c.jailed
}
// SetWGConfig is called when a new NetworkMap is received.
func (t *Wrapper) SetWGConfig(wcfg *wgcfg.Config) {
cfg := peerConfigTableFromWGConfig(wcfg)
old := t.peerConfig.Swap(cfg)
if !reflect.DeepEqual(old, cfg) {
t.logf("peer config: %v", cfg)
}
}
var (
magicDNSIPPort = netip.AddrPortFrom(tsaddr.TailscaleServiceIP(), 0) // 100.100.100.100:0
magicDNSIPPortv6 = netip.AddrPortFrom(tsaddr.TailscaleServiceIPv6(), 0)
)
func (t *Wrapper) filterPacketOutboundToWireGuard(p *packet.Parsed, pc *peerConfigTable) filter.Response {
// Fake ICMP echo responses to MagicDNS (100.100.100.100).
if p.IsEchoRequest() {
switch p.Dst {
case magicDNSIPPort:
header := p.ICMP4Header()
header.ToResponse()
outp := packet.Generate(&header, p.Payload())
t.InjectInboundCopy(outp)
return filter.DropSilently // don't pass on to OS; already handled
case magicDNSIPPortv6:
header := p.ICMP6Header()
header.ToResponse()
outp := packet.Generate(&header, p.Payload())
t.InjectInboundCopy(outp)
return filter.DropSilently // don't pass on to OS; already handled
}
}
// Issue 1526 workaround: if we sent disco packets over
// Tailscale from ourselves, then drop them, as that shouldn't
// happen unless a networking stack is confused, as it seems
// macOS in Network Extension mode might be.
if p.IPProto == ipproto.UDP && // disco is over UDP; avoid isSelfDisco call for TCP/etc
t.isSelfDisco(p) {
t.limitedLogf("[unexpected] received self disco out packet over tstun; dropping")
metricPacketOutDropSelfDisco.Add(1)
return filter.DropSilently
}
if t.PreFilterPacketOutboundToWireGuardNetstackIntercept != nil {
if res := t.PreFilterPacketOutboundToWireGuardNetstackIntercept(p, t); res.IsDrop() {
// Handled by netstack.Impl.handleLocalPackets (quad-100 DNS primarily)
return res
}
}
if t.PreFilterPacketOutboundToWireGuardEngineIntercept != nil {
if res := t.PreFilterPacketOutboundToWireGuardEngineIntercept(p, t); res.IsDrop() {
// Handled by userspaceEngine.handleLocalPackets (primarily handles
// quad-100 if netstack is not installed).
return res
}
}
// If the outbound packet is to a jailed peer, use our jailed peer
// packet filter.
var filt *filter.Filter
if pc.outboundPacketIsJailed(p) {
filt = t.jailedFilter.Load()
} else {
filt = t.filter.Load()
}
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
if filt == nil {
return filter.Drop
}
if filt.RunOut(p, t.filterFlags) != filter.Accept {
metricPacketOutDropFilter.Add(1)
return filter.Drop
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
if t.PostFilterPacketOutboundToWireGuard != nil {
if res := t.PostFilterPacketOutboundToWireGuard(p, t); res.IsDrop() {
return res
}
}
return filter.Accept
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
// noteActivity records that there was a read or write at the current time.
func (t *Wrapper) noteActivity() {
t.lastActivityAtomic.StoreAtomic(mono.Now())
}
// IdleDuration reports how long it's been since the last read or write to this device.
//
// Its value should only be presumed accurate to roughly 10ms granularity.
// If there's never been activity, the duration is since the wrapper was created.
func (t *Wrapper) IdleDuration() time.Duration {
return mono.Since(t.lastActivityAtomic.LoadAtomic())
}
func (t *Wrapper) Read(buffs [][]byte, sizes []int, offset int) (int, error) {
if !t.started.Load() {
<-t.startCh
}
// packet from OS read and sent to WG
res, ok := <-t.vectorOutbound
if !ok {
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
return 0, io.EOF
}
if res.err != nil && len(res.data) == 0 {
return 0, res.err
}
if res.data == nil {
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
return t.injectedRead(res.injected, buffs, sizes, offset)
}
metricPacketOut.Add(int64(len(res.data)))
var buffsPos int
p := parsedPacketPool.Get().(*packet.Parsed)
defer parsedPacketPool.Put(p)
captHook := t.captureHook.Load()
pc := t.peerConfig.Load()
for _, data := range res.data {
p.Decode(data[res.dataOffset:])
if m := t.destIPActivity.Load(); m != nil {
if fn := m[p.Dst.Addr()]; fn != nil {
fn()
}
}
if captHook != nil {
captHook(capture.FromLocal, t.now(), p.Buffer(), p.CaptureMeta)
}
if !t.disableFilter {
response := t.filterPacketOutboundToWireGuard(p, pc)
if response != filter.Accept {
metricPacketOutDrop.Add(1)
continue
}
}
// Make sure to do SNAT after filtering, so that any flow tracking in
// the filter sees the original source address. See #12133.
pc.snat(p)
n := copy(buffs[buffsPos][offset:], p.Buffer())
if n != len(data)-res.dataOffset {
panic(fmt.Sprintf("short copy: %d != %d", n, len(data)-res.dataOffset))
}
sizes[buffsPos] = n
if stats := t.stats.Load(); stats != nil {
stats.UpdateTxVirtual(p.Buffer())
}
buffsPos++
}
// t.vectorBuffer has a fixed location in memory.
// TODO(raggi): add an explicit field and possibly method to the tunVectorReadResult
// to signal when sendBufferConsumed should be called.
if &res.data[0] == &t.vectorBuffer[0] {
// We are done with t.buffer. Let poll() re-use it.
t.sendBufferConsumed()
}
t.noteActivity()
return buffsPos, res.err
}
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
const (
minTCPHeaderSize = 20
)
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
func stackGSOToTunGSO(pkt []byte, gso stack.GSO) (tun.GSOOptions, error) {
options := tun.GSOOptions{
CsumStart: gso.L3HdrLen,
CsumOffset: gso.CsumOffset,
GSOSize: gso.MSS,
NeedsCsum: gso.NeedsCsum,
}
switch gso.Type {
case stack.GSONone:
options.GSOType = tun.GSONone
return options, nil
case stack.GSOTCPv4:
options.GSOType = tun.GSOTCPv4
case stack.GSOTCPv6:
options.GSOType = tun.GSOTCPv6
default:
return tun.GSOOptions{}, fmt.Errorf("unsupported gVisor GSOType: %v", gso.Type)
}
// options.HdrLen is both layer 3 and 4 together, whereas gVisor only
// gives us layer 3 length. We have to gather TCP header length
// ourselves.
if len(pkt) < int(gso.L3HdrLen)+minTCPHeaderSize {
return tun.GSOOptions{}, errors.New("gVisor GSOTCP packet length too short")
}
tcphLen := uint16(pkt[int(gso.L3HdrLen)+12] >> 4 * 4)
options.HdrLen = gso.L3HdrLen + tcphLen
return options, nil
}
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
func invertGSOChecksum(pkt []byte, gso stack.GSO) {
if gso.NeedsCsum != true {
return
}
at := int(gso.L3HdrLen + gso.CsumOffset)
if at+1 > len(pkt)-1 {
return
}
pkt[at] = ^pkt[at]
pkt[at+1] = ^pkt[at+1]
}
// injectedRead handles injected reads, which bypass filters.
func (t *Wrapper) injectedRead(res tunInjectedRead, outBuffs [][]byte, sizes []int, offset int) (n int, err error) {
var gso stack.GSO
pkt := outBuffs[0][offset:]
if res.packet != nil {
bufN := copy(pkt, res.packet.NetworkHeader().Slice())
bufN += copy(pkt[bufN:], res.packet.TransportHeader().Slice())
bufN += copy(pkt[bufN:], res.packet.Data().AsRange().ToSlice())
gso = res.packet.GSOOptions
pkt = pkt[:bufN]
defer res.packet.DecRef() // defer DecRef so we may continue to reference it
} else {
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
sizes[0] = copy(pkt, res.data)
pkt = pkt[:sizes[0]]
n = 1
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
pc := t.peerConfig.Load()
p := parsedPacketPool.Get().(*packet.Parsed)
defer parsedPacketPool.Put(p)
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
p.Decode(pkt)
// We invert the transport layer checksum before and after snat() if gVisor
// handed us a segment with a partial checksum. A partial checksum is not a
// ones' complement of the sum, and incremental checksum updating that could
// occur as a result of snat() is not aware of this. Alternatively we could
// plumb partial transport layer checksum awareness down through snat(),
// but the surface area of such a change is much larger, and not yet
// justified by this singular case.
invertGSOChecksum(pkt, gso)
pc.snat(p)
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
invertGSOChecksum(pkt, gso)
if m := t.destIPActivity.Load(); m != nil {
if fn := m[p.Dst.Addr()]; fn != nil {
net/packet: remove the custom IP4/IP6 types in favor of netaddr.IP. Upstream netaddr has a change that makes it alloc-free, so it's safe to use in hot codepaths. This gets rid of one of the many IP types in our codebase. Performance is currently worse across the board. This is likely due in part to netaddr.IP being a larger value type (4b -> 24b for IPv4, 16b -> 24b for IPv6), and in other part due to missing low-hanging fruit optimizations in netaddr. However, the regression is less bad than it looks at first glance, because we'd micro-optimized packet.IP* in the past few weeks. This change drops us back to roughly where we were at the 1.2 release, but with the benefit of a significant code and architectural simplification. name old time/op new time/op delta pkg:tailscale.com/net/packet goos:linux goarch:amd64 Decode/tcp4-8 12.2ns ± 5% 29.7ns ± 2% +142.32% (p=0.008 n=5+5) Decode/tcp6-8 12.6ns ± 3% 65.1ns ± 2% +418.47% (p=0.008 n=5+5) Decode/udp4-8 11.8ns ± 3% 30.5ns ± 2% +157.94% (p=0.008 n=5+5) Decode/udp6-8 27.1ns ± 1% 65.7ns ± 2% +142.36% (p=0.016 n=4+5) Decode/icmp4-8 24.6ns ± 2% 30.5ns ± 2% +23.65% (p=0.016 n=4+5) Decode/icmp6-8 22.9ns ±51% 65.5ns ± 2% +186.19% (p=0.008 n=5+5) Decode/igmp-8 18.1ns ±44% 30.2ns ± 1% +66.89% (p=0.008 n=5+5) Decode/unknown-8 20.8ns ± 1% 10.6ns ± 9% -49.11% (p=0.016 n=4+5) pkg:tailscale.com/wgengine/filter goos:linux goarch:amd64 Filter/icmp4-8 30.5ns ± 1% 77.9ns ± 3% +155.01% (p=0.008 n=5+5) Filter/tcp4_syn_in-8 43.7ns ± 3% 123.0ns ± 3% +181.72% (p=0.008 n=5+5) Filter/tcp4_syn_out-8 24.5ns ± 2% 45.7ns ± 6% +86.22% (p=0.008 n=5+5) Filter/udp4_in-8 64.8ns ± 1% 210.0ns ± 2% +223.87% (p=0.008 n=5+5) Filter/udp4_out-8 119ns ± 0% 278ns ± 0% +133.78% (p=0.016 n=4+5) Filter/icmp6-8 40.3ns ± 2% 204.4ns ± 4% +407.70% (p=0.008 n=5+5) Filter/tcp6_syn_in-8 35.3ns ± 3% 199.2ns ± 2% +464.95% (p=0.008 n=5+5) Filter/tcp6_syn_out-8 32.8ns ± 2% 81.0ns ± 2% +147.10% (p=0.008 n=5+5) Filter/udp6_in-8 106ns ± 2% 290ns ± 2% +174.48% (p=0.008 n=5+5) Filter/udp6_out-8 184ns ± 2% 314ns ± 3% +70.43% (p=0.016 n=4+5) pkg:tailscale.com/wgengine/tstun goos:linux goarch:amd64 Write-8 9.02ns ± 3% 8.92ns ± 1% ~ (p=0.421 n=5+5) name old alloc/op new alloc/op delta pkg:tailscale.com/net/packet goos:linux goarch:amd64 Decode/tcp4-8 0.00B 0.00B ~ (all equal) Decode/tcp6-8 0.00B 0.00B ~ (all equal) Decode/udp4-8 0.00B 0.00B ~ (all equal) Decode/udp6-8 0.00B 0.00B ~ (all equal) Decode/icmp4-8 0.00B 0.00B ~ (all equal) Decode/icmp6-8 0.00B 0.00B ~ (all equal) Decode/igmp-8 0.00B 0.00B ~ (all equal) Decode/unknown-8 0.00B 0.00B ~ (all equal) pkg:tailscale.com/wgengine/filter goos:linux goarch:amd64 Filter/icmp4-8 0.00B 0.00B ~ (all equal) Filter/tcp4_syn_in-8 0.00B 0.00B ~ (all equal) Filter/tcp4_syn_out-8 0.00B 0.00B ~ (all equal) Filter/udp4_in-8 0.00B 0.00B ~ (all equal) Filter/udp4_out-8 16.0B ± 0% 64.0B ± 0% +300.00% (p=0.008 n=5+5) Filter/icmp6-8 0.00B 0.00B ~ (all equal) Filter/tcp6_syn_in-8 0.00B 0.00B ~ (all equal) Filter/tcp6_syn_out-8 0.00B 0.00B ~ (all equal) Filter/udp6_in-8 0.00B 0.00B ~ (all equal) Filter/udp6_out-8 48.0B ± 0% 64.0B ± 0% +33.33% (p=0.008 n=5+5) name old allocs/op new allocs/op delta pkg:tailscale.com/net/packet goos:linux goarch:amd64 Decode/tcp4-8 0.00 0.00 ~ (all equal) Decode/tcp6-8 0.00 0.00 ~ (all equal) Decode/udp4-8 0.00 0.00 ~ (all equal) Decode/udp6-8 0.00 0.00 ~ (all equal) Decode/icmp4-8 0.00 0.00 ~ (all equal) Decode/icmp6-8 0.00 0.00 ~ (all equal) Decode/igmp-8 0.00 0.00 ~ (all equal) Decode/unknown-8 0.00 0.00 ~ (all equal) pkg:tailscale.com/wgengine/filter goos:linux goarch:amd64 Filter/icmp4-8 0.00 0.00 ~ (all equal) Filter/tcp4_syn_in-8 0.00 0.00 ~ (all equal) Filter/tcp4_syn_out-8 0.00 0.00 ~ (all equal) Filter/udp4_in-8 0.00 0.00 ~ (all equal) Filter/udp4_out-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Filter/icmp6-8 0.00 0.00 ~ (all equal) Filter/tcp6_syn_in-8 0.00 0.00 ~ (all equal) Filter/tcp6_syn_out-8 0.00 0.00 ~ (all equal) Filter/udp6_in-8 0.00 0.00 ~ (all equal) Filter/udp6_out-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Signed-off-by: David Anderson <danderson@tailscale.com>
2020-12-19 16:43:25 -08:00
fn()
}
}
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
if res.packet != nil {
var gsoOptions tun.GSOOptions
gsoOptions, err = stackGSOToTunGSO(pkt, gso)
if err != nil {
return 0, err
}
n, err = tun.GSOSplit(pkt, gsoOptions, outBuffs, sizes, offset)
}
if stats := t.stats.Load(); stats != nil {
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
for i := 0; i < n; i++ {
stats.UpdateTxVirtual(outBuffs[i][offset : offset+sizes[i]])
}
}
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
t.noteActivity()
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
metricPacketOut.Add(int64(n))
return n, err
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
func (t *Wrapper) filterPacketInboundFromWireGuard(p *packet.Parsed, captHook capture.Callback, pc *peerConfigTable) filter.Response {
if captHook != nil {
captHook(capture.FromPeer, t.now(), p.Buffer(), p.CaptureMeta)
}
if p.IPProto == ipproto.TSMP {
if pingReq, ok := p.AsTSMPPing(); ok {
t.noteActivity()
t.injectOutboundPong(p, pingReq)
return filter.DropSilently
} else if data, ok := p.AsTSMPPong(); ok {
if f := t.OnTSMPPongReceived; f != nil {
f(data)
}
}
}
if p.IsEchoResponse() {
if f := t.OnICMPEchoResponseReceived; f != nil && f(p) {
// Note: this looks dropped in metrics, even though it was
// handled internally.
return filter.DropSilently
}
}
// Issue 1526 workaround: if we see disco packets over
// Tailscale from ourselves, then drop them, as that shouldn't
// happen unless a networking stack is confused, as it seems
// macOS in Network Extension mode might be.
if p.IPProto == ipproto.UDP && // disco is over UDP; avoid isSelfDisco call for TCP/etc
t.isSelfDisco(p) {
t.limitedLogf("[unexpected] received self disco in packet over tstun; dropping")
metricPacketInDropSelfDisco.Add(1)
return filter.DropSilently
}
if t.PreFilterPacketInboundFromWireGuard != nil {
if res := t.PreFilterPacketInboundFromWireGuard(p, t); res.IsDrop() {
return res
}
}
var filt *filter.Filter
if pc.inboundPacketIsJailed(p) {
filt = t.jailedFilter.Load()
} else {
filt = t.filter.Load()
}
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
if filt == nil {
return filter.Drop
}
outcome := filt.RunIn(p, t.filterFlags)
// Let peerapi through the filter; its ACLs are handled at L7,
// not at the packet level.
if outcome != filter.Accept &&
p.IPProto == ipproto.TCP &&
p.TCPFlags&packet.TCPSyn != 0 &&
t.PeerAPIPort != nil {
if port, ok := t.PeerAPIPort(p.Dst.Addr()); ok && port == p.Dst.Port() {
outcome = filter.Accept
}
}
if outcome != filter.Accept {
metricPacketInDropFilter.Add(1)
// Tell them, via TSMP, we're dropping them due to the ACL.
// Their host networking stack can translate this into ICMP
// or whatnot as required. But notably, their GUI or tailscale CLI
// can show them a rejection history with reasons.
if p.IPVersion == 4 && p.IPProto == ipproto.TCP && p.TCPFlags&packet.TCPSyn != 0 && !t.disableTSMPRejected {
rj := packet.TailscaleRejectedHeader{
IPSrc: p.Dst.Addr(),
IPDst: p.Src.Addr(),
Src: p.Src,
Dst: p.Dst,
Proto: p.IPProto,
Reason: packet.RejectedDueToACLs,
}
if filt.ShieldsUp() {
rj.Reason = packet.RejectedDueToShieldsUp
}
pkt := packet.Generate(rj, nil)
t.InjectOutbound(pkt)
// TODO(bradfitz): also send a TCP RST, after the TSMP message.
}
return filter.Drop
}
if t.PostFilterPacketInboundFromWireGuard != nil {
if res := t.PostFilterPacketInboundFromWireGuard(p, t); res.IsDrop() {
return res
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
}
return filter.Accept
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
// Write accepts incoming packets. The packets begins at buffs[:][offset:],
// like wireguard-go/tun.Device.Write.
func (t *Wrapper) Write(buffs [][]byte, offset int) (int, error) {
metricPacketIn.Add(int64(len(buffs)))
i := 0
p := parsedPacketPool.Get().(*packet.Parsed)
defer parsedPacketPool.Put(p)
captHook := t.captureHook.Load()
pc := t.peerConfig.Load()
for _, buff := range buffs {
p.Decode(buff[offset:])
pc.dnat(p)
if !t.disableFilter {
if t.filterPacketInboundFromWireGuard(p, captHook, pc) != filter.Accept {
metricPacketInDrop.Add(1)
} else {
buffs[i] = buff
i++
}
}
}
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GRO for Linux (#12921) This commit implements TCP GRO for packets being written to gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported IP checksum functions. gVisor is updated in order to make use of newly exported stack.PacketBuffer GRO logic. TCP throughput towards gVisor, i.e. TUN write direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement, sometimes as high as 2x. High bandwidth-delay product paths remain receive window limited, bottlenecked by gVisor's default TCP receive socket buffer size. This will be addressed in a follow-on commit. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GRO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec 20 sender [ 5] 0.00-10.00 sec 4.77 GBytes 4.10 Gbits/sec receiver The second result is from this commit with TCP GRO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec 20 sender [ 5] 0.00-10.00 sec 10.6 GBytes 9.14 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-08-02 10:41:10 -07:00
if t.EndPacketVectorInboundFromWireGuardFlush != nil {
t.EndPacketVectorInboundFromWireGuardFlush()
}
if t.disableFilter {
i = len(buffs)
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
buffs = buffs[:i]
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
if len(buffs) > 0 {
t.noteActivity()
_, err := t.tdevWrite(buffs, offset)
return len(buffs), err
}
return 0, nil
}
func (t *Wrapper) tdevWrite(buffs [][]byte, offset int) (int, error) {
if stats := t.stats.Load(); stats != nil {
for i := range buffs {
stats.UpdateRxVirtual((buffs)[i][offset:])
}
}
return t.tdev.Write(buffs, offset)
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
func (t *Wrapper) GetFilter() *filter.Filter {
return t.filter.Load()
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
func (t *Wrapper) SetFilter(filt *filter.Filter) {
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
t.filter.Store(filt)
}
func (t *Wrapper) GetJailedFilter() *filter.Filter {
return t.jailedFilter.Load()
}
func (t *Wrapper) SetJailedFilter(filt *filter.Filter) {
t.jailedFilter.Store(filt)
}
// InjectInboundPacketBuffer makes the Wrapper device behave as if a packet
// with the given contents was received from the network.
// It takes ownership of one reference count on the packet. The injected
// packet will not pass through inbound filters.
//
// This path is typically used to deliver synthesized packets to the
// host networking stack.
func (t *Wrapper) InjectInboundPacketBuffer(pkt *stack.PacketBuffer) error {
buf := make([]byte, PacketStartOffset+pkt.Size())
n := copy(buf[PacketStartOffset:], pkt.NetworkHeader().Slice())
n += copy(buf[PacketStartOffset+n:], pkt.TransportHeader().Slice())
n += copy(buf[PacketStartOffset+n:], pkt.Data().AsRange().ToSlice())
if n != pkt.Size() {
panic("unexpected packet size after copy")
}
pkt.DecRef()
pc := t.peerConfig.Load()
p := parsedPacketPool.Get().(*packet.Parsed)
defer parsedPacketPool.Put(p)
p.Decode(buf[PacketStartOffset:])
captHook := t.captureHook.Load()
if captHook != nil {
captHook(capture.SynthesizedToLocal, t.now(), p.Buffer(), p.CaptureMeta)
}
pc.dnat(p)
return t.InjectInboundDirect(buf, PacketStartOffset)
}
// InjectInboundDirect makes the Wrapper device behave as if a packet
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
// with the given contents was received from the network.
// It blocks and does not take ownership of the packet.
// The injected packet will not pass through inbound filters.
//
// The packet contents are to start at &buf[offset].
// offset must be greater or equal to PacketStartOffset.
// The space before &buf[offset] will be used by WireGuard.
func (t *Wrapper) InjectInboundDirect(buf []byte, offset int) error {
if len(buf) > MaxPacketSize {
return errPacketTooBig
}
if len(buf) < offset {
return errOffsetTooBig
}
if offset < PacketStartOffset {
return errOffsetTooSmall
}
// Write to the underlying device to skip filters.
_, err := t.tdevWrite([][]byte{buf}, offset) // TODO(jwhited): alloc?
return err
}
// InjectInboundCopy takes a packet without leading space,
2020-09-25 12:24:44 -07:00
// reallocates it to conform to the InjectInboundDirect interface
// and calls InjectInboundDirect on it. Injecting a nil packet is a no-op.
func (t *Wrapper) InjectInboundCopy(packet []byte) error {
// We duplicate this check from InjectInboundDirect here
// to avoid wasting an allocation on an oversized packet.
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
if len(packet) > MaxPacketSize {
return errPacketTooBig
}
if len(packet) == 0 {
return nil
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
buf := make([]byte, PacketStartOffset+len(packet))
copy(buf[PacketStartOffset:], packet)
return t.InjectInboundDirect(buf, PacketStartOffset)
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
func (t *Wrapper) injectOutboundPong(pp *packet.Parsed, req packet.TSMPPingRequest) {
pong := packet.TSMPPongReply{
Data: req.Data,
}
if t.PeerAPIPort != nil {
pong.PeerAPIPort, _ = t.PeerAPIPort(pp.Dst.Addr())
}
switch pp.IPVersion {
case 4:
h4 := pp.IP4Header()
h4.ToResponse()
pong.IPHeader = h4
case 6:
h6 := pp.IP6Header()
h6.ToResponse()
pong.IPHeader = h6
default:
return
}
t.InjectOutbound(packet.Generate(pong, nil))
}
// InjectOutbound makes the Wrapper device behave as if a packet
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
// with the given contents was sent to the network.
// It does not block, but takes ownership of the packet.
// The injected packet will not pass through outbound filters.
// Injecting an empty packet is a no-op.
func (t *Wrapper) InjectOutbound(pkt []byte) error {
if len(pkt) > MaxPacketSize {
return errPacketTooBig
}
if len(pkt) == 0 {
return nil
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
t.injectOutbound(tunInjectedRead{data: pkt})
return nil
}
// InjectOutboundPacketBuffer logically behaves as InjectOutbound. It takes ownership of one
// reference count on the packet, and the packet may be mutated. The packet refcount will be
// decremented after the injected buffer has been read.
func (t *Wrapper) InjectOutboundPacketBuffer(pkt *stack.PacketBuffer) error {
size := pkt.Size()
if size > MaxPacketSize {
pkt.DecRef()
return errPacketTooBig
}
if size == 0 {
pkt.DecRef()
return nil
}
if capt := t.captureHook.Load(); capt != nil {
b := pkt.ToBuffer()
capt(capture.SynthesizedToPeer, t.now(), b.Flatten(), packet.CaptureMeta{})
}
t.injectOutbound(tunInjectedRead{packet: pkt})
return nil
wgengine: wrap tun.Device to support filtering and packet injection (#358) Right now, filtering and packet injection in wgengine depend on a patch to wireguard-go that probably isn't suitable for upstreaming. This need not be the case: wireguard-go/tun.Device is an interface. For example, faketun.go implements it to mock a TUN device for testing. This patch implements the same interface to provide filtering and packet injection at the tunnel device level, at which point the wireguard-go patch should no longer be necessary. This patch has the following performance impact on i7-7500U @ 2.70GHz, tested in the following namespace configuration: ┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐ │ $ns1 │ │ $ns0 │ │ $ns2 │ │ client0 │ │ tailcontrol, logcatcher │ │ client1 │ │ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │ │ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │ │ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │ │ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │ │ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │ └────────────────┘ └─────────────────────────────────┘ └────────────────┘ Before: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec | --------------------------------------------------- After: --------------------------------------------------- | TCP send | UDP send | |------------------------|------------------------| | 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec | --------------------------------------------------- The impact on receive performance is similar. Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
2020-05-13 09:16:17 -04:00
}
func (t *Wrapper) BatchSize() int {
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869) This commit implements TCP GSO for packets being read from gVisor on Linux. Windows support will follow later. The wireguard-go dependency is updated in order to make use of newly exported GSO logic from its tun package. A new gVisor stack.LinkEndpoint implementation has been established (linkEndpoint) that is loosely modeled after its predecessor (channel.Endpoint). This new implementation supports GSO of monster TCP segments up to 64K in size, whereas channel.Endpoint only supports up to 32K. linkEndpoint will also be required for GRO, which will be implemented in a follow-on commit. TCP throughput from gVisor, i.e. TUN read direction, is dramatically improved as a result of this commit. Benchmarks show substantial improvement through a wide range of RTT and loss conditions, sometimes as high as 5x. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 57856fc without TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.51 GBytes 2.15 Gbits/sec 154 sender [ 5] 0.00-10.00 sec 2.49 GBytes 2.14 Gbits/sec receiver The second result is from this commit with TCP GSO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec 6 sender [ 5] 0.00-10.00 sec 12.6 GBytes 10.8 Gbits/sec receiver Updates #6816 Signed-off-by: Jordan Whited <jordan@tailscale.com>
2024-07-31 09:42:11 -07:00
if runtime.GOOS == "linux" {
// Always setup Linux to handle vectors, even in the very rare case that
// the underlying t.tdev returns 1. gVisor GSO is always enabled for
// Linux, and we cannot make a determination on gVisor usage at
// wireguard-go.Device startup, which is when this value matters for
// packet memory init.
return conn.IdealBatchSize
}
return t.tdev.BatchSize()
}
// Unwrap returns the underlying tun.Device.
func (t *Wrapper) Unwrap() tun.Device {
return t.tdev
}
// SetStatistics specifies a per-connection statistics aggregator.
// Nil may be specified to disable statistics gathering.
func (t *Wrapper) SetStatistics(stats *connstats.Statistics) {
t.stats.Store(stats)
}
var (
metricPacketIn = clientmetric.NewCounter("tstun_in_from_wg")
metricPacketInDrop = clientmetric.NewCounter("tstun_in_from_wg_drop")
metricPacketInDropFilter = clientmetric.NewCounter("tstun_in_from_wg_drop_filter")
metricPacketInDropSelfDisco = clientmetric.NewCounter("tstun_in_from_wg_drop_self_disco")
metricPacketOut = clientmetric.NewCounter("tstun_out_to_wg")
metricPacketOutDrop = clientmetric.NewCounter("tstun_out_to_wg_drop")
metricPacketOutDropFilter = clientmetric.NewCounter("tstun_out_to_wg_drop_filter")
metricPacketOutDropSelfDisco = clientmetric.NewCounter("tstun_out_to_wg_drop_self_disco")
)
func (t *Wrapper) InstallCaptureHook(cb capture.Callback) {
t.captureHook.Store(cb)
}