iproute2 3.16.0-2 from Debian Jessie (oldoldstable) doesn't return
exit code 2 when deleting a non-existent IP rule.
Fixes#434
Signed-off-by: David Anderson <danderson@tailscale.com>
We have a filter in tailscaled itself now, which is more robust
against weird network topologies (such as the one Docker creates).
Signed-off-by: David Anderson <danderson@tailscale.com>
We'll use SO_BINDTODEVICE instead of fancy policy routing. This has
some limitations: for example, we will route all traffic through the
interface that has the main "default" (0.0.0.0/0) route, so machines
that have multiple physical interfaces might have to go through DERP to
get to some peers. But machines with multiple physical interfaces are
very likely to have policy routing (ip rule) support anyway.
So far, the only OS I know of that needs this feature is ChromeOS
(crostini). Fixes#245.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
Otherwise iOS/macOS will reconfigure their routing every time anything
minor changes in the netmap (in particular, endpoints and DERP homes),
which is way too often.
Some users reported "network reconfigured" errors from Chrome when this
happens.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
This allows tailscaled's own traffic to bypass Tailscale-managed routes,
so that things like tailscale-provided default routes don't break
tailscaled itself.
Progress on #144.
Signed-off-by: David Anderson <danderson@tailscale.com>
We canceled the pingers in Close, but didn't wait around for their
goroutines to be cleaned up. This caused the ipn/e2e_test to catch
pingers in its resource leak check.
This commit introduces an object, but also simplifies the semantics
around the pinger's cancel functions. They no longer need to be called
while holding the mutex.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
Specifically, this sequence:
iptables -N ts-forward
iptables -A ts-forward -m mark --mark 0x10000 -j ACCEPT
iptables -A FORWARD -j ts-forward
doesn't work on Debian-9-using-nftables, but this sequence:
iptables -N ts-forward
iptables -A FORWARD -j ts-forward
iptables -A ts-forward -m mark --mark 0x10000 -j ACCEPT
does work.
I'm sure the reason why is totally fascinating, but it's an old version
of iptables and the bug doesn't seem to exist on modern nftables, so
let's refactor our code to add rules in the always-safe order and
pretend this never happened.
Fixes#401.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
On startup, and when switching into =off and =nodivert, we were
deleting netfilter rules even if we weren't the ones that added them.
In order to avoid interfering with rules added by the sysadmin, we have
to be sure to delete rules only in the case that we added them in the
first place.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
Instead of retrieving the list of chains, or the list of rules in a
chain, just try deleting the ones we don't want and then adding the
ones we do want. An error in flushing/deleting still means the rule
doesn't exist anymore, so there was no need to check for it first.
This avoids the need to parse iptables output, which avoids the need to
ever call iptables -S, which fixes#403, among other things. It's also
much more future proof in case the iptables command line changes.
Unfortunately the iptables go module doesn't properly pass the iptables
command exit code back up when doing .Delete(), so we can't correctly
check the exit code there. (exit code 1 really means the rule didn't
exist, rather than some other weird problem).
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
This removes the use of suppress_ifgroup and fwmark "x/y" notation,
which are, among other things, not available in busybox and centos6.
We also use the return codes from the 'ip' program instead of trying to
parse its output.
I also had to remove the previous hack that routed all of 100.64.0.0/10
by default, because that would add the /10 route into the 'main' route
table instead of the new table 88, which is no good. It was a terrible
hack anyway; if we wanted to capture that route, we should have
captured it explicitly as a subnet route, not as part of the addr. Note
however that this change affects all platforms, so hopefully there
won't be any surprises elsewhere.
Fixes#405
Updates #320, #144
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
Instead of hard-coding the DERP map (except for cmd/tailscale netcheck
for now), get it from the control server at runtime.
And make the DERP map support multiple nodes per region with clients
picking the first one that's available. (The server will balance the
order presented to clients for load balancing)
This deletes the stunner package, merging it into the netcheck package
instead, to minimize all the config hooks that would've been
required.
Also fix some test flakes & races.
Fixes#387 (Don't hard-code the DERP map)
Updates #388 (Add DERP region support)
Fixes#399 (wgengine: flaky tests)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The comment module is compiled out on several embedded systems (and
also gentoo, because netfilter can't go brrrr with comments holding it
back). Attempting to use comments results in a confusing error, and a
non-functional firewall.
Additionally, make the legacy rule cleanup non-fatal, because we *do*
have to probe for the existence of these -m comment rules, and doing
so will error out on these systems.
Signed-off-by: David Anderson <danderson@tailscale.com>
Our new build scripts try to build ipn-go-bridge on more than just
linux and darwin, so let's enable this file so it can be successful on
every platform.
This didn't catch anything yet, but it's good practice for detecting
goroutine leaks that we might not find otherwise.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
If a test calls log.Printf, 'go test' horrifyingly rearranges the
output to no longer be in chronological order, which makes debugging
virtually impossible. Let's stop that from happening by making
log.Printf panic if called from any module, no matter how deep, during
tests.
This required us to change the default error handler in at least one
http.Server, as well as plumbing a bunch of logf functions around,
especially in magicsock and wgengine, but also in logtail and backoff.
To add insult to injury, 'go test' also rearranges the output when a
parent test has multiple sub-tests (all the sub-test's t.Logf is always
printed after all the parent tests t.Logf), so we need to screw around
with a special Logf that can point at the "current" t (current_t.Logf)
in some places. Probably our entire way of using subtests is wrong,
since 'go test' would probably like to run them all in parallel if you
called t.Parallel(), but it definitely can't because the're all
manipulating the shared state created by the parent test. They should
probably all be separate toplevel tests instead, with common
setup/teardown logic. But that's a job for another time.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
Right now, filtering and packet injection in wgengine depend
on a patch to wireguard-go that probably isn't suitable for upstreaming.
This need not be the case: wireguard-go/tun.Device is an interface.
For example, faketun.go implements it to mock a TUN device for testing.
This patch implements the same interface to provide filtering
and packet injection at the tunnel device level,
at which point the wireguard-go patch should no longer be necessary.
This patch has the following performance impact on i7-7500U @ 2.70GHz,
tested in the following namespace configuration:
┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐
│ $ns1 │ │ $ns0 │ │ $ns2 │
│ client0 │ │ tailcontrol, logcatcher │ │ client1 │
│ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │
│ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │
│ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │
│ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │
│ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │
└────────────────┘ └─────────────────────────────────┘ └────────────────┘
Before:
---------------------------------------------------
| TCP send | UDP send |
|------------------------|------------------------|
| 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec |
---------------------------------------------------
After:
---------------------------------------------------
| TCP send | UDP send |
|------------------------|------------------------|
| 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec |
---------------------------------------------------
The impact on receive performance is similar.
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
This saves a layer of translation, and saves us having to
pass in extra bits and pieces of the netmap and prefs to
wgengine. Now it gets one Wireguard config, and one OS
network stack config.
Signed-off-by: David Anderson <danderson@tailscale.com>
Defensive programming against #368 in environments other than Docker,
e.g. if you try using Tailscale in Alpine Linux directly, sans
container.
Signed-off-by: David Anderson <danderson@tailscale.com>