Commit Graph

1067 Commits

Author SHA1 Message Date
Brad Fitzpatrick
946dfec98a wgengine/router: fix checkIPRuleSupportsV6 to actually use IPv6
Updates #3358 (should fix it)
Updates #391

Change-Id: Ia62437dfa81247b0b5994d554cf279c3d540e4e7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-19 11:37:05 -08:00
Brad Fitzpatrick
9259377a7f wgengine/router: don't assume Linux was built with IP_MULTIPLE_TABLES
Updates #3351
Updates #391

Change-Id: I7e66b686e05f3c970846513679cc62556ebe322a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-19 11:19:03 -08:00
Brad Fitzpatrick
0350cf0438 wgengine{,/router}: annotate some more errors
Updates #3351

Change-Id: I8b4f957d2051b3e29401bb449dbadbdada3a7c46
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-19 10:46:01 -08:00
Josh Bleecher Snyder
758c37b83d net/netns: thread logf into control functions
So that darwin can log there without panicking during tests.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-18 15:09:51 -08:00
Josh Bleecher Snyder
85184a58ed wgengine/wgcfg: recover from mismatched PublicKey/Endpoints
In rare circumstances (tailscale/corp#3016), the PublicKey
and Endpoints can diverge.

This by itself doesn't cause any harm, but our early exit
in response did, because it prevented us from recovering from it.

Remove the early exit.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-18 14:28:41 -08:00
Brad Fitzpatrick
8ec44d0d5f wgengine/magicsock: remove some log spam
Fixes tailscale/corp#3070

Change-Id: Ie50031800ec8669e0596ad6d59d1e329a5c88516
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-18 11:01:51 -08:00
Brad Fitzpatrick
61d0435ed9 wgengine/monitor: reduce Windows log spam
Fixes #3345

Change-Id: Icde9c92f88f98bb3b030d39b0424a7d389bceb88
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-18 10:57:27 -08:00
Brad Fitzpatrick
d24ed3f68e wgengine/router: add debug knob to resort to Linux "ip" command usage
Tailscale 1.18 uses netlink instead of the "ip" command to program the
Linux kernel.

The old way was kept primarily for tests, but this also adds a
TS_DEBUG_USE_IP_COMMAND environment knob to force the old way
temporarily for debugging anybody who might have problems with the
new way in 1.18.

Updates #391

Change-Id: I0236fbfda6c9c05dcb3554fcc27ec0c86456efd9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-18 08:01:22 -08:00
Josh Bleecher Snyder
b3d6704aa3 wgengine/magicsock: fix data race on endpoint.discoKey
endpoint.discoKey is protected by endpoint.mu.
endpoint.sendDiscoMessage was reading it without holding the lock.
This showed up in a CI failure and is readily reproducible locally.

The fix is in two parts.

First, for Conn.enqueueCallMeMaybe, eliminate the one-line helper method endpoint.sendDiscoMessage; call Conn.sendDiscoMessage directly.
This makes it more natural to read endpoint.discoKey in a context
in which endpoint.mu is already held.

Second, for endpoint.sendDiscoPing, explicitly pass the disco key
as an argument. Again, this makes it easier to read endpoint.discoKey
in a context in which endpoint.mu is already held.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-17 17:49:33 -08:00
Brad Fitzpatrick
cf06f9df37 net/tstun, wgengine: add packet-level and drop metrics
Primarily tstun work, but some MagicDNS stuff spread into wgengine.

No wireguard reconfig metrics (yet).

Updates #3307

Change-Id: Ide768848d7b7d0591e558f118b553013d1ec94ad
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-17 16:18:52 -08:00
Brad Fitzpatrick
7901289578 wgengine/magicsock: add a stress test
And add a peerMap validate method that checks its internal invariants.

Updates tailscale/corp#3016

Change-Id: I23708e68ed44d81986d9e2be82029d4555547592
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-17 14:37:28 -08:00
Josh Bleecher Snyder
5a60781919 wgengine/magicsock: increase TestDiscokeyChange connection timeout
I believe that this should eliminate the flakiness.
If GitHub CI manages to be even slower that can be believed
(and I can believe a lot at this point),
then we should roll this back and make some more invasive changes.

Updates #654
Fixes #3247 (I hope)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-17 14:13:58 -08:00
Josh Bleecher Snyder
773af7292b wgengine/magicsock: simplify peerMap.upsertEndpoint
We can do the "maybe delete" check unilaterally:
In the case of an insert, both oldDiscoKey
and ep.discoKey will be the zero value.

And since we don't use pi again, we can skip
giving it a name, which makes scoping clearer.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-16 15:15:49 -08:00
Josh Bleecher Snyder
9da22dac3d wgengine/magicsock: fix bug in peerMap.upsertEndpoint
Found by inspection by David Crawshaw while
investigating tailscale/corp#3016.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-16 15:15:49 -08:00
Josh Bleecher Snyder
16870cb754 wgengine/magicsock: fix typo in comment
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-16 15:15:49 -08:00
David Anderson
41da7620af go.mod: update wireguard-go to pick up roaming toggle
wgengine/wgcfg: introduce wgcfg.NewDevice helper to disable roaming
at all call sites (one real plus several tests).

Fixes tailscale/corp#3016.

Signed-off-by: David Anderson <danderson@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-16 13:15:04 -08:00
Brad Fitzpatrick
24ea365d48 netcheck, controlclient, magicsock: add more metrics
Updates #3307

Change-Id: Ibb33425764a75bde49230632f1b472f923551126
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-16 10:48:19 -08:00
Brad Fitzpatrick
57b039c51d util/clientmetrics: add new package to add metrics to the client
And annotate magicsock as a start.

And add localapi and debug handlers with the Prometheus-format
exporter.

Updates #3307

Change-Id: I47c5d535fe54424741df143d052760387248f8d3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-15 13:46:05 -08:00
David Anderson
0532eb30db all: replace tailcfg.DiscoKey with key.DiscoPublic.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-03 14:00:16 -07:00
Josh Bleecher Snyder
c467ed0b62 wgengine/wgcfg: always close io.Pipe
In DeviceConfig, we did not close r after calling FromUAPI.
If FromUAPI returned early due to an error, then it might
not have read all the data that IpcGetOperation wanted to write.
As a result, IpcGetOperation could hang, as in #3220.

We were also closing the wrong end of the pipe after IpcSetOperation
in ReconfigDevice.

To ensure that we get all available information to diagnose
such a situation, include all errors anytime something goes wrong.

This should fix the immediate crashing problem in #3220.
We'll then need to figure out why IpcGetOperation was failing.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-02 17:50:15 -07:00
Josh Bleecher Snyder
3fd5f4380f util/multierr: new package
github.com/go-multierror/multierror served us well.
But we need a few feature from it (implement Is),
and it's not worth maintaining a fork of such a small module.

Instead, I did a clean room implementation inspired by its API.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-11-02 17:50:15 -07:00
David Anderson
7e6a1ef4f1 tailcfg: use key.NodePublic in wire protocol types.
Updates #3206.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-02 09:11:43 -07:00
David Anderson
c17250cee2 ipn/ipnstate: use key.NodePublic instead of tailcfg.NodeKey.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-01 20:32:10 -07:00
David Anderson
c3d7115e63 wgengine: use key.NodePublic instead of tailcfg.NodeKey.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-01 18:28:45 -07:00
David Anderson
72ace0acba wgengine/magicsock: use key.NodePublic instead of tailcfg.NodeKey.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-01 18:03:48 -07:00
David Anderson
d6e7cec6a7 types/netmap: use key.NodePublic instead of tailcfg.NodeKey.
Update #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-11-01 17:07:40 -07:00
Brad Fitzpatrick
408b0923a6 wgengine/router: remove last non-test "ip" command usage on Linux
Updates #391

Change-Id: Ic2c3f8460b1e4b8d34b936a1725705fcc1effbae
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-01 15:52:24 -07:00
Brad Fitzpatrick
ff1954cfd9 wgengine/router: use netlink for ip rules on Linux
Using temporary netlink fork in github.com/tailscale/netlink until we
get the necessary changes upstream in either vishvananda/netlink
or jsimonetti/rtnetlink.

Updates #391

Change-Id: I6e1de96cf0750ccba53dabff670aca0c56dffb7c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-01 15:40:36 -07:00
Brad Fitzpatrick
5dc5bd8d20 cmd/tailscaled, wgengine/netstack: always wire up netstack
Even if not in use. We plan to use it for more stuff later.

(not for iOS or macOS-GUIs yet; only tailscaled)

Change-Id: Idaef719d2a009be6a39f158fd8f57f8cca68e0ee
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-01 14:11:30 -07:00
David Anderson
84c3a09a8d types/key: export constants for key size, not a method.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-29 17:39:04 -07:00
David Anderson
6422789ea0 disco: use key.NodePublic instead of tailcfg.NodeKey.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-29 17:39:04 -07:00
David Anderson
418adae379 various: use NodePublic.AsNodeKey() instead of tailcfg.NodeKeyFromNodePublic()
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-29 16:19:27 -07:00
David Anderson
eeb97fd89f various: remove remaining uses of key.NewPrivate.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-29 15:01:12 -07:00
David Anderson
ccd36cb5b1 wgengine: remove use of legacy key parsing helper.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-29 14:57:32 -07:00
David Anderson
ef241f782e wgengine/magicsock: remove uses of tailcfg.DiscoKey.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-29 14:31:44 -07:00
David Anderson
55b6753c11 wgengine/magicsock: remove use of key.{Public,Private}.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-29 13:20:13 -07:00
David Anderson
c1d009b9e9 ipn/ipnstate: use key.NodePublic instead of the generic key.Public.
Updates #3206.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-29 10:00:59 -07:00
David Anderson
37c150aee1 derp: use new node key type.
Update #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-28 16:02:11 -07:00
Brad Fitzpatrick
19189d7018 wgengine/router: add a addrFamily type [linux]
In prep for more netlink-ification.

Change-Id: I7c34a04001988107dc2583597aa4f26ddb887e91
2021-10-28 14:52:29 -07:00
David Anderson
e03fda7ae6 wgengine/magicsock: remove test uses of wgkey.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-28 14:17:25 -07:00
Brad Fitzpatrick
7c40a5d440 wgengine/router: refactor in prep for Linux netlink-ification
Pull out the list of policy routing rules to a data structure
now shared between the add & delete paths, but to also be shared
by the netlink paths in a future change.

Updates #391

Change-Id: I119ab1c246f141d639006c808b61c585c3d67924
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-28 13:56:46 -07:00
Josh Bleecher Snyder
94fb42d4b2 all: use testingutil.MinAllocsPerRun
There are a few remaining uses of testing.AllocsPerRun:
Two in which we only log the number of allocations,
and one in which dynamically calculate the allocations
target based on a different AllocsPerRun run.

This also allows us to tighten the "no allocs"
test in wgengine/filter.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-10-28 12:48:37 -07:00
Josh Bleecher Snyder
1df865a580 wgengine/magicsock: allow even fewer allocs per UDP receive
We improved things again for Go 1.18. Lock that in.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-10-28 12:48:37 -07:00
Josh Bleecher Snyder
c1d377078d wgengine/magicsock: use testingutil.MinAllocsPerRun
This speeds up and deflakes the test.

Fixes #2826 (again)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-10-28 12:48:37 -07:00
Brad Fitzpatrick
aad46bd9ff wgengine/router: stop cleaning up old dev rules on Linux
Anybody using that one old, unreleased version of Tailscale from over
a year ago should've rebooted their machine by now to get various
non-Tailscale security updates. :)

Change-Id: If9e043cb008b20fcd6ddfd03756b3b23a9d7aeb5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-28 12:29:54 -07:00
David Anderson
c9bf773312 wgengine/magicsock: replace use of wgkey with new node key type.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-28 11:21:52 -07:00
Brad Fitzpatrick
d36c0d3566 wgengine/router: add debug test to enumerate rules
No non-test changes.

Updates #391

Change-Id: Ia88610c08e07a119d002e58250463cb4659b9f54
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-28 11:12:16 -07:00
David Anderson
6e5175373e types/netmap: use new node key type.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-28 10:44:34 -07:00
David Anderson
3164c7410e wgengine/wgcfg: remove unused helper function.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-28 10:38:13 -07:00
Brad Fitzpatrick
dc2fbf5877 wgengine/router: start using netlink instead of 'ip' on Linux
Converts up, down, add/del addresses, add/del routes.

Not yet done: rules.

Updates #391

Change-Id: I02554ca07046d18f838e04a626ba99bbd35266fb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-28 10:16:26 -07:00
David Anderson
a9c78910bd wgengine/wgcfg: convert to use new node key type.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-28 09:39:23 -07:00
Brad Fitzpatrick
b0b0a80318 net/netcheck: implement netcheck for js/wasm clients
And the derper change to add a CORS endpoint for latency measurement.

And a little magicsock change to cut down some log spam on js/wasm.

Updates #3157

Change-Id: I5fd9e6f5098c815116ddc8ac90cbcd0602098a48
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-27 09:59:31 -07:00
Maisem Ali
85fa1b0d61 wgengine: fail NewUserspaceEngine if wireguard device doesn't come up
Just something I ran across while debugging an unrelated failure. This
is not in response to any bug/issue.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2021-10-25 12:34:14 -07:00
David Crawshaw
0b62f26349 magicsock: remove test data race
Speculative, I haven't been able to replicate it locally.

Fixes #3156

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-10-22 11:19:07 -07:00
Brad Fitzpatrick
ed3fb197ad wgengine/magicsock: fix/disable a few misc things to get js/wasm working
Updates #3157

Change-Id: Ie9e3a772bb9878584080bb257b32150492e26eaf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-22 09:09:37 -07:00
Brad Fitzpatrick
e25afc6656 wgengine/magicsock: don't try to determine endpoints on js/wasm
Avoid netcheck, LocalAddr, etc.

Updates #3157

Change-Id: Ibc875c787c0e101b8076e64833f4fcc809372815
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-20 12:57:45 -07:00
Brad Fitzpatrick
6cb2705833 wgengine/magicsock: don't run UDP listeners on js/wasm
Be DERP-only for now. (WebRTC can come later :))

Updates #3157

Change-Id: I56ebb3d914e37e8f4ab651306fd705b817ca381c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-20 12:23:22 -07:00
Brad Fitzpatrick
9310713bfb all: fix some js/wasm compilation issues
Change-Id: I05a3a4835e225a1e413ec3540a7c7e4a2d477084
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-20 10:06:16 -07:00
Brad Fitzpatrick
c30fa5903d wgengine/magicsock: remove peerMap.byDiscoKey map
No longer used.

Updates #3088

Change-Id: I0ced3f87baa4053d3838d3c4a828ed0293923825
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-19 12:22:11 -07:00
David Crawshaw
3552d86525 wgengine/magicsock: turn down timeouts in tests
Before:

	--- PASS: TestActiveDiscovery (11.78s)
	    --- PASS: TestActiveDiscovery/facing_easy_firewalls (5.89s)
	    --- PASS: TestActiveDiscovery/facing_nats (5.89s)
	    --- PASS: TestActiveDiscovery/simple_internet (0.89s)

After:

	--- PASS: TestActiveDiscovery (1.98s)
	    --- PASS: TestActiveDiscovery/facing_easy_firewalls (0.99s)
	    --- PASS: TestActiveDiscovery/facing_nats (0.99s)
	    --- PASS: TestActiveDiscovery/simple_internet (0.89s)

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-10-19 09:22:50 -07:00
David Anderson
b956139b0c wgengine/magicsock: track IP<>node mappings without relying on discokeys.
Updates #3088.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-18 14:58:21 -07:00
Brad Fitzpatrick
7a243ae5b1 wgengine/magicsock: finish TODO to speed up peerMap.forEachEndpointWithDiscoKey
Now that peerMap tracks the set of nodes for a DiscoKey.

Updates #3088

Change-Id: I927bf2bdfd2b8126475f6b6acc44bc799fcb489f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-18 14:50:28 -07:00
Brad Fitzpatrick
11fdb14c53 wgengine/magicsock: don't check always-non-nil endpoint for nil-ness
Continuation of 2aa5df7ac1, remove nil
check because it can never be nil. (It previously was able to be nil.)

Change-Id: I59cd9ad611dbdcbfba680ed9b22e841b00c9d5e6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-18 14:37:59 -07:00
David Anderson
e7eb46bced wgengine/magicsock: add an explicit else branch to peerMap update.
Clarifies that the replace+delete of peerinfo data is only when peerInfo
already exists.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-18 13:05:52 -07:00
Maisem Ali
53199738fb wgengine: don't try to delete legacy netfilter rules on synology.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2021-10-18 14:51:25 -04:00
David Anderson
2aa5df7ac1 wgengine/magicsock: document and enforce that peerInfo.ep is non-nil.
Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-18 10:49:24 -07:00
David Anderson
521b44e653 wgengine/magicsock: move discoKey fields to the mutex-protected section.
Fixes #3106

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-18 10:49:24 -07:00
Maisem Ali
27799a1a96 wgengine: only use AmbientCaps on DSM7+
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2021-10-18 13:39:51 -04:00
Brad Fitzpatrick
a6d02dc122 wgengine/magicsock: track which NodeKey each DiscoKey was last for
This adds new fields (currently unused) to discoInfo to track what the
last verified (unambiguous) NodeKey a DiscoKey last mapped to, and
when.

Then on CallMeMaybe, Pong and on most Pings, we update the mapping
from DiscoKey to the current NodeKey for that DiscoKey.

Updates #3088

Change-Id: Idc4261972084dec71cf8ec7f9861fb9178eb0a4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-18 09:55:02 -07:00
Brad Fitzpatrick
c759fcc7d3 wgengine/magicsock: fix data race with sync.Pool in error+logging path
Fixes #3122

Change-Id: Ib52e84f9bd5813d6cf2e80ce5b2296912a48e064
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-17 17:27:57 -07:00
Brad Fitzpatrick
75a7779b42 disco, wgengine/magicsock: send self node key in disco pings
This lets clients quickly (sub-millisecond within a local LAN) map
from an ambiguous disco key to a node key without waiting for a
CallMeMaybe (over relatively high latency DERP).

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-17 10:24:07 -07:00
Joe Tsai
9af27ba829 cmd/cloner: mangle "go:generate" in cloner.go
The "go generate" command blindly looks for "//go:generate" anywhere
in the file regardless of whether it is truly a comment.
Prevent this false positive in cloner.go by mangling the string
to look less like "//go:generate".

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2021-10-16 17:53:43 -07:00
Denton Gentry
def650b3e8 wgengine/magicsock: don't Rebind after STUN error if closed.
https://github.com/tailscale/tailscale/pull/3014 added a
rebind on STUN failure, which means there can now be a
tailscale.com/wgengine/magicsock.(*RebindingUDPConn).ReadFromNetaddr
in progress at the end of the test waiting for a STUN
response which will never arrive.

This causes a test flake due to the resource leak in those
cases where the Conn decided to rebind. For whatever reason,
it mostly flakes with Windows.

If the Conn is closed, don't Rebind after a send error.

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2021-10-16 17:22:13 -07:00
Brad Fitzpatrick
f55c2bccf5 wgengine/magicsock: don't call setAddrToDiscoLocked on DERP ping
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-16 07:43:48 -07:00
Brad Fitzpatrick
569f70abfd wgengine/magicsock: finish some renamings of discoEndpoint to endpoint
Renames only; continuation of earlier 8049063d35

These kept confusing me while working on #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-15 22:26:07 -07:00
Brad Fitzpatrick
695df497ba wgengine/magicsock: delete peerMap.endpointForDiscoKey, remove remaining caller
The one remaining caller of peerMap.endpointForDiscoKey was making the
improper assumption that there's exactly 1 node with a given DiscoKey
in the network. That was the cause of #3088.

Now that all the other callers have been updated to not use
endpointForDiscoKey, there's no need to try to keep maintaining that
prone-to-misuse index.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-15 22:19:27 -07:00
Brad Fitzpatrick
04fd94acd6 wgengine/magicsock: remove endpointForDiscoKey call from handleDiscoMessage
A DiscoKey maps 1:n to endpoints. When we get a disco pong, we don't
necessarily know which endpoint sent it to us. Ask them all. There
will only usually be 1 (and in rare circumstances 2). So it's easier
to ask all two rather than building new maps from the random ping TxID
to its endpoint.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-15 21:59:15 -07:00
Brad Fitzpatrick
151b4415ca wgengine/magicsock: remove endpoint parameter from handlePingLocked
We can reply to a ping without knowing which exact node it's from.  As
long as it's in our netmap, it's safe to reply. If there's more than
one node with that discokey, it doesn't matter who we're relpying to.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-15 21:44:52 -07:00
Brad Fitzpatrick
d86081f353 wgengine/magicsock: add new discoInfo type for DiscoKey state, move some fields
As more prep for removing the false assumption that you're able to
map from DiscoKey to a single peer, move the lastPingFrom and lastPingTime
fields from the endpoint type to a new discoInfo type, effectively upgrading
the old sharedDiscoKey map (which only held a *[32]byte nacl precomputed key
as its value) to discoInfo which then includes that naclbox key.

Then start plumbing it into handlePing in prep for removing the need
for handlePing to take an endpoint parameter.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-15 20:48:44 -07:00
Brad Fitzpatrick
e5779f019e wgengine/magicsock: move temporary endpoint lookup later, add TODO to remove
Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-15 19:22:30 -07:00
Brad Fitzpatrick
36a07089ee wgengine/magicsock: remove redundant/wrong sharedDiscoKey delete
The pass just after in this method handles cleaning up sharedDiscoKey.
No need to do it wrong (assuming DiscoKey => 1 node) earlier.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-15 16:57:59 -07:00
Brad Fitzpatrick
3e80806804 wgengine/magicsock: pass src NodeKey to handleDiscoMessage for DERP disco msgs
And then use it to avoid another lookup-by-DiscoKey.

Updates #3088
2021-10-15 16:52:42 -07:00
Brad Fitzpatrick
82fa15fa3b wgengine/magicsock: start removing endpointForDiscoKey
It's not valid to assume that a discokey is globally unique.

This removes the first two of the four callers.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-15 16:44:02 -07:00
Brad Fitzpatrick
14f9c75293 wgengine/router: ignore Linux ip route error adding dup route
Updates #3060
Updates #391

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-10-14 14:00:45 -07:00
nicksherron
f01ff18b6f all: fix spelling mistakes
Signed-off-by: nicksherron <nsherron90@gmail.com>
2021-10-12 21:23:14 -07:00
Avery Pennarun
0d4a0bf60e magicsock: if STUN failed to send before, rebind before STUNning again.
On iOS (and possibly other platforms), sometimes our UDP socket would
get stuck in a state where it was bound to an invalid interface (or no
interface) after a network reconfiguration. We can detect this by
actually checking the error codes from sending our STUN packets.

If we completely fail to send any STUN packets, we know something is
very broken. So on the next STUN attempt, let's rebind the UDP socket
to try to correct any problems.

This fixes a problem where iOS would sometimes get stuck using DERP
instead of direct connections until the backend was restarted.

Fixes #2994

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
2021-10-08 02:17:09 +09:00
David Anderson
830f641c6b wgengine/magicsock: update discokeys on netmap change.
Fixes #3008.

Signed-off-by: David Anderson <danderson@tailscale.com>
2021-10-06 14:52:47 -07:00
Brad Fitzpatrick
29a8fb45d3 wgengine/netstack: include DNS.ExtraRecords in DNSMap
So SOCKS5 dialer can dial HTTPS cert names, for instance.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-28 10:01:36 -07:00
Brad Fitzpatrick
52737c14ac wgengine/monitor: ignore ipsec link monitor events on iOS/macOS
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-27 20:45:51 -07:00
Denton Gentry
93c2882a2f wgengine: flush DNS cache after major link change.
Windows has a public dns.Flush used in router_windows.go.
However that won't work for platforms like Linux, where
we need a different flush mechanism for resolved versus
other implementations.

We're instead adding a FlushCaches method to the dns Manager,
which can be made to work on all platforms as needed.

Fixes https://github.com/tailscale/tailscale/issues/2132

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2021-09-19 22:58:53 -07:00
Josh Bleecher Snyder
d5ab18b2e6 cmd/cloner: add Clone context to regen struct assignments
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-09-17 16:46:08 -07:00
Josh Bleecher Snyder
a722e48cef wgengine/magicsock: skip alloc test with -race
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-09-17 09:56:32 -07:00
Josh Bleecher Snyder
7693d36aed all: close fake userspace engines when tests complete
We were leaking FDs.
In a few places, switch from defer to t.Cleanup.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-09-15 15:31:51 -07:00
Josh Bleecher Snyder
4bbf5a8636 cmd/cloner: reduce diff noise when changing command
Spelling out the command to run for every type
means that changing the command makes for a large, repetitive diff.
Stop doing that.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-09-15 10:58:12 -07:00
Brad Fitzpatrick
dabeda21e0 net/tstun: block looped disco traffic
Updates #1526 (maybe fixes? time will tell)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-13 16:00:28 -07:00
Brad Fitzpatrick
31c1331415 wgengine/magicsock: deflake TestReceiveFromAllocs
100 iterations isn't enough with background allocs happening
apparently. 1000 seems to be reliable.

Fixes #2826

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-09 11:49:44 -07:00
Brad Fitzpatrick
2238814b99 wgengine/magicsock: fix crash introduced in recent cleanups
Fixes #2801

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-08 08:27:51 -07:00
Brad Fitzpatrick
640134421e all: update tests to use tstest.MemLogger
And give MemLogger a mutex, as one caller had, which does match the logf
contract better.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-07 20:06:15 -07:00
Brad Fitzpatrick
4c68b7df7c tstest: add MemLogger bytes.Buffer wrapper with Logf method
We use it tons of places. Updated three at least in this PR.

Another use in next commit.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-09-07 15:33:45 -07:00
David Crawshaw
9502b515f1 net/dns: replace resolver IPs with type for DoH
We currently plumb full URLs for DNS resolvers from the control server
down to the client. But when we pass the values into the net/dns
package, we throw away any URL that isn't a bare IP. This commit
continues the plumbing, and gets the URL all the way to the built in
forwarder. (It stops before plumbing URLs into the OS configurations
that can handle them.)

For #2596

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2021-09-07 14:44:26 -07:00