At startup the client doesn't yet have the DERP map so can't do STUN
queries against DERP servers, so it only knows it local interface
addresses, not its STUN-mapped addresses.
We were reporting the interface-local addresses to control, getting
the DERP map, and then immediately reporting the full set of
updates. That was an extra HTTP request to control, but worse: it was
an extra broadcast from control out to all the peers in the network.
Now, skip the initial update if there are no stun results and we don't
have a DERP map.
More work remains optimizing start-up requests/map updates, but this
is a start.
Updates tailscale/corp#557
If no interfaces are up, calm down and stop spamming so much. It was
noticed as especially bad on Windows, but probably was bad
everywhere. I just have the best network conditions testing on a
Windows VM.
Updates #604
Updates #654. See that issue for a discussion of why
this timeout reduces flakiness, and what next steps are.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Rather than consider bigs jumps in last-received-from activity as a
signal to possibly reconfigure the set of wireguard peers to have
configured, instead just track the set of peers that are currently
excluded from the configuration. Easier to reason about.
Also adds a bit more logging.
This might fix an error we saw on a machine running a recent unstable
build:
2020-08-26 17:54:11.528033751 +0000 UTC: 8.6M/92.6M magicsock: [unexpected] lazy endpoint not created for [UcppE], d:42a770f678357249
2020-08-26 17:54:13.691305296 +0000 UTC: 8.7M/92.6M magicsock: DERP packet received from idle peer [UcppE]; created=false
2020-08-26 17:54:13.691383687 +0000 UTC: 8.7M/92.6M magicsock: DERP packet from unknown key: [UcppE]
If it does happen again, though, we'll have more logs.
Seems to break linux CI builder. Cannot reproduce locally,
so attempting a rollback.
This reverts commit cd7bc02ab1.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
Without this, a freshly started ipn client will be stuck in the
"Starting" state until something triggers a call to RequestStatus.
Usually a UI does this, but until then we can sit in this state
until poked by an external event, as is evidenced by our e2e tests
locking up when DERP is attached.
(This only recently became a problem when we enabled lazy handshaking
everywhere, otherwise the wireugard tunnel creation would also
trigger a RequestStatus.)
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
Consider:
Hard NAT (A) <---> Hard NAT w/ mapped port (B)
If A sends a packet to B's mapped port, A can disco ping B directly,
with low latency, without DERP.
But B couldn't establish a path back to A and needed to use DERP,
despite already logging about A's endpoint and adding a mapping to it
for other purposes (the wireguard conn.Endpoint lookup also needed
it).
This adds the tracking to discoEndpoint too so it'll be used for
finding a path back.
Fixestailscale/corp#556
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For example:
$ tailscale ping -h
USAGE
ping <hostname-or-IP>
FLAGS
-c 10 max number of pings to send
-stop-once-direct true stop once a direct path is established
-verbose false verbose output
$ tailscale ping mon.ts.tailscale.com
pong from monitoring (100.88.178.64) via DERP(sfo) in 65ms
pong from monitoring (100.88.178.64) via DERP(sfo) in 252ms
pong from monitoring (100.88.178.64) via [2604:a880:2:d1::36:d001]:41641 in 33ms
Fixes#661
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1) we weren't waking up a discoEndpoint that once existed and
went idle for 5 minutes and then got a disco message again.
2) userspaceEngine.noteReceiveActivity had a buggy check; fixed
and added a test
This removes the atomic bool that tried to track whether we needed to acquire
the lock on a future recursive call back into magicsock. Unfortunately that
hack doesn't work because we also had a lock ordering issue between magicsock
and userspaceEngine (see issue). This documents that too.
Fixes#644
If a node is behind a hard NAT and is using an explicit local port
number, assume they might've mapped a port and add their public IPv4
address with the local tailscaled's port number as a candidate endpoint.
Starting with fe68841dc7, some e2e tests
got flaky. Rather than debug them (they're gnarly), just revert to the old
behavior as far as those tests are concerned. The tests were somehow
using magicsock without a private key and expecting it to do ... something.
My goal with fe68841dc7 was to stop log spam
and unnecessary work I saw on the iOS app when when stopping the app.
Instead, only stop doing that work on any transition from
once-had-a-private-key to no-longer-have-a-private-key. That fixes
what I wanted to fix while still making the mysterious e2e tests
happy.
There is a race in natlab where we might start shutdown while natlab is still running
a goroutine or two to deliver packets. This adds a small grace period to try and receive
it before continuing shutdown.
Signed-off-by: David Anderson <danderson@tailscale.com>
The first packet to transit may take several seconds to do so, because
setup rates in wgengine may result in the initial WireGuard handshake
init to get dropped. So, we have to wait at least long enough for a
retransmit to correct the fault.
Signed-off-by: David Anderson <danderson@tailscale.com>
Active discovery lets us introspect the state of the network stack precisely
enough that it's unnecessary, and dropping the initial DERP packets greatly
slows down tests. Additionally, it's unrealistic since our production network
will never deliver _only_ discovery packets, it'll be all or nothing.
Signed-off-by: David Anderson <danderson@tailscale.com>
Uses natlab only, because the point of this active discovery test is going to be
that it should get through a lot of obstacles.
Signed-off-by: David Anderson <danderson@tailscale.com>
The deadlock was:
* Conn.Close was called, which acquired c.mu
* Then this goroutine scheduled:
if firstDerp {
startGate = c.derpStarted
go func() {
dc.Connect(ctx)
close(c.derpStarted)
}()
}
* The getRegion hook for that derphttp.Client then ran, which also
tries to acquire c.mu.
This change makes that hook first see if we're already in a closing
state and then it can pretend that region doesn't exist.
wireguard-go uses 3 goroutines per peer (with reasonably large stacks
& buffers).
Rather than tell wireguard-go about all our peers, only tell it about
peers we're actively communicating with. That means we need hooks into
magicsock's packet receiving path and tstun's packet sending path to
lazily create a wireguard peer on demand from the network map.
This frees up lots of memory for iOS (where we have almost nothing
left for larger domains with many users).
We should ideally do this in wireguard-go itself one day, but that'd
be a pretty big change.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Before this patch, the 250ms sleep would not be interrupted by context cancellation,
which would result in the goroutine sometimes lingering in tests (100ms grace period).
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
Very rarely, cancellation occurs between a successful send on derpRecvCh
and a call to copyBuf on the receiving side.
Without this patch, this situation results in <-copyBuf blocking indefinitely.
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
Peers advertising a discovery key know how to speak the discovery
protocol and do their own heartbeats to get through NATs and keep NATs
open. No need for the pinger except for with legacy peers.
The new interface lets implementors more precisely distinguish
local traffic from forwarded traffic, and applies different
forwarding logic within Machines for each type. This allows
Machines to be packet forwarders, which didn't quite work
with the implementation of Inject.
Signed-off-by: David Anderson <danderson@tailscale.com>
The test demonstrates that magicsock can traverse two stateful
firewalls facing each other, that each require localhost to
initiate connections.
Signed-off-by: David Anderson <danderson@tailscale.com>
There's a lot of confusion around what tailscale status shows, so make it better:
show region names, last write time, and put stars around DERP too if active.
Now stars are always present if activity, and always somewhere.
* fix tailscale status for peers using discovery
* as part of that, pull out disco address selection into reusable
and testable discoEndpoint.addrForSendLocked
* truncate ping/pong logged hex txids in half to eliminate noise
* move a bunch of random time constants into named constants
with docs
* track a history of per-endpoint pong replies for future use &
status display
* add "send" and " got" prefix to discovery message logging
immediately before the frame type so it's easier to read than
searching for the "<-" or "->" arrows earlier in the line; but keep
those as the more reasily machine readable part for later.
Updates #483
Update the mapping from ip:port to discokey, so when we retrieve a
packet from the network, we can find the same conn.Endpoint that we
gave to wireguard-go previously, without making it think we've
roamed. (We did, but we're not using its roaming.)
Updates #483
Ping messages now go out somewhat regularly, pong replies are sent,
and pong replies are now partially handled enough to upgrade off DERP
to LAN.
CallMeMaybe packets are sent & received over DERP, but aren't yet
handled. That's next (and regular maintenance timers), and then WAN
should work.
Updates #483
Starting at yesterday's e96f22e560 (convering some UDPAddrs to
IPPorts), Conn.ReceiveIPv4 could return a nil addr, which would make
its way through wireguard-go and blow up later. The DERP read path
wasn't initializing the addr result parameter any more, and wgRecvAddr
wasn't checking it either.
Fixes#515
And while plumbing, a bit of discovery work I'll need: the
endpointOfAddr map to map from validated paths to the discoEndpoint.
Not being populated yet.
Updates #483
This adds a new magicsock endpoint type only used when both sides
support discovery (that is, are advertising a discovery
key). Otherwise the old code is used.
So far the new code only communicates over DERP as proof that the new
code paths are wired up. None of the actually discovery messaging is
implemented yet.
Support for discovery (generating and advertising a key) are still
behind an environment variable for now.
Updates #483
And track known peers.
Doesn't yet do anything with the messages. (nor does it send any yet)
Start of docs on the message format. More will come in subsequent changes.
Updates #483
If there's been 5 minutes of inactivity, stop doing STUN lookups. That
means NAT mappings will expire, but they can resume later when there's
activity again.
We'll do this for all platforms later.
Updates tailscale/corp#320
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The magicsock derpReader was holding onto 65KB for each DERP
connection forever, just in case.
Make the derp{,http}.Client be in charge of memory instead. It can
reuse its bufio.Reader buffer space.
This allows tailscaled's own traffic to bypass Tailscale-managed routes,
so that things like tailscale-provided default routes don't break
tailscaled itself.
Progress on #144.
Signed-off-by: David Anderson <danderson@tailscale.com>
Instead of hard-coding the DERP map (except for cmd/tailscale netcheck
for now), get it from the control server at runtime.
And make the DERP map support multiple nodes per region with clients
picking the first one that's available. (The server will balance the
order presented to clients for load balancing)
This deletes the stunner package, merging it into the netcheck package
instead, to minimize all the config hooks that would've been
required.
Also fix some test flakes & races.
Fixes#387 (Don't hard-code the DERP map)
Updates #388 (Add DERP region support)
Fixes#399 (wgengine: flaky tests)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This didn't catch anything yet, but it's good practice for detecting
goroutine leaks that we might not find otherwise.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
If a test calls log.Printf, 'go test' horrifyingly rearranges the
output to no longer be in chronological order, which makes debugging
virtually impossible. Let's stop that from happening by making
log.Printf panic if called from any module, no matter how deep, during
tests.
This required us to change the default error handler in at least one
http.Server, as well as plumbing a bunch of logf functions around,
especially in magicsock and wgengine, but also in logtail and backoff.
To add insult to injury, 'go test' also rearranges the output when a
parent test has multiple sub-tests (all the sub-test's t.Logf is always
printed after all the parent tests t.Logf), so we need to screw around
with a special Logf that can point at the "current" t (current_t.Logf)
in some places. Probably our entire way of using subtests is wrong,
since 'go test' would probably like to run them all in parallel if you
called t.Parallel(), but it definitely can't because the're all
manipulating the shared state created by the parent test. They should
probably all be separate toplevel tests instead, with common
setup/teardown logic. But that's a job for another time.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
Right now, filtering and packet injection in wgengine depend
on a patch to wireguard-go that probably isn't suitable for upstreaming.
This need not be the case: wireguard-go/tun.Device is an interface.
For example, faketun.go implements it to mock a TUN device for testing.
This patch implements the same interface to provide filtering
and packet injection at the tunnel device level,
at which point the wireguard-go patch should no longer be necessary.
This patch has the following performance impact on i7-7500U @ 2.70GHz,
tested in the following namespace configuration:
┌────────────────┐ ┌─────────────────────────────────┐ ┌────────────────┐
│ $ns1 │ │ $ns0 │ │ $ns2 │
│ client0 │ │ tailcontrol, logcatcher │ │ client1 │
│ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ │
│ │vethc│───────┼────┼──│vethrc│ │vethrs│──────┼─────┼──│veths│ │
│ ├─────┴─────┐ │ │ ├──────┴────┐ ├──────┴────┐ │ │ ├─────┴─────┐ │
│ │10.0.0.2/24│ │ │ │10.0.0.1/24│ │10.0.1.1/24│ │ │ │10.0.1.2/24│ │
│ └───────────┘ │ │ └───────────┘ └───────────┘ │ │ └───────────┘ │
└────────────────┘ └─────────────────────────────────┘ └────────────────┘
Before:
---------------------------------------------------
| TCP send | UDP send |
|------------------------|------------------------|
| 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec |
---------------------------------------------------
After:
---------------------------------------------------
| TCP send | UDP send |
|------------------------|------------------------|
| 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec |
---------------------------------------------------
The impact on receive performance is similar.
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
The docs on magicsock.Conn stated that they implemented the
wireguard/device.Bind interface, yet this type does not exist. In
reality, the Conn type implements the wireguard/conn.Bind interface.
I also fixed a small typo in the same file.
Signed-off-by: Blake Gentry <blakesgentry@gmail.com>
* remove endpoint discovery noise when results unchanged
* consistently spell derp nodes as "derp-N"
* replace "127.3.3.40:" with "derp-" in CreateEndpoint log output
* stop early DERP setup before SetPrivateKey is called;
it just generates log nosie
* fix stringification of peer ShortStrings (it had an old %x on it,
rendering it garbage)
* describe why derp routes are changing, with one of:
shared home, their home, our home, alt
Add opt-in method to request IPv6 endpoints from the control plane.
For now they should just be skipped. A previous version of this CL was
unconditional and reportedly had problems that I can't reproduce. So
make it a knob until the mystery is solved.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Breaks something deep in wireguard or magicsock's brainstem, no packets at all
can flow. All received packets fail decryption with "invalid mac1".
This reverts commit 94024355ed.
Signed-off-by: David Anderson <dave@natulte.net>
More steps towards IPv6 transport.
We now send it to tailcontrol, which ignores it.
But it doesn't actually actually support IPv6 yet (outside of STUN).
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Use this when making the ipn state transition from Starting to
Running. This way a network of quiet nodes with no active
handshaking will still transition to Active.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
Typically the home DERP server is found and set on startup before
magicsock's SetPrivateKey can be called, so no DERP connection is
established. Make sure one is by kicking the home DERP tires in
SetPrivateKey.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
The code as written intended to do this, but it repeated the
comparison of derpNum and c.myDerp after c.myDerp had been
updated, so it never executed.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
Before, endpoint updates were constantly being interrupted and resumed
on Linux due to tons of LinkChange messages from over-zealous Linux
netlink messages (from router_linux.go)
Now that endpoint updates are fast and bounded in time anyway, just
let them run to completion, but note that another needs to be
scheduled after.
Now logs went from pages of noise to just:
root@taildoc:~# grep -i -E 'stun|endpoint update' log
2020/03/13 08:51:29 magicsock.Conn: starting endpoint update (initial)
2020/03/13 08:51:30 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:31 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:31 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:33 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:33 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:35 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:35 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
Or, seen in another run:
2020/03/13 08:45:41 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:46:09 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:46:21 magicsock.Conn: starting endpoint update (link-change-major)
2020/03/13 08:46:37 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:47:05 magicsock.Conn: starting endpoint update (periodic)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The TODO above derphttp.NewClient suggests it does network I/O,
but the derphttp client connects lazily and so creating one is
very cheap.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
It used to make assumptions based on having Anycast IPs that are super
near. Now we're intentionally going to a bunch of different distant
IPs to measure latency.
Also, optimize how the hairpin detection works. No need to STUN on
that socket. Just use that separate socket for sending, once we know
the other UDP4 socket's endpoint. The trick is: make our test probe
also a STUN packet, so it fits through magicsock's existing STUN
routing.
This drops netcheck from ~5 seconds to ~250-500ms.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Failure to do this leads to fd exhaustion at -count=10000,
and increasingly poor execution north of -count=100.
Signed-off-by: David Anderson <danderson@tailscale.com>
Failure to do so triggers either a data race or a panic
in the testing package, due to racey use of t.Logf.
Signed-off-by: David Anderson <danderson@tailscale.com>
Basically, don't trust the OS-level link monitor to only tell you
interesting things. Sanity check it.
Also, move the interfaces package into the net directory now that we
have it.
The UDP reader goroutine was clobbering `n` and `err` from the
main goroutine, whose accesses are not synchronized the way `b` is.
Signed-off-by: David Anderson <danderson@tailscale.com>
wireguard-go closes magicsock, and expects this to unblock reads
so that its internal goroutines can wind down. We were incorrectly
blocking the read indefinitey and breaking this contract.
Signed-off-by: David Anderson <danderson@tailscale.com>
It's extremely flaky in several dimensions, as well as very slow.
It's making the CI completely red all the time without telling us
useful information.
Set RUN_CURSED_TESTS=1 to run locally.
This change just alters the semantics of the one flaky test, without
trying to speed up timeouts on the others. Empirically, speeding up
the timeouts causes _more_ flakes right now :(
The remaining flake occurs due to a mysterious packet loss. This
doesn't affect normal tailscaled operations, so until I track down
where the loss occurs and fix it, the flaky test is going to be
lenient about packet loss (but not about whether the spray logic
worked).
Signed-off-by: David Anderson <danderson@tailscale.com>
It previously passed incorrectly due to bugs. With those fixed,
it becomes flaky for 2 reasons. One of them is the wireguard handshake
race, which can eat the 1st sprayed packet and prevent roamAddr
discovery. This change fixes that failure, by spreading the test
traffic out enough that additional spraying occurs.
Signed-Off-By: David Anderson <danderson@tailscale.com>
The previous code would skip the DERP short-circuit if roamAddr
was set, which is not what we wanted. More generally, hitting
any of the fast path conditions is a direct return, so we can
just have 3 standalone branches rather than 'else if' stuff.
Signed-Off-By: David Anderson <danderson@tailscale.com>
The effect is subtle: when we're not spraying packets, and have not yet
figured out a curAddr, and we're not spraying, we end up sending to
whatever the first IP is in the iteration order. In English, that
means "when we have no idea where to send packets, and we've given
up on sending to everyone, just send to the first addr we see in
the list."
This is, in general, what we want, because the addrs are in sorted
preference order, low to high, and DERP is the least preferred
destination. So, when we have no idea where to send, send to DERP,
right?
... Except for very historical reasons, appendDests iterated through
addresses in _reverse_ order, most preferred to least preferred.
crawshaw@ believes this was part of the earliest handshaking
algorithm magicsock had, where it slowly iterated through possible
destinations and poked handshakes to them one at a time.
Anyway, because of this historical reverse iteration, in the case
described above of "we have no idea where to send", the code would
end up sending to the _most_ preferred candidate address, rather
than the _least_ preferred. So when in doubt, we'd end up firing
packets into the blackhole of some LAN address that doesn't work,
and connectivity would not work.
This case only comes up if all your non-DERP connectivity options
have failed, so we more or less failed to detect it because we
didn't have a pathological test box deployed. Worse, codependent
bug 2839854994 made DERP accidentally
work sometimes anyway by incorrectly exploiting roamAddr behavior,
albeit at the cost of making DERP traffic symmetric. In fixing
DERP to once again be asymmetric, we effectively removed the
bandaid that was concealing this bug.
Signed-Off-By: David Anderson <danderson@tailscale.com>
DERP traffic is asymmetric by design, with nodes always sending
to their peer's home DERP server. However, if roamAddr is set,
magicsock will always push data there, rather than let DERP
server selection do its thing, so we end up accidentally
creating a symmetric flow.
Signed-Off-By: David Anderson <danderson@tailscale.com>
I started to write a full DNS caching resolver and I realized it was
overkill and wouldn't work on Windows even in Go 1.14 yet, so I'm
doing this tiny one instead for now, just for all our netcheck STUN
derp lookups, and connections to DERP servers. (This will be caching a
exactly 8 DNS entries, all ours.)
Fixes#145 (can be better later, of course)
The value predates the introduction of AddrSet which replaces
the index by tracking curAddr directly.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
In particular, this is designed to catch the case where a
HandshakeInitiation packet is sent out but the intermediate NATs
have not been primed, so the packet passes over DERP.
In that case, the HandshakeResponse also comes back over DERP,
and the connection proceeds via DERP without ever trying to punch
through the NAT.
With this change, the HandshakeResponse (which was sprayed out
and so primed one NAT) triggers an UpdateDst, which triggers
the extra spray logic.
(For this to work, there has to be an initial supply of packets
to send on to a peer for the three seconds following a handshake.
The source of these packets is left as a future exercise.)
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>