Moves magicDNS-specific handling out of Resolver & into dns.Manager. This
greatly simplifies the Resolver to solely issuing queries and returning
responses, without channels.
Enforcement of max number of in-flight magicDNS queries, assembly of
synthetic UDP datagrams, and integration with wgengine for
recieving/responding to magicDNS traffic is now entirely in Manager.
This path is being kept around, but ultimately aims to be deleted and
replaced with a netstack-based path.
This commit is part of a series to implement magicDNS using netstack.
Signed-off-by: Tom DNetto <tom@tailscale.com>
Well, goimports actually (which adds the normal import grouping order we do)
Change-Id: I0ce1b1c03185f3741aad67c14a7ec91a838de389
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This updates the fix from #4562 to pick the proxy based on the request
scheme.
Updates #4395, #2605, #4562
Signed-off-by: James Tucker <james@tailscale.com>
Currently we try to use `https://` when we see `https_host`, however
that doesn't work and results in errors like `Received error: fetch
control key: Get "https://controlplane.tailscale.com/key?v=32":
proxyconnect tcp: tls: first record does not look like a TLS handshake`
This indiciates that we are trying to do a HTTPS request to a HTTP
server. Googling suggests that the standard is to use `http` regardless
of `https` or `http` proxy
Updates #4395, #2605
Signed-off-by: Maisem Ali <maisem@tailscale.com>
The connections returned from SystemDial are automatically closed when
there is a major link change.
Also plumb through the dialer to the noise client so that connections
are auto-reset when moving from cellular to WiFi etc.
Updates #3363
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Updates #2067
This should help us determine if more robust control of edns parameters
+ implementing answer truncation is warranted, given its likely complexity.
Signed-off-by: Tom DNetto <tom@tailscale.com>
This populates DNS suffixes ("ts.net", etc) in /etc/resolver/* files
to point to 100.100.100.100 so MagicDNS works.
It also sets search domains.
Updates #4276
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
$ tailscale debug via 0xb 10.2.0.0/16
fd7a:115c:a1e0:b1a:0🅱️a02:0/112
$ tailscale debug via fd7a:115c:a1e0:b1a:0🅱️a02:0/112
site 11 (0xb), 10.2.0.0/16
Previously: 3ae701f0eb
This adds a little debug tool to do CIDR math to make converting between
those ranges easier for now.
Updates #3616
Change-Id: I98302e95d17765bfaced3ecbb71cbd43e84bff46
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In cases where tailscale is operating behind a MITM proxy, we need to consider
that a lot more of the internals of our HTTP requests are visible and may be
used as part of authorization checks. As such, we need to 'behave' as closely
as possible to ideal.
- Some proxies do authorization or consistency checks based the on Host header
or HTTP URI, instead of just the IP/hostname/SNI. As such, we need to
construct a `*http.Request` with a valid URI everytime HTTP is going to be
used on the wire, even if its over TLS.
Aside from the singular instance in net/netcheck, I couldn't find anywhere
else a http.Request was constructed incorrectly.
- Some proxies may deny requests, typically by returning a 403 status code. We
should not consider these requests as a valid latency check, so netcheck
semantics have been updated to consider >299 status codes as a failed probe.
Signed-off-by: Tom DNetto <tom@tailscale.com>
Two changes in one:
* make DoH upgrades an explicitly scheduled send earlier, when we come
up with the resolvers-and-delay send plan. Previously we were
getting e.g. four Google DNS IPs and then spreading them out in
time (for back when we only did UDP) but then later we added DoH
upgrading at the UDP packet layer, which resulted in sometimes
multiple DoH queries to the same provider running (each doing happy
eyeballs dialing to 4x IPs themselves) for each of the 4 source IPs.
Instead, take those 4 Google/Cloudflare IPs and schedule 5 things:
first the DoH query (which can use all 4 IPs), and then each of the
4 IPs as UDP later.
* clean up the dnstype.Resolver.Addr confusion; half the code was
using it as an IP string (as documented) as half was using it as
an IP:port (from some prior type we used), primarily for tests.
Instead, document it was being primarily an IP string but also
accepting an IP:port for tests, then add an accessor method on it
to get the IPPort and use that consistently everywhere.
Change-Id: Ifdd72b9e45433a5b9c029194d50db2b9f9217b53
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If all N queries failed, we waited until context timeout (in 5
seconds) to return.
This makes (*forwarder).forward fail fast when the network's
unavailable.
Change-Id: Ibbb3efea7ed34acd3f3b29b5fee00ba8c7492569
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Simplify the ability to reason about the DoH dialing code by reusing the
dnscache's dialer we already have.
Also, reduce the scope of the "ip" variable we don't want to close over.
This necessarily adds a new field to dnscache.Resolver:
SingleHostStaticResult, for when the caller already knows the IPs to be
returned.
Change-Id: I9f2aef7926f649137a5a3e63eebad6a3fffa48c0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This extracts DOH mapping of known public DNS providers in
forwarder.go into its own package, to be consumed by other repos
Signed-off-by: Jenny Zhang <jz@tailscale.com>
This defines a new magic IPv6 prefix, fd7a:115c:a1e0:b1a::/64, a
subset of our existing /48, where the final 32 bits are an IPv4
address, and the middle 32 bits are a user-chosen "site ID". (which
must currently be 0000:00xx; the top 3 bytes must be zero for now)
e.g., I can say my home LAN's "site ID" is "0000:00bb" and then
advertise its 10.2.0.0/16 IPv4 range via IPv6, like:
tailscale up --advertise-routes=fd7a:115c:a1e0:b1a::bb:10.2.0.0/112
(112 being /128 minuse the /96 v6 prefix length)
Then people in my tailnet can:
$ curl '[fd7a:115c:a1e0:b1a::bb:10.2.0.230]'
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" ....
Updates #3616, etc
RELNOTE=initial support for TS IPv6 addresses to route v4 "via" specific nodes
Change-Id: I9b49b6ad10410a24b5866b9fbc69d3cae1f600ef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* net/dns, net/dns/resolver, wgengine: refactor DNS request path
Previously, method calls into the DNS manager/resolver types handled DNS
requests rather than DNS packets. This is fine for UDP as one packet
corresponds to one request or response, however will not suit an
implementation that supports DNS over TCP.
To support PRs implementing this in the future, wgengine delegates
all handling/construction of packets to the magic DNS endpoint, to
the DNS types themselves. Handling IP packets at this level enables
future support for both UDP and TCP.
Signed-off-by: Tom DNetto <tom@tailscale.com>
Combine the code between `LocalBackend.CheckIPForwarding` and
`controlclient.ipForwardingBroken`.
Fixes#4300
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Currently if the passed in host is an IP, Lookup still attempts to
resolve it with a dns server. This makes it just return the IP directly.
Updates tailscale/corp#4475
Signed-off-by: Maisem Ali <maisem@tailscale.com>
When the context is canceled, dc.dialOne returns an error from line 345.
This causes the defer on line 312 to try to resolve the host again, which
triggers a dns lookup of "127.0.0.1" from derp.
Updates tailscale/corp#4475
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Plumb the outbound injection path to allow passing netstack
PacketBuffers down to the tun Read, where they are decref'd to enable
buffer re-use. This removes one packet alloc & copy, and reduces GC
pressure by pooling outbound injected packets.
Fixes#2741
Signed-off-by: James Tucker <james@tailscale.com>
The best flag to use on Win7 and Win8.0 is deprecated in Win8.1, so we resolve
the flag depending on OS version info.
Fixes https://github.com/tailscale/tailscale/issues/4201
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Incidentally, simplify the go generate CI workflow, by
marking the dnsfallback update non-hermetic (so CI will
skip it) rather than manually filter it out of `go list`.
Updates #4194
Signed-off-by: David Anderson <danderson@tailscale.com>
As of Go 1.18, the register ABI list includes arm64, amd64,
ppc64, and ppc64le. This is a large enough percentage of the
architectures that it's not worth explaining.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Customer reported an issue where the connections were not closing, and
would instead just stay open. This commit makes it so that we close out
the connection regardless of what error we see. I've verified locally
that it fixes the issue, we should add a test for this.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
If it's in a non-standard table, as it is on Unifi UDM Pro, apparently.
Updates #4038 (probably fixes, but don't have hardware to verify)
Change-Id: I2cb9a098d8bb07d1a97a6045b686aca31763a937
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I introduced a bug in 8fe503057d when unifying oneConnListener
implementations.
The NewOneConnListenerFrom API was easy to misuse (its Close method
closes the underlying Listener), and we did (via http.Serve, which
closes the listener after use, which meant we were close the peerapi's
listener, even though we only wanted its Addr)
Instead, combine those two constructors into one and pass in the Addr
explicitly, without delegating through to any Listener.
Change-Id: I061d7e5f842e0cada416e7b2dd62100d4f987125
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If we've already connected to a certain name's IP in the past, don't
assume the problem was DNS related. That just puts unnecessarily load
on our bootstrap DNS servers during regular restarts of Tailscale
infrastructure components.
Also, if we do do a bootstrap DNS lookup and it gives the same IP(s)
that we already tried, don't try them again.
Change-Id: I743e8991a7f957381b8e4c1508b8e9d0df1782fe
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
No behavior changes (intended, at least).
This is in prep for future changes to this package, which would get
too complicated in the current style.
Change-Id: Ic260f8e34ae2f64f34819d4a56e38bee8d8ac5ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This TODO was both added and fixed in 506c727e3.
As I recall, I wasn't originally going to do it because it seemed
annoying, so I wrote the TODO, but then I felt bad about it and just
did it, but forgot to remove the TODO.
Change-Id: I8f3514809ad69b447c62bfeb0a703678c1aec9a3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I was about to add a third copy, so unify them now instead.
Change-Id: I3b93896aa1249b1250a6b1df4829d57717f2311a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Avoid some work when D-Bus isn't running.
Change-Id: I6f89bb75fdb24c13f61be9b400610772756db1ef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If systemd-resolved is enabled but not running (or not yet running,
such as early boot) and resolv.conf is old/dangling, we weren't
detecting systemd-resolved.
This moves its ping earlier, which will trigger it to start up and
write its file.
Updates #3362 (likely fixes)
Updates #3531 (likely fixes)
Change-Id: I6392944ac59f600571c43b8f7a677df224f2beed
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
tailscaled was using 100% CPU on a machine with ~1M lines, 100MB+
of /proc/net/route data.
Two problems: in likelyHomeRouterIPLinux, we didn't stop reading the
file once we found the default route (which is on the first non-header
line when present). Which meant it was finding the answer and then
parsing 100MB over 1M lines unnecessarily. Second was that if the
default route isn't present, it'd read to the end of the file looking
for it. If it's not in the first 1,000 lines, it ain't coming, or at
least isn't worth having. (it's only used for discovering a potential
UPnP/PMP/PCP server, which is very unlikely to be present in the
environment of a machine with a ton of routes)
Change-Id: I2c4a291ab7f26aedc13885d79237b8f05c2fd8e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Now that Go 1.17 has module graph pruning
(https://go.dev/doc/go1.17#go-command), we should be able to use
upstream netstack without breaking our private repo's build
that then depends on the tailscale.com Go module.
This is that experiment.
Updates #1518 (the original bug to break out netstack to own module)
Updates #2642 (this updates netstack, but doesn't remove workaround)
Change-Id: I27a252c74a517053462e5250db09f379de8ac8ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
A new package can also later record/report which knobs are checked and
set. It also makes the code cleaner & easier to grep for env knobs.
Change-Id: Id8a123ab7539f1fadbd27e0cbeac79c2e4f09751
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Currently only search domains are stored. This was an oversight
(under?) on my part.
As things are now, when MagicDNS is on and "Override local DNS" is
off, the dns forwarder has to timeout before names resolve. This
introduces a pretty annoying lang that makes everything feel
extremely slow. You will also see an error: "upstream nameservers
not set".
I tested with "Override local DNS" on and off. In both situations
things seem to function as expected (and quickly).
Signed-off-by: Aaron Bieber <aaron@bolddaemon.com>
Go 1.17 added a HandshakeContext func to take care of timeouts during
TLS handshaking, so switch from our homegrown goroutine implementation
to the standard way.
Signed-off-by: David Anderson <danderson@tailscale.com>
Cancelling the context makes the timeout goroutine race with the write that
reports a successful TLS handshake, so you can end up with a successful TLS
handshake that mysteriously reports that it timed out after ~0s in flight.
The context is always canceled and cleaned up as the function exits, which
happens mere microseconds later, so just let function exit clean up and
thereby avoid races.
Signed-off-by: David Anderson <danderson@tailscale.com>
On Synology, the /etc/resolv.conf has tabs in it, which this
resolv.conf parser (we have two, sigh) didn't handle.
Updates #3710
Change-Id: I86f8e09ad1867ee32fa211e85c382a27191418ea
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Tailscale seems to be breaking WSL configurations lately. Until we
understand what changed, turn off Tailscale's involvement by default
and make it opt-in.
Updates #2815
Change-Id: I9977801f8debec7d489d97761f74000a4a33f71b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
OpenBSD 6.9 and up has a daemon which handles nameserver configuration. This PR
teaches the OpenBSD dns manager to check if resolvd is being used. If it is, it
will use the route(8) command to tell resolvd to add the Tailscale dns entries
to resolv.conf
Signed-off-by: Aaron Bieber <aaron@bolddaemon.com>
Fixes#3660
RELNOTE=MagicDNS now works over IPv6 when CGNAT IPv4 is disabled.
Change-Id: I001e983df5feeb65289abe5012dedd177b841b45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And delete the unused code in net/dns/resolver/neterr_*.go.
Change-Id: Ibe62c486bacce2733eb9968c96a98cbbdb2758bd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Treat UDP send EPERM errors as a lost UDP packet, not something super
fatal. That's just the Linux firewall preventing it from going out.
And add a leaf package net/neterror for that (and future) policy that
all three packages can share, with tests.
Updates #3619
Change-Id: Ibdb838c43ee9efe70f4f25f7fc7fdf4607ba9c1d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Only if the source address isn't on the currently active interface or
a ping of the DERP server fails.
Updates #3619
Change-Id: I6bf06503cff4d781f518b437c8744ac29577acc8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It was pretty ill-defined before and mostly for logging. But I wanted
to start depending on it, so define what it is and make Windows match
the other operating systems, without losing the log output we had
before. (and add tests for that)
Change-Id: I0fbbba1cfc67a265d09dd6cb738b73f0f6005247
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Don't just ignore them. See if this makes them calm down.
Updates #3363
Change-Id: Id1d66308e26660d26719b2538b577522a1e36b63
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To convince me it's not as alloc-y as it looks.
Change-Id: I503a0cc267268a23d2973dfde9833c420be4e868
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And it updates the build tag style on a couple files.
Change-Id: I84478d822c8de3f84b56fa1176c99d2ea5083237
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is enough to handle the DNS queries as generated by Go's
net package (which our HTTP/SOCKS client uses), and the responses
generated by the ExitDNS DoH server.
This isn't yet suitable for putting on 100.100.100.100 where a number
of different DNS clients would hit it, as this doesn't yet do
EDNS0. It might work, but it's untested and likely incomplete.
Likewise, this doesn't handle anything about truncation, as the
exchanges are entirely in memory between Go or DoH. That would also
need to be handled later, if/when it's hooked up to 100.100.100.100.
Updates #3507
Change-Id: I1736b0ad31eea85ea853b310c52c5e6bf65c6e2a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It will be used for ICMPv6 next, so pass in the proto.
Also, use the ipproto constants rather than hardcoding the mysterious
number.
Change-Id: I57b68bdd2d39fff75f82affe955aff9245de246b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And simplify, unexport some tsdial/netstack stuff in the the process.
Fixes#3475
Change-Id: I186a5a5cbd8958e25c075b4676f7f6e70f3ff76e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This starts to refactor tsdial.Dialer's name resolution to have
different stages: in-memory MagicDNS vs system resolution. A future
change will plug in ExitDNS resolution.
This also plumbs a Dialer into netstack and unexports the dnsMap
internals.
And it removes some of the async AddNetworkMapCallback usage and
replaces it with synchronous updates of the Dialer's netmap
from LocalBackend, since the LocalBackend has the Dialer too.
Updates #3475
Change-Id: Idcb7b1169878c74f0522f5151031ccbc49fe4cb4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
With this, I'm able to send a Taildrop file (using "tailscale file cp")
from a Linux machine running --tun=userspace-networking.
Updates #2179
Change-Id: I4e7a4fb0fbda393e4fb483adb06b74054a02cfd0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In prep for moving stuff out of LocalBackend.
Change-Id: I9725aa9c3ebc7275f8c40e040b326483c0340127
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Not done yet, but this move more of the outbound dial special casing
from random packages into tsdial, which aspires to be the one unified
place for all outbound dialing shenanigans.
Then this plumbs it all around, so everybody is ultimately
holding on to the same dialer.
As of this commit, macOS/iOS using an exit node should be able to
reach to the exit node's DoH DNS proxy over peerapi, doing the sockopt
to stay within the Network Extension.
A number of steps remain, including but limited to:
* move a bunch more random dialing stuff
* make netstack-mode tailscaled be able to use exit node's DNS proxy,
teaching tsdial's resolver to use it when an exit node is in use.
Updates #1713
Change-Id: I1e8ee378f125421c2b816f47bc2c6d913ddcd2f5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For now this just deletes the net/socks5/tssocks implementation (and
the DNSMap stuff from wgengine/netstack) and moves it into net/tsdial.
Then initialize a Dialer early in tailscaled, currently only use for the
outbound and SOCKS5 proxies. It will be plumbed more later. Notably, it
needs to get down into the DNS forwarder for exit node DNS forwading
in netstack mode. But it will also absorb all the peerapi setsockopt
and netns Dial and tlsdial complexity too.
Updates #1713
Change-Id: Ibc6d56ae21a22655b2fa1002d8fc3f2b2ae8b6df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The block-write and block-read tests are both flaky,
because each assumes it can get a normal read/write
completed within 10ms. This isn’t always true.
We can’t increase the timeouts, because that slows down the test.
However, we don’t need to issue a regular read/write for this test.
The immediately preceding tests already test this code,
using a far more generous timeout.
Remove the extraneous read/write.
This drops the failure rate from 1 per 20,000 to undetectable
on my machine.
While we’re here, fix a typo in a debug print statement.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Without the continue, we might overwrite our current meta
with a zero meta.
Log the error, so that we can check for anything unexpected.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>