Two changes in one:
* make DoH upgrades an explicitly scheduled send earlier, when we come
up with the resolvers-and-delay send plan. Previously we were
getting e.g. four Google DNS IPs and then spreading them out in
time (for back when we only did UDP) but then later we added DoH
upgrading at the UDP packet layer, which resulted in sometimes
multiple DoH queries to the same provider running (each doing happy
eyeballs dialing to 4x IPs themselves) for each of the 4 source IPs.
Instead, take those 4 Google/Cloudflare IPs and schedule 5 things:
first the DoH query (which can use all 4 IPs), and then each of the
4 IPs as UDP later.
* clean up the dnstype.Resolver.Addr confusion; half the code was
using it as an IP string (as documented) as half was using it as
an IP:port (from some prior type we used), primarily for tests.
Instead, document it was being primarily an IP string but also
accepting an IP:port for tests, then add an accessor method on it
to get the IPPort and use that consistently everywhere.
Change-Id: Ifdd72b9e45433a5b9c029194d50db2b9f9217b53
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If all N queries failed, we waited until context timeout (in 5
seconds) to return.
This makes (*forwarder).forward fail fast when the network's
unavailable.
Change-Id: Ibbb3efea7ed34acd3f3b29b5fee00ba8c7492569
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Simplify the ability to reason about the DoH dialing code by reusing the
dnscache's dialer we already have.
Also, reduce the scope of the "ip" variable we don't want to close over.
This necessarily adds a new field to dnscache.Resolver:
SingleHostStaticResult, for when the caller already knows the IPs to be
returned.
Change-Id: I9f2aef7926f649137a5a3e63eebad6a3fffa48c0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This extracts DOH mapping of known public DNS providers in
forwarder.go into its own package, to be consumed by other repos
Signed-off-by: Jenny Zhang <jz@tailscale.com>
* net/dns, net/dns/resolver, wgengine: refactor DNS request path
Previously, method calls into the DNS manager/resolver types handled DNS
requests rather than DNS packets. This is fine for UDP as one packet
corresponds to one request or response, however will not suit an
implementation that supports DNS over TCP.
To support PRs implementing this in the future, wgengine delegates
all handling/construction of packets to the magic DNS endpoint, to
the DNS types themselves. Handling IP packets at this level enables
future support for both UDP and TCP.
Signed-off-by: Tom DNetto <tom@tailscale.com>
As of Go 1.18, the register ABI list includes arm64, amd64,
ppc64, and ppc64le. This is a large enough percentage of the
architectures that it's not worth explaining.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Avoid some work when D-Bus isn't running.
Change-Id: I6f89bb75fdb24c13f61be9b400610772756db1ef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If systemd-resolved is enabled but not running (or not yet running,
such as early boot) and resolv.conf is old/dangling, we weren't
detecting systemd-resolved.
This moves its ping earlier, which will trigger it to start up and
write its file.
Updates #3362 (likely fixes)
Updates #3531 (likely fixes)
Change-Id: I6392944ac59f600571c43b8f7a677df224f2beed
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
A new package can also later record/report which knobs are checked and
set. It also makes the code cleaner & easier to grep for env knobs.
Change-Id: Id8a123ab7539f1fadbd27e0cbeac79c2e4f09751
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Currently only search domains are stored. This was an oversight
(under?) on my part.
As things are now, when MagicDNS is on and "Override local DNS" is
off, the dns forwarder has to timeout before names resolve. This
introduces a pretty annoying lang that makes everything feel
extremely slow. You will also see an error: "upstream nameservers
not set".
I tested with "Override local DNS" on and off. In both situations
things seem to function as expected (and quickly).
Signed-off-by: Aaron Bieber <aaron@bolddaemon.com>
On Synology, the /etc/resolv.conf has tabs in it, which this
resolv.conf parser (we have two, sigh) didn't handle.
Updates #3710
Change-Id: I86f8e09ad1867ee32fa211e85c382a27191418ea
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Tailscale seems to be breaking WSL configurations lately. Until we
understand what changed, turn off Tailscale's involvement by default
and make it opt-in.
Updates #2815
Change-Id: I9977801f8debec7d489d97761f74000a4a33f71b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
OpenBSD 6.9 and up has a daemon which handles nameserver configuration. This PR
teaches the OpenBSD dns manager to check if resolvd is being used. If it is, it
will use the route(8) command to tell resolvd to add the Tailscale dns entries
to resolv.conf
Signed-off-by: Aaron Bieber <aaron@bolddaemon.com>
Fixes#3660
RELNOTE=MagicDNS now works over IPv6 when CGNAT IPv4 is disabled.
Change-Id: I001e983df5feeb65289abe5012dedd177b841b45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And delete the unused code in net/dns/resolver/neterr_*.go.
Change-Id: Ibe62c486bacce2733eb9968c96a98cbbdb2758bd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Don't just ignore them. See if this makes them calm down.
Updates #3363
Change-Id: Id1d66308e26660d26719b2538b577522a1e36b63
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To convince me it's not as alloc-y as it looks.
Change-Id: I503a0cc267268a23d2973dfde9833c420be4e868
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And it updates the build tag style on a couple files.
Change-Id: I84478d822c8de3f84b56fa1176c99d2ea5083237
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Not done yet, but this move more of the outbound dial special casing
from random packages into tsdial, which aspires to be the one unified
place for all outbound dialing shenanigans.
Then this plumbs it all around, so everybody is ultimately
holding on to the same dialer.
As of this commit, macOS/iOS using an exit node should be able to
reach to the exit node's DoH DNS proxy over peerapi, doing the sockopt
to stay within the Network Extension.
A number of steps remain, including but limited to:
* move a bunch more random dialing stuff
* make netstack-mode tailscaled be able to use exit node's DNS proxy,
teaching tsdial's resolver to use it when an exit node is in use.
Updates #1713
Change-Id: I1e8ee378f125421c2b816f47bc2c6d913ddcd2f5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Currently, comments in resolv.conf cause our parser to fail,
with error messages like:
ParseIP("192.168.0.100 # comment"): unexpected character (at " # comment")
Fix that.
Noticed while looking through logs.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Lets the systemd-resolved OSConfigurator report health changes
for out of band config resyncs.
Updates #3327
Signed-off-by: David Anderson <danderson@tailscale.com>
Don't set all the *.arpa. reverse DNS lookup domains if systemd-resolved
is old and can't handle them.
Fixes#3188
Change-Id: I283f8ce174daa8f0a972ac7bfafb6ff393dde41d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There are a few remaining uses of testing.AllocsPerRun:
Two in which we only log the number of allocations,
and one in which dynamically calculate the allocations
target based on a different AllocsPerRun run.
This also allows us to tighten the "no allocs"
test in wgengine/filter.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
There are /etc/resolv.conf files out there where resolvconf wrote
the file but pointed to systemd-resolved as the nameserver.
We're better off handling those as systemd-resolved.
> # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
> # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
> # 127.0.0.53 is the systemd-resolved stub resolver.
> # run "systemd-resolve --status" to see details about the actual nameservers.
Fixes https://github.com/tailscale/tailscale/issues/3026
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
In some containers, /etc/resolv.conf is a bind-mount from outside the container.
This prevents renaming to or from /etc/resolv.conf, because it's on a different
filesystem from linux's perspective. It also prevents removing /etc/resolv.conf,
because doing so would break the bind-mount.
If we find ourselves within this environment, fall back to using copy+delete when
renaming to /etc/resolv.conf, and copy+truncate when renaming from /etc/resolv.conf.
Fixes#3000
Co-authored-by: Denton Gentry <dgentry@tailscale.com>
Signed-off-by: David Anderson <danderson@tailscale.com>
When a DNS server claims to be unable or unwilling to handle a request,
instead of passing that refusal along to the client, just treat it as
any other error trying to connect to the DNS server. This prevents DNS
requests from failing based on if a server can respond with a transient
error before another server is able to give an actual response. DNS
requests only failing *sometimes* is really hard to find the cause of
(#1033).
Signed-off-by: Smitty <me@smitop.com>
We added the initial handling only for macOS and iOS.
With 1.16.0 now released, suppress forwarding DNS-SD
on all platforms to test it through the 1.17.x cycle.
Updates #2442
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
DNSSEC is an availability issue, as recently demonstrated by the
Slack issue, with limited security advantage. DoH on the other hand
is a critical security upgrade. This change adds DoH support for the
non-DNSSEC endpoints of Quad9.
https://www.quad9.net/service/service-addresses-and-features#unsec
Signed-off-by: Filippo Valsorda <hi@filippo.io>
Windows has a public dns.Flush used in router_windows.go.
However that won't work for platforms like Linux, where
we need a different flush mechanism for resolved versus
other implementations.
We're instead adding a FlushCaches method to the dns Manager,
which can be made to work on all platforms as needed.
Fixes https://github.com/tailscale/tailscale/issues/2132
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
We currently plumb full URLs for DNS resolvers from the control server
down to the client. But when we pass the values into the net/dns
package, we throw away any URL that isn't a bare IP. This commit
continues the plumbing, and gets the URL all the way to the built in
forwarder. (It stops before plumbing URLs into the OS configurations
that can handle them.)
For #2596
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
Reported on IRC: in an edge case, you can end up with a directManager DNS
manager and --accept-dns=false, in which case we should do nothing, but
actually end up restarting resolved whenever the netmap changes, even though
the user told us to not manage DNS.
Signed-off-by: David Anderson <danderson@tailscale.com>
Reported on IRC: a resolv.conf that contained two entries for
"nameserver 127.0.0.53", which defeated our "is resolved actually
in charge" check. Relax that check to allow any number of nameservers,
as long as they're all 127.0.0.53.
Signed-off-by: David Anderson <danderson@tailscale.com>
Now that we have the easier-to-parse go:build build tags,
it is straightforward to simplify them. Yay.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Previously, we hashed the question and combined it with the original
txid which was useful when concurrent queries were multiplexed on a
single local source port. We encountered some situations where the DNS
server canonicalizes the question in the response (uppercase converted
to lowercase in this case), which resulted in responses that we couldn't
match to the original request due to hash mismatches. This includes a
new test to cover that situation.
Fixes#2597
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Go 1.17 switches to a register ABI on amd64 platforms.
Part of that switch is that go and defer calls use an argument-less
closure, which allocates. This means that we have an extra
alloc in some DNS work. That's unfortunate but not a showstopper,
and I don't see a clear path to fixing it.
The other performance benefits from the register ABI will all
but certainly outweigh this extra alloc.
Fixes#2545
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
I don't know how to get access to a real packet. Basing this commit
entirely off:
+------------+--------------+------------------------------+
| Field Name | Field Type | Description |
+------------+--------------+------------------------------+
| NAME | domain name | MUST be 0 (root domain) |
| TYPE | u_int16_t | OPT (41) |
| CLASS | u_int16_t | requestor's UDP payload size |
| TTL | u_int32_t | extended RCODE and flags |
| RDLEN | u_int16_t | length of all RDATA |
| RDATA | octet stream | {attribute,value} pairs |
+------------+--------------+------------------------------+
From https://datatracker.ietf.org/doc/html/rfc6891#section-6.1.2
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
Instead of blasting away at all upstream resolvers at the same time,
make a timing plan upon reconfiguration and have each upstream have an
associated start delay, depending on the overall forwarding config.
So now if you have two or four upstream Google or Cloudflare DNS
servers (e.g. two IPv4 and two IPv6), we now usually only send a
query, not four.
This is especially nice on iOS where we start fewer DoH queries and
thus fewer HTTP/1 requests (because we still disable HTTP/2 on iOS),
fewer sockets, fewer goroutines, and fewer associated HTTP buffers,
etc, saving overall memory burstiness.
Fixes#2436
Updates tailscale/corp#2250
Updates tailscale/corp#2238
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a place to hang state in a future change for #2436.
For now this just simplifies the send signature without
any functional change.
Updates #2436
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It was a huge chunk of the overall log output and made debugging
difficult. Omit and summarize the spammy *.arpa parts instead.
Fixestailscale/corp#2066 (to which nobody had opinions, so)
Recognize Cloudflare, Google, Quad9 which are by far the
majority of upstream DNS servers that people use.
RELNOTE=MagicDNS now uses DNS-over-HTTPS when querying popular upstream resolvers,
so DNS queries aren't sent in the clear over the Internet.
Updates #915 (might fix it?)
Updates #988 (gets us closer, if it fixes Android)
Updates #74 (not yet configurable, but progress)
Updates #2056 (not yet configurable, dup of #74?)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We also have to make a one-off change to /etc/wsl.conf to stop every
invocation of wsl.exe clobbering the /etc/resolv.conf. This appears to
be a safe change to make permanently, as even though the resolv.conf is
constantly clobbered, it is always the same stable internal IP that is
set as a nameserver. (I believe the resolv.conf clobbering predates the
MS stub resolver.)
Tested on WSL2, should work for WSL1 too.
Fixes#775
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
This is preliminary work for using the directManager as
part of a wslManager on windows, where in addition to configuring
windows we'll use wsl.exe to edit the linux file system and modify the
system resolv.conf.
The pinholeFS is a little funky, but it's designed to work through
simple unix tools via wsl.exe without invoking bash. I would not have
thought it would stand on its own like this, but it turns out it's
useful for writing a test for the directManager.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
This has been bothering me for a while, but everytime I run format from the root directory
it also formats this file. I didn't want to add it to my other PRs but it's annoying to have to
revert it every time.
Signed-off-by: julianknodt <julianknodt@gmail.com>
This change (subject to some limitations) looks for the EDNS OPT record
in queries and responses, clamping the size field to fit within our DNS
receive buffer. If the size field is smaller than the DNS receive buffer
then it is left unchanged.
I think we will eventually need to transition to fully processing the
DNS queries to handle all situations, but this should cover the most
common case.
Mostly fixes#2066
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Windows 8.1 incorrectly handles search paths on an interface with no
associated resolver, so we have to provide a full primary DNS config
rather than use Windows 8.1's nascent-but-present NRPT functionality.
Fixes#2237.
Signed-off-by: David Anderson <danderson@tailscale.com>
It's possible to install a configuration that passes our current checks
for systemd-resolved, without actually pointing to systemd-resolved. In
that case, we end up programming DNS in resolved, but that config never
applies to any name resolution requests on the system.
This is quite a far-out edge case, but there's a simple additional check
we can do: if the header comment names systemd-resolved, there should be
a single nameserver in resolv.conf pointing to 127.0.0.53. If not, the
configuration should be treated as an unmanaged resolv.conf.
Fixes#2136.
Signed-off-by: David Anderson <danderson@tailscale.com>
This raises the maximum DNS response message size from 512 to 4095. This
should be large enough for almost all situations that do not need TCP.
We still do not recognize EDNS, so we will still forward requests that
claim support for a larger response size than 4095 (that will be solved
later). For now, when a response comes back that is too large to fit in
our receive buffer, we now set the truncation flag in the DNS header,
which is an improvement from before but will prompt attempts to use TCP
which isn't supported yet.
On Windows, WSARecvFrom into a buffer that's too small returns an error
in addition to the data. On other OSes, the extra data is silently
discarded. In this case, we prefer the latter so need to catch the error
on Windows.
Partially addresses #1123
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
This leads to a cleaner separation of intent vs. implementation
(Routes is now the only place specifying who handles DNS requests),
and allows for cleaner expression of a configuration that creates
MagicDNS records without serving them to the OS.
Signed-off-by: David Anderson <danderson@tailscale.com>
interfaces.Tailscale only returns an interface if it has at least one Tailscale
IP assigned to it. In the resolved DNS manager, when we're called upon to tear
down DNS config, the interface no longer has IPs.
Instead, look up the interface index on construction and reuse it throughout
the daemon lifecycle.
Fixes#1892.
Signed-off-by: David Anderson <dave@natulte.net>
This reverts commit 7d16c8228b.
I have no idea how I ended up here. The bug I was fixing with this change
fails to reproduce on Ubuntu 18.04 now, and this change definitely does
break 20.04, 20.10, and Debian Buster. So, until we can reliably reproduce
the problem this was meant to fix, reverting.
Part of #1875
Signed-off-by: David Anderson <dave@natulte.net>
NetworkManager fixed the bug that forced us to use NetworkManager
if it's programming systemd-resolved, and in the same release also
made NetworkManager ignore DNS settings provided for unmanaged
interfaces... Which breaks what we used to do. So, with versions
1.26.6 and above, we MUST NOT use NetworkManager to indirectly
program systemd-resolved, but thankfully we can talk to resolved
directly and get the right outcome.
Fixes#1788
Signed-off-by: David Anderson <danderson@tailscale.com>