ipn/ipnlocal, net/dns*, util/cloudenv: specialize DNS config on Google Cloud

This does three things:

* If you're on GCP, it adds a *.internal DNS split route to the
  metadata server, so we never break GCP DNS names. This lets people
  have some Tailscale nodes on GCP and some not (e.g. laptops at home)
  without having to add a Tailnet-wide *.internal DNS route.
  If you already have such a route, though, it won't overwrite it.

* If the 100.100.100.100 DNS forwarder has nowhere to forward to,
  it forwards it to the GCP metadata IP, which forwards to 8.8.8.8.
  This means there are never errNoUpstreams ("upstream nameservers not set")
  errors on GCP due to e.g. mangled /etc/resolv.conf (GCP default VMs
  don't have systemd-resolved, so it's likely a DNS supremacy fight)

* makes the DNS fallback mechanism use the GCP metadata IP as a
  fallback before our hosted HTTP-based fallbacks

I created a default GCP VM from their web wizard. It has no
systemd-resolved.

I then made its /etc/resolv.conf be empty and deleted its GCP
hostnames in /etc/hosts.

I then logged in to a tailnet with no global DNS settings.

With this, tailscaled writes /etc/resolv.conf (direct mode, as no
systemd-resolved) and sets it to 100.100.100.100, which then has
regular DNS via the metadata IP and *.internal DNS via the metadata IP
as well. If the tailnet configures explicit DNS servers, those are used
instead, except for *.internal.

This also adds a new util/cloudenv package based on version/distro
where the cloud type is only detected once. We'll likely expand it in
the future for other clouds, doing variants of this change for other
popular cloud environments.

Fixes #4911

RELNOTES=Google Cloud DNS improvements

Change-Id: I19f3c2075983669b2b2c0f29a548da8de373c7cf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit is contained in:
Brad Fitzpatrick
2022-06-29 13:19:34 -07:00
committed by Brad Fitzpatrick
parent 6f58497647
commit 88c2afd1e3
9 changed files with 265 additions and 24 deletions

View File

@@ -34,6 +34,7 @@ import (
"tailscale.com/net/tsdial"
"tailscale.com/types/dnstype"
"tailscale.com/types/logger"
"tailscale.com/util/cloudenv"
"tailscale.com/util/dnsname"
"tailscale.com/wgengine/monitor"
)
@@ -560,17 +561,38 @@ func (f *forwarder) sendUDP(ctx context.Context, fq *forwardQuery, rr resolverAn
return out, nil
}
// gcpResolverFallback is the fallback resolver for Google Cloud.
var gcpResolverFallback = []resolverAndDelay{{name: &dnstype.Resolver{Addr: cloudenv.GoogleMetadataAndDNSIP}}}
// resolvers returns the resolvers to use for domain.
func (f *forwarder) resolvers(domain dnsname.FQDN) []resolverAndDelay {
f.mu.Lock()
routes := f.routes
f.mu.Unlock()
var ret []resolverAndDelay
var matchedSuffix dnsname.FQDN
for _, route := range routes {
if route.Suffix == "." || route.Suffix.Contains(domain) {
return route.Resolvers
ret = route.Resolvers
matchedSuffix = route.Suffix
break
}
}
return nil
if len(ret) == 0 && cloudenv.Get() == cloudenv.GCP && (matchedSuffix == "" || matchedSuffix == ".") {
// If we're running on GCP where there's always a well-known IP of a
// recursive resolver, return that rather than having callers return
// errNoUpstreams. This fixes both normal 100.100.100.100 resolution
// when /etc/resolv.conf is missing/corrupt, and the peerapi ExitDNS
// stub resolver lookup.
//
// But we only do this if no route matched (matchedSuffix == "") or
// if we had no resolvers for the top-level route (matchedSuffix == ".").
// If they had an explicit empty route that we matched, don't do the auto
// fallback in that case.
ret = gcpResolverFallback
}
return ret
}
// forwardQuery is information and state about a forwarded DNS query that's