tailscale

mirror of https://github.com/tailscale/tailscale.git synced 2024-11-26 11:35:35 +00:00

Author	SHA1	Message	Date
Andrea Gottardo	6de6ab015f	net/dns: tweak DoH timeout, limit MaxConnsPerHost, require TLS 1.3 (#13564 ) Updates tailscale/tailscale#6148 This is the result of some observations we made today with @raggi. The DNS over HTTPS client currently doesn't cap the number of connections it uses, either in-use or idle. A burst of DNS queries will open multiple connections. Idle connections remain open for 30 seconds (this interval is defined in the dohTransportTimeout constant). For DoH providers like NextDNS which send keep-alives, this means the cellular modem will remain up more than expected to send ACKs if any keep-alives are received while a connection remains idle during those 30 seconds. We can set the IdleConnTimeout to 10 seconds to ensure an idle connection is terminated if no other DNS queries come in after 10 seconds. Additionally, we can cap the number of connections to 1. This ensures that at all times there is only one open DoH connection, either active or idle. If idle, it will be terminated within 10 seconds from the last query. We also observed all the DoH providers we support are capable of TLS 1.3. We can force this TLS version to reduce the number of packets sent/received each time a TLS connection is established. Signed-off-by: Andrea Gottardo <andrea@gottardo.me>	2024-10-02 09:26:11 -07:00
Andrea Gottardo	8a6f48b455	cli: add `tailscale dns query` (#13368 ) Updates tailscale/tailscale#13326 Adds a CLI subcommand to perform DNS queries using the internal DNS forwarder and observe its internals (namely, which upstream resolvers are being used). Signed-off-by: Andrea Gottardo <andrea@gottardo.me>	2024-09-24 20:18:45 +00:00
James Tucker	af5a845a87	net/dns/resolver: fix dns-sd NXDOMAIN responses from quad-100 mdnsResponder at least as of macOS Sequoia does not find NXDOMAIN responses to these dns-sd PTR queries acceptable unless they include the question section in the response. This was found debugging #13511, once we turned on additional diagnostic reporting from mdnsResponder we witnessed: ``` Received unacceptable 12-byte response from 100.100.100.100 over UDP via utun6/27 -- id: 0x7F41 (32577), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 0/0/0/0, ``` If the response includes a question section, the resposnes are acceptable, e.g.: ``` Received acceptable 59-byte response from 8.8.8.8 over UDP via en0/17 -- id: 0x2E55 (11861), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 1/0/0/0, ``` This may be contributing to an issue under diagnosis in #13511 wherein some combination of conditions results in mdnsResponder no longer answering DNS queries correctly to applications on the system for extended periods of time (multiple minutes), while dig against quad-100 provides correct responses for those same domains. If additional debug logging is enabled in mdnsResponder we see it reporting: ``` Penalizing server 100.100.100.100 for 60 seconds ``` It is also possible that the reason that macOS & iOS never "stopped spamming" these queries is that they have never been replied to with acceptable responses. It is not clear if this special case handling of dns-sd PTR queries was ever beneficial, and given this evidence may have always been harmful. If we subsequently observe that the queries settle down now that they have acceptable responses, we should remove these special cases - making upstream queries very occasionally isn't a lot of battery, so we should be better off having to maintain less special cases and avoid bugs of this class. Updates #2442 Updates #3025 Updates #3363 Updates #3594 Updates #13511 Signed-off-by: James Tucker <james@tailscale.com>	2024-09-18 18:43:03 -07:00
Andrea Gottardo	a584d04f8a	dns: increase TimeToVisible before DNS unavailable warning (#13317 ) Updates tailscale/tailscale#13314 Some users are reporting 'DNS unavailable' spurious (?) warnings, especially on Android: https://old.reddit.com/r/Tailscale/comments/1f2ow3w/health_warning_dns_unavailable_on_tailscale/ https://old.reddit.com/r/Tailscale/comments/1f3l2il/health_warnings_dns_unavailable_what_does_it_mean/ I suspect this is caused by having a too low TimeToVisible setting on the Warnable, which triggers the unhealthy state during slow network transitions. Signed-off-by: Andrea Gottardo <andrea@gottardo.me>	2024-08-29 11:43:38 -07:00
Andrea Gottardo	5cbbb48c2e	health/dns: reduce severity of DNS unavailable warning (#13152 ) `DNS unavailable` was marked as a high severity warning. On Android (and other platforms), these trigger a system notification. Here we reduce the severity level to medium. A medium severity warning will still display the warning icon on platforms with a tray icon because of the `ImpactsConnectivity=true` flag being set here, but it won't show a notification anymore. If people enter an area with bad cellular reception, they're bound to receive so many of these notifications and we need to reduce notification fatigue. Signed-off-by: Andrea Gottardo <andrea@tailscale.com>	2024-08-16 11:12:06 -04:00
Nick Khyl	f23932bd98	net/dns/resolver: log forwarded query details when TS_DEBUG_DNS_FORWARD_SEND is enabled Troubleshooting DNS resolution issues often requires additional information. This PR expands the effect of the TS_DEBUG_DNS_FORWARD_SEND envknob to forwarder.forwardWithDestChan, and includes the request type, domain name length, and the first 3 bytes of the domain's SHA-256 hash in the output. Fixes #13070 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2024-08-08 15:57:35 -05:00
Jonathan Nobels	19b0c8a024	net/dns, health: raise health warning for failing forwarded DNS queries (#12888 ) updates tailscale/corp#21823 Misconfigured, broken, or blocked DNS will often present as "internet is broken'" to the end user. This plumbs the health tracker into the dns manager and forwarder and adds a health warning with a 5 second delay that is raised on failures in the forwarder and lowered on successes. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>	2024-07-29 13:48:46 -04:00
Jonathan Nobels	4e5ef5b628	net/dns: fix broken dns benchmark tests (#12686 ) Updates tailscale/corp#20677 The recover function wasn't getting set in the benchmark tests. Default changed to an empty func. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>	2024-07-02 14:22:13 -04:00
Brad Fitzpatrick	9766f0e110	net/dns: move mutex before the field it guards And some misc doc tweaks for idiomatic Go style. Updates #cleanup Change-Id: I3ca45f78aaca037f433538b847fd6a9571a2d918 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-06-26 16:56:02 -07:00
Andrew Dunham	a475c435ec	net/dns/resolver: fix test failure Updates #cleanup Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I0e815a69ee44ca0ff7c0ea0ca3c6904bbf67ed1f	2024-06-25 23:08:08 -04:00
Jonathan Nobels	27033c6277	net/dns: recheck DNS config on SERVFAIL errors (#12547 ) Fixes tailscale/corp#20677 Replaces the original attempt to rectify this (by injecting a netMon event) which was both heavy handed, and missed cases where the netMon event was "minor". On apple platforms, the fetching the interface's nameservers can and does return an empty list in certain situations. Apple's API in particular is very limiting here. The header hints at notifications for dns changes which would let us react ahead of time, but it's all private APIs. To avoid remaining in the state where we end up with no nameservers but we absolutely need them, we'll react to a lack of upstream nameservers by attempting to re-query the OS. We'll rate limit this to space out the attempts. It seems relatively harmless to attempt a reconfig every 5 seconds (triggered by an incoming query) if the network is in this broken state. Missing nameservers might possibly be a persistent condition (vs a transient error), but that would also imply that something out of our control is badly misconfigured. Tested by randomly returning [] for the nameservers. When switching between Wifi networks, or cell->wifi, this will randomly trigger the bug, and we appear to reliably heal the DNS state. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>	2024-06-25 14:56:13 -04:00
Jonathan Nobels	02e3c046aa	net/dns: re-query system resolvers on no-upstream resolver failure on apple platforms (#12398 ) Fixes tailscale/corp#20677 On macOS sleep/wake, we're encountering a condition where reconfigure the network a little bit too quickly - before apple has set the nameservers for our interface. This results in a persistent condition where we have no upstream resolver and fail all forwarded DNS queries. No upstream nameservers is a legitimate configuration, and we have no (good) way of determining when Apple is ready - but if we need to forward a query, and we have no nameservers, then something has gone badly wrong and the network is very broken. A simple fix here is to simply inject a netMon event, which will go through the configuration dance again when we hit the SERVFAIL condition. Tested by artificially/randomly returning [] for the list of nameservers in the bespoke ipn-bridge code responsible for getting the nameservers. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>	2024-06-12 15:45:13 -04:00
Nick Khyl	f62e678df8	net/dns/resolver, control/controlknobs, tailcfg: use UserDial instead of SystemDial to dial DNS servers Now that tsdial.Dialer.UserDial has been updated to honor the configured routes and dial external network addresses without going through Tailscale, while also being able to dial a node/subnet router on the tailnet, we can start using UserDial to forward DNS requests. This is primarily needed for DNS over TCP when forwarding requests to internal DNS servers, but we also update getKnownDoHClientForProvider to use it. Updates tailscale/corp#18725 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2024-05-06 17:29:24 -05:00
Andrew Dunham	f97d0ac994	net/dns/resolver: add better error wrapping To aid in debugging exactly what's going wrong, instead of the not-particularly-useful "dns udp query: context deadline exceeded" error that we currently get. Updates #3786 Updates #10768 Updates #11620 (etc.) Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I76334bf0681a8a2c72c90700f636c4174931432c	2024-05-02 14:08:05 -04:00
Brad Fitzpatrick	3672f29a4e	net/netns, net/dns/resolver, etc: make netmon required in most places The goal is to move more network state accessors to netmon.Monitor where they can be cheaper/cached. But first (this change and others) we need to make sure the one netmon.Monitor is plumbed everywhere. Some notable bits: * tsdial.NewDialer is added, taking a now-required netmon * because a tsdial.Dialer always has a netmon, anything taking both a Dialer and a NetMon is now redundant; take only the Dialer and get the NetMon from that if/when needed. * netmon.NewStatic is added, primarily for tests Updates tailscale/corp#10910 Updates tailscale/corp#18960 Updates #7967 Updates #3299 Change-Id: I877f9cb87618c4eb037cee098241d18da9c01691 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-04-27 12:17:45 -07:00
Andrew Dunham	b85c2b2313	net/dns/resolver: use SystemDial in DoH forwarder This ensures that we close the underlying connection(s) when a major link change happens. If we don't do this, on mobile platforms switching between WiFi and cellular can result in leftover connections in the http.Client's connection pool which are bound to the "wrong" interface. Updates #10821 Updates tailscale/corp#19124 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: Ibd51ce2efcaf4bd68e14f6fdeded61d4e99f9a01	2024-04-17 17:24:38 -04:00
James Tucker	8d0d46462b	net/dns: timeout DOH requests after 10s without response headers If a client socket is remotely lost but the client is not sent an RST in response to the next request, the socket might sit in RTO for extended lengths of time, resulting in "no internet" for users. Instead, timeout after 10s, which will close the underlying socket, recovering from the situation more promptly. Updates #10967 Signed-off-by: James Tucker <james@tailscale.com>	2024-02-23 23:08:12 -08:00
Andrew Dunham	35c303227a	net/dns/resolver: add ID to verbose logs in forwarder To make it easier to correlate the starting/ending log messages. Updates #cleanup Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I2802d53ad98e19bc8914bc58f8c04d4443227b26	2024-01-05 15:25:49 -05:00
Andrew Dunham	286c6ce27c	net/dns/resolver: race UDP and TCP queries (#9544 ) Instead of just falling back to making a TCP query to an upstream DNS server when the UDP query returns a truncated query, also start a TCP query in parallel with the UDP query after a given race timeout. This ensures that if the upstream DNS server does not reply over UDP (or if the response packet is blocked, or there's an error), we can still make queries if the server replies to TCP queries. This also adds a new package, util/race, to contain the logic required for racing two different functions and returning the first non-error answer. Updates tailscale/corp#14809 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I4311702016c1093b1beaa31b135da1def6d86316	2023-10-03 16:26:38 -04:00
Andrew Dunham	530aaa52f1	net/dns: retry forwarder requests over TCP We weren't correctly retrying truncated requests to an upstream DNS server with TCP. Instead, we'd return a truncated request to the user, even if the user was querying us over TCP and thus able to handle a large response. Also, add an envknob and controlknob to allow users/us to disable this behaviour if it turns out to be buggy (✨ DNS ✨). Updates #9264 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: Ifb04b563839a9614c0ba03e9c564e8924c1a2bfd	2023-09-25 16:42:07 -04:00
Mihai Parparita	7330aa593e	all: avoid repeated default interface lookups On some platforms (notably macOS and iOS) we look up the default interface to bind outgoing connections to. This is both duplicated work and results in logspam when the default interface is not available (i.e. when a phone has no connectivity, we log an error and thus cause more things that we will try to upload and fail). Fixed by passing around a netmon.Monitor to more places, so that we can use its cached interface state. Fixes #7850 Updates #7621 Signed-off-by: Mihai Parparita <mihai@tailscale.com>	2023-04-20 15:46:01 -07:00
Mihai Parparita	4722f7e322	all: move network monitoring from wgengine/monitor to net/netmon We're using it in more and more places, and it's not really specific to our use of Wireguard (and does more just link/interface monitoring). Also removes the separate interface we had for it in sockstats -- it's a small enough package (we already pull in all of its dependencies via other paths) that it's not worth the extra complexity. Updates #7621 Updates #7850 Signed-off-by: Mihai Parparita <mihai@tailscale.com>	2023-04-20 10:15:59 -07:00
Brad Fitzpatrick	10f1c90f4d	wgengine/magicsock, types/nettype, etc: finish ReadFromUDPAddrPort netip migration So we're staying within the netip.Addr/AddrPort consistently and avoiding allocs/conversions to the legacy net addr types. Updates #5162 Change-Id: I59feba60d3de39f773e68292d759766bac98c917 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2023-04-15 13:40:15 -07:00
Mihai Parparita	edb02b63f8	net/sockstats: pass in logger to sockstats.WithSockStats Using log.Printf may end up being printed out to the console, which is not desirable. I noticed this when I was investigating some client logs with `sockstats: trace "NetcheckClient" was overwritten by another`. That turns to be harmless/expected (the netcheck client will fall back to the DERP client in some cases, which does its own sockstats trace). However, the log output could be visible to users if running the `tailscale netcheck` CLI command, which would be needlessly confusing. Updates tailscale/corp#9230 Signed-off-by: Mihai Parparita <mihai@tailscale.com>	2023-04-12 18:40:03 -07:00
Andrew Dunham	83fa17d26c	various: pass logger.Logf through to more places Updates #7537 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: Id89acab70ea678c8c7ff0f44792d54c7223337c6	2023-03-12 12:38:38 -04:00
Mihai Parparita	6ac6ddbb47	sockstats: switch label to enum Makes it cheaper/simpler to persist values, and encourages reuse of labels as opposed to generating an arbitrary number. Updates tailscale/corp#9230 Updates #3363 Signed-off-by: Mihai Parparita <mihai@tailscale.com>	2023-03-06 15:54:35 -08:00
Mihai Parparita	9cb332f0e2	sockstats: instrument networking code paths Uses the hooks added by tailscale/go#45 to instrument the reads and writes on the major code paths that do network I/O in the client. The convention is to use "<package>.<type>:<label>" as the annotation for the responsible code path. Enabled on iOS, macOS and Android only, since mobile platforms are the ones we're most interested in, and we are less sensitive to any throughput degradation due to the per-I/O callback overhead (macOS is also enabled for ease of testing during development). For now just exposed as counters on a /v0/sockstats PeerAPI endpoint. We also keep track of the current interface so that we can break out the stats by interface. Updates tailscale/corp#9230 Updates #3363 Signed-off-by: Mihai Parparita <mihai@tailscale.com>	2023-03-01 12:09:31 -08:00
David Anderson	8b2ae47c31	version: unexport all vars, turn Short/Long into funcs The other formerly exported values aren't used outside the package, so just unexport them. Signed-off-by: David Anderson <danderson@tailscale.com>	2023-02-11 07:29:55 +00:00
Mihai Parparita	0e3fb91a39	net/dns/resolver: remove maxDoHInFlight It was originally added to control memory use on iOS (#2490), but then was relaxed conditionally when running on iOS 15 (#3098). Now that we require iOS 15, there's no need for the limit at all, so simplify back to the original state. Signed-off-by: Mihai Parparita <mihai@tailscale.com>	2023-02-03 17:07:12 -08:00
Will Norris	71029cea2d	all: update copyright and license headers This updates all source files to use a new standard header for copyright and license declaration. Notably, copyright no longer includes a date, and we now use the standard SPDX-License-Identifier header. This commit was done almost entirely mechanically with perl, and then some minimal manual fixes. Updates #6865 Signed-off-by: Will Norris <will@tailscale.com>	2023-01-27 15:36:29 -08:00
Josh Soref	d4811f11a0	all: fix spelling mistakes Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>	2022-09-29 13:36:13 -07:00
Eng Zer Jun	f0347e841f	refactor: move from io/ioutil to io and os packages The io/ioutil package has been deprecated as of Go 1.16 [1]. This commit replaces the existing io/ioutil functions with their new definitions in io and os packages. Reference: https://golang.org/doc/go1.16#ioutil Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2022-09-15 21:45:53 -07:00
Brad Fitzpatrick	74674b110d	envknob: support changing envknobs post-init Updates #5114 Change-Id: Ia423fc7486e1b3f3180a26308278be0086fae49b Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2022-09-15 15:04:02 -07:00
Mihai Parparita	82e82d9b7a	net/dns/resolver: remove unused responseTimeout constant Timeout is now enforced elsewhere, see discussion in https://github.com/tailscale/tailscale/pull/4408#discussion_r970092333. Signed-off-by: Mihai Parparita <mihai@tailscale.com>	2022-09-13 18:12:11 -07:00
Brad Fitzpatrick	e7376aca25	net/dns/resolver: set DNS-over-HTTPS Accept and User-Agent header on requests Change-Id: I14b821771681e70405a507f43229c694159265ff Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2022-09-10 08:57:26 -07:00
Brad Fitzpatrick	58abae1f83	net/dns/{publicdns,resolver}: add NextDNS DoH support NextDNS is unique in that users create accounts and then get user-specific DNS IPs & DoH URLs. For DoH, the customer ID is in the URL path. For IPv6, the IP address includes the customer ID in the lower bits. For IPv4, there's a fragile "IP linking" mechanism to associate your public IPv4 with an assigned NextDNS IPv4 and that tuple maps to your customer ID. We don't use the IP linking mechanism. Instead, NextDNS is DoH-only. Which means using NextDNS necessarily shunts all DNS traffic through 100.100.100.100 (programming the OS to use 100.100.100.100 as the global resolver) because operating systems can't usually do DoH themselves. Once it's in Tailscale's DoH client, we then connect out to the known NextDNS IPv4/IPv6 anycast addresses. If the control plane sends the client a NextDNS IPv6 address, we then map it to the corresponding NextDNS DoH with the same client ID, and we dial that DoH server using the combination of v4/v6 anycast IPs. Updates #2452 Change-Id: I3439d798d21d5fc9df5a2701839910f5bef85463 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2022-09-08 12:50:32 -07:00
Nahum Shalman	214242ff62	net/dns/publicdns: Add Mullvad DoH See https://mullvad.net/en/help/dns-over-https-and-dns-over-tls/ The Mullvad DoH servers appear to only speak HTTP/2 and the use of a non-nil DialContext in the http.Transport means that ForceAttemptHTTP2 must be set to true to be able to use them. Signed-off-by: Nahum Shalman <nahamu@gmail.com>	2022-08-26 17:46:30 -07:00
Maisem Ali	3bb57504af	net/dns/resolver: add comments clarifying nil error returns Signed-off-by: Maisem Ali <maisem@tailscale.com>	2022-08-09 13:32:11 -07:00
Maisem Ali	4497bb0b81	net/dns/resolver: return SERVFAIL when no upstream resolvers set Otherwise we just keep looping over the same thing again and again. ``` dns udp query: upstream nameservers not set dns udp query: upstream nameservers not set dns udp query: upstream nameservers not set ``` Signed-off-by: Maisem Ali <maisem@tailscale.com>	2022-08-09 13:28:03 -07:00
Maisem Ali	9bb5a038e5	all: use atomic.Pointer Also add some missing docs. Signed-off-by: Maisem Ali <maisem@tailscale.com>	2022-08-03 21:42:52 -07:00
Brad Fitzpatrick	a12aad6b47	all: convert more code to use net/netip directly perl -i -npe 's,netaddr.IPPrefixFrom,netip.PrefixFrom,' $(git grep -l -F netaddr.) perl -i -npe 's,netaddr.IPPortFrom,netip.AddrPortFrom,' $(git grep -l -F netaddr. ) perl -i -npe 's,netaddr.IPPrefix,netip.Prefix,g' $(git grep -l -F netaddr. ) perl -i -npe 's,netaddr.IPPort,netip.AddrPort,g' $(git grep -l -F netaddr. ) perl -i -npe 's,netaddr.IP\b,netip.Addr,g' $(git grep -l -F netaddr. ) perl -i -npe 's,netaddr.IPv6Raw\b,netip.AddrFrom16,g' $(git grep -l -F netaddr. ) goimports -w . Then delete some stuff from the net/netaddr shim package which is no longer neeed. Updates #5162 Change-Id: Ia7a86893fe21c7e3ee1ec823e8aba288d4566cd8 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2022-07-25 21:53:49 -07:00
Brad Fitzpatrick	7eaf5e509f	net/netaddr: start migrating to net/netip via new netaddr adapter package Updates #5162 Change-Id: Id7bdec303b25471f69d542f8ce43805328d56c12 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2022-07-25 16:20:43 -07:00
Tom DNetto	d6817d0f22	net/dns/resolver: respond with SERVFAIL if all upstreams fail Fixes #4722 Signed-off-by: Tom DNetto <tom@tailscale.com>	2022-07-05 10:22:52 -07:00
Brad Fitzpatrick	aa37aece9c	ipn/ipnlocal, net/dns, util/cloudenv: add AWS DNS support And remove the GCP special-casing from ipn/ipnlocal; do it only in the forwarder for .internal. Fixes #4980 Fixes #4981 Change-Id: I5c481e96d91f3d51d274a80fbd37c38f16dfa5cb Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2022-06-29 20:37:44 -07:00
Brad Fitzpatrick	88c2afd1e3	ipn/ipnlocal, net/dns, util/cloudenv: specialize DNS config on Google Cloud This does three things: If you're on GCP, it adds a .internal DNS split route to the metadata server, so we never break GCP DNS names. This lets people have some Tailscale nodes on GCP and some not (e.g. laptops at home) without having to add a Tailnet-wide .internal DNS route. If you already have such a route, though, it won't overwrite it. * If the 100.100.100.100 DNS forwarder has nowhere to forward to, it forwards it to the GCP metadata IP, which forwards to 8.8.8.8. This means there are never errNoUpstreams ("upstream nameservers not set") errors on GCP due to e.g. mangled /etc/resolv.conf (GCP default VMs don't have systemd-resolved, so it's likely a DNS supremacy fight) * makes the DNS fallback mechanism use the GCP metadata IP as a fallback before our hosted HTTP-based fallbacks I created a default GCP VM from their web wizard. It has no systemd-resolved. I then made its /etc/resolv.conf be empty and deleted its GCP hostnames in /etc/hosts. I then logged in to a tailnet with no global DNS settings. With this, tailscaled writes /etc/resolv.conf (direct mode, as no systemd-resolved) and sets it to 100.100.100.100, which then has regular DNS via the metadata IP and .internal DNS via the metadata IP as well. If the tailnet configures explicit DNS servers, those are used instead, except for .internal. This also adds a new util/cloudenv package based on version/distro where the cloud type is only detected once. We'll likely expand it in the future for other clouds, doing variants of this change for other popular cloud environments. Fixes #4911 RELNOTES=Google Cloud DNS improvements Change-Id: I19f3c2075983669b2b2c0f29a548da8de373c7cf Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2022-06-29 17:39:13 -07:00
Maisem Ali	fd99c54e10	tailcfg,all: change structs to []*dnstype.Resolver Signed-off-by: Maisem Ali <maisem@tailscale.com>	2022-05-06 10:58:10 -07:00
Tom DNetto	5b85f848dd	net/dns,net/dns/resolver: refactor channels/magicDNS out of Resolver Moves magicDNS-specific handling out of Resolver & into dns.Manager. This greatly simplifies the Resolver to solely issuing queries and returning responses, without channels. Enforcement of max number of in-flight magicDNS queries, assembly of synthetic UDP datagrams, and integration with wgengine for recieving/responding to magicDNS traffic is now entirely in Manager. This path is being kept around, but ultimately aims to be deleted and replaced with a netstack-based path. This commit is part of a series to implement magicDNS using netstack. Signed-off-by: Tom DNetto <tom@tailscale.com>	2022-04-30 10:18:59 -07:00
Tom DNetto	5fb8e01a8b	net/dns/resolver: add metric for number of truncated dns packets Updates #2067 This should help us determine if more robust control of edns parameters + implementing answer truncation is warranted, given its likely complexity. Signed-off-by: Tom DNetto <tom@tailscale.com>	2022-04-25 13:05:28 -07:00
Brad Fitzpatrick	cc575fe4d6	net/dns: schedule DoH upgrade explicitly, fix Resolver.Addr confusion Two changes in one: * make DoH upgrades an explicitly scheduled send earlier, when we come up with the resolvers-and-delay send plan. Previously we were getting e.g. four Google DNS IPs and then spreading them out in time (for back when we only did UDP) but then later we added DoH upgrading at the UDP packet layer, which resulted in sometimes multiple DoH queries to the same provider running (each doing happy eyeballs dialing to 4x IPs themselves) for each of the 4 source IPs. Instead, take those 4 Google/Cloudflare IPs and schedule 5 things: first the DoH query (which can use all 4 IPs), and then each of the 4 IPs as UDP later. * clean up the dnstype.Resolver.Addr confusion; half the code was using it as an IP string (as documented) as half was using it as an IP:port (from some prior type we used), primarily for tests. Instead, document it was being primarily an IP string but also accepting an IP:port for tests, then add an accessor method on it to get the IPPort and use that consistently everywhere. Change-Id: Ifdd72b9e45433a5b9c029194d50db2b9f9217b53 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2022-04-19 12:00:22 -07:00
Brad Fitzpatrick	e3a4952527	net/dns/resolver: count errors when racing DNS queries, fail earlier If all N queries failed, we waited until context timeout (in 5 seconds) to return. This makes (*forwarder).forward fail fast when the network's unavailable. Change-Id: Ibbb3efea7ed34acd3f3b29b5fee00ba8c7492569 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2022-04-19 11:07:31 -07:00

1 2

91 Commits