mirror of
https://github.com/tailscale/tailscale.git
synced 2025-01-08 09:07:44 +00:00
net/dns: re-query system resolvers on no-upstream resolver failure on apple platforms (#12398)
Fixes tailscale/corp#20677 On macOS sleep/wake, we're encountering a condition where reconfigure the network a little bit too quickly - before apple has set the nameservers for our interface. This results in a persistent condition where we have no upstream resolver and fail all forwarded DNS queries. No upstream nameservers is a legitimate configuration, and we have no (good) way of determining when Apple is ready - but if we need to forward a query, and we have no nameservers, then something has gone badly wrong and the network is very broken. A simple fix here is to simply inject a netMon event, which will go through the configuration dance again when we hit the SERVFAIL condition. Tested by artificially/randomly returning [] for the list of nameservers in the bespoke ipn-bridge code responsible for getting the nameservers. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
This commit is contained in:
parent
d0f1a838a6
commit
02e3c046aa
@ -14,6 +14,7 @@
|
|||||||
"net/http"
|
"net/http"
|
||||||
"net/netip"
|
"net/netip"
|
||||||
"net/url"
|
"net/url"
|
||||||
|
"runtime"
|
||||||
"sort"
|
"sort"
|
||||||
"strings"
|
"strings"
|
||||||
"sync"
|
"sync"
|
||||||
@ -881,6 +882,24 @@ func (f *forwarder) forwardWithDestChan(ctx context.Context, query packet, respo
|
|||||||
if len(resolvers) == 0 {
|
if len(resolvers) == 0 {
|
||||||
metricDNSFwdErrorNoUpstream.Add(1)
|
metricDNSFwdErrorNoUpstream.Add(1)
|
||||||
f.logf("no upstream resolvers set, returning SERVFAIL")
|
f.logf("no upstream resolvers set, returning SERVFAIL")
|
||||||
|
|
||||||
|
if runtime.GOOS == "darwin" || runtime.GOOS == "ios" {
|
||||||
|
// On apple, having no upstream resolvers here is the result a race condition where
|
||||||
|
// we've tried a reconfig after a major link change but the system has not yet set
|
||||||
|
// the resolvers for the new link. We use SystemConfiguration to query nameservers, and
|
||||||
|
// the timing of when that will give us the "right" answer is non-deterministic.
|
||||||
|
//
|
||||||
|
// This will typically happen on sleep-wake cycles with a Wifi interface where
|
||||||
|
// it takes some random amount of time (after telling us that the interface exists)
|
||||||
|
// for the system to configure the dns servers.
|
||||||
|
//
|
||||||
|
// Repolling the network monitor here is a bit odd, but if we're
|
||||||
|
// seeing DNS queries, it's likely that the network is now fully configured, and it's
|
||||||
|
// an ideal time to to requery for the nameservers.
|
||||||
|
f.logf("injecting network monitor event to attempt to refresh the resolvers")
|
||||||
|
f.netMon.InjectEvent()
|
||||||
|
}
|
||||||
|
|
||||||
res, err := servfailResponse(query)
|
res, err := servfailResponse(query)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
f.logf("building servfail response: %v", err)
|
f.logf("building servfail response: %v", err)
|
||||||
|
Loading…
x
Reference in New Issue
Block a user