The prefix has space for 32-bit site IDs, but the validateViaPrefix
function would previously have disallowed site IDs greater than 255.
Fixestailscale/corp#16470
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I4cdb0711dafb577fae72d86c4014cf623fa538ef
Add a standalone server for STUN that can be hosted independently of the
derper, and factor that back into the derper.
Fixes#8434Closes#8435Closes#10745
Signed-off-by: James Tucker <james@tailscale.com>
To make it easier to correlate the starting/ending log messages.
Updates #cleanup
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2802d53ad98e19bc8914bc58f8c04d4443227b26
Updates #8022
Updates #6075
On iOS, we currently rely on delegated interface information to figure out the default route interface. The NetworkExtension framework in iOS seems to set the delegate interface only once, upon the *creation* of the VPN tunnel. If a network transition (e.g. from Wi-Fi to Cellular) happens while the tunnel is connected, it will be ignored and we will still try to set Wi-Fi as the default route because the delegated interface is not getting updated as connectivity transitions.
Here we work around this on the Swift side with a NWPathMonitor instance that observes the interface name of the first currently satisfied network path. Our Swift code will call into `UpdateLastKnownDefaultRouteInterface`, so we can rely on that when it is set.
If for any reason the Swift machinery didn't work and we don't get any updates, here we also have some fallback logic: we try finding a hardcoded Wi-Fi interface called en0. If en0 is down, we fall back to cellular (pdp_ip0) as a last resort. This doesn't handle all edge cases like USB-Ethernet adapters or multiple Ethernet interfaces, but it is good enough to ensure connectivity isn't broken.
I tested this on iPhones and iPads running iOS 17.1 and it appears to work. Switching between different cellular plans on a dual SIM configuration also works (the interface name remains pdp_ip0).
Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
If the epoch that we see during a Probe is less than the existing epoch,
it means that the gateway has either restarted or reset its
configuration, and an existing mapping is no longer valid. Reset any
saved mapping(s) if we detect this case so that a future
createOrGetMapping will not attempt to re-use it.
Updates #10597
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ie3cddaf625cb94a29885f7a1eeea25dbf6b97b47
When the portable Monitor creates a winMon via newOSMon, we register
address and route change callbacks with Windows. Once a callback is hit,
it starts a goroutine that attempts to send the event into messagec and returns.
The newly started goroutine then blocks until it can send to the channel.
However, if the monitor is never started and winMon.Receive is never called,
the goroutines remain indefinitely blocked, leading to goroutine leaks and
significant memory consumption in the tailscaled service process on Windows.
Unlike the tailscaled subprocess, the service process creates but never starts
a Monitor.
This PR adds a check within the callbacks to confirm the monitor's active status,
and exits immediately if the monitor hasn't started.
Updates #9864
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Run `staticcheck` with `U1000` to find unused code. This cleans up about
a half of it. I'll do the other half separately to keep PRs manageable.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This logs additional information about what mapping(s) are obtained
during the creation process, including whether we return an existing
cached mapping.
Updates #10597
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I9ff25071f064c91691db9ab0b9365ccc5f948d6e
Currently, we get the "likely home router" gateway IP and then iterate
through all IPs for all interfaces trying to match IPs to determine the
source IP. However, on many platforms we know what interface the gateway
is through, and thus we don't need to iterate through all interfaces
checking IPs. Instead, use the IP address of the associated interface.
This better handles the case where we have multiple interfaces on a
system all connected to the same gateway, and where the first interface
that we visit (as iterated by ForeachInterfaceAddress) isn't also the
default internet route.
Updates #8992
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8632f577f1136930f4ec60c76376527a19a47d1f
Instead of taking the first UPnP response we receive and using that to
create port mappings, store all received UPnP responses, sort and
deduplicate them, and then try all of them to obtain an external
address.
Updates #10602
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I783ccb1834834ee2a9ecbae2b16d801f2354302f
This uses the fact that we've received a frame from a given DERP region
within a certain time as a signal that the region is stil present (and
thus can still be a node's PreferredDERP / home region) even if we don't
get a STUN response from that region during a netcheck.
This should help avoid DERP flaps that occur due to losing STUN probes
while still having a valid and active TCP connection to the DERP server.
RELNOTE=Reduce home DERP flapping when there's still an active connection
Updates #8603
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If7da6312581e1d434d5c0811697319c621e187a0
Previously, we would select the first WANIPConnection2 (and related)
client from the root device, without any additional checks. However,
some routers expose multiple UPnP devices in various states, and simply
picking the first available one can result in attempting to perform a
portmap with a device that isn't functional.
Instead, mimic what the miniupnpc code does, and prefer devices that are
(a) reporting as Connected, and (b) have a valid external IP address.
For our use-case, we additionally prefer devices that have an external
IP address that's a public address, to increase the likelihood that we
can obtain a direct connection from peers.
Finally, we split out fetching the root device (getUPnPRootDevice) from
selecting the best service within that root device (selectBestService),
and add some extensive tests for various UPnP server behaviours.
RELNOTE=Improve UPnP portmapping when multiple UPnP services exist
Updates #8364
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I71795cd80be6214dfcef0fe83115a5e3fe4b8753
Unfortunately in the test we can't reproduce the failure seen
in the real system ("SOAP fault: UPnPError")
Updates https://github.com/tailscale/tailscale/issues/8364
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
Before this fix, LikelyHomeRouterIP could return a 'self' IP that
doesn't correspond to the gateway address, since it picks the first
private address when iterating over the set interfaces as the 'self' IP,
without checking that the address corresponds with the
previously-detected gateway.
This behaviour was introduced by accident in aaf2df7, where we deleted
the following code:
for _, prefix := range privatev4s {
if prefix.Contains(gateway) && prefix.Contains(ip) {
myIP = ip
ok = true
return
}
}
Other than checking that 'gateway' and 'ip' were private IP addresses
(which were correctly replaced with a call to the netip.Addr.IsPrivate
method), it also implicitly checked that both 'gateway' and 'ip' were a
part of the *same* prefix, and thus likely to be the same interface.
Restore that behaviour by explicitly checking pfx.Contains(gateway),
which, given that the 'ip' variable is derived from our prefix 'pfx',
ensures that the 'self' IP will correspond to the returned 'gateway'.
Fixes#10466
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Iddd2ee70cefb9fb40071986fefeace9ca2441ee6
Config.singleResolverSet returns true if all routes have the same resolvers,
even if the routes have no resolvers. If none of the routes have a specific
resolver, the default should be used instead. Therefore, check for more than
0 instead of nil.
Signed-off-by: Ryan Petris <ryan@petris.net>
This prevents running more than one recursive resolution for the same
hostname in parallel, which can use excessive amounts of CPU when called
in a tight loop. Additionally, add tests that hit the network (when
run with a flag) to test the lookup behaviour.
Updates tailscale/corp#15261
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I39351e1d2a8782dd4c52cb04b3bd982eb651c81e
An EmbeddedAppConnector is added that when configured observes DNS
responses from the PeerAPI. If a response is found matching a configured
domain, routes are advertised when necessary.
The wiring from a configuration in the netmap capmap is not yet done, so
while the connector can be enabled, no domains can yet be added.
Updates tailscale/corp#15437
Signed-off-by: James Tucker <james@tailscale.com>
The other IP types don't appear to be imported anymore, and after a scan
through I couldn't see any substantial usage of other representations,
so I think this TODO is complete.
Updates #cleanup
Signed-off-by: James Tucker <james@tailscale.com>
As of 2023-11-27, the official IP addresses for b.root-servers.net will
change to a new set, with the older IP addresses supported for at least
a year after that date. These IPs are already active and returning
results, so update these in our recursive DNS resolver package so as to
be ready for the switchover.
See: https://b.root-servers.org/news/2023/05/16/new-addresses.htmlFixes#9994
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I29e2fe9f019163c9ec0e62bdb286e124aa90a487
It seems to be implicated in a CPU consumption bug that's not yet
understood. Disable it until we understand.
Updates tailscale/corp#15261
Change-Id: Ia6d0c310da6464dda79a70fc3c18be0782812d3f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Endeavour OS, at least, uses NetworkManager 1.44.2 and does
not use systemd-resolved behind the scenes at all. If we
find ourselves in that situation, return "direct" not
"systemd-resolved"
Fixes https://github.com/tailscale/tailscale/issues/9687
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
The current structure meant that we were embedding netstack in
the tailscale CLI and in the GUIs. This removes that by isolating
the checksum munging to a different pkg which is only called from
`net/tstun`.
Fixes#9756
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Tailscale attempts to determine if resolvconf or openresolv
is in use by running `resolvconf --version`, under the assumption
this command will error when run with Debian's resolvconf. This
assumption is no longer true and leads to the wrong commands being
run on newer versions of Debian with resolvconf >= 1.90. We can
now check if the returned version string starts with "Debian resolvconf"
if the command is successful.
Fixes#9218
Signed-off-by: Galen Guyer <galen@galenguyer.com>
Automatically probe the path MTU to a peer when peer MTU is enabled, but do not
use the MTU information for anything yet.
Updates #311
Signed-off-by: Val <valerie@tailscale.com>
Advertise it on Android (it looks like it already works once advertised).
And both advertise & likely fix it on iOS. Yet untested.
Updates #9672
Change-Id: If3b7e97f011dea61e7e75aff23dcc178b6cf9123
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Instead of just falling back to making a TCP query to an upstream DNS
server when the UDP query returns a truncated query, also start a TCP
query in parallel with the UDP query after a given race timeout. This
ensures that if the upstream DNS server does not reply over UDP (or if
the response packet is blocked, or there's an error), we can still make
queries if the server replies to TCP queries.
This also adds a new package, util/race, to contain the logic required for
racing two different functions and returning the first non-error answer.
Updates tailscale/corp#14809
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I4311702016c1093b1beaa31b135da1def6d86316
Implements the ability for the address-rewriting code to support rewriting IPv6 addresses.
Specifically, UpdateSrcAddr & UpdateDstAddr.
Signed-off-by: Tom DNetto <tom@tailscale.com>
Updates https://github.com/tailscale/corp/issues/11202
Now that corp is updated, remove the shim code to bridge the rename from
DefaultMTU() to DefaultTUNMTU.
Updates #311
Signed-off-by: Val <valerie@tailscale.com>
We should be able to freely run `./tool/go generate ./...`, but we're
continually dodging this particular generator. Instead of constantly
dodging it, let's just remove it.
Updates #cleanup
Signed-off-by: James Tucker <james@tailscale.com>
Prepare for path MTU discovery by splitting up the concept of
DefaultMTU() into the concepts of the Tailscale TUN MTU, MTUs of
underlying network interfaces, minimum "safe" TUN MTU, user configured
TUN MTU, probed path MTU to a peer, and maximum probed MTU. Add a set
of likely MTUs to probe.
Updates #311
Signed-off-by: Val <valerie@tailscale.com>
We weren't correctly retrying truncated requests to an upstream DNS
server with TCP. Instead, we'd return a truncated request to the user,
even if the user was querying us over TCP and thus able to handle a
large response.
Also, add an envknob and controlknob to allow users/us to disable this
behaviour if it turns out to be buggy (✨ DNS ✨).
Updates #9264
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ifb04b563839a9614c0ba03e9c564e8924c1a2bfd
Prepare for path MTU discovery by splitting up the concept of
DefaultMTU() into the concepts of the Tailscale TUN MTU, MTUs of
underlying network interfaces, minimum "safe" TUN MTU, user configured
TUN MTU, probed path MTU to a peer, and maximum probed MTU. Add a set
of likely MTUs to probe.
Updates #311
Signed-off-by: Val <valerie@tailscale.com>
One Quad9 IPv6 address was incorrect, and an additional group needed
adding. Additionally I checked Cloudflare and included source reference
URLs for both.
Updates #cleanup
Signed-off-by: James Tucker <james@tailscale.com>
It might as well have been spewing out gibberish. This adds
a nicer output format for us to be able to read and identify
whats going on.
Sample output
```
natV4Config{nativeAddr: 100.83.114.95, listenAddrs: [10.32.80.33], dstMasqAddrs: [10.32.80.33: 407 peers]}
```
Fixestailscale/corp#14650
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This PR plumbs through awareness of an IPv6 SNAT/masquerade address from the wire protocol
through to the low-level (tstun / wgengine). This PR is the first in two PRs for implementing
IPv6 NAT support to/from peers.
A subsequent PR will implement the data-plane changes to implement IPv6 NAT - this is just plumbing.
Signed-off-by: Tom DNetto <tom@tailscale.com>
Updates ENG-991
This should allow us to gather a bit more information about errors that
we encounter when creating UPnP mappings. Since we don't have a
"LabelMap" construction for clientmetrics, do what sockstats does and
lazily register a new metric when we see a new code.
Updates #9343
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibb5aadd6138beb58721f98123debcc7273b611ba
And convert all callers over to the methods that check SelfNode.
Now we don't have multiple ways to express things in tests (setting
fields on SelfNode vs NetworkMap, sometimes inconsistently) and don't
have multiple ways to check those two fields (often only checking one
or the other).
Updates #9443
Change-Id: I2d7ba1cf6556142d219fae2be6f484f528756e3c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
NetworkMap.Addresses is redundant with the SelfNode.Addresses. This
works towards a TODO to delete NetworkMap.Addresses and replace it
with a method.
This is similar to #9389.
Updates #cleanup
Change-Id: Id000509ca5d16bb636401763d41bdb5f38513ba0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The following IPs are not used anymore: 193.19.108.2 and 193.19.108.3.
All of the servers are now named consistently under dns.mullvad.net.
Several new servers were added.
https://mullvad.net/en/help/dns-over-https-and-dns-over-tls/
Updates #5416
Updates #9345
Signed-off-by: James Tucker <james@tailscale.com>
This logs that the gateway/self IP address has changed if one of the new
values differs.
Updates #8992
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I0919424b68ad97fbe1204dd36317ed6f5915411f
Some routers don't support lease times for UPnP portmapping; let's fall
back to adding a permanent lease in these cases. Additionally, add a
proper end-to-end test case for the UPnP portmapping behaviour.
Updates #9343
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I17dec600b0595a5bfc9b4d530aff6ee3109a8b12
Previously two tsnet nodes in the same process couldn't have disjoint
sets of controlknob settings from control as both would overwrite each
other's global variables.
This plumbs a new controlknobs.Knobs type around everywhere and hangs
the knobs sent by control on that instead.
Updates #9351
Change-Id: I75338646d36813ed971b4ffad6f9a8b41ec91560
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I didn't clean up the more idiomatic map[T]bool with true values, at
least yet. I just converted the relatively awkward struct{}-valued
maps.
Updates #cleanup
Change-Id: I758abebd2bb1f64bc7a9d0f25c32298f4679c14f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
RELNOTE=Adds support for Wikimedia DNS
Updates #9255
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I4213c29e0f91ea5aa0304a5a026c32b6690fead9
It was too aggressive before, as it only had the ill-defined "Major"
bool to work with. Now it can check more precisely.
Updates #9040
Change-Id: I20967283b64af6a9cad3f8e90cff406de91653b8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This makes wsconn.Conns somewhat present reasonably when they are
the client of an http.Request, rather than just put a placeholder
in that field.
Updates tailscale/corp#13777
Signed-off-by: David Anderson <danderson@tailscale.com>
This removes a lot of API from net/interfaces (including all the
filter types, EqualFiltered, active Tailscale interface func, etc) and
moves the "major" change detection to net/netmon which knows more
about the world and the previous/new states.
Updates #9040
Change-Id: I7fe66a23039c6347ae5458745b709e7ebdcce245
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This makes it more maintainable for other code to statically depend
on the exact value of this string. It also makes it easier to
identify what code might depend on this string by looking up
references to this constant.
Updates tailscale/corp#13777
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This option allows logging the raw HTTP requests and responses that the
portmapper Client makes when using UPnP. This can be extremely helpful
when debugging strange UPnP issues with users' devices, and might allow
us to avoid having to instruct users to perform a packet capture.
Updates #8992
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2c3cf6930b09717028deaff31738484cc9b008e4
I'm not saying it works, but it compiles.
Updates #5794
Change-Id: I2f3c99732e67fe57a05edb25b758d083417f083e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We have a fancy package for doing TLS cert validation even if the machine
doesn't have TLS certs (for LetsEncrypt only) but the CLI's netcheck command
wasn't using it.
Also, update the tlsdial's outdated package docs while here.
Updates #cleanup
Change-Id: I74b3cb645d07af4d8ae230fb39a60c809ec129ad
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This simplifies some netmon code in prep for other changes.
It breaks up Monitor.debounce into a helper method so locking is
easier to read and things unindent, and then it simplifies the polling
netmon implementation to remove the redundant stuff that the caller
(the Monitor.debounce loop) was already basically doing.
Updates #9040
Change-Id: Idcfb45201d00ae64017042a7bdee6ef86ad37a9f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Port 0 is interpreted, per the spec (but inconsistently among router
software) as requesting to map every single available port on the UPnP
gateway to the internal IP address. We'd previously avoided picking
ports below 1024 for one of the two UPnP methods (in #7457), and this
change moves that logic so that we avoid it in all cases.
Updates #8992
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I20d652c0cd47a24aef27f75c81f78ae53cc3c71e
Make it just a views.Slice[netip.Prefix] instead of its own named type.
Having the special case led to circular dependencies in another WIP PR
of mine.
Updates #8948
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Values are still turned into pointers internally to maintain the
invariants of strideTable, but from the user's perspective it's
now possible to tbl.Insert(pfx, true) rather than
tbl.Insert(pfx, ptr.To(true)).
Updates #7781
Signed-off-by: David Anderson <danderson@tailscale.com>
In preparation for a different refactor, but incidentally also saves
10-25% memory on overall table size in benchmarks.
Updates #7781
Signed-off-by: David Anderson <danderson@tailscale.com>
Netcheck no longer performs I/O itself, instead it makes requests via
SendPacket and expects users to route reply traffic to
ReceiveSTUNPacket.
Netcheck gains a Standalone function that stands up sockets and
goroutines to implement I/O when used in a standalone fashion.
Magicsock now unconditionally routes STUN traffic to the netcheck.Client
that it hosts, and plumbs the send packet sink.
The CLI is updated to make use of the Standalone mode.
Fixes#8723
Signed-off-by: James Tucker <james@tailscale.com>
Refactor two shared functions used by the tailscale cli,
calcAdvertiseRoutes and licensesURL. These are used by the web client as
well as other tailscale subcommands. The web client is being moved out
of the cli package, so move these two functions to new locations.
Updates tailscale/corp#13775
Signed-off-by: Will Norris <will@tailscale.com>
One is a straight "I forgot how to Go" bug, the others are semantic
mismatches with the main implementation around masking the prefixes
passed to insert/delete.
Updates #7781
Signed-off-by: David Anderson <danderson@tailscale.com>