With #6566 we added an external mechanism for getting the default
interface, and used it on macOS and iOS (see tailscale/corp#8201).
The goal was to be able to get the default physical interface even when
using an exit node (in which case the routing table would say that the
Tailscale utun* interface is the default).
However, the external mechanism turns out to be unreliable in some
cases, e.g. when multiple cellular interfaces are present/toggled (I
have occasionally gotten my phone into a state where it reports the pdp_ip1
interface as the default, even though it can't actually route traffic).
It was observed that `ifconfig -v` on macOS reports an "effective interface"
for the Tailscale utn* interface, which seems promising. By examining
the ifconfig source code, it turns out that this is done via a
SIOCGIFDELEGATE ioctl syscall. Though this is a private API, it appears
to have been around for a long time (e.g. it's in the 10.13 xnu release
at https://opensource.apple.com/source/xnu/xnu-4570.41.2/bsd/net/if_types.h.auto.html)
and thus is unlikely to go away.
We can thus use this ioctl if the routing table says that a utun*
interface is the default, and go back to the simpler mechanism that
we had before #6566.
Updates #7184
Updates #7188
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
As part of the work on #7248 I wanted to know all of the flags on the
RouteMessage struct that we get back from macOS. Though it doesn't turn
out to be useful (when using an exit node/Tailscale is the default route,
the flags for the physical interface routes are the same), it still seems
useful from a debugging/comprehensiveness perspective.
Adds additional Darwin flags that were output once I enabled this mode.
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
With #6566 we started to more aggressively bind to the default interface
on Darwin. We are seeing some reports of the wrong cellular interface
being chosen on iOS. To help with the investigation, this adds to knobs
to control the behavior changes:
- CapabilityDebugDisableAlternateDefaultRouteInterface disables the
alternate function that we use to get the default interface on macOS
and iOS (implemented in tailscale/corp#8201). We still log what it
would have returned so we can see if it gets things wrong.
- CapabilityDebugDisableBindConnToInterface is a bigger hammer that
disables binding of connections to the default interface altogether.
Updates #7184
Updates #7188
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
It was originally added to control memory use on iOS (#2490), but then
was relaxed conditionally when running on iOS 15 (#3098). Now that we
require iOS 15, there's no need for the limit at all, so simplify back
to the original state.
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
GetProxyConnectHeader (golang/go#41048) was upstreamed in Go 1.16 and
OnProxyConnectResponse (golang/go#54299) in Go 1.20, thus we no longer
need to guard their use by the tailscale_go build tag.
Updates #7123
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
Add the envknob TS_DEBUG_EXIT_NODE_DNS_NET_PKG, which enables more
verbose debug logging when calling the handleExitNodeDNSQueryWithNetPkg
function. This function is currently only called on Windows and Android.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ieb3ca7b98837d7dc69cd9ca47609c1c52e3afd7b
When we make a connection to a server, we previously would verify with
the system roots, and then fall back to verifying with our baked-in
Let's Encrypt root if the system root cert verification failed.
We now explicitly check for, and log a health error on, self-signed
certificates. Additionally, we now always verify against our baked-in
Let's Encrypt root certificate and log an error if that isn't
successful. We don't consider this a health failure, since if we ever
change our server certificate issuer in the future older non-updated
versions of Tailscale will no longer be healthy despite being able to
connect.
Updates #3198
Change-Id: I00be5ceb8afee544ee795e3c7a2815476abc4abf
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Updates #7123
Updates #6257 (more to do in other repos)
Change-Id: I073e2a6d81a5d7fbecc29caddb7e057ff65239d0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Update all code generation tools, and those that check for license
headers to use the new standard header.
Also update copyright statement in LICENSE file.
Fixes#6865
Signed-off-by: Will Norris <will@tailscale.com>
This updates all source files to use a new standard header for copyright
and license declaration. Notably, copyright no longer includes a date,
and we now use the standard SPDX-License-Identifier header.
This commit was done almost entirely mechanically with perl, and then
some minimal manual fixes.
Updates #6865
Signed-off-by: Will Norris <will@tailscale.com>
Follow-up to #7065 with some comments from Brad's review.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ia1219f4fa25479b2dada38ffe421065b408c5954
The API documentation does claim to output empty strings under certain
conditions, but we're sometimes seeing nil pointers in the wild, not empty
strings.
Fixes https://github.com/tailscale/corp/issues/8878
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
When turned on via environment variable (off by default), this will use
the BSD routing APIs to query what interface index a socket should be
bound to, rather than binding to the default interface in all cases.
Updates #5719
Updates #5940
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib4c919471f377b7a08cd3413f8e8caacb29fee0b
This allows users to temporarily enable/disable dnscache logging via a
new node capability, to aid in debugging strange connectivity issues.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I46cf2596a8ae4c1913880a78d0033f8b668edc08
I typoed/brainoed in the earlier 3582628691
Change-Id: Ic198a6f9911f195d9da9fc5259b5784a4b15e5e3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
01b90df2fa added SCTP support before
(with explicit parsing for ports) and
69de3bf7bf tried to add support for
arbitrary IP protocols (as long as the ACL permited a port of "*",
since we might not know how to find ports from an arbitrary IP
protocol, if it even has such a concept). But apparently that latter
commit wasn't tested end-to-end enough. It had a lot of tests, but the
tests made assumptions about layering that either weren't true, or
regressed since 1.20. Notably, it didn't remove the (*Filter).pre
bidirectional filter that dropped all "unknown" protocol packets both
leaving and entering, even if there were explicit protocol matches
allowing them in.
Also, don't map all unknown protocols to 0. Keep their IP protocol
number parsed so it's matchable by later layers. Only reject illegal
things.
Fixes#6423
Updates #2162
Updates #2163
Change-Id: I9659b3ece86f4db51d644f9b34df78821758842c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Gateway devices operating as an HA pair w/VRRP or CARP may send UPnP
replies from static addresses rather than the floating gateway address.
This commit relaxes our source address verification such that we parse
responses from non-gateway IPs, and re-point the UPnP root desc
URL to the gateway IP. This ensures we are still interfacing with the
gateway device (assuming L2 security intact), even though we got a
root desc from a non-gateway address.
This relaxed handling is required for ANY port mapping to work on certain
OPNsense/pfsense distributions using CARP at the time of writing, as
miniupnpd may only listen on the static, non-gateway interface address
for PCP and PMP.
Fixes#5502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
With a42a594bb3, iOS uses netstack and
hence there are no longer any platforms which use the legacy MagicDNS path. As such, we remove it.
We also normalize the limit for max in-flight DNS queries on iOS (it was 64, now its 256 as per other platforms).
It was 64 for the sake of being cautious about memory, but now we have 50Mb (iOS-15 and greater) instead of 15Mb
so we have the spare headroom.
Signed-off-by: Tom DNetto <tom@tailscale.com>
Go now includes the GOROOT bin directory in $PATH while running tests
and generate, so it is no longer necessary to construct a path using
runtime.GOROOT().
Fixes#6689
Signed-off-by: James Tucker <james@tailscale.com>
We saw a few cases where we hit this limit; bumping to 4k seems
relatively uncontroversial.
Change-Id: I218fee3bc0d2fa5fde16eddc36497a73ebd7cbda
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
We change our invocations of GetExtendedTcpTable to request additional
information about the "module" responsible for the port. In addition to pid,
this output also includes sufficient metadata to enable Windows to resolve
process names and disambiguate svchost processes.
We store the OS-specific output in an OSMetadata field in netstat.Entry, which
portlist may then use as necessary to actually resolve the process/module name.
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
The Tailscale logging service has a hard limit on the maximum
log message size that can be accepted.
We want to ensure that netlog messages never exceed
this limit otherwise a client cannot transmit logs.
Move the goroutine for periodically dumping netlog messages
from wgengine/netlog to net/connstats.
This allows net/connstats to manage when it dumps messages,
either based on time or by size.
Updates tailscale/corp#8427
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Previously, if a DNS-over-TCP message was received while there were
existing queries in-flight, and it was over the size limit, we'd close
the 'responses' channel. This would cause those in-flight queries to
send on the closed channel and panic.
Instead, don't close the channel at all and rely on s.ctx being
canceled, which will ensure that in-flight queries don't hang.
Fixes#6725
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8267728ac37ed7ae38ddd09ce2633a5824320097
This is temporary while we work to upstream performance work in
https://github.com/WireGuard/wireguard-go/pull/64. A replace directive
is less ideal as it breaks dependent code without duplication of the
directive.
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We would call parsedPacketPool.Get() for all packets received in Read/Write.
This was wasteful and not necessary, fetch a single *packet.Parsed for
all packets.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This commit updates the wireguard-go dependency and implements the
necessary changes to the tun.Device and conn.Bind implementations to
support passing vectors of packets in tailscaled. This significantly
improves throughput performance on Linux.
Updates #414
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Co-authored-by: James Tucker <james@tailscale.com>
x/exp/slices now has ContainsFunc (golang/go#53983) so we can delete
our versions.
Change-Id: I5157a403bfc1b30e243bf31c8b611da25e995078
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We were previously only doing this for tailscaled-on-Darwin, but it also
appears to help on iOS. Otherwise, when we rebind magicsock UDP
connections after a cellular -> WiFi interface change they still keep
using cellular one.
To do this correctly when using exit nodes, we need to exclude the
Tailscale interface when getting the default route, otherwise packets
cannot leave the tunnel. There are native macOS/iOS APIs that we can
use to do this, so we allow those clients to override the implementation
of DefaultRouteInterfaceIndex.
Updates #6565, may also help with #5156
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
I couldn't find any logs that indicated which mode it was running in so adding that.
Also added a gauge metric for dnsMode.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Previously, tstun.Wrapper and magicsock.Conn managed their
own statistics data structure and relied on an external call to
Extract to extract (and reset) the statistics.
This makes it difficult to ensure a maximum size on the statistics
as the caller has no introspection into whether the number
of unique connections is getting too large.
Invert the control flow such that a *connstats.Statistics
is registered with tstun.Wrapper and magicsock.Conn.
Methods on non-nil *connstats.Statistics are called for every packet.
This allows the implementation of connstats.Statistics (in the future)
to better control when it needs to flush to ensure
bounds on maximum sizes.
The value registered into tstun.Wrapper and magicsock.Conn could
be an interface, but that has two performance detriments:
1. Method calls on interface values are more expensive since
they must go through a virtual method dispatch.
2. The implementation would need a sync.Mutex to protect the
statistics value instead of using an atomic.Pointer.
Given that methods on constats.Statistics are called for every packet,
we want reduce the CPU cost on this hot path.
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
The fix in 4fc8538e2 was sufficient for IPv6. Browsers (can?) send the
IPv6 literal, even without a port number, in brackets.
Updates tailscale/corp#7948
Change-Id: I0e429d3de4df8429152c12f251ab140b0c8f6b77
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There was a mechanism in tshttpproxy to note that a Windows proxy
lookup failed and to stop hitting it so often. But that turns out to
fire a lot (no PAC file configured at all results in a proxy lookup),
so after the first proxy lookup, we were enabling the "omg something's
wrong, stop looking up proxies" bit for awhile, which was then also
preventing the normal Go environment-based proxy lookups from working.
This at least fixes environment-based proxies.
Plenty of other Windows-specific proxy work remains (using
WinHttpGetIEProxyConfigForCurrentUser instead of just PAC files,
ignoring certain types of errors, etc), but this should fix
the regression reported in #4811.
Updates #4811
Change-Id: I665e1891897d58e290163bda5ca51a22a017c5f9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Run an inotify goroutine and watch if another program takes over
/etc/inotify.conf. Log if so.
For now this only logs. In the future I want to wire it up into the
health system to warn (visible in "tailscale status", etc) about the
situation, with a short URL to more info about how you should really
be using systemd-resolved if you want programs to not fight over your
DNS files on Linux.
Updates #4254 etc etc
Change-Id: I86ad9125717d266d0e3822d4d847d88da6a0daaa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Noticed this while debugging something else, we would reset all routes if
either `--advertise-exit-node` or `--advertise-routes` were set. This handles
correctly updating them.
Also added tests.
Signed-off-by: Maisem Ali <maisem@tailscale.com>
The derpers don't allow whitespace in the challenge.
Change-Id: I93a8b073b846b87854fba127b5c1d80db205f658
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
The //go:build syntax was introduced in Go 1.17:
https://go.dev/doc/go1.17#build-lines
gofmt has kept the +build and go:build lines in sync since
then, but enough time has passed. Time to remove them.
Done with:
perl -i -npe 's,^// \+build.*\n,,' $(git grep -l -F '+build')
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To collect some data on how widespread this is and whether there's
any correlation between different versions of Windows, etc.
Updates #4811
Change-Id: I003041d0d7e61d2482acd8155c1a4ed413a2c5c4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Instead of returning a custom error, use ErrGetBaseConfigNotSupported
that seems to be intended for this use case. This fixes DNS resolution
on macOS clients compiled from source.
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The netlog.Message type is useful to depend on from other packages,
but doing so would transitively cause gvisor and other large packages
to be linked in.
Avoid this problem by moving all network logging types to a single package.
We also update staticcheck to take in:
003d277bcf
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
It looks like this was left by mistake in 4a3e2842.
Change-Id: Ie4e3d5842548cd2e8533b3552298fb1ce9ba761a
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
On Android, the system resolver can return IPv4 addresses as IPv6-mapped
addresses (i.e. `::ffff:a.b.c.d`). After the switch to `net/netip`
(19008a3), this case is no longer handled and a response like this will
be seen as failure to resolve any IPv4 addresses.
Handle this case by simply calling `Unmap()` on the returned IPs. Fixes#5698.
Signed-off-by: Peter Cai <peter@typeblog.net>
We had previously added this to the netcheck report in #5087 but never
copied it into the NetInfo struct. Additionally, add it to log lines so
it's visible to support.
Change-Id: Ib6266f7c6aeb2eb2a28922aeafd950fe1bf5627e
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
The ResolvConfMode property is documented to return how systemd-resolved
is currently managing /etc/resolv.conf. Include that information in the
debug line, when available, to assist in debugging DNS issues.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1ae3a257df1d318d0193a8c7f135c458ec45093e
The Lufthansa in-flight wifi generates a synthetic 204 response to the
DERP server's /generate_204 endpoint. This PR adds a basic
challenge/response to the endpoint; something sufficiently complicated
that it's unlikely to be implemented by a captive portal. We can then
check for the expected response to verify whether we're being MITM'd.
Follow-up to #5601
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I94a68c9a16a7be7290200eea6a549b64f02ff48f
Instead of treating any interface with a non-ifscope route as a
potential default gateway, now verify that a given route is
actually a default route (0.0.0.0/0 or ::/0).
Fixes#5879
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
We removed it in #4806 in favor of the built-in functionality from the
nhooyr.io/websocket package. However, it has an issue with deadlines
that has not been fixed yet (see nhooyr/websocket#350). Temporarily
go back to using a custom wrapper (using the fix from our fork) so that
derpers will stop closing connections too aggressively.
Updates #5921
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
If netcheck happens before there's a derpmap.
This seems to only affect Headscale because it doesn't send a derpmap
as early?
Change-Id: I51e0dfca8e40623e04702bc9cc471770ca20d2c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The Logger type managers a logtail.Logger for extracting
statistics from a tstun.Wrapper.
So long as Shutdown is called, it ensures that logtail
and statistic gathering resources are properly cleared up.
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Rename StatisticsEnable as SetStatisticsEnabled to be consistent
with other similarly named methods.
Rename StatisticsExtract as ExtractStatistics to follow
the convention where methods start with a verb.
It was originally named with Statistics as a prefix so that
statistics related methods would sort well in godoc,
but that property no longer holds.
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
If Wrapper.StatisticsEnable is enabled,
then per-connection counters are maintained.
If enabled, Wrapper.StatisticsExtract must be periodically called
otherwise there is unbounded memory growth.
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
High-level API:
type Statistics struct { ... }
type Counts struct { TxPackets, TxBytes, RxPackets, RxBytes uint64 }
func (*Statistics) UpdateTx([]byte)
func (*Statistics) UpdateRx([]byte)
func (*Statistics) Extract() map[flowtrack.Tuple]Counts
The API accepts a []byte instead of a packet.Parsed so that a future
implementation can directly hash the address and port bytes,
which are contiguous in most IP packets.
This will be useful for a custom concurrent-safe hashmap implementation.
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Most visible when using tsnet.Server, but could have resulted in dropped
messages in a few other places too.
Fixes#5743
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
I brain-o'ed the math earlier. The NextDNS prefix is /32 (actually
/33, but will guarantee last bit is 0), so we have 128-32 = 96 bits
(12 bytes) of config/profile ID that we can extract. NextDNS doesn't
currently use all those, but might.
Updates #2452
Change-Id: I249bd28500c781e45425fd00fd3f46893ae226a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
- removed some in-flow time calls
- increase buffer size to 2MB to overcome syscall cost
- move relative time computation from record to report time
Signed-off-by: James Tucker <james@tailscale.com>
The fragment offset is an 8 byte offset rather than a byte offset, so
the short packet limit is now in fragment block size in order to compare
with the offset value.
The packet flags are in the first 3 bits of the flags/frags byte, and
so after conversion to a uint16 little endian value they are at the
start, not the end of the value - the mask for extracting "more
fragments" is adjusted to match this byte.
Extremely short fragments less than 80 bytes are dropped, but fragments
over 80 bytes are now accepted.
Fixes#5727
Signed-off-by: James Tucker <james@tailscale.com>
This doesn't change any behaviour for now, other than maybe running a
full netcheck more often. The intent is to start gathering data on
captive portals, and additionally, seeing this in the 'tailscale
netcheck' command should provide a bit of additional information to
users.
Updates #1634
Change-Id: I6ba08f9c584dc0200619fa97f9fde1a319f25c76
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
The io/ioutil package has been deprecated as of Go 1.16 [1]. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.
Reference: https://golang.org/doc/go1.16#ioutil
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>