Changes made:
* Avoid "encoding/json" for JSON processing, and instead use
"github.com/go-json-experiment/json/jsontext".
Use jsontext.Value.IsValid for validation, which is much faster.
Use jsontext.AppendQuote instead of our own JSON escaping.
* In drainPending, use a different maxLen depending on lowMem.
In lowMem mode, it is better to perform multiple uploads
than it is to construct a large body that OOMs the process.
* In drainPending, if an error is encountered draining,
construct an error message in the logtail JSON format
rather than something that is invalid JSON.
* In appendTextOrJSONLocked, use jsontext.Decoder to check
whether the input is a valid JSON object. This is faster than
the previous approach of unmarshaling into map[string]any and
then re-marshaling that data structure.
This is especially beneficial for network flow logging,
which produces relatively large JSON objects.
* In appendTextOrJSONLocked, enforce maxSize on the input.
If too large, then we may end up in a situation where the logs
can never be uploaded because it exceeds the maximum body size
that the Tailscale logs service accepts.
* Use "tailscale.com/util/truncate" to properly truncate a string
on valid UTF-8 boundaries.
* In general, remove unnecessary spaces in JSON output.
Performance:
name old time/op new time/op delta
WriteText 776ns ± 2% 596ns ± 1% -23.24% (p=0.000 n=10+10)
WriteJSON 110µs ± 0% 9µs ± 0% -91.77% (p=0.000 n=8+8)
name old alloc/op new alloc/op delta
WriteText 448B ± 0% 0B -100.00% (p=0.000 n=10+10)
WriteJSON 37.9kB ± 0% 0.0kB ± 0% -99.87% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
WriteText 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
WriteJSON 1.08k ± 0% 0.00k ± 0% -99.91% (p=0.000 n=10+10)
For text payloads, this is 1.30x faster.
For JSON payloads, this is 12.2x faster.
Updates #cleanup
Updates tailscale/corp#18514
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
We were being too aggressive when deciding whether to write our NRPT rules
to the local registry key or the group policy registry key.
After once again reviewing the document which calls itself a spec
(see issue), it is clear that the presence of the DnsPolicyConfig subkey
is the important part, not the presence of values set in the DNSClient
subkey. Furthermore, a footnote indicates that the presence of
DnsPolicyConfig in the GPO key will always override its counterpart in
the local key. The implication of this is important: we may unconditionally
write our NRPT rules to the local key. We copy our rules to the policy
key only when it contains NRPT rules belonging to somebody other than us.
Fixes https://github.com/tailscale/corp/issues/19071
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This package is included in the tempfork directory, rather than as a go
module dependency, so is not included in the normal package list.
Updates tailscale/corp#5780
Signed-off-by: Will Norris <will@tailscale.com>
Just because we don't have known endpoints for a peer does not mean that
the peer should become unreachable. If we know the peers key, it should
be able to call us, then we can talk back via whatever path it called us
on. First step - don't drop the packet in this context.
Updates tailscale/corp#19106
Signed-off-by: James Tucker <james@tailscale.com>
Extend the `zypper` install to import importing the GPG key used to sign
the repository packages.
Updates #11635
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Request ID generation appears prominently in some services cumulative
allocation rate, and while this does not eradicate this issue (the API
still makes UUID objects), it does improve the overhead of this API and
reduce the amount of garbage that it produces.
Updates tailscale/corp#18266
Updates tailscale/corp#19054
Signed-off-by: James Tucker <james@tailscale.com>
This still generates github.com/google/uuid UUID objects, but does so
using a ChaCha8 CSPRNG from the stdlib rand/v2 package. The public API
is backed by a sync.Pool to provide good performance in highly
concurrent operation.
Under high load the read API produces a lot of extra garbage and
overhead by way of temporaries and syscalls. This implementation reduces
both to minimal levels, and avoids any long held global lock by
utilizing sync.Pool.
Updates tailscale/corp#18266
Updates tailscale/corp#19054
Signed-off-by: James Tucker <james@tailscale.com>
This removes a potentially increased boot delay for certain boot
topologies where they block on ExecStartPre that may have socket
activation dependencies on other system services (such as
systemd-resolved and NetworkManager).
Also rename cleanup to clean up in affected/immediately nearby places
per code review commentary.
Fixes#11599
Signed-off-by: James Tucker <james@tailscale.com>
Package winenv provides information about the current Windows environment.
This includes details such as whether the device is a server or workstation,
and if it is AD domain-joined, MDM-registered, or neither.
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Also capitalises the start of all ShortHelp, allows subcommands to be hidden
with a "HIDDEN: " prefix in their ShortHelp, and adds a TS_DUMP_HELP envknob
to look at all --help messages together.
Fixes#11664
Signed-off-by: Paul Scott <paul@tailscale.com>
Buffer.Write has the exact same signature of io.Writer.Write.
The latter requires that implementations to never retain
the provided input buffer, which is an expectation that most
users will have when they see a Write signature.
The current behavior of Buffer.Write where it does retain
the input buffer is a risky precedent to set.
Switch the behavior to match io.Writer.Write.
There are only two implementations of Buffer in existence:
* logtail.memBuffer
* filch.Filch
The former can be fixed by cloning the input to Write.
This will cause an extra allocation in every Write,
but we can fix that will pooling on the caller side
in a follow-up PR.
The latter only passes the input to os.File.Write,
which does respect the io.Writer.Write requirements.
Updates #cleanup
Updates tailscale/corp#18514
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Updates ENG-2776
Updates the .admx and .adml files to include the new ManagedByOrganizationName, ManagedByCaption and ManagedByURL system policies, added in Tailscale v1.62 for Windows.
Co-authored-by: Andrea Gottardo <andrea@gottardo.me>
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The derphttp.Client mutex is held during connects (for up to 10
seconds) so this LocalAddr method (blocking on said mutex) could also
block for up to 10 seconds, causing a pileup upstream in
magicsock/wgengine and ultimately a watchdog timeout resulting in a
crash.
Updates #11519
Change-Id: Idd1d94ee00966be1b901f6899d8b9492f18add0f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
cmd/tailscale/cli: respect $KUBECONFIG
* `$KUBECONFIG` is a `$PATH`-like: it defines a *list*.
`tailscale config kubeconfig` works like the rest of the
ecosystem so that if $KUBECONFIG is set it will write to the first existant file in the list, if none exist then
the final entry in the list.
* if `$KUBECONFIG` is an empty string, the old logic takes over.
Notes:
* The logic for file detection is inlined based on what `kind` does.
Technically it's a race condition, since the file could be removed/added
in between the processing steps, but the fallout shouldn't be too bad.
https://github.com/kubernetes-sigs/kind/blob/v0.23.0-alpha/pkg/cluster/internal/kubeconfig/internal/kubeconfig/paths.go
* The sandboxed (App Store) variant relies on a specific temporary
entitlement to access the ~/.kube/config file.
The entitlement is only granted to specific files, and so is not
applicable to paths supplied by the user at runtime.
While there may be other ways to achieve this access to arbitrary
kubeconfig files, it's out of scope for now.
Updates #11645
Signed-off-by: Chloé Vulquin <code@toast.bunkerlabs.net>
After:
bradfitz@book1pro tailscale.com % ./tool/go test -c ./cmd/tailscale/cli
bradfitz@book1pro tailscale.com % ./cli.test
bradfitz@book1pro tailscale.com %
Before:
bradfitz@book1pro tailscale.com % ./tool/go test -c ./cmd/tailscale/cli
bradfitz@book1pro tailscale.com % ./cli.test
Warning: funnel=on for foo.test.ts.net:443, but no serve config
run: `tailscale serve --help` to see how to configure handlers
Warning: funnel=on for foo.test.ts.net:443, but no serve config
run: `tailscale serve --help` to see how to configure handlers
USAGE
funnel <serve-port> {on|off}
funnel status [--json]
Funnel allows you to publish a 'tailscale serve'
server publicly, open to the entire internet.
Turning off Funnel only turns off serving to the internet.
It does not affect serving to your tailnet.
SUBCOMMANDS
status show current serve/funnel status
error: path must be absolute
error: invalid TCP source "localhost:5432": missing port in address
error: invalid TCP source "tcp://somehost:5432"
must be one of: localhost or 127.0.0.1
tcp://somehost:5432error: invalid TCP source "tcp://somehost:0"
must be one of: localhost or 127.0.0.1
tcp://somehost:0error: invalid TCP source "tcp://somehost:65536"
must be one of: localhost or 127.0.0.1
tcp://somehost:65536error: path must be absolute
error: cannot serve web; already serving TCP
You don't have permission to enable this feature.
This also moves the color handling up to a generic spot so it's
not just one subcommand doing it itself. See
https://github.com/tailscale/tailscale/issues/11626#issuecomment-2041795129Fixes#11643
Updates #11626
Change-Id: I3a49e659dcbce491f4a2cb784be20bab53f72303
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
- Wrap each prober function into a probe class that allows associating
metric labels and custom metrics with a given probe;
- Make sure all existing probe classes set a `class` metric label;
- Move bandwidth probe size from being a metric label to a separate
gauge metric; this will make it possible to use it to calculate
average used bandwidth using a PromQL query;
- Also export transfer time for the bandwidth prober (more accurate than
the total probe time, since it excludes connection establishment
time).
Updates tailscale/corp#17912
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
At least in the case of dialing a Tailscale IP.
Updates #4529
Change-Id: I9fd667d088a14aec4a56e23aabc2b1ffddafa3fe
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is primarily for GUIs, so they don't need to remember the most
recently used exit node themselves.
This adds some CLI commands, but they're disabled and behind the WIP
envknob, as we need to consider naming (on/off is ambiguous with
running an exit node, etc) as well as automatic exit node selection in
the future. For now the CLI commands are effectively developer debug
things to test the LocalAPI.
Updates tailscale/corp#18724
Change-Id: I9a32b00e3ffbf5b29bfdcad996a4296b5e37be7e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It was used when we only supported subnet routers on linux
and would nil out the SubnetRoutes slice as no other router
worked with it, but now we support subnet routers on ~all platforms.
The field it was setting to nil is now only used for network logging
and nowhere else, so keep the field but drop the SubnetRouterWrapper
as it's not useful.
Updates #cleanup
Change-Id: Id03f9b6ec33e47ad643e7b66e07911945f25db79
Signed-off-by: Maisem Ali <maisem@tailscale.com>
This names the func() that Once-unlocked LocalBackend.mu. It does so
both for docs and because it can then have a method: Unlock, for the
few points that need to explicitly unlock early (the cause of all this
mess). This makes those ugly points easy to find, and also can then
make them stricter, panicking if the mutex is already unlocked. So a
normal call to the func just once-releases the mutex, returning false
if it's already done, but the Unlock method is the strict one.
Then this uses it more, so most the b.mu.Unlock calls remaining are
simple cases and usually defers.
Updates #11649
Change-Id: Ia070db66c54a55e59d2f76fdc26316abf0dd4627
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
A number of methods in LocalBackend (with suffixed "LockedOnEntry")
require b.mu be held but unlock it on the way out. That's asymmetric
and atypical and error prone.
This adds a helper method to LocalBackend that locks the mutex and
returns a sync.OnceFunc that unlocks the mutex. Then we pass around
that unlocker func down the chain to make it explicit (and somewhat
type check the passing of ownership) but also let the caller defer
unlock it, in the case of errors/panics that happen before the callee
gets around to calling the unlock.
This revealed a latent bug in LocalBackend.DeleteProfile which double
unlocked the mutex.
Updates #11649
Change-Id: I002f77567973bd77b8906bfa4ec9a2049b89836a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The output was changing randomly per run, due to range over a map.
Then some misc style tweaks I noticed while debugging.
Fixes#11629
Change-Id: I67aef0e68566994e5744d4828002f6eb70810ee1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The netcheck package and the magicksock package coordinate via the
health package, but both sides have time based heuristics through
indirect dependencies. These were misaligned, so the implemented
heuristic aimed at reducing DERP moves while there is active traffic
were non-operational about 3/5ths of the time.
It is problematic to setup a good test for this integration presently,
so instead I added comment breadcrumbs along with the initial fix.
Updates #8603
Signed-off-by: James Tucker <james@tailscale.com>
This change makes the normalizeShareName function public, so it can be
used for validation in control.
Updates tailscale/corp#16827
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
And make NewMultiLabelMap panic earlier (at construction time)
if the comparable struct type T violates the documented rules,
rather than panicking at Add time.
Updates #cleanup
Change-Id: Ib1a03babdd501b8d699c4f18b1097a56c916c6d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Only on Gokrazy, set sysctls to enable IP forwarding so subnet routing
and advertised exit node works.
Fixes#11405
Signed-off-by: Joonas Kuorilehto <joneskoo@derbian.fi>
There are no mutations to the input,
so we can support both ~string and ~[]byte just fine.
Updates #cleanup
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This change switches the api to /drive, rather than the previous /tailfs
as well as updates the log lines to reflect the new value. It also
cleans up some existing tailfs references.
Updates tailscale/corp#16827
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
This allows clients to avoid establishing their VPN multiple times when
both routes and DNS are changing in rapid succession.
Updates tailscale/corp#18928
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Specifying a smaller window size during compression
provides a knob to tweak the tradeoff between memory usage
and the compression ratio.
Updates tailscale/corp#18514
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
So that we can e.g. check TLS on multiple ports for a given IP.
Updates tailscale/corp#16367
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I81d840a4c88138de1cbb2032b917741c009470e6
This allows us to check all IP addresses (and address families) for a
given DNS hostname while dynamically discovering new IPs and removing
old ones as they're no longer valid.
Also add a testable example that demonstrates how to use it.
Alternative to #11610
Updates tailscale/corp#16367
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I6d6f39bafc30e6dfcf6708185d09faee2a374599