tailscale

mirror of https://github.com/tailscale/tailscale.git synced 2025-12-01 09:32:08 +00:00

Author	SHA1	Message	Date
Nick Khyl	66826a496b	VERSION.txt: this is v1.90.9 Signed-off-by: Nick Khyl <nickk@tailscale.com> v1.90.9	2025-11-25 10:12:16 -06:00
Claus Lensbøl	d7cf0cfdb4	wgengine/userspace: run link change subscribers in eventqueue (#18024 ) Updates #17996 Signed-off-by: Claus Lensbøl <claus@tailscale.com> (cherry picked from commit `e7f5ca1d5e`)	2025-11-25 10:07:49 -06:00
Nick Khyl	f58cbffda1	util/eventbus: use unbounded event queues for DeliveredEvents in subscribers Bounded DeliveredEvent queues reduce memory usage, but they can deadlock under load. Two common scenarios trigger deadlocks when the number of events published in a short period exceeds twice the queue capacity (there's a PublishedEvent queue of the same size): - a subscriber tries to acquire the same mutex as held by a publisher, or - a subscriber for A events publishes B events Avoiding these scenarios is not practical and would limit eventbus usefulness and reduce its adoption, pushing us back to callbacks and other legacy mechanisms. These deadlocks already occurred in customer devices, dev machines, and tests. They also make it harder to identify and fix slow subscribers and similar issues we have been seeing recently. Choosing an arbitrary large fixed queue capacity would only mask the problem. A client running on a sufficiently large and complex customer environment can exceed any meaningful constant limit, since event volume depends on the number of peers and other factors. Behavior also changes based on scheduling of publishers and subscribers by the Go runtime, OS, and hardware, as the issue is essentially a race between publishers and subscribers. Additionally, on lower-end devices, an unreasonably high constant capacity is practically the same as using unbounded queues. Therefore, this PR changes the event queue implementation to be unbounded by default. The PublishedEvent queue keeps its existing capacity of 16 items, while subscribers' DeliveredEvent queues become unbounded. This change fixes known deadlocks and makes the system stable under load, at the cost of higher potential memory usage, including cases where a queue grows during an event burst and does not shrink when load decreases. Further improvements can be implemented in the future as needed. Fixes #17973 Fixes #18012 Signed-off-by: Nick Khyl <nickk@tailscale.com> (cherry picked from commit `1ccece0f78`)	2025-11-24 15:30:18 -06:00
Nick Khyl	f2100e2e1d	util/eventbus: add tests for a subscriber publishing events As of 2025-11-20, publishing more events than the eventbus's internal queues can hold may deadlock if a subscriber tries to publish events itself. This commit adds a test that demonstrates this deadlock, and skips it until the bug is fixed. Updates #18012 Signed-off-by: Nick Khyl <nickk@tailscale.com> (cherry picked from commit `3780f25d51`)	2025-11-24 15:30:18 -06:00
Nick Khyl	6bc07d3c24	util/eventbus: add tests for a subscriber trying to acquire the same mutex as a publisher As of 2025-11-20, publishing more events than the eventbus's internal queues can hold may deadlock if a subscriber tries to acquire a mutex that can also be held by a publisher. This commit adds a test that demonstrates this deadlock, and skips it until the bug is fixed. Updates #17973 Signed-off-by: Nick Khyl <nickk@tailscale.com> (cherry picked from commit `016ccae2da`)	2025-11-24 15:30:18 -06:00
Nick Khyl	ccf4f3c7ce	VERSION.txt: this is v1.90.8 Signed-off-by: Nick Khyl <nickk@tailscale.com> v1.90.8	2025-11-18 12:31:30 -06:00
Alex Chan	6b0fbffd4f	tka: move RemoveAll() to CompactableChonk I added a RemoveAll() method on tka.Chonk in #17946, but it's only used in the node to purge local AUMs. We don't need it in the SQLite storage, which currently implements tka.Chonk, so move it to CompactableChonk instead. Also add some automated tests, as a safety net. Updates tailscale/corp#33599 Change-Id: I54de9ccf1d6a3d29b36a94eccb0ebd235acd4ebc Signed-off-by: Alex Chan <alexc@tailscale.com> (cherry picked from commit `c17ba64129`)	2025-11-18 12:23:48 -06:00
Nick Khyl	90d3cb3c95	VERSION.txt: this is v1.90.7 Signed-off-by: Nick Khyl <nickk@tailscale.com> v1.90.7	2025-11-18 11:32:04 -06:00
Alex Chan	37b63eff1c	ipn/ipnlocal: use an in-memory TKA store if FS is unavailable This requires making the internals of LocalBackend a bit more generic, and implementing the `tka.CompactableChonk` interface for `tka.Mem`. Signed-off-by: Alex Chan <alexc@tailscale.com> Updates https://github.com/tailscale/corp/issues/33599 (cherry picked from commit `1723cb83ed`)	2025-11-18 11:23:33 -06:00
Alex Chan	43ab8b4b18	tka: rename a mutex to `mu` instead of single-letter `l` See http://go/no-ell Updates tailscale/corp#33846 Signed-off-by: Alex Chan <alexc@tailscale.com> Change-Id: I88ecd9db847e04237c1feab9dfcede5ca1050cc5 (cherry picked from commit `fca66fb51a`)	2025-11-18 11:23:33 -06:00
Alex Chan	6b64718bb9	tka: don't try to read AUMs which are partway through being written Fixes https://github.com/tailscale/tailscale/issues/17600 Signed-off-by: Alex Chan <alexc@tailscale.com> (cherry picked from commit `23359dc727`)	2025-11-18 11:23:33 -06:00
Jonathan Nobels	fa514c7280	net/netmon: do not abandon a subscriber when exiting early (#17899 ) (#17905 ) LinkChangeLogLimiter keeps a subscription to track rate limits for log messages. But when its context ended, it would exit the subscription loop, leaving the subscriber still alive. Ensure the subscriber gets cleaned up when the context ends, so we don't stall event processing. Updates tailscale/corp#34311 Change-Id: I82749e482e9a00dfc47f04afbc69dd0237537cb2 (cherry picked from commit `ab4b990d51`) Signed-off-by: M. J. Fromberger <fromberger@tailscale.com> Co-authored-by: M. J. Fromberger <fromberger@tailscale.com>	2025-11-17 15:40:46 -05:00
Jordan Whited	ea8eeeb2f7	feature/relayserver: fix Shutdown() deadlock (#17898 ) Updates #17894 Signed-off-by: Jordan Whited <jordan@tailscale.com> (cherry picked from commit `0285e1d5fb`)	2025-11-15 13:54:07 -08:00
Jordan Whited	0f421d3def	feature/relayserver,ipn/ipnlocal,net/udprelay: plumb DERPMap (#17881 ) This commit replaces usage of local.Client in net/udprelay with DERPMap plumbing over the eventbus. This has been a longstanding TODO. This work was also accelerated by a memory leak in net/http when using local.Client over long periods of time. So, this commit also addresses said leak. Updates #17801 Signed-off-by: Jordan Whited <jordan@tailscale.com> (cherry picked from commit `9e4d1fd87f`)	2025-11-15 13:54:07 -08:00
Jordan Whited	eb03b354f6	net/udprelay: replace VNI pool with selection algorithm (#17868 ) This reduces memory usage when tailscaled is acting as a peer relay. Updates #17801 Signed-off-by: Jordan Whited <jordan@tailscale.com> (cherry picked from commit `f4f9dd7f8c`)	2025-11-15 13:54:07 -08:00
Jordan Whited	771a9d29ff	wgengine/magicsock: fix UDPRelayAllocReq/Resp deadlock (#17831 ) Updates #17830 Signed-off-by: Jordan Whited <jordan@tailscale.com> (cherry picked from commit `2ad2d4d409`)	2025-11-15 13:54:07 -08:00
Jordan Whited	e602907cf5	wgengine/magicsock: validate endpoint.derpAddr in Conn.onUDPRelayAllocResp (#17828 ) Otherwise a zero value will panic in Conn.sendUDPStd. Updates #17827 Signed-off-by: Jordan Whited <jordan@tailscale.com> (cherry picked from commit `18806de400`)	2025-11-15 13:54:07 -08:00
Nick Khyl	28f6c2dbfc	VERSION.txt: this is v1.90.6 Signed-off-by: Nick Khyl <nickk@tailscale.com> v1.90.6	2025-10-31 16:18:03 -05:00
M. J. Fromberger	b6eabd4038	util/eventbus: allow logging of slow subscribers (#17705 ) Add options to the eventbus.Bus to plumb in a logger. Route that logger in to the subscriber machinery, and trigger a log message to it when a subscriber fails to respond to its delivered events for 5s or more. The log message includes the package, filename, and line number of the call site that created the subscription. Add tests that verify this works. Updates #17680 Change-Id: I0546516476b1e13e6a9cf79f19db2fe55e56c698 Signed-off-by: M. J. Fromberger <fromberger@tailscale.com> (cherry picked from commit `061e6266cf`)	2025-10-31 21:10:10 +00:00
M. J. Fromberger	6e2f2bb31a	ipn/ipnlocal: do not stall event processing for appc route updates (#17663 ) A follow-up to #17411. Put AppConnector events into a task queue, as they may take some time to process. Ensure that the queue is stopped at shutdown so that cleanup will remain orderly. Because events are delivered on a separate goroutine, slow processing of an event does not cause an immediate problem; however, a subscriber that blocks for a long time will push back on the bus as a whole. See https://godoc.org/tailscale.com/util/eventbus#hdr-Expected_subscriber_behavior for more discussion. Updates #17192 Updates #15160 Change-Id: Ib313cc68aec273daf2b1ad79538266c81ef063e3 Signed-off-by: M. J. Fromberger <fromberger@tailscale.com> (cherry picked from commit `06b092388e`)	2025-10-31 21:09:32 +00:00
Alex Chan	faca4c08b7	.github/workflows: pin the google/oss-fuzz GitHub Actions Updates https://github.com/tailscale/corp/issues/31017 Signed-off-by: Alex Chan <alexc@tailscale.com> (cherry picked from commit `3944809a11`)	2025-10-30 11:42:02 -07:00
Nick Khyl	63242007ae	VERSION.txt: this is v1.90.5 Signed-off-by: Nick Khyl <nickk@tailscale.com> v1.90.5	2025-10-30 12:38:25 -05:00
Brad Fitzpatrick	300e6062bf	cmd/k8s-operator/generate: skip tests if no network or Helm is down Updates helm/helm#31434 Change-Id: I5eb20e97ff543f883d5646c9324f50f54180851d Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> (cherry picked from commit `d5a40c01ab`)	2025-10-29 18:20:21 -07:00
Brad Fitzpatrick	1a6c31538e	sessionrecording: fix regression in recent http2 package change In `3f5c560fd4` I changed to use std net/http's HTTP/2 support, instead of pulling in x/net/http2. But I forgot to update DialTLSContext to DialContext, which meant it was falling back to using the std net.Dialer for its dials, instead of the passed-in one. The tests only passed because they were using localhost addresses, so the std net.Dialer worked. But in prod, where a tsnet Dialer would be needed, it didn't work, and would time out for 10 seconds before resorting to the old protocol. So this fixes the tests to use an isolated in-memory network to prevent that class of problem in the future. With the test change, the old code fails and the new code passes. Thanks to @jasonodonnell for debugging! Updates #17304 Updates `3f5c560fd4` Change-Id: I3602bafd07dc6548e2c62985af9ac0afb3a0e967 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> (cherry picked from commit `8996254647`)	2025-10-29 18:20:21 -07:00
Nick Khyl	68cba300e4	VERSION.txt: this is v1.90.4 Signed-off-by: Nick Khyl <nickk@tailscale.com> v1.90.4	2025-10-28 13:29:24 -05:00
M. J. Fromberger	2dd72f6ec2	Revert "logtail: avoid racing eventbus subscriptions with Shutdown (#17639 )" (#17684 ) This reverts commit `4346615d77`. We averted the shutdown race, but will need to service the subscriber even when we are not waiting for a change so that we do not delay the bus as a whole. Updates #17638 Change-Id: I5488466ed83f5ad1141c95267f5ae54878a24657 Signed-off-by: M. J. Fromberger <fromberger@tailscale.com> (cherry picked from commit `db5815fb97`)	2025-10-28 12:50:41 -05:00
Brad Fitzpatrick	53004dded1	wgengine/magicsock: fix js/wasm crash regression loading non-existent portmapper Thanks for the report, @Need-an-AwP! Fixes #17681 Updates #9394 Change-Id: I2e0b722ef9b460bd7e79499192d1a315504ca84c Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> (cherry picked from commit `edb11e0e60`)	2025-10-28 09:14:20 -07:00
srwareham	033adc398c	cmd/tailscale/cli: move JetKVM scripts to /userdata/init.d for persistence (#17610 ) Updates #16524 Updates jetkvm/rv1106-system#34 Signed-off-by: srwareham <ebriouscoding@gmail.com> (cherry picked from commit `f4e2720821`)	2025-10-27 15:31:05 -07:00
Max Coulombe	bad03eefa1	feature/identityfederation: strip query params on clientID (#17666 ) Updates #9192 Change-Id: I35c88df8a0242ecc19a23265d392ef78ac176b9d Signed-off-by: mcoulombe <max@tailscale.com> (cherry picked from commit `34e992f59d`)	2025-10-27 15:30:31 -07:00
Patrick O'Doherty	dc3c15b4c6	control/controlclient: back out HW key attestation (#17664 ) Temporarily back out the TPM-based hw attestation code while we debug Windows exceptions. Updates tailscale/corp#31269 Signed-off-by: Patrick O'Doherty <patrick@tailscale.com> (cherry picked from commit `a760cbe33f`)	2025-10-27 13:19:55 -07:00
Nick Khyl	c50fe71822	VERSION.txt: this is v1.90.3 Signed-off-by: Nick Khyl <nickk@tailscale.com> v1.90.3	2025-10-27 11:15:14 -05:00
M. J. Fromberger	597acd8663	logtail: avoid racing eventbus subscriptions with Shutdown (#17639 ) When the eventbus is enabled, set up the subscription for change deltas at the beginning when the client is created, rather than waiting for the first awaitInternetUp check. Otherwise, it is possible for a check to race with the client close in Shutdown, which triggers a panic. Updates #17638 Change-Id: I461c07939eca46699072b14b1814ecf28eec750c Signed-off-by: M. J. Fromberger <fromberger@tailscale.com> (cherry picked from commit `4346615d77`)	2025-10-27 10:50:13 -05:00
Claus Lensbøl	e6a3669277	net/tsdial: do not panic if setting the same eventbus twice (#17640 ) Updates #17638 Signed-off-by: Claus Lensbøl <claus@tailscale.com> (cherry picked from commit `fd0e541e5d`)	2025-10-27 10:49:49 -05:00
Nick Khyl	8bcd44ecf0	VERSION.txt: this is v1.90.2 Signed-off-by: Nick Khyl <nickk@tailscale.com> v1.90.2	2025-10-24 11:49:00 -05:00
Claus Lensbøl	b0f0bce928	health: compare warnable codes to avoid errors on release branch (#17637 ) This compares the warnings we actually care about and skips the unstable warnings and the changes with no warnings. Fixes #17635 Signed-off-by: Claus Lensbøl <claus@tailscale.com> (cherry picked from commit `7418583e47`)	2025-10-24 11:30:07 -05:00
Brad Fitzpatrick	c81ef9055b	util/linuxfw: fix 32-bit arm regression with iptables This fixes a regression from `dd615c8fdd` that moved the newIPTablesRunner constructor from a any-Linux-GOARCH file to one that was only amd64 and arm64, thus breaking iptables on other platforms (notably 32-bit "arm", as seen on older Pis running Buster with iptables) Tested by hand on a Raspberry Pi 2 w/ Buster + iptables for now, for lack of automated 32-bit arm tests at the moment. But filed #17629. Fixes #17623 Updates #17629 Change-Id: Iac1a3d78f35d8428821b46f0fed3f3717891c1bd Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> (cherry picked from commit `8576a802ca`)	2025-10-23 21:08:26 -07:00
Patrick O'Doherty	9fe44b3718	feature/tpm: use withSRK to probe TPM availability (#17627 ) On some platforms e.g. ChromeOS the owner hierarchy might not always be available to us. To avoid stale sealing exceptions later we probe to confirm it's working rather than rely solely on family indicator status. Updates #17622 Signed-off-by: Patrick O'Doherty <patrick@tailscale.com> (cherry picked from commit `672b1f0e76`)	2025-10-23 16:50:58 -07:00
Patrick O'Doherty	a8ae316858	feature/tpm: check TPM family data for compatibility (#17624 ) Check that the TPM we have opened is advertised as a 2.0 family device before using it for state sealing / hardware attestation. Updates #17622 Signed-off-by: Patrick O'Doherty <patrick@tailscale.com> (cherry picked from commit `36ad24b20f`)	2025-10-23 14:59:40 -07:00
Nick Khyl	75b0c6f164	VERSION.txt: this is v1.90.1 Signed-off-by: Nick Khyl <nickk@tailscale.com> v1.90.1	2025-10-23 11:06:03 -05:00
Nick Khyl	3c78146ece	VERSION.txt: this is v1.90.0 Signed-off-by: Nick Khyl <nickk@tailscale.com> v1.90.0	2025-10-20 11:01:07 -05:00
License Updater	4e1c270f90	licenses: update license notices Signed-off-by: License Updater <noreply+license-updater@tailscale.com>	2025-10-20 08:15:00 -07:00
Alex Chan	4673992b96	tka: created a shared testing library for Chonk This patch creates a set of tests that should be true for all implementations of Chonk and CompactableChonk, which we can share with the SQLite implementation in corp. It includes all the existing tests, plus a test for LastActiveAncestor which was in corp but not in oss. Updates https://github.com/tailscale/corp/issues/33465 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-10-20 13:13:14 +01:00
Alex Chan	c961d58091	cmd/tailscale: improve the error message for `lock log` with no lock Previously, running `tailscale lock log` in a tailnet without Tailnet Lock enabled would return a potentially confusing error: $ tailscale lock log 2025/10/20 11:07:09 failed to connect to local Tailscale service; is Tailscale running? It would return this error even if Tailscale was running. This patch fixes the error to be: $ tailscale lock log Tailnet Lock is not enabled Fixes #17586 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-10-20 12:15:57 +01:00
Max Coulombe	6a73c0bdf5	cmd/tailscale/cli,feature: add support for identity federation (#17529 ) Add new arguments to `tailscale up` so authkeys can be generated dynamically via identity federation. Updates #9192 Signed-off-by: mcoulombe <max@tailscale.com>	2025-10-17 18:05:32 -04:00
Brad Fitzpatrick	54cee33bae	go.toolchain.rev: update to Go 1.25.3 Updates tailscale/go#140 Updates tailscale/go#142 Updates tailscale/go#138 Change-Id: Id25b6fa4e31eee243fec17667f14cdc48243c59e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-10-17 11:27:16 -07:00
David Bond	9083ef1ac4	cmd/k8s-operator: allow pod tolerations on nameservers (#17260 ) This commit modifies the `DNSConfig` custom resource to allow specifying [tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) on the nameserver pods. This will allow users to dictate where their nameserver pods are located within their clusters. Fixes: https://github.com/tailscale/tailscale/issues/17092 Signed-off-by: David Bond <davidsbond93@gmail.com>	2025-10-17 18:32:30 +01:00
Andrew Lytvynov	6493206ac7	.github/workflows: pin nix-related github actions (#17574 ) Updates #cleanup Signed-off-by: Andrew Lytvynov <awly@tailscale.com>	2025-10-17 10:00:42 -07:00
Alex Chan	8d119f62ee	wgengine/magicsock: minor tidies in Test_endpoint_maybeProbeUDPLifetimeLocked * Remove a couple of single-letter `l` variables * Use named struct parameters in the test cases for readability * Delete `wantAfterInactivityForFn` parameter when it returns the default zero Updates #cleanup Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-10-17 17:10:35 +01:00
Alex Chan	55a43c3736	tka: don't look up parent/child information from purged AUMs We soft-delete AUMs when they're purged, but when we call `ChildAUMs()`, we look up soft-deleted AUMs to find the `Children` field. This patch changes the behaviour of `ChildAUMs()` so it only looks at not-deleted AUMs. This means we don't need to record child information on AUMs any more, which is a minor space saving for any newly-recorded AUMs. Updates https://github.com/tailscale/tailscale/issues/17566 Updates https://github.com/tailscale/corp/issues/27166 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-10-17 15:06:18 +01:00
Alex Chan	c3acf25d62	tka: remove an unused Mem.Orphans() method This method was added in `cca25f6` in the initial in-memory implementation of Chonk, but it's not part of the Chonk interface and isn't implemented or used anywhere else. Let's get rid of it. Updates https://github.com/tailscale/corp/issues/33465 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-10-17 14:07:21 +01:00

1 2 3 4 5 ...

9815 Commits