tailscale

mirror of https://github.com/tailscale/tailscale.git synced 2025-12-01 09:32:08 +00:00

Author	SHA1	Message	Date
Jonathan Nobels	822adaa259	VERSION.txt: this is v1.92.0 Signed-off-by: Jonathan Nobels <jonathan@tailscale.com> v1.92.0	2025-11-26 15:35:58 -05:00
Andrew Lytvynov	9eff8a4503	feature/tpm: return opening errors from both /dev/tpmrm0 and /dev/tpm0 (#18071 ) This might help users diagnose why TPM access is failing for tpmrm0. Fixes #18026 Signed-off-by: Andrew Lytvynov <awly@tailscale.com>	2025-11-26 12:35:24 -06:00
Brad Fitzpatrick	8af7778ce0	util/execqueue: don't hold mutex in RunSync We don't hold q.mu while running normal ExecQueue.Add funcs, so we shouldn't in RunSync either. Otherwise code it calls can't shut down the queue, as seen in #18502. Updates #18052 Co-authored-by: Nick Khyl <nickk@tailscale.com> Change-Id: Ic5e53440411eca5e9fabac7f4a68a9f6ef026de1 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-26 10:09:23 -08:00
Alex Chan	b7658a4ad2	tstest/integration: add integration test for Tailnet Lock This patch adds an integration test for Tailnet Lock, checking that a node can't talk to peers in the tailnet until it becomes signed. This patch also introduces a new package `tstest/tkatest`, which has some helpers for constructing a mock control server that responds to TKA requests. This allows us to reduce boilerplate in the IPN tests. Updates tailscale/corp#33599 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-26 11:54:48 +00:00
Jordan Whited	824027305a	cmd/tailscale/cli,ipn,all: make peer relay server port a uint16 In preparation for exposing its configuration via ipn.ConfigVAlpha, change {Masked}Prefs.RelayServerPort from int to *uint16. This takes a defensive stance against invalid inputs at JSON decode time. 'tailscale set --relay-server-port' is currently the only input to this pref, and has always sanitized input to fit within a uint16. Updates tailscale/corp#34591 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2025-11-25 19:40:17 -08:00
Sachin Iyer	53476ce872	ipn/serve: validate service paths in HasPathHandler Fixes #17839 Signed-off-by: Sachin Iyer <siyer@detail.dev>	2025-11-25 16:27:37 -05:00
Claus Lensbøl	c54d243690	net/tstun: add TSMPDiscoAdvertisement to TSMPPing (#17995 ) Adds a new types of TSMP messages for advertising disco keys keys to/from a peer, and implements the advertising triggered by a TSMP ping. Needed as part of the effort to cache the netmap and still let clients connect without control being reachable. Updates #12639 Signed-off-by: Claus Lensbøl <claus@tailscale.com> Co-authored-by: James Tucker <james@tailscale.com>	2025-11-25 15:35:38 -05:00
Alex Chan	b38dd1ae06	ipn/ipnlocal: don't panic if there are no suitable exit nodes In suggestExitNodeLocked, if no exit node candidates have a home DERP or valid location info, `bestCandidates` is an empty slice. This slice is passed to `selectNode` (`randomNode` in prod): ```go func randomNode(nodes views.Slice[tailcfg.NodeView], …) tailcfg.NodeView { … return nodes.At(rand.IntN(nodes.Len())) } ``` An empty slice becomes a call to `rand.IntN(0)`, which panics. This patch changes the behaviour, so if we've filtered out all the candidates before calling `selectNode`, reset the list and then pick from any of the available candidates. This patch also updates our tests to give us more coverage of `randomNode`, so we can spot other potential issues. Updates #17661 Change-Id: I63eb5e4494d45a1df5b1f4b1b5c6d5576322aa72 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-25 19:05:13 +00:00
Fran Bull	f4a4bab105	tsconsensus: skip integration tests in CI There is an issue to add non-integration tests: #18022 Fixes #15627 #16340 Signed-off-by: Fran Bull <fran@tailscale.com>	2025-11-25 10:53:40 -08:00
Brad Fitzpatrick	ac0b15356d	tailcfg, control/controlclient: start moving MapResponse.DefaultAutoUpdate to a nodeattr And fix up the TestAutoUpdateDefaults integration tests as they weren't testing reality: the DefaultAutoUpdate is supposed to only be relevant on the first MapResponse in the stream, but the tests weren't testing that. They were instead injecting a 2nd+ MapResponse. This changes the test control server to add a hook to modify the first map response, and then makes the test control when the node goes up and down to make new map responses. Also, the test now runs on macOS where the auto-update feature being disabled would've previously t.Skipped the whole test. Updates #11502 Change-Id: If2319bd1f71e108b57d79fe500b2acedbc76e1a6 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-25 10:45:34 -08:00
Simon Law	848978e664	ipn/ipnlocal: test traffic-steering when feature is not enabled (#17997 ) In PR tailscale/corp#34401, the `traffic-steering` feature flag does not automatically enable traffic steering for all nodes. Instead, an admin must add the `traffic-steering` node attribute to each client node that they want opted-in. For backwards compatibility with older clients, tailscale/corp#34401 strips out the `traffic-steering` node attribute if the feature flag is not enabled, even if it is set in the policy file. This lets us safely disable the feature flag. This PR adds a missing test case for suggested exit nodes that have no priority. Updates tailscale/corp#34399 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2025-11-25 09:21:55 -08:00
Nick Khyl	7073f246d3	ipn/ipnlocal: do not call controlclient.Client.Shutdown with b.mu held This fixes a regression in #17804 that caused a deadlock. Updates #18052 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2025-11-25 09:22:50 -06:00
David Bond	d4821cdc2f	cmd/k8s-operator: allow HA ingresses to be deleted when VIP service does not exist (#18050 ) This commit fixes a bug in our HA ingress reconciler where ingress resources would be stuck in a deleting state should their associated VIP service be deleted within control. The reconciliation loop would check for the existence of the VIP service and if not found perform no additional cleanup steps. The code has been modified to continue onwards even if the VIP service is not found. Fixes: https://github.com/tailscale/tailscale/issues/18049 Signed-off-by: David Bond <davidsbond93@gmail.com>	2025-11-25 12:41:39 +00:00
Simon Law	9c3a2aa797	ipn/ipnlocal: replace log.Printf with logf (#18045 ) Updates #cleanup Signed-off-by: Simon Law <sfllaw@tailscale.com>	2025-11-24 17:42:58 -08:00
Jordan Whited	7426eca163	cmd/tailscale,feature/relayserver,ipn: add relay-server-static-endpoints set flag Updates tailscale/corp#31489 Updates #17791 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2025-11-24 16:37:15 -08:00
Jordan Whited	755309c04e	net/udprelay: use blake2s-256 MAC for handshake challenge This commit replaces crypto/rand challenge generation with a blake2s-256 MAC. This enables the peer relay server to respond to multiple forward disco.BindUDPRelayEndpoint messages per handshake generation without sacrificing the proof of IP ownership properties of the handshake. Responding to multiple forward disco.BindUDPRelayEndpoint messages per handshake generation improves client address/path selection where lowest client->server path/addr one-way delay does not necessarily equate to lowest client<->server round trip delay. It also improves situations where outbound traffic is filtered independent of input, and the first reply disco.BindUDPRelayEndpointChallenge message is dropped on the reply path, but a later reply using a different source would make it through. Reduction in serverEndpoint state saves 112 bytes per instance, trading for slightly more expensive crypto ops: 277ns/op vs 321ns/op on an M1 Macbook Pro. Updates tailscale/corp#34414 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2025-11-24 14:52:34 -08:00
Tom Proctor	6637003cc8	cmd/cigocacher,go.mod: add cigocacher cmd Adds cmd/cigocacher as the client to cigocached for Go caching over HTTP. The HTTP cache is best-effort only, and builds will fall back to disk-only cache if it's not available, much like regular builds. Not yet used in CI; that will follow in another PR once we have runners available in this repo with the right network setup for reaching cigocached. Updates tailscale/corp#10808 Change-Id: I13ae1a12450eb2a05bd9843f358474243989e967 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>	2025-11-24 21:15:46 +00:00
Andrew Dunham	698eecda04	ipn/ipnlocal: fix panic in driveTransport on network error When the underlying transport returns a network error, the RoundTrip method returns (nil, error). The defer was attempting to access resp without checking if it was nil first, causing a panic. Fix this by checking for nil in the defer. Also changes driveTransport.tr from *http.Transport to http.RoundTripper and adds a test. Fixes #17306 Signed-off-by: Andrew Dunham <andrew@tailscale.com> Change-Id: Icf38a020b45aaa9cfbc1415d55fd8b70b978f54c	2025-11-24 10:35:23 -05:00
Andrew Dunham	a20cdb5c93	tstest/integration/testcontrol: de-flake TestUserMetricsRouteGauges SetSubnetRoutes was not sending update notifications to nodes when their approved routes changed, causing nodes to not fetch updated netmaps with PrimaryRoutes populated. This resulted in TestUserMetricsRouteGauges flaking because it waited for PrimaryRoutes to be set, which only happened if the node happened to poll for other reasons. Now send updateSelfChanged notification to affected nodes so they fetch an updated netmap immediately. Fixes #17962 Signed-off-by: Andrew Dunham <andrew@tailscale.com>	2025-11-23 21:13:23 -05:00
Andrew Dunham	16587746ed	portlist,tstest: skip tests on kernels with /proc/net/tcp regression Linux kernel versions 6.6.102-104 and 6.12.42-45 have a regression in /proc/net/tcp that causes seek operations to fail with "illegal seek". This breaks portlist tests on these kernels. Add kernel version detection for Linux systems and a SkipOnKernelVersions helper to tstest. Use it to skip affected portlist tests on the broken kernel versions. Thanks to philiptaron for the list of kernels with the issue and fix. Updates #16966 Signed-off-by: Andrew Dunham <andrew@tailscale.com>	2025-11-21 22:33:57 -05:00
Nick Khyl	1ccece0f78	util/eventbus: use unbounded event queues for DeliveredEvents in subscribers Bounded DeliveredEvent queues reduce memory usage, but they can deadlock under load. Two common scenarios trigger deadlocks when the number of events published in a short period exceeds twice the queue capacity (there's a PublishedEvent queue of the same size): - a subscriber tries to acquire the same mutex as held by a publisher, or - a subscriber for A events publishes B events Avoiding these scenarios is not practical and would limit eventbus usefulness and reduce its adoption, pushing us back to callbacks and other legacy mechanisms. These deadlocks already occurred in customer devices, dev machines, and tests. They also make it harder to identify and fix slow subscribers and similar issues we have been seeing recently. Choosing an arbitrary large fixed queue capacity would only mask the problem. A client running on a sufficiently large and complex customer environment can exceed any meaningful constant limit, since event volume depends on the number of peers and other factors. Behavior also changes based on scheduling of publishers and subscribers by the Go runtime, OS, and hardware, as the issue is essentially a race between publishers and subscribers. Additionally, on lower-end devices, an unreasonably high constant capacity is practically the same as using unbounded queues. Therefore, this PR changes the event queue implementation to be unbounded by default. The PublishedEvent queue keeps its existing capacity of 16 items, while subscribers' DeliveredEvent queues become unbounded. This change fixes known deadlocks and makes the system stable under load, at the cost of higher potential memory usage, including cases where a queue grows during an event burst and does not shrink when load decreases. Further improvements can be implemented in the future as needed. Fixes #17973 Fixes #18012 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2025-11-21 16:00:12 -06:00
Jordan Whited	9245c7131b	feature/relayserver: don't publish from within a subscribe fn goroutine Updates #17830 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2025-11-21 12:28:38 -08:00
Claus Lensbøl	e7f5ca1d5e	wgengine/userspace: run link change subscribers in eventqueue (#18024 ) Updates #17996 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2025-11-21 14:49:37 -05:00
Nick Khyl	3780f25d51	util/eventbus: add tests for a subscriber publishing events As of 2025-11-20, publishing more events than the eventbus's internal queues can hold may deadlock if a subscriber tries to publish events itself. This commit adds a test that demonstrates this deadlock, and skips it until the bug is fixed. Updates #18012 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2025-11-21 13:35:48 -06:00
Nick Khyl	016ccae2da	util/eventbus: add tests for a subscriber trying to acquire the same mutex as a publisher As of 2025-11-20, publishing more events than the eventbus's internal queues can hold may deadlock if a subscriber tries to acquire a mutex that can also be held by a publisher. This commit adds a test that demonstrates this deadlock, and skips it until the bug is fixed. Updates #17973 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2025-11-21 13:35:48 -06:00
Alex Chan	ce95bc77fb	tka: don't panic if no clock set in tka.Mem This is causing confusing panics in tailscale/corp#34485. We'll keep using the tka.ChonkMem constructor as much as we can, but don't panic if you create a tka.Mem directly -- we know what the sensible thing is. Updates #cleanup Signed-off-by: Alex Chan <alexc@tailscale.com> Change-Id: I49309f5f403fc26ce4f9a6cf0edc8eddf6a6f3a4	2025-11-21 17:20:28 +00:00
Andrew Lytvynov	c679aaba32	cmd/tailscaled,ipn: show a health warning when state store fails to open (#17883 ) With the introduction of node sealing, store.New fails in some cases due to the TPM device being reset or unavailable. Currently it results in tailscaled crashing at startup, which is not obvious to the user until they check the logs. Instead of crashing tailscaled at startup, start with an in-memory store with a health warning about state initialization and a link to (future) docs on what to do. When this health message is set, also block any login attempts to avoid masking the problem with an ephemeral node registration. Updates #15830 Updates #17654 Signed-off-by: Andrew Lytvynov <awly@tailscale.com>	2025-11-20 15:52:58 -06:00
Andrew Lytvynov	de8ed203e0	go.mod: bump golang.org/x/crypto (#18011 ) Pick up fixes for https://pkg.go.dev/vuln/GO-2025-4134 Updates #cleanup Signed-off-by: Andrew Lytvynov <awly@tailscale.com>	2025-11-20 14:10:38 -06:00
Harry Harpham	ac74d28190	ipn/ipnlocal: add validations when setting serve config (#17950 ) These validations were previously performed in the CLI frontend. There are two motivations for moving these to the local backend: 1. The backend controls synchronization around the relevant state, so only the backend can guarantee many of these validations. 2. Doing these validations in the back-end avoids the need to repeat them across every frontend (e.g. the CLI and tsnet). Updates tailscale/corp#27200 Signed-off-by: Harry Harpham <harry@tailscale.com>	2025-11-20 13:40:05 -06:00
David Bond	42a5262016	cmd/k8s-operator: add multi replica support for recorders (#17864 ) This commit adds the `spec.replicas` field to the `Recorder` custom resource that allows for a highly available deployment of `tsrecorder` within a kubernetes cluster. Many changes were required here as the code hard-coded the assumption of a single replica. This has required a few loops, similar to what we do for the `Connector` resource to create auth and state secrets. It was also required to add a check to remove dangling state and auth secrets should the recorder be scaled down. Updates: https://github.com/tailscale/tailscale/issues/17965 Signed-off-by: David Bond <davidsbond93@gmail.com>	2025-11-20 11:46:34 +00:00
Jonathan Nobels	682172ca2d	net/netns: remove spammy logs for interface binding caps fixes tailscale/tailscale#17990 The logging for the netns caps is spammy. Log only on changes to the values and don't log Darwin specific stuff on non Darwin clients. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>	2025-11-19 18:19:07 -08:00
Brad Fitzpatrick	7d19813618	net/batching: fix import formatting From #17842 Updates #cleanup Change-Id: Ie041b50659361b50558d5ec1f557688d09935f7c Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-19 18:13:46 -08:00
David Bond	86a849860e	cmd/k8s-operator: use stable image for k8s-nameserver (#17985 ) This commit modifies the kubernetes operator to use the "stable" version of `k8s-nameserver` by default. Updates: https://github.com/tailscale/corp/issues/19028 Signed-off-by: David Bond <davidsbond93@gmail.com>	2025-11-20 00:00:27 +00:00
KevinLiang10	a0d059d74c	cmd/tailscale/cli: allow remote target as service destination (#17607 ) This commit enables user to set service backend to remote destinations, that can be a partial URL or a full URL. The commit also prevents user to set remote destinations on linux system when socket mark is not working. For user on any version of mac extension they can't serve a service either. The socket mark usability is determined by a new local api. Fixes tailscale/corp#24783 Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>	2025-11-19 12:29:08 -05:00
License Updater	12c598de28	licenses: update license notices Signed-off-by: License Updater <noreply+license-updater@tailscale.com>	2025-11-19 07:06:18 -08:00
Alex Chan	976bf24f5e	ipn/ipnlocal: remove the always-true CanSupportNetworkLock() Now that we support using an in-memory backend for TKA state (#17946), this function always returns `nil` – we can always support Network Lock. We don't need it any more. Plus, clean up a couple of errant TODOs from that PR. Updates tailscale/corp#33599 Change-Id: Ief93bb9adebb82b9ad1b3e406d1ae9d2fa234877 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-19 14:51:13 +00:00
Brad Fitzpatrick	6ac4356bce	util/eventbus: simplify some reflect in Bus.pump Updates #cleanup Change-Id: Ib7b497e22c6cdd80578c69cf728d45754e6f909e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-19 06:23:34 -08:00
Alex Chan	336df56f85	cmd/tailscale/cli: remove Latin abbreviations from CLI help text Our style guide recommends avoiding Latin abbreviations in technical documentation, which includes the CLI help text. This is causing linter issues for the docs site, because this help text is copied into the docs. See http://go/style-guide/kb/language-and-grammar/abbreviations#latin-abbreviations Updates #cleanup Change-Id: I980c28d996466f0503aaaa65127685f4af608039 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-19 13:22:13 +00:00
Alex Chan	aeda3e8183	ipn/ipnlocal: reduce profileManager boilerplate in network-lock tests Updates tailscale/corp#33537 Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-19 13:21:52 +00:00
Raj Singh	62d64c05e1	cmd/k8s-operator: fix type comparison in apiserver proxy template (#17981 ) ArgoCD sends boolean values but the template expects strings, causing "incompatible types for comparison" errors. Wrap values with toString so both work. Fixes #17158 Signed-off-by: Raj Singh <raj@tailscale.com>	2025-11-19 13:08:40 +00:00
Alex Chan	e1dd9222d4	ipn/ipnlocal, tka: compact TKA state after every sync Previously a TKA compaction would only run when a node starts, which means a long-running node could use unbounded storage as it accumulates ever-increasing amounts of TKA state. This patch changes TKA so it runs a compaction after every sync. Updates https://github.com/tailscale/corp/issues/33537 Change-Id: I91df887ea0c5a5b00cb6caced85aeffa2a4b24ee Signed-off-by: Alex Chan <alexc@tailscale.com>	2025-11-19 12:27:04 +00:00
David Bond	38ccdbe35c	cmd/k8s-operator: default to stable image (#17848 ) This commit modifies the helm/static manifest configuration for the k8s-operator to prefer the stable image tag. This avoids making those using static manifests seeing unstable behaviour by default if they do not manually make the change. This is managed for us when using helm but not when generating the static manifests. Updates https://github.com/tailscale/tailscale/issues/10655 Signed-off-by: David Bond <davidsbond93@gmail.com>	2025-11-19 11:57:27 +00:00
Brad Fitzpatrick	408336a089	feature/featuretags: add CacheNetMap feature tag for upcoming work (trying to get in smaller obvious chunks ahead of later PRs to make them smaller) Updates #17925 Change-Id: I184002001055790484e4792af8ffe2a9a2465b2e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-18 18:11:20 -08:00
Brad Fitzpatrick	5b0c57f497	tailcfg: add some omitzero, adjust some omitempty to omitzero Updates tailscale/corp#25406 Change-Id: I7832dbe3dce3774bcc831e3111feb75bcc9e021d Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-18 17:36:27 -08:00
Joe Tsai	3b865d7c33	cmd/netlogfmt: support resolving IP addresses to synonymous labels (#17955 ) We now embed node information into network flow logs. By default, netlogfmt still prints out using Tailscale IP addresses. Support a "--resolve-addrs=TYPE" flag that can be used to specify resolving IP addresses as node IDs, hostnames, users, or tags. Updates tailscale/corp#33352 Signed-off-by: Joe Tsai <joetsai@digital-static.net>	2025-11-18 14:16:27 -08:00
James Tucker	c09c95ef67	types/key,wgengine/magicsock,control/controlclient,ipn: add debug disco key rotation Adds the ability to rotate discovery keys on running clients, needed for testing upcoming disco key distribution changes. Introduces key.DiscoKey, an atomic container for a disco private key, public key, and the public key's ShortString, replacing the prior separate atomic fields. magicsock.Conn has a new RotateDiscoKey method, and access to this is provided via localapi and a CLI debug command. Note that this implementation is primarily for testing as it stands, and regular use should likely introduce an additional mechanism that allows the old key to be used for some time, to provide a seamless key rotation rather than one that invalidates all sessions. Updates tailscale/corp#34037 Signed-off-by: James Tucker <james@tailscale.com>	2025-11-18 12:16:15 -08:00
Fran Bull	da508c504d	appc: add ippool type As part of the conn25 work we will want to be able to keep track of a pool of IP Addresses and know which have been used and which have not. Fixes tailscale/corp#34247 Signed-off-by: Fran Bull <fran@tailscale.com>	2025-11-18 10:46:28 -08:00
Alex Chan	d0daa5a398	tka: marshal AUMHash totext even if Tailnet Lock is omitted We use `tka.AUMHash` in `netmap.NetworkMap`, and we serialise it as JSON in the `/debug/netmap` C2N endpoint. If the binary omits Tailnet Lock support, the debug endpoint returns an error because it's unable to marshal the AUMHash. This patch adds a sentinel value so this marshalling works, and we can use the debug endpoint. Updates https://github.com/tailscale/tailscale/issues/17115 Signed-off-by: Alex Chan <alexc@tailscale.com> Change-Id: I51ec1491a74e9b9f49d1766abd89681049e09ce4	2025-11-18 18:34:09 +00:00
Anton Tolchanov	04a9d25a54	tka: mark young AUMs as active even if the chain is long Existing compaction logic seems to have had an assumption that markActiveChain would cover a longer part of the chain than markYoungAUMs. This prevented long, but fresh, chains, from being compacted correctly. Updates tailscale/corp#33537 Signed-off-by: Anton Tolchanov <anton@tailscale.com>	2025-11-18 18:04:12 +00:00
Brad Fitzpatrick	bd29b189fe	types/netmap,*: remove some redundant fields from NetMap Updates #12639 Change-Id: Ia50b15529bd1c002cdd2c937cdfbe69c06fa2dc8 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2025-11-18 07:56:10 -08:00

1 2 3 4 5 ...

9946 Commits