tailscale

mirror of https://github.com/tailscale/tailscale.git synced 2024-11-25 19:15:34 +00:00

Author	SHA1	Message	Date
Brad Fitzpatrick	841eaacb07	net/sockstats: quiet some log spam in release builds Updates #13731 Change-Id: Ibee85426827ebb9e43a1c42a9c07c847daa50117 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-08 11:02:46 -07:00
Irbe Krumina	861dc3631c	cmd/{k8s-operator,containerboot},kube/egressservices: fix Pod IP check for dual stack clusters (#13721 ) Currently egress Services for ProxyGroup only work for Pods and Services with IPv4 addresses. Ensure that it works on dual stack clusters by reading proxy Pod's IP from the .status.podIPs list that always contains both IPv4 and IPv6 address (if the Pod has them) rather than .status.podIP that could contain IPv6 only for a dual stack cluster. Updates tailscale/tailscale#13406 Signed-off-by: Irbe Krumina <irbe@tailscale.com>	2024-10-08 18:35:23 +01:00
Andrew Dunham	8ee7f82bf4	net/netcheck: don't panic if a region has no Nodes Updates #13728 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I1e8319d6b2da013ae48f15113b30c9333e69cc0b	2024-10-08 12:52:27 -04:00
Tom Proctor	36cb2e4e5f	cmd/k8s-operator,k8s-operator: use default ProxyClass if set for ProxyGroup (#13720 ) The default ProxyClass can be set via helm chart or env var, and applies to all proxies that do not otherwise have an explicit ProxyClass set. This ensures proxies created by the new ProxyGroup CRD are consistent with the behaviour of existing proxies Nearby but unrelated changes: * Fix up double error logs (controller runtime logs returned errors) * Fix a couple of variable names Updates #13406 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>	2024-10-08 17:34:34 +01:00
Tom Proctor	cba2e76568	cmd/containerboot: simplify k8s setup logic (#13627 ) Rearrange conditionals to reduce indentation and make it a bit easier to read the logic. Also makes some error message updates for better consistency with the recent decision around capitalising resource names and the upcoming addition of config secrets. Updates #cleanup Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>	2024-10-08 17:13:00 +01:00
dependabot[bot]	866714a894	.github: Bump github/codeql-action from 3.26.9 to 3.26.11 (#13710 ) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.9 to 3.26.11. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](`461ef6c76d...6db8d6351f`) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-10-07 22:15:40 -06:00
dependabot[bot]	266c14d6ca	.github: Bump actions/cache from 4.0.2 to 4.1.0 (#13711 ) Bumps [actions/cache](https://github.com/actions/cache) from 4.0.2 to 4.1.0. - [Release notes](https://github.com/actions/cache/releases) - [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md) - [Commits](`0c45773b62...2cdf405574`) --- updated-dependencies: - dependency-name: actions/cache dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-10-07 20:48:06 -06:00
Nick Hill	9a73462ea4	types/lazy: add DeferredInit type It is sometimes necessary to defer initialization steps until the first actual usage or until certain prerequisites have been met. For example, policy setting and policy source registration should not occur during package initialization. Instead, they should be deferred until the syspolicy package is actually used. Additionally, any errors should be properly handled and reported, rather than causing a panic within the package's init function. In this PR, we add DeferredInit, to facilitate the registration and invocation of deferred initialization functions. Updates #12687 Signed-off-by: Nick Hill <mykola.khyl@gmail.com>	2024-10-07 15:43:22 -05:00
Brad Fitzpatrick	f3de4e96a8	derp: fix omitted word in comment Fix comment just added in `38f236c725`. Updates tailscale/corp#23668 Updates #cleanup Change-Id: Icbe112e24fcccf8c61c759c631ad09f3e5480547 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-07 12:21:10 -07:00
Irbe Krumina	7f016baa87	cmd/k8s-operator,k8s-operator: create ConfigMap for egress services + small fixes for egress services (#13715 ) cmd/k8s-operator, k8s-operator: create ConfigMap for egress services + small reconciler fixes Updates tailscale/tailscale#13406 Signed-off-by: Irbe Krumina <irbe@tailscale.com>	2024-10-07 20:12:56 +01:00
Brad Fitzpatrick	38f236c725	derp: add server metric for batch write sizes Updates tailscale/corp#23668 Change-Id: Ie6268c4035a3b29fd53c072c5793e4cbba93d031 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-07 11:22:51 -07:00
Erisa A	c588c36233	types/key: use tlpub: in error message (#13707 ) Fixes tailscale/corp#19442 Signed-off-by: Erisa A <erisa@tailscale.com>	2024-10-07 17:28:45 +01:00
Brad Fitzpatrick	cb10eddc26	tool/gocross: fix argument order to find To avoid warning: find: warning: you have specified the global option -maxdepth after the argument -type, but global options are not positional, i.e., -maxdepth affects tests specified before it as well as those specified after it. Please specify global options before other arguments. Fixes tailscale/corp#23689 Change-Id: I91ee260b295c552c0a029883d5e406733e081478 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-07 08:07:03 -07:00
Tom Proctor	e48cddfbb3	cmd/{containerboot,k8s-operator},k8s-operator,kube: add ProxyGroup controller (#13684 ) Implements the controller for the new ProxyGroup CRD, designed for running proxies in a high availability configuration. Each proxy gets its own config and state Secret, and its own tailscale node ID. We are currently mounting all of the config secrets into the container, but will stop mounting them and instead read them directly from the kube API once #13578 is implemented. Updates #13406 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>	2024-10-07 14:58:45 +01:00
Brad Fitzpatrick	1005cbc1e4	tailscaleroot: panic if tailscale_go build tag but Go toolchain mismatch Fixes #13527 Change-Id: I05921969a84a303b60d1b3b9227aff9865662831 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-06 15:22:04 -07:00
Brad Fitzpatrick	c48cc08de2	wgengine: stop conntrack log spam about Canonical net probes Like we do for the ones on iOS. As a bonus, this removes a caller of tsaddr.IsTailscaleIP which we want to revamp/remove soonish. Updates #13687 Change-Id: Iab576a0c48e9005c7844ab52a0aba5ba343b750e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-05 12:51:55 -07:00
Andrew Dunham	12f1bc7c77	envknob: support disk-based envknobs on the macsys build Per my investigation just now, the $HOME environment variable is unset on the macsys (standalone macOS GUI) variant, but the current working directory is valid. Look for the environment variable file in that location in addition to inside the home directory. Updates #3707 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: I481ae2e0d19b316244373e06865e3b5c3a9f3b88	2024-10-04 17:12:27 -04:00
Patrick O'Doherty	4ad3f01225	safeweb: allow passing http.Server in safeweb.Config (#13688 ) Extend safeweb.Config with the ability to pass a http.Server that safeweb will use to server traffic. Updates corp#8207 Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>	2024-10-04 11:57:00 -07:00
kari-ts	8fdffb8da0	hostinfo: update SetPackage doc with new Android values (#13537 ) Fixes tailscale/corp#23283 Signed-off-by: kari-ts <kari@tailscale.com>	2024-10-04 16:35:19 +00:00
Erisa A	f30d85310c	cmd/tailscale/cli: don't print disablement secrets if init fails (#13673 ) * cmd/tailscale/cli: don't print disablement secrets if init fails Fixes tailscale/corp#11355 Signed-off-by: Erisa A <erisa@tailscale.com> * cmd/tailscale/cli: changes from code review Signed-off-by: Erisa A <erisa@tailscale.com> * cmd/tailscale/cli: small grammar change Signed-off-by: Erisa A <erisa@tailscale.com> --------- Signed-off-by: Erisa A <erisa@tailscale.com>	2024-10-04 16:01:48 +01:00
Irbe Krumina	e8bb5d1be5	cmd/{k8s-operator,containerboot},k8s-operator,kube: reconcile ExternalName Services for ProxyGroup (#13635 ) Adds a new reconciler that reconciles ExternalName Services that define a tailnet target that should be exposed to cluster workloads on a ProxyGroup's proxies. The reconciler ensures that for each such service, the config mounted to the proxies is updated with the tailnet target definition and that and EndpointSlice and ClusterIP Service are created for the service. Adds a new reconciler that ensures that as proxy Pods become ready to route traffic to a tailnet target, the EndpointSlice for the target is updated with the Pods' endpoints. Updates tailscale/tailscale#13406 Signed-off-by: Irbe Krumina <irbe@tailscale.com>	2024-10-04 13:11:35 +01:00
Irbe Krumina	9bd158cc09	cmd/containerboot,util/linuxfw: create a SNAT rule for dst/src only once, clean up if needed (#13658 ) The AddSNATRuleForDst rule was adding a new rule each time it was called including: - if a rule already existed - if a rule matching the destination, but with different desired source already existed This was causing issues especially for the in-progress egress HA proxies work, where the rules are now refreshed more frequently, so more redundant rules were being created. This change: - only creates the rule if it doesn't already exist - if a rule for the same dst, but different source is found, delete it - also ensures that egress proxies refresh firewall rules if the node's tailnet IP changes Updates tailscale/tailscale#13406 Signed-off-by: Irbe Krumina <irbe@tailscale.com>	2024-10-03 20:15:00 +01:00
Patrick O'Doherty	a3c6a3a34f	safeweb: add StrictTransportSecurityOptions config (#13679 ) Add the ability to specify Strict-Transport-Security options in response to BrowserMux HTTP requests in safeweb. Updates https://github.com/tailscale/corp/issues/23375 Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>	2024-10-03 18:38:29 +00:00
Brad Fitzpatrick	dc60c8d786	ssh/tailssh: pass window size pixels in IoctlSetWinsize events Fixes #13669 Change-Id: Id44cfbb83183f1bbcbdc38c29238287b9d288707 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-03 09:24:28 -07:00
Andrea Gottardo	58c6bc2991	logpolicy: force TLS 1.3 handshake Updates tailscale/tailscale#3363 We know `log.tailscale.io` supports TLS 1.3, so we can enforce its usage in the client to shake some bytes off the TLS handshake each time a connection is opened to upload logs. Signed-off-by: Andrea Gottardo <andrea@gottardo.me>	2024-10-03 09:16:23 -07:00
Brad Fitzpatrick	5f88b65764	wgengine/netstack: check userspace ping success on Windows Hacky temporary workaround until we do #13654 correctly. Updates #13654 Change-Id: I764eaedbb112fb3a34dddb89572fec1b2543fd4a Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-03 09:07:39 -07:00
Brad Fitzpatrick	1f8eea53a8	control/controlclient: include HTTP status string in error message too Not just its code. Updates tailscale/corp#23584 Change-Id: I8001a675372fe15da797adde22f04488d8683448 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-03 08:37:16 -07:00
Brad Fitzpatrick	6f694da912	wgengine/magicsock: avoid log spam from ReceiveFunc on shutdown The new logging in `2dd71e64ac` is spammy at shutdown: Receive func ReceiveIPv6 exiting with error: net.OpError, read udp [::]:38869: raw-read udp6 [::]:38869: use of closed network connection Receive func ReceiveIPv4 exiting with error: net.OpError, read udp 0.0.0.0:36123: raw-read udp4 0.0.0.0:36123: use of closed network connection Skip it if we're in the process of shutting down. Updates #10976 Change-Id: I4f6d1c68465557eb9ffe335d43d740e499ba9786 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-02 20:22:12 -07:00
Naman Sood	09ec2f39b5	tailcfg: add func to check for known valid ServiceProtos (#13668 ) Updates tailscale/corp#23574. Signed-off-by: Naman Sood <mail@nsood.in>	2024-10-02 22:54:02 -04:00
Brad Fitzpatrick	383120c534	ipn/ipnlocal: don't run portlist code unless service collection is on We were selectively uploading it, but we were still gathering it, which can be a waste of CPU. Also remove a bunch of complexity that I don't think matters anymore. And add an envknob to force service collection off on a single node, even if the tailnet policy permits it. Fixes #13463 Change-Id: Ib6abe9e29d92df4ffa955225289f045eeeb279cf Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-02 18:08:31 -07:00
Nick Khyl	d837e0252f	wf/firewall: allow link-local multicast for permitted local routes when the killswitch is on on Windows When an Exit Node is used, we create a WFP rule to block all inbound and outbound traffic, along with several rules to permit specific types of traffic. Notably, we allow all inbound and outbound traffic to and from LocalRoutes specified in wgengine/router.Config. The list of allowed routes always includes routes for internal interfaces, such as loopback and virtual Hyper-V/WSL2 interfaces, and may also include LAN routes if the "Allow local network access" option is enabled. However, these permitting rules do not allow link-local multicast on the corresponding interfaces. This results in broken mDNS/LLMNR, and potentially other similar issues, whenever an exit node is used. In this PR, we update (wf.Firewall).UpdatePermittedRoutes() to create rules allowing outbound and inbound link-local multicast traffic to and from the permitted IP ranges, partially resolving the mDNS/LLMNR and .local name resolution issue. Since Windows does not attempt to send mDNS/LLMNR queries if a catch-all NRPT rule is present, it is still necessary to disable the creation of that rule using the disable-local-dns-override-via-nrpt nodeAttr. Updates #13571 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2024-10-02 18:36:01 -05:00
Brad Fitzpatrick	b8af93310a	tstest: add the start of a testing wishlist Of tests we wish we could easily add. One day. Updates #13038 Change-Id: If44646f8d477674bbf2c9a6e58c3cd8f94a4e8df Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-02 16:08:41 -07:00
Andrea Gottardo	6de6ab015f	net/dns: tweak DoH timeout, limit MaxConnsPerHost, require TLS 1.3 (#13564 ) Updates tailscale/tailscale#6148 This is the result of some observations we made today with @raggi. The DNS over HTTPS client currently doesn't cap the number of connections it uses, either in-use or idle. A burst of DNS queries will open multiple connections. Idle connections remain open for 30 seconds (this interval is defined in the dohTransportTimeout constant). For DoH providers like NextDNS which send keep-alives, this means the cellular modem will remain up more than expected to send ACKs if any keep-alives are received while a connection remains idle during those 30 seconds. We can set the IdleConnTimeout to 10 seconds to ensure an idle connection is terminated if no other DNS queries come in after 10 seconds. Additionally, we can cap the number of connections to 1. This ensures that at all times there is only one open DoH connection, either active or idle. If idle, it will be terminated within 10 seconds from the last query. We also observed all the DoH providers we support are capable of TLS 1.3. We can force this TLS version to reduce the number of packets sent/received each time a TLS connection is established. Signed-off-by: Andrea Gottardo <andrea@gottardo.me>	2024-10-02 09:26:11 -07:00
Brad Fitzpatrick	a01b545441	control/control{client,http}: don't noise dial localhost:443 in http-only tests `1eaad7d3de` regressed some tests in another repo that were starting up a control server on `http://127.0.0.1:nnn`. Because there was no https running, and because of a bug in `1eaad7d3de` (which ended up checking the recently-dialed-control check twice in a single dial call), we ended up forcing only the use of TLS dials in a test that only had plaintext HTTP running. Instead, plumb down support for explicitly disabling TLS fallbacks and use it only when running in a test and using `http` scheme control plane URLs to 127.0.0.1 or localhost. This fixes the tests elsewhere. Updates #13597 Change-Id: I97212ded21daf0bd510891a278078daec3eebaa6 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-02 10:41:08 -05:00
Brad Fitzpatrick	6b03e18975	control/controlhttp: rename a param from addr to optAddr for clarity And update docs. Updates #cleanup Updates #13597 (tangentially; noted this cleanup while debugging) Change-Id: I62440294c78b0bb3f5673be10318dd89af1e1bfe Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-02 10:41:08 -05:00
Brad Fitzpatrick	f49d218cfe	net/dnscache: don't fall back to an IPv6 dial if we don't have IPv6 I noticed while debugging a test failure elsewhere that our failure logs (when verbosity is cranked up) were uselessly attributing dial failures to failure to dial an invalid IP address (this IPv6 address we didn't have), rather than showing me the actual IPv4 connection failure. Updates #13597 (tangentially) Change-Id: I45ffbefbc7e25ebfb15768006413a705b941dae5 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-02 10:41:08 -05:00
Brad Fitzpatrick	30f0fa95d9	control/controlclient: bound ReportHealthChange context lifetime to Direct client's Fixes #13651 Change-Id: I8154d3cc0ca40fe7a0223b26ae2e77e8d6ba874b Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-02 10:40:39 -05:00
Andrea Gottardo	ed1ac799c8	net/captivedetection: set Timeout on net.Dialer (#13613 ) Updates tailscale/tailscale#1634 Updates tailscale/tailscale#13265 Captive portal detection uses a custom `net.Dialer` in its `http.Client`. This custom Dialer ensures that the socket is bound specifically to the Wi-Fi interface. This is crucial because without it, if any default routes are set, the outgoing requests for detecting a captive portal would bypass Wi-Fi and go through the default route instead. The Dialer did not have a Timeout property configured, so the default system timeout was applied. This caused issues in #13265, where we attempted to make captive portal detection requests over an IPsec interface used for Wi-Fi Calling. The call to `connect()` would fail and remain blocked until the system timeout (approximately 1 minute) was reached. In #13598, I simply excluded the IPsec interface from captive portal detection. This was a quick and safe mitigation for the issue. This PR is a follow-up to make the process more robust, by setting a 3 seconds timeout on any connection establishment on any interface (this is the same timeout interval we were already setting on the HTTP client). Signed-off-by: Andrea Gottardo <andrea@gottardo.me>	2024-10-02 15:29:46 +00:00
Nick Khyl	e66fe1f2e8	docs/windows/policy: add ADMX policy setting to configure the AuthKey Updates tailscale/corp#22120 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2024-10-02 09:19:19 -05:00
dependabot[bot]	992ee6dd0b	.github: Bump github/codeql-action from 3.26.8 to 3.26.9 (#13625 ) Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.8 to 3.26.9. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](`294a9d9291...461ef6c76d`) --- updated-dependencies: - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-10-01 23:27:30 -06:00
Brad Fitzpatrick	262c526c4e	net/portmapper: don't treat 0.0.0.0 as a valid IP Updates tailscale/corp#23538 Change-Id: I58b8c30abe43f1d1829f01eb9fb2c1e6e8db9476 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-01 16:11:47 -05:00
Andrew Dunham	16ef88754d	net/portmapper: don't return unspecified/local external IPs We were previously not checking that the external IP that we got back from a UPnP portmap was a valid endpoint; add minimal validation that this endpoint is something that is routeable by another host. Updates tailscale/corp#23538 Signed-off-by: Andrew Dunham <andrew@du.nham.ca> Change-Id: Id9649e7683394aced326d5348f4caa24d0efd532	2024-10-01 14:13:40 -04:00
Brad Fitzpatrick	1eaad7d3de	control/controlhttp: fix connectivity on Alaska Air wifi Updates #13597 Change-Id: Ifbf52b93fd35d64fcf80f8fddbfd610008fd8742 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-01 11:58:20 -05:00
Brad Fitzpatrick	fd32f0ddf4	control/controlhttp: factor out some code in prep for future change This pulls out the clock and forceNoise443 code into methods on the Dialer as cleanup in its own commit to make a future change less distracting. Updates #13597 Change-Id: I7001e57fe7b508605930c5b141a061b6fb908733 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-01 11:28:59 -05:00
Brad Fitzpatrick	d3f302d8e2	cmd/tailscale/cli: make 'tailscale debug ts2021' try twice In prep for a future port 80 MITM fix, make the 'debug ts2021' command retry once after a failure to give it a chance to pick a new strategy. Updates #13597 Change-Id: Icb7bad60cbf0dbec78097df4a00e9795757bc8e4 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2024-10-01 11:28:59 -05:00
Mario Minardi	8f44ba1cd6	ssh: Add logic to set accepted environment variables in SSH session (#13559 ) Add logic to set environment variables that match the SSH rule's `acceptEnv` settings in the SSH session's environment. Updates https://github.com/tailscale/corp/issues/22775 Signed-off-by: Mario Minardi <mario@tailscale.com>	2024-09-30 21:47:45 -06:00
dependabot[bot]	dd6b808acf	.github: Bump peter-evans/create-pull-request from 7.0.1 to 7.0.5 (#13626 ) Bumps [peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request) from 7.0.1 to 7.0.5. - [Release notes](https://github.com/peter-evans/create-pull-request/releases) - [Commits](`8867c4aba1...5e914681df`) --- updated-dependencies: - dependency-name: peter-evans/create-pull-request dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-30 21:12:44 -06:00
Anton Tolchanov	a70287d324	logpolicy: don't create a filch buffer if logging is disabled Updates #9549 Signed-off-by: Anton Tolchanov <commits@knyar.net>	2024-09-30 11:36:08 +02:00
Maisem Ali	fb0f8fc0ae	cmd/tsidp: add --dir flag To better control where the tsnet state is being stored. Updates #10263 Signed-off-by: Maisem Ali <maisem@tailscale.com>	2024-09-29 16:15:22 -07:00
Irbe Krumina	096b090caf	cmd/containerboot,kube,util/linuxfw: configure kube egress proxies to route to 1+ tailnet targets (#13531 ) * cmd/containerboot,kube,util/linuxfw: configure kube egress proxies to route to 1+ tailnet targets This commit is first part of the work to allow running multiple replicas of the Kubernetes operator egress proxies per tailnet service + to allow exposing multiple tailnet services via each proxy replica. This expands the existing iptables/nftables-based proxy configuration mechanism. A proxy can now be configured to route to one or more tailnet targets via a (mounted) config file that, for each tailnet target, specifies: - the target's tailnet IP or FQDN - mappings of container ports to which cluster workloads will send traffic to tailnet target ports where the traffic should be forwarded. Example configfile contents: { "some-svc": {"tailnetTarget":{"fqdn":"foo.tailnetxyz.ts.net","ports"{"tcp:4006:80":{"protocol":"tcp","matchPort":4006,"targetPort":80},"tcp:4007:443":{"protocol":"tcp","matchPort":4007,"targetPort":443}}}} } A proxy that is configured with this config file will configure firewall rules to route cluster traffic to the tailnet targets. It will then watch the config file for updates as well as monitor relevant netmap updates and reconfigure firewall as needed. This adds a bunch of new iptables/nftables functionality to make it easier to dynamically update the firewall rules without needing to restart the proxy Pod as well as to make it easier to debug/understand the rules: - for iptables, each portmapping is a DNAT rule with a comment pointing at the 'service',i.e: -A PREROUTING ! -i tailscale0 -p tcp -m tcp --dport 4006 -m comment --comment "some-svc:tcp:4006 -> tcp:80" -j DNAT --to-destination 100.64.1.18:80 Additionally there is a SNAT rule for each tailnet target, to mask the source address. - for nftables, a separate prerouting chain is created for each tailnet target and all the portmapping rules are placed in that chain. This makes it easier to look up rules and delete services when no longer needed. (nftables allows hooking a custom chain to a prerouting hook, so no extra work is needed to ensure that the rules in the service chains are evaluated). The next steps will be to get the Kubernetes Operator to generate the configfile and ensure it is mounted to the relevant proxy nodes. Updates tailscale/tailscale#13406 Signed-off-by: Irbe Krumina <irbe@tailscale.com>	2024-09-29 16:30:53 +01:00

1 2 3 4 5 ...

8276 Commits