cmd/{containerboot,k8s-operator},kube: add preshutdown hook for egress PG proxies
This change is part of work towards minimizing downtime during update
rollouts of egress ProxyGroup replicas.
This change:
- updates the containerboot health check logic to return Pod IP in headers,
if set
- always runs the health check for egress PG proxies
- updates ClusterIP Services created for PG egress endpoints to include
the health check endpoint
- implements preshutdown endpoint in proxies. The preshutdown endpoint
logic waits till, for all currently configured egress services, the ClusterIP
Service health check endpoint is no longer returned by the shutting-down Pod
(by looking at the new Pod IP header).
- ensures that kubelet is configured to call the preshutdown endpoint
This reduces the possibility that, as replicas are terminated during an update,
a replica gets terminated to which cluster traffic is still being routed via
the ClusterIP Service because kube proxy has not yet updated routig rules.
This is not a perfect check as in practice, it only checks that the kube
proxy on the node on which the proxy runs has updated rules. However, overall
this might be good enough.
The preshutdown logic is disabled if users have configured a custom health check
port via TS_LOCAL_ADDR_PORT env var. This change throws a warnign if so and in
future setting of that env var for operator proxies might be disallowed (as users
shouldn't need to configure this for a Pod directly).
This is backwards compatible with earlier proxy versions.
Updates tailscale/tailscale#14326
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This was flagged by @tkhattra on the merge commit; thanks!
Updates tailscale/corp#25479
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ia8045640f02bd4dcc0fe7433249fd72ac6b9cf52
The upstream crypto package now supports sending banners at any time during
authentication, so the Tailscale fork of crypto/ssh is no longer necessary.
github.com/tailscale/golang-x-crypto is still needed for some custom ACME
autocert functionality.
tempfork/gliderlabs is still necessary because of a few other customizations,
mostly related to TTY handling.
Updates #8593
Signed-off-by: Percy Wegmann <percy@tailscale.com>
It was a temporary migration over four years ago. It's no longer
relevant.
Updates #610
Change-Id: I1f00c9485fab13ede6f77603f7d4235222c2a481
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We've been maintaining temporary dev forks of golang.org/x/crypto/{acme,ssh}
in https://github.com/tailscale/golang-x-crypto instead of using
this repo's tempfork directory as we do with other packages. The reason we were
doing that was because x/crypto/ssh depended on x/crypto/ssh/internal/poly1305
and I hadn't noticed there are forwarding wrappers already available
in x/crypto/poly1305. It also depended internal/bcrypt_pbkdf but we don't use that
so it's easy to just delete that calling code in our tempfork/ssh.
Now that our SSH changes have been upstreamed, we can soon unfork from SSH.
That leaves ACME remaining.
This change copies our tailscale/golang-x-crypto/acme code to
tempfork/acme but adds a test that our vendored copied still matches
our tailscale/golang-x-crypto repo, where we can continue to do
development work and rebases with upstream. A comment on the new test
describes the expected workflow.
While we could continue to just import & use
tailscale/golang-x-crypto/acme, it seems a bit nicer to not have that
entire-fork-of-x-crypto visible at all in our transitive deps and the
questions that invites. Showing just a fork of an ACME client is much
less scary. It does add a step to the process of hacking on the ACME
client code, but we do that approximately never anyway, and the extra
step is very incremental compared to the existing tedious steps.
Updates #8593
Updates #10238
Change-Id: I8af4378c04c1f82e63d31bf4d16dba9f510f9199
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously we were depending on the GUI(s) to do it.
By doing it in tailscaled, GUIs can be simplified and be
guaranteed to render consistent results.
If warnable A depends on warnable B, if both A & B are unhealhy, only
B will be shown to the GUI as unhealthy. Once B clears up, only then
will A be presented as unhealthy.
Updates #14687
Change-Id: Id8566f2672d8d2d699740fa053d4e2a2c8009e83
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The c2n handling code was using the Go httptest package's
ResponseRecorder code but that's in a test package which brings in
Go's test certs, etc.
This forks the httptest recorder type into its own package that only
has the recorder and adds a test that we don't re-introduce a
dependency on httptest.
Updates #12614
Change-Id: I3546f49972981e21813ece9064cc2be0b74f4b16
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The hiding of internal packages has hidden things I wanted to see a
few times now. Stop hiding them. This makes depaware.txt output a bit
longer, but not too much. Plus we only really look at it with diffs &
greps anyway; it's not like anybody reads the whole thing.
Updates #12614
Change-Id: I868c89eeeddcaaab63e82371651003629bc9bda8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This protects against rearranging packages and not catching that a BadDeps
package got moved. That would then effectively remove a test.
Updates #12614
Change-Id: I257f1eeda9e3569c867b7628d5bfb252d3354ba6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The AsDebugJSON method (used only for a LocalAPI debug call) always
needed to be updated whenever a new controlknob was added. We had a
test for it, which was nice, but it was a tedious step we don't need
to do. Use reflect instead.
Updates #14788
Change-Id: If59cd776920f3ce7c748f86ed2eddd9323039a0b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We had the debug packet capture code + Lua dissector in the CLI + the
iOS app. Now we don't, with tests to lock it in.
As a bonus, tailscale.com/net/packet and tailscale.com/net/flowtrack
no longer appear in the CLI's binary either.
A new build tag ts_omit_capture disables the packet capture code and
was added to build_dist.sh's --extra-small mode.
Updates #12614
Change-Id: I79b0628c0d59911bd4d510c732284d97b0160f10
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Adds an envknob setting for changing the client's ACME directory URL.
This allows testing cert issuing against LE's staging environment, as
well as enabling local-only test environments, which is useful for
avoiding the production rate limits in test and development scenarios.
Fixes#14761
Change-Id: I191c840c0ca143a20e4fa54ea3b2f9b7cbfc889f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Some natc instances have been observed with excessive memory growth,
dominant in gvisor buffers. It is likely that the connection buffers are
sticking around for too long due to the default long segment time, and
uptuned buffer size applied by default in wgengine/netstack. Apply
configurations in natc specifically which are a better match for the
natc use case, most notably a 5s maximum segment lifetime.
Updates tailscale/corp#25169
Signed-off-by: James Tucker <james@tailscale.com>
Manually update the `web-client-prebuilt` package as the GitHub action
is failing for some reason.
Updates https://github.com/tailscale/tailscale/issues/14568
Signed-off-by: Mario Minardi <mario@tailscale.com>
Removing the advanced options collapsible from the web client login for
now ahead of our next client release.
Updates https://github.com/tailscale/tailscale/issues/14568
Signed-off-by: Mario Minardi <mario@tailscale.com>
The CN field is technically deprecated; set the requested name in a DNS SAN
extension in addition to maximise compatibility with RFC 8555.
Fixes#14762
Change-Id: If5d27f1e7abc519ec86489bf034ac98b2e613043
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The timeout still defaults to 2 seconds, but can now be changed via command-line flag.
Updates tailscale/corp#26045
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This interface is used both by the DERP client as well as the server.
Defining the interface in derp.go makes it clear that it is shared.
Updates tailscale/corp#26045
Signed-off-by: Percy Wegmann <percy@tailscale.com>
3dabea0fc2c added some docs with inconsistent usage docs.
This fixes them, and adds a test.
It also adds some other tests and fixes other verb tense
inconsistencies.
Updates tailscale/corp#25278
Change-Id: I94c2a8940791bddd7c35c1c3d5fb791a317370c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In v1.78, we started acquiring the GP lock when reading policy settings. This led to a deadlock during
Tailscale installation via Group Policy Software Installation because the GP engine holds the write lock
for the duration of policy processing, which in turn waits for the installation to complete, which in turn
waits for the service to enter the running state.
In this PR, we prevent the acquisition of GP locks (aka EnterCriticalPolicySection) during service startup
and update the Windows Registry-based util/syspolicy/source.PlatformPolicyStore to handle this failure
gracefully. The GP lock is somewhat optional; it’s safe to read policy settings without it, but acquiring
the lock is recommended when reading multiple values to prevent the Group Policy engine from modifying
settings mid-read and to avoid inconsistent results.
Fixes#14416
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This was a slow memory leak on busy tailnets with lots of tagged
ephemeral nodes.
Updates tailscale/corp#26058
Change-Id: I298e7d438e3ffbb3cde795640e344671d244c632
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Still behind the same ts_omit_tap build tag.
See #14738 for background on the pattern.
Updates #12614
Change-Id: I03fb3d2bf137111e727415bd8e713d8568156ecc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If we fail to parse the upstream DNS response in an app connector, we
might miss new IPs for the target domain. Log parsing errors to be able
to diagnose that.
Updates #14606
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Remove "unexpected" labelling of PeerGoneReasonNotHere.
A peer being no longer connected to a DERP server
is not an unexpected case and causes confusion in looking at logs.
Fixestailscale/corp#25609
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
The new ProxyGroup-based Ingress reconciler is causing a fatal log at
startup because it has the same name as the existing Ingress reconciler.
Explicitly name both to ensure they have unique names that are consistent
with other explicitly named reconcilers.
Updates #14583
Change-Id: Ie76e3eaf3a96b1cec3d3615ea254a847447372ea
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This pulls out the Wake-on-LAN (WoL) code out into its own package
(feature/wakeonlan) that registers itself with various new hooks
around tailscaled.
Then a new build tag (ts_omit_wakeonlan) causes the package to not
even be linked in the binary.
Ohter new packages include:
* feature: to just record which features are loaded. Future:
dependencies between features.
* feature/condregister: the package with all the build tags
that tailscaled, tsnet, and the Tailscale Xcode project
extension can empty (underscore) import to load features
as a function of the defined build tags.
Future commits will move of our "ts_omit_foo" build tags into this
style.
Updates #12614
Change-Id: I9c5378dafb1113b62b816aabef02714db3fc9c4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* Reapply "ipn/ipnlocal: re-advertise appc routes on startup (#14609)"
This reverts commit 51adaec35a3e4d25df88d81e6264584e151bd33d.
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
* ipn/ipnlocal: fix a deadlock in readvertiseAppConnectorRoutes
Don't hold LocalBackend.mu while calling the methods of
appc.AppConnector. Those methods could call back into LocalBackend and
try to acquire it's mutex.
Fixes https://github.com/tailscale/corp/issues/25965Fixes#14606
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
---------
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Updates tailscale/corp#25278
Adds definitions for new CLI commands getting added in v1.80. Refactors some pre-existing CLI commands within the `configure` tree to clean up code.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Rather than using a string everywhere and needing to clarify that the
string should have the svc: prefix, create a separate type for Service
names.
Updates tailscale/corp#24607
Change-Id: I720e022f61a7221644bb60955b72cacf42f59960
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>