control/controlclient,health,ipn/ipnlocal,health: fix deadlock by deleting health reporting

A recent change (009d702adf) introduced a deadlock where the
/machine/update-health network request to report the client's health
status update to the control plane was moved to being synchronous
within the eventbus's pump machinery.

I started to instead make the health reporting be async, but then we
realized in the three years since we added that, it's barely been used
and doesn't pay for itself, for how many HTTP requests it makes.

Instead, delete it all and replace it with a c2n handler, which
provides much more helpful information.

Fixes tailscale/corp#32952

Change-Id: I9e8a5458269ebfdda1c752d7bbb8af2780d71b04
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit is contained in:
Brad Fitzpatrick
2025-10-02 12:01:59 -07:00
committed by Brad Fitzpatrick
parent a208cb9fd5
commit 24e38eb729
5 changed files with 18 additions and 73 deletions

View File

@@ -172,7 +172,8 @@ type CapabilityVersion int
// - 125: 2025-08-11: dnstype.Resolver adds UseWithExitNode field.
// - 126: 2025-09-17: Client uses seamless key renewal unless disabled by control (tailscale/corp#31479)
// - 127: 2025-09-19: can handle C2N /debug/netmap.
const CurrentCapabilityVersion CapabilityVersion = 127
// - 128: 2025-10-02: can handle C2N /debug/health.
const CurrentCapabilityVersion CapabilityVersion = 128
// ID is an integer ID for a user, node, or login allocated by the
// control plane.
@@ -2734,6 +2735,9 @@ type SetDNSResponse struct{}
// node health changes to:
//
// POST https://<control-plane>/machine/update-health.
//
// As of 2025-10-02, we stopped sending this to the control plane proactively.
// It was never useful enough with its current design and needs more thought.
type HealthChangeRequest struct {
Subsys string // a health.Subsystem value in string form
Error string // or empty if cleared