derp/derphttp: fix race in mesh watcher

The derphttp client automatically reconnects upon failure.

RunWatchConnectionLoop called derphttp.Client.WatchConnectionChanges
once, but that wrapper method called the underlying
derp.Client.WatchConnectionChanges exactly once on derphttp.Client's
currently active connection. If there's a failure, we need to re-subscribe
upon all reconnections.

This removes the derphttp.Client.WatchConnectionChanges method, which
was basically impossible to use correctly, and changes it to be a
boolean field on derphttp.Client alongside MeshKey and IsProber. Then
it moves the call to the underlying derp.Client.WatchConnectionChanges
to derphttp's client connection code, so it's resubscribed on any
reconnect.

Some paranoia is then added to make sure people hold the API right,
not calling derphttp.Client.RunWatchConnectionLoop on an
already-started Client without having set the bool to true. (But still
auto-setting it to true if that's the first method that's been called
on that derphttp.Client, as is commonly the case, and prevents
existing code from breaking)

Fixes tailscale/corp#9916
Supercedes tailscale/tailscale#9719

Co-authored-by: Val <valerie@tailscale.com>
Co-authored-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Anton Tolchanov <anton@tailscale.com>
Signed-off-by: Brad Fitzpatrick <brad@danga.com>
This commit is contained in:
Brad Fitzpatrick
2023-10-25 11:59:06 -07:00
committed by Anton Tolchanov
parent df4b730438
commit 3d7fb6c21d
3 changed files with 43 additions and 32 deletions

View File

@@ -14,20 +14,30 @@ import (
"tailscale.com/types/logger"
)
// RunWatchConnectionLoop loops until ctx is done, sending WatchConnectionChanges and subscribing to
// connection changes.
// RunWatchConnectionLoop loops until ctx is done, sending
// WatchConnectionChanges and subscribing to connection changes.
//
// If the server's public key is ignoreServerKey, RunWatchConnectionLoop returns.
// If the server's public key is ignoreServerKey, RunWatchConnectionLoop
// returns.
//
// Otherwise, the add and remove funcs are called as clients come & go.
//
// infoLogf, if non-nil, is the logger to write periodic status
// updates about how many peers are on the server. Error log output is
// set to the c's logger, regardless of infoLogf's value.
// infoLogf, if non-nil, is the logger to write periodic status updates about
// how many peers are on the server. Error log output is set to the c's logger,
// regardless of infoLogf's value.
//
// To force RunWatchConnectionLoop to return quickly, its ctx needs to
// be closed, and c itself needs to be closed.
// To force RunWatchConnectionLoop to return quickly, its ctx needs to be
// closed, and c itself needs to be closed.
//
// It is a fatal error to call this on an already-started Client withoutq having
// initialized Client.WatchConnectionChanges to true.
func (c *Client) RunWatchConnectionLoop(ctx context.Context, ignoreServerKey key.NodePublic, infoLogf logger.Logf, add func(key.NodePublic, netip.AddrPort), remove func(key.NodePublic)) {
if !c.WatchConnectionChanges {
if c.isStarted() {
panic("invalid use of RunWatchConnectionLoop on already-started Client without setting Client.RunWatchConnectionLoop")
}
c.WatchConnectionChanges = true
}
if infoLogf == nil {
infoLogf = logger.Discard
}
@@ -101,14 +111,6 @@ func (c *Client) RunWatchConnectionLoop(ctx context.Context, ignoreServerKey key
}
for ctx.Err() == nil {
err := c.WatchConnectionChanges()
if err != nil {
clear()
logf("WatchConnectionChanges: %v", err)
sleep(retryInterval)
continue
}
if c.ServerPublicKey() == ignoreServerKey {
logf("detected self-connect; ignoring host")
return