wgengine/magicsock: improve endpoint selection for WireGuard peers with rx time

If we don't have the ICMP hint available, such as on Android, we can use
the signal of rx traffic to bias toward a particular endpoint.

We don't want to stick to a particular endpoint for a very long time
without any signals, so the sticky time is reduced to 1 second, which is
large enough to avoid excessive packet reordering in the common case,
but should be small enough that either rx provides a strong signal, or
we rotate in a user-interactive schedule to another endpoint, improving
the feel of failover to other endpoints.

Updates #8999

Co-authored-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>

Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
This commit is contained in:
James Tucker
2023-08-21 17:09:35 -07:00
committed by James Tucker
parent 5edb39d032
commit e1c7e9b736
4 changed files with 177 additions and 86 deletions

View File

@@ -1188,7 +1188,7 @@ func (c *Conn) receiveIP(b []byte, ipp netip.AddrPort, cache *ippEndpointCache)
cache.gen = de.numStopAndReset()
ep = de
}
ep.noteRecvActivity()
ep.noteRecvActivity(ipp)
if stats := c.stats.Load(); stats != nil {
stats.UpdateRxPhysical(ep.nodeAddr, ipp, len(b))
}
@@ -2605,6 +2605,11 @@ var (
// resetting the counter, as the first pings likely didn't through
// the firewall)
discoPingInterval = 5 * time.Second
// wireguardPingInterval is the minimum time between pings to an endpoint.
// Pings are only sent if we have not observed bidirectional traffic with an
// endpoint in at least this duration.
wireguardPingInterval = 5 * time.Second
)
// indexSentinelDeleted is the temporary value that endpointState.index takes while