From 28bf53f50235cc88f4ddac71f69bb24b928732e1 Mon Sep 17 00:00:00 2001 From: Brad Fitzpatrick Date: Wed, 5 Jan 2022 11:28:08 -0800 Subject: [PATCH] wgengine/magicsock: reduce disco ping heartbeat aggressiveness a bit Bigger changes coming later, but this should improve things a bit in the meantime. Rationale: * 2 minutes -> 45 seconds: 2 minutes was overkill and never considered phones/battery at the time. It was totally arbitrary. 45 seconds is also arbitrary but is less than 2 minutes. * heartbeat from 2 seconds to 3 seconds: in practice this meant two packets per second (2 pings and 2 pongs every 2 seconds) because the other side was also pinging us every 2 seconds on their own. That's just overkill. (see #540 too) So in the worst case before: when we sent a single packet (say: a DNS packet), we ended up sending 61 packets over 2 minutes: the 1 DNS query and then then 60 disco pings (2 minutes / 2 seconds) & received the same (1 DNS response + 60 pongs). Now it's 15. In 1.22 we plan to remove this whole timer-based heartbeat mechanism entirely. The 5 seconds to 6.5 seconds change is just stretching out that interval so you can still miss two heartbeats (other 3 + 3 seconds would be greater than 5 seconds). This means that if your peer moves without telling you, you can have a path out for 6.5 seconds now instead of 5 seconds before disco finds a new one. That will also improve in 1.22 when we start doing UDP+DERP at the same time when confidence starts to go down on a UDP path. Updates #3363 Change-Id: Ic2314bbdaf42edcdd7103014b775db9cf4facb47 Signed-off-by: Brad Fitzpatrick --- wgengine/magicsock/magicsock.go | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/wgengine/magicsock/magicsock.go b/wgengine/magicsock/magicsock.go index 02269ddc9..ceece58c4 100644 --- a/wgengine/magicsock/magicsock.go +++ b/wgengine/magicsock/magicsock.go @@ -3274,7 +3274,7 @@ type pendingCLIPing struct { // try to keep an established endpoint peering alive. // It's also the idle time at which we stop doing STUN queries to // keep NAT mappings alive. - sessionActiveTimeout = 2 * time.Minute + sessionActiveTimeout = 45 * time.Second // upgradeInterval is how often we try to upgrade to a better path // even if we have some non-DERP route that works. @@ -3282,11 +3282,11 @@ type pendingCLIPing struct { // heartbeatInterval is how often pings to the best UDP address // are sent. - heartbeatInterval = 2 * time.Second + heartbeatInterval = 3 * time.Second // trustUDPAddrDuration is how long we trust a UDP address as the exclusive // path (without using DERP) without having heard a Pong reply. - trustUDPAddrDuration = 5 * time.Second + trustUDPAddrDuration = 6500 * time.Millisecond // goodEnoughLatency is the latency at or under which we don't // try to upgrade to a better path.