health, wgenegine: fix receive func health checks for the fourth time

The old implementation knew too much about how wireguard-go worked.
As a result, it missed genuine problems that occurred due to unrelated bugs.

This fourth attempt to fix the health checks takes a black box approach.
A receive func is healthy if one (or both) of these conditions holds:

* It is currently running and blocked.
* It has been executed recently.

The second condition is required because receive functions
are not continuously executing. wireguard-go calls them and then
processes their results before calling them again.

There is a theoretical false positive if wireguard-go go takes
longer than one minute to process the results of a receive func execution.
If that happens, we have other problems.

Updates #1790

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
This commit is contained in:
Josh Bleecher Snyder
2021-04-26 17:08:05 -07:00
parent 0d4c8cb2e1
commit 744de615f1
2 changed files with 66 additions and 0 deletions

View File

@@ -1594,6 +1594,8 @@ func (c *Conn) receiveIPv6(b []byte) (int, conn.Endpoint, error) {
// receiveIPv4 receives a UDP IPv4 packet. It is called by wireguard-go.
func (c *Conn) receiveIPv4(b []byte) (n int, ep conn.Endpoint, err error) {
health.ReceiveIPv4.Enter()
defer health.ReceiveIPv4.Exit()
for {
n, ipp, err := c.pconn4.ReadFromNetaddr(b)
if err != nil {
@@ -1646,6 +1648,8 @@ func (c *Conn) receiveIP(b []byte, ipp netaddr.IPPort, cache *ippEndpointCache)
// If the packet was a disco message or the peer endpoint wasn't
// found, the returned error is errLoopAgain.
func (c *connBind) receiveDERP(b []byte) (n int, ep conn.Endpoint, err error) {
health.ReceiveDERP.Enter()
defer health.ReceiveDERP.Exit()
for dm := range c.derpRecvCh {
if c.Closed() {
break