`prober.DERP` was created in #5988 based on derpprobe. Having used it
instead of derpprobe for a few months, I think we have enough confidence
that it works and can now migrate derpprobe to use the prober framework
and get rid of code duplication.
A few notable changes in behaviour:
- results of STUN probes over IPv4 and IPv6 are now reported separately;
- TLS probing now includes OCSP verification;
- probe names in the output have changed;
- ability to send Slack notification from the prober has been removed.
Instead, the prober now exports metrics in Expvar (/debug/vars) and
Prometheus (/debug/varz) formats.
Fixes https://github.com/tailscale/corp/issues/8497
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
This updates all source files to use a new standard header for copyright
and license declaration. Notably, copyright no longer includes a date,
and we now use the standard SPDX-License-Identifier header.
This commit was done almost entirely mechanically with perl, and then
some minimal manual fixes.
Updates #6865
Signed-off-by: Will Norris <will@tailscale.com>
This was tested by running 10000 test iterations and observing no flakes
after this change was made.
Change-Id: Ib036fd03a3a17800132c53c838cc32bfe2961306
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
By default all probes with the same probe interval that have been added
together will run on a synchronized schedule, which results in spiky
resource usage and potential throttling by third-party systems (for
example, OCSP servers used by the TLS probes).
To address this, prober can now run in "spread" mode that will
introduce a random delay before the first run of each probe.
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
This ensures that each DERP server is probed individually (TLS and STUN)
and also manages per-region mesh probing. Actual probing code has been
copied from cmd/derpprobe.
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
sendAlert will trigger the Incident Response system.
sendWarning will post to Slack.
Co-authored-by: M. J. Fromberger <fromberger@tailscale.com>
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
TLS prober now checks validity period for all server certificates
and verifies OCSP revocation status for the leaf cert.
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
prober: add labels to Probe instances.
This allows especially dynamically-registered probes to have a bunch
more dimensions along which they can be sliced in Prometheus.
Signed-off-by: David Anderson <danderson@tailscale.com>
Turns out, it's annoying to have to wait the entire interval
before getting any monitorable data, especially for very long
interval probes like hourly/daily checks.
Signed-off-by: David Anderson <danderson@tailscale.com>