Redo the testwrapper to track and only retry flaky tests instead
of retrying the entire pkg. It also fails early if a non-flaky test fails.
This also makes it so that the go test caches are used.
Fixes#7975
Signed-off-by: Maisem Ali <maisem@tailscale.com>
The action cache restore process either matches the restore key pattern
exactly, or uses a matching prefix with the most recent date.
If the restore key is an exact match, then no updates are uploaded, but
if we've just computed tests executions for more recent code then we
will likely want to use those results in future runs.
Appending run_id to the cache key will give us an always new key, and
then we will be restore a recently uploaded cache that is more likely
has a higher overlap with the code being tested.
Updates #7975
Signed-off-by: James Tucker <james@tailscale.com>
Benchmark flags prevent test caching, so benchmarks are now executed
independently of tests.
Fixes#7975
Signed-off-by: James Tucker <james@tailscale.com>
Go artifact caching will help provided that the cache remains small
enough - we can reuse the strategy from the Windows build where we only
cache and pull the zips, but let go(1) do the many-file unpacking as it
does so faster.
The race matrix was building once without race, then running all the
tests with race, so change the matrix to incldue a `buildflags`
parameter and use that both in the build and test steps.
Updates #cleanup
Signed-off-by: James Tucker <james@tailscale.com>
We accidentally switched to ./tool/go in
4022796484 which resulted in no longer
running Windows builds, as this is attempting to run a bash script.
I was unable to quickly fix the various tests that have regressed, so
instead I've added skips referencing #7876, which we need to back and
fix.
Updates #7262
Updates #7876
Signed-off-by: James Tucker <james@tailscale.com>
We use it to gate code that depends on custom Go toolchain, but it's
currently only passed in the corp runners. Add a set on OSS so that we
can catch regressions earlier.
To specifically test sockstats this required adding a build tag to
explicitly enable them -- they're normally on for iOS, macOS and Android
only, and we don't run tests on those platforms normally.
Signed-off-by: Mihai Parparita <mihai@tailscale.com>
Github requires explicitly listing every single job within a workflow
that is required for status checks, instead of letting you list entire
workflows. This is ludicrous, and apparently this nonsense is the
workaround.
Signed-off-by: David Anderson <danderson@tailscale.com>
armv5 because that's what we ship to most downstreams right now,
armv7 becuase that's what we want to ship more of.
Fixes https://github.com/tailscale/tailscale/issues/7269
Signed-off-by: David Anderson <danderson@tailscale.com>
CI status doesn't collapse into "everything OK" if a job gets
skipped. Instead, always run the job, but skip its only step in PRs.
Signed-off-by: David Anderson <danderson@tailscale.com>
Replaces the former shell goop, which was a shell reimplementation
of a subset of version/mkversion.
Signed-off-by: David Anderson <danderson@tailscale.com>
OSS-Fuzz doesn't update their version of Go as quickly as we do, so
we sometimes end up with OSS-Fuzz being unable to build our code for
a few weeks. We don't want CI to be red for that entire time, but
we also don't want to forget to reenable fuzzing when OSS-Fuzz does
start working again.
This change makes two configurations worthy of a CI pass:
- Fuzzing works, and we expected it to work. This is a normal
happy state.
- Fuzzing didn't compile, and we expected it to not compile. This
is the "OSS-Fuzz temporarily broken" state.
If fuzzing is unexpectedly broken, or unexpectedly not broken, that's
a CI failure because we need to either address a fuzz finding, or
update TS_FUZZ_CURRENTLY_BROKEN to reflect the state of OSS-Fuzz.
Signed-off-by: David Anderson <danderson@tailscale.com>
Github's matrix runner formats the race variant as '(amd64, true)' if we
use race=true. So, change the way the variable is defined so that it says
'(amd64, race)' even if that makes the if statements a bit more complex.
Signed-off-by: David Anderson <danderson@tailscale.com>
Instead of having a dozen files that contribute CI steps with
inconsistent configs, this one file lists out everything that,
for us, constitutes "a CI run". It also enables the slack
notification webhook to notify us exactly once on a mass breakage,
rather than once for every sub-job that fails.
Signed-off-by: David Anderson <danderson@tailscale.com>