Commit Graph

97 Commits

Author SHA1 Message Date
Joe Tsai
49015b00fe
logtail: fix race condition with sockstats label (#8578)
Updates tailscale/corp#8427

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2023-07-11 10:51:51 -07:00
Brad Fitzpatrick
67e912824a all: adjust some build tags for wasi
A start.

Updates #8320

Change-Id: I64057f977be51ba63ce635c56d67de7ecec415d1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-06-11 09:45:46 -07:00
Joe Tsai
84c99fe0d9
logtail: be less aggressive about re-uploads (#8117)
The retry logic was pathological in the following ways:

* If we restarted the logging service, any pending uploads
would be placed in a retry-loop where it depended on backoff.Backoff,
which was too aggresive. It would retry failures within milliseconds,
taking at least 10 retries to hit a delay of 1 second.

* In the event where a logstream was rate limited,
the aggressive retry logic would severely exacerbate the problem
since each retry would also log an error message.
It is by chance that the rate of log error spam
does not happen to exceed the rate limit itself.

We modify the retry logic in the following ways:

* We now respect the "Retry-After" header sent by the logging service.

* Lacking a "Retry-After" header, we retry after a hard-coded period of
30 to 60 seconds. This avoids the thundering-herd effect when all nodes
try reconnecting to the logging service at the same time after a restart.

* We do not treat a status 400 as having been uploaded.
This is simply not the behavior of the logging service.

Updates #tailscale/corp#11213

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2023-05-11 12:52:35 -07:00
Mihai Parparita
4722f7e322 all: move network monitoring from wgengine/monitor to net/netmon
We're using it in more and more places, and it's not really specific to
our use of Wireguard (and does more just link/interface monitoring).

Also removes the separate interface we had for it in sockstats -- it's
a small enough package (we already pull in all of its dependencies
via other paths) that it's not worth the extra complexity.

Updates #7621
Updates #7850

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-04-20 10:15:59 -07:00
Mihai Parparita
edb02b63f8 net/sockstats: pass in logger to sockstats.WithSockStats
Using log.Printf may end up being printed out to the console, which
is not desirable. I noticed this when I was investigating some client
logs with `sockstats: trace "NetcheckClient" was overwritten by another`.
That turns to be harmless/expected (the netcheck client will fall back
to the DERP client in some cases, which does its own sockstats trace).

However, the log output could be visible to users if running the
`tailscale netcheck` CLI command, which would be needlessly confusing.

Updates tailscale/corp#9230

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-04-12 18:40:03 -07:00
Will Norris
62a1e9a44f log/sockstatlog: add delay before writing logs to disk
Split apart polling of sockstats and logging them to disk.  Add a 3
second delay before writing logs to disk to prevent an infinite upload
loop when uploading stats to logcatcher.

Fixes #7719

Signed-off-by: Will Norris <will@tailscale.com>
2023-03-29 13:10:42 -07:00
Mihai Parparita
a2be1aabfa logtail: remove unncessary response read
Effectively reverts #249, since the server side was fixed (with #251?)
to send a 200 OK/content-length 0 response.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-03-08 15:39:04 -08:00
Mihai Parparita
6ac6ddbb47 sockstats: switch label to enum
Makes it cheaper/simpler to persist values, and encourages reuse of
labels as opposed to generating an arbitrary number.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-03-06 15:54:35 -08:00
Maisem Ali
1a30b2d73f all: use tstest.Replace more
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2023-03-04 12:24:55 -08:00
Joe Tsai
283a84724f
types/logid: simplify implementation (#7415)
Share the same underlying implementation for both PrivateID and PublicID.
For the shared methods, declare them in the same order.
Only keep documentation on methods without obvious meaning.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2023-03-02 13:18:04 -08:00
Joe Tsai
7e4788e383
logtail: delete ID types and functions (#7412)
These have been moved to the types/logid package.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2023-03-01 12:18:23 -08:00
Mihai Parparita
9cb332f0e2 sockstats: instrument networking code paths
Uses the hooks added by tailscale/go#45 to instrument the reads and
writes on the major code paths that do network I/O in the client. The
convention is to use "<package>.<type>:<label>" as the annotation for
the responsible code path.

Enabled on iOS, macOS and Android only, since mobile platforms are the
ones we're most interested in, and we are less sensitive to any
throughput degradation due to the per-I/O callback overhead (macOS is
also enabled for ease of testing during development).

For now just exposed as counters on a /v0/sockstats PeerAPI endpoint.

We also keep track of the current interface so that we can break out
the stats by interface.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-03-01 12:09:31 -08:00
Joe Tsai
0d19f5d421
all: replace logtail.{Public,Private}ID with logid.{Public,Private}ID (#7404)
The log ID types were moved to a separate package so that
code that only depend on log ID types do not need to link
in the logic for the logtail client itself.
Not all code need the logtail client.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2023-02-28 19:00:00 -08:00
David Crawshaw
46467e39c2 logtail: allow multiple calls to Shutdown
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2023-02-24 19:18:32 +00:00
Mihai Parparita
f0f2b2e22b logtail: increase maximum log line size in low memory mode
The 255 byte limit was chosen more than 3 years ago (tailscale/corp@929635c9d9),
when iOS was operating under much more significant memory constraints.
With iOS 15 the network extension has an increased limit, so increasing
it to 4K should be fine.

The motivating factor was that the network interfaces being logged
by linkChange in wgengine/userspace.go were getting truncated, and it
would be useful to know why in some cases we're choosing the pdp_ip1
cell interface instead of the pdp_ip0 one.

Updates #7184
Updates #7188

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-02-07 22:00:14 -08:00
Will Norris
71029cea2d all: update copyright and license headers
This updates all source files to use a new standard header for copyright
and license declaration.  Notably, copyright no longer includes a date,
and we now use the standard SPDX-License-Identifier header.

This commit was done almost entirely mechanically with perl, and then
some minimal manual fixes.

Updates #6865

Signed-off-by: Will Norris <will@tailscale.com>
2023-01-27 15:36:29 -08:00
Brad Fitzpatrick
b657187a69 cmd/tailscale, logtail: add 'tailscale debug daemon-logs' logtail mechanism
Fixes #6836

Change-Id: Ia6eb39ff8972e1aa149aeeb63844a97497c2cf04
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-01-15 11:23:28 -08:00
Brad Fitzpatrick
69c0b7e712 ipn/ipnlocal: add c2n handler to flush logtail for support debugging
Updates tailscale/corp#8564

Change-Id: I0c619d4007069f90cffd319fba66bd034d63e84d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2023-01-05 12:06:07 -08:00
Mihai Parparita
8f2bc0708b logtail: make logs flush delay dynamic
Instead of a static FlushDelay configuration value, use a FlushDelayFn
function that we invoke every time we decide send logs. This will allow
mobile clients to be more dynamic about when to send logs.

Updates #6768

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2023-01-04 16:59:25 -08:00
Joe Tsai
35c10373b5
types/logid: move logtail ID types here (#6336)
Many packages reference the logtail ID types,
but unfortunately pull in the transitive dependencies of logtail.
Fix this problem by putting the log ID types in its own package
with minimal dependencies.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2022-11-28 15:25:47 -08:00
Joe Tsai
eff62b7b1b
logtail: remove MustParsePublicID (#6335)
This function is no longer necessary as you can trivially rewrite:

	logtail.MustParsePublicID(...)

with:

	must.Get(logtail.ParsePublicID(...))

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2022-11-16 15:38:27 -08:00
Brad Fitzpatrick
da8def8e13 all: remove old +build tags
The //go:build syntax was introduced in Go 1.17:

https://go.dev/doc/go1.17#build-lines

gofmt has kept the +build and go:build lines in sync since
then, but enough time has passed. Time to remove them.

Done with:

    perl -i -npe 's,^// \+build.*\n,,' $(git grep -l -F '+build')

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-11-04 07:25:42 -07:00
Brad Fitzpatrick
a04f1ff9e6 logtail: default to 2s log flush delay on all platforms
Per chat. This is close enough to realtime but massively reduces
number of HTTP requests. (which you can verify with
TS_DEBUG_LOGTAIL_WAKES and watching tailscaled run at start)

By contrast, this is set to 2 minutes on mobile.

Change-Id: Id737c7924d452de5c446df3961f5e94a43a33f1f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-10-15 09:25:12 -07:00
Brad Fitzpatrick
a315336287 logtail: change batched upload mechanism to not use CPU when idle
The mobile implementation had a 2 minute ticker going all the time
to do a channel send. Instead, schedule it as needed based on activity.

Then we can be actually idle for long periods of time.

Updates #3363

Change-Id: I0dba4150ea7b94f74382fbd10db54a82f7ef6c29
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-10-13 14:45:05 -07:00
Joe Tsai
3af0d4d0f2
logtail: always record timestamps in UTC (#5732)
Upstream optimizations to the Go time package will make
unmarshaling of time.Time 3-6x faster. See:
* https://go.dev/cl/425116
* https://go.dev/cl/425197
* https://go.dev/cl/429862

The last optimization avoids a []byte -> string allocation
if the timestamp string less than than 32B.
Unfortunately, the presence of a timezone breaks that optimization.
Drop recording of timezone as this is non-essential information.

Most of the performance gains is upon unmarshal,
but there is also a slight performance benefit to
not marshaling the timezone as well.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2022-10-05 12:27:52 -07:00
Joe Tsai
c321363d2c
logtail: support a copy ID (#5851)
The copy ID operates similar to a CC in email where
a message is sent to both the primary ID and also the copy ID.
A given log message is uploaded once, but the log server
records it twice for each ID.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2022-10-05 12:25:10 -07:00
Josh Soref
d4811f11a0 all: fix spelling mistakes
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-09-29 13:36:13 -07:00
Eng Zer Jun
f0347e841f refactor: move from io/ioutil to io and os packages
The io/ioutil package has been deprecated as of Go 1.16 [1]. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

Reference: https://golang.org/doc/go1.16#ioutil
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-09-15 21:45:53 -07:00
Joe Tsai
21cd402204
logtail: do not log when backing off (#5485) 2022-08-30 06:21:03 -07:00
Mihai Parparita
f371a1afd9 cmd/tsconnect: make logtail uploading work
Initialize logtail and provide an uploader that works in the
browser (we make a no-cors cross-origin request to avoid having to
open up the logcatcher servers to CORS).

Fixes #5147

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2022-08-04 09:10:20 -07:00
Brad Fitzpatrick
4950fe60bd syncs, all: move to using Go's new atomic types instead of ours
Fixes #5185

Change-Id: I850dd532559af78c3895e2924f8237ccc328449d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-08-04 07:47:59 -07:00
Brad Fitzpatrick
5381437664 logtail, net/portmapper, wgengine/magicsock: use fmt.Appendf
Fixes #5206

Change-Id: I490bb92e774ce7c044040537e2cd864fcf1dbe5a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-08-03 21:35:51 -07:00
Brad Fitzpatrick
48e73e147a logtail,logpolicy: tweak minor cosmetic things
Just reading the code again in prep for some alloc reductions.

Change-Id: I065226ea794b7ec7144c2b15942d35131c9313a8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-07-27 21:13:46 -07:00
Joe Tsai
96f73b3894
logtail: do not panic in PrivateID.PublicID (#4815)
It is not idiomatic for Go code to panic for situations that
can be normal. For example, if a server receives PrivateID
from a client, it is normal for the server to call
PrivateID.PublicID to validate that the PublicID matches.
However, doing so would panic prior to this change.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2022-06-07 17:25:05 -07:00
Brad Fitzpatrick
02e580c1d2 logtail: use http.NewRequestWithContext
Saves some allocs. Not hot, but because we can now.

And a const instead of a var.

Change-Id: Ieb2b64534ed38051c36b2c0aa2e82739d9d0e015
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-05-27 16:33:43 -07:00
Mihai Parparita
3222bce02d logtail: add instance metadata to the entry logtail
Allows instances that are running with the same machine ID (due to
cloning) to be distinguished.

Also adds sequence numbers to detect duplicates.

For tailscale/corp#5244

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2022-05-18 13:57:14 -07:00
Brad Fitzpatrick
9f1dd716e8 tailcfg, logtail: provide Debug bit to disable logtail
For people running self-hosted control planes who want a global
opt-out knob instead of running their own logcatcher.

Change-Id: I7f996c09f45850ff77b58bfd5a535e197971725a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-04-18 13:53:13 -07:00
Josh Bleecher Snyder
0868329936 all: use any instead of interface{}
My favorite part of generics.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2022-03-17 11:35:09 -07:00
Brad Fitzpatrick
5f529d1359 logtail: add Logger.PrivateID accessor
For the control plane to use.

Change-Id: I0f02321fc4fa3a41c3ece3b51eee729ea9770905
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-03-14 20:59:04 -07:00
Brad Fitzpatrick
84138450a4 types/logger, logtail: add mechanism to do structured JSON logs
e.g. the change to ipnlocal in this commit ultimately logs out:

{"logtail":{"client_time":"2022-02-17T20:40:30.511381153-08:00","server_time":"2022-02-18T04:40:31.057771504Z"},"type":"Hostinfo","val":{"GoArch":"amd64","Hostname":"tsdev","IPNVersion":"1.21.0-date.20220107","OS":"linux","OSVersion":"Debian 11.2 (bullseye); kernel=5.10.0-10-amd64"},"v":1}

Change-Id: I668646b19aeae4a2fed05170d7b279456829c844
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-02-18 12:42:06 -08:00
Josh Bleecher Snyder
1dc4151f8b logtail: add MustParsePublicID
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2022-02-14 16:00:17 -08:00
Brad Fitzpatrick
5d9ab502f3 logtail: don't strip verbose level on upload
For analysis of log spam.

Bandwidth is ~unchanged from had we not stripped the "[vN] " from
text; it just gets restructed intot he new "v":N, field.  I guess it
adds one byte.

Updates #1548

Change-Id: Ie00a4e0d511066a33d10dc38d765d92b0b044697
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-02-13 11:30:37 -08:00
Brad Fitzpatrick
86a902b201 all: adjust some log verbosity
Updates #1548

Change-Id: Ia55f1b5dc7dfea09a08c90324226fb92cd10fa00
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2022-02-12 08:51:16 -08:00
Josh Bleecher Snyder
e45d51b060 logtail: add a few new methods to PublicID
These are for use in our internal systems.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2022-01-31 14:14:10 -08:00
Josh Bleecher Snyder
9fe5ece833 logtail: cap the buffer size in encodeText
This started as an attempt to placate GitHub's code scanner,
but it's also probably generally a good idea.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2022-01-13 14:37:27 -08:00
Brad Fitzpatrick
40e2b312b6 ipn/ipnserver, logpolicy: move Windows disk logging up earlier
This moves the Windows-only initialization of the filelogger into
logpolicy. Previously we only did it when babysitting the tailscaled
subprocess, but this meant that log messages from the service itself
never made it to disk. Examples that weren't logged to disk:

* logtail unable to dial out,
* DNS flush messages from the service
* svc.ChangeRequest messages (#3581)

This is basically the same fix as #3571 but staying in the Logf type,
and avoiding build-tagged file (which wasn't quite a goal, but
happened and seemed nice)

Fixes #3570

Co-authored-by: Aaron Klotz <aaron@tailscale.com>
Change-Id: Iacd80c4720b7218365ec80ae143339d030842702
2021-12-16 12:33:04 -08:00
Brad Fitzpatrick
3b541c833e util/clientmetric, logtail: log metric changes
Updates #3307

Change-Id: I1399ebd786f6ff7defe6e11c0eb651144c071574
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-11-16 08:06:31 -08:00
Maisem Ali
05e55f4a0b logtail/filch: limit buffer file size to 50MB
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2021-10-29 13:31:30 -07:00
Josh Bleecher Snyder
94fb42d4b2 all: use testingutil.MinAllocsPerRun
There are a few remaining uses of testing.AllocsPerRun:
Two in which we only log the number of allocations,
and one in which dynamically calculate the allocations
target based on a different AllocsPerRun run.

This also allows us to tighten the "no allocs"
test in wgengine/filter.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-10-28 12:48:37 -07:00
Josh Bleecher Snyder
0c038b477f logtail: add a re-usable buffer for uploads
This avoids a per-upload alloc (which in practice
often means per-log-line), up to 4k.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-08-17 12:56:57 -07:00