This moves the Windows-only initialization of the filelogger into
logpolicy. Previously we only did it when babysitting the tailscaled
subprocess, but this meant that log messages from the service itself
never made it to disk. Examples that weren't logged to disk:
* logtail unable to dial out,
* DNS flush messages from the service
* svc.ChangeRequest messages (#3581)
This is basically the same fix as #3571 but staying in the Logf type,
and avoiding build-tagged file (which wasn't quite a goal, but
happened and seemed nice)
Fixes#3570
Co-authored-by: Aaron Klotz <aaron@tailscale.com>
Change-Id: Iacd80c4720b7218365ec80ae143339d030842702
The code goes to some effort to send a single JSON object
when there's only a single line and a JSON array when there
are multiple lines.
It makes the code more complex and more expensive;
when we add a second line, we have to use a second buffer
to duplicate the first one after adding a leading square brackets.
The savings come to two bytes. Instead, always send an array.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
When tailscaled starts up, these lines run:
func run() error {
// ...
pol := logpolicy.New("tailnode.log.tailscale.io")
pol.SetVerbosityLevel(args.verbose)
// ...
}
If there are old log entries present, they immediate start getting uploaded. This races with the call to pol.SetVerbosityLevel.
This manifested itself as a test failure in tailscale.com/tstest/integration
when run with -race:
WARNING: DATA RACE
Read at 0x00c0001bc970 by goroutine 24:
tailscale.com/logtail.(*Logger).Write()
/Users/josh/t/corp/oss/logtail/logtail.go:517 +0x27c
log.(*Logger).Output()
/Users/josh/go/ts/src/log/log.go:184 +0x2b8
log.Printf()
/Users/josh/go/ts/src/log/log.go:323 +0x94
tailscale.com/logpolicy.newLogtailTransport.func1()
/Users/josh/t/corp/oss/logpolicy/logpolicy.go:509 +0x36c
net/http.(*Transport).dial()
/Users/josh/go/ts/src/net/http/transport.go:1168 +0x238
net/http.(*Transport).dialConn()
/Users/josh/go/ts/src/net/http/transport.go:1606 +0x21d0
net/http.(*Transport).dialConnFor()
/Users/josh/go/ts/src/net/http/transport.go:1448 +0xe4
Previous write at 0x00c0001bc970 by main goroutine:
tailscale.com/logtail.(*Logger).SetVerbosityLevel()
/Users/josh/t/corp/oss/logtail/logtail.go:131 +0x98
tailscale.com/logpolicy.(*Policy).SetVerbosityLevel()
/Users/josh/t/corp/oss/logpolicy/logpolicy.go:463 +0x60
main.run()
/Users/josh/t/corp/oss/cmd/tailscaled/tailscaled.go:178 +0x50
main.main()
/Users/josh/t/corp/oss/cmd/tailscaled/tailscaled.go:163 +0x71c
Goroutine 24 (running) created at:
net/http.(*Transport).queueForDial()
/Users/josh/go/ts/src/net/http/transport.go:1417 +0x4d8
net/http.(*Transport).getConn()
/Users/josh/go/ts/src/net/http/transport.go:1371 +0x5b8
net/http.(*Transport).roundTrip()
/Users/josh/go/ts/src/net/http/transport.go:585 +0x7f4
net/http.(*Transport).RoundTrip()
/Users/josh/go/ts/src/net/http/roundtrip.go:17 +0x30
net/http.send()
/Users/josh/go/ts/src/net/http/client.go:251 +0x4f0
net/http.(*Client).send()
/Users/josh/go/ts/src/net/http/client.go:175 +0x148
net/http.(*Client).do()
/Users/josh/go/ts/src/net/http/client.go:717 +0x1d0
net/http.(*Client).Do()
/Users/josh/go/ts/src/net/http/client.go:585 +0x358
tailscale.com/logtail.(*Logger).upload()
/Users/josh/t/corp/oss/logtail/logtail.go:367 +0x334
tailscale.com/logtail.(*Logger).uploading()
/Users/josh/t/corp/oss/logtail/logtail.go:289 +0xec
Rather than complicate the logpolicy API,
allow the verbosity to be adjusted concurrently.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Part of overall effort to clean up, unify, use link monitoring more,
and make Tailscale quieter when all networks are down. This is especially
bad on macOS where we can get killed for not being polite it seems.
(But we should be polite in any case)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Log levels can now be specified with "[v1] " or "[v2] " substrings
that are then stripped and filtered at the final logger. This follows
our existing "[unexpected]" etc convention and doesn't require a
wholesale reworking of our logging at the moment.
cmd/tailscaled then gets a new --verbose=N flag to take a log level
that controls what gets logged to stderr (and thus systemd, syslog,
etc). Logtail is unaffected by --verbose.
This commit doesn't add annotations to any existing log prints. That
is in the next commit.
Updates #924
Updates #282
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add content length hints to headers.
The server can use these hints to more efficiently select buffers.
Stop attempting to compress tiny requests.
The bandwidth savings are negligible (and sometimes negative!),
and it makes extra work for the server.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Also, bit of behavior change: on non-nil err but expired context,
don't reset the consecutive failure count. I don't think the old
behavior was intentional.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We want to run bo.Backoff() after every upload, regardless. If
upload==true but err!=nil, we weren't backing off, which caused some
very-high-throughput log upload retries in bad network conditions.
Updates #282.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
If a test calls log.Printf, 'go test' horrifyingly rearranges the
output to no longer be in chronological order, which makes debugging
virtually impossible. Let's stop that from happening by making
log.Printf panic if called from any module, no matter how deep, during
tests.
This required us to change the default error handler in at least one
http.Server, as well as plumbing a bunch of logf functions around,
especially in magicsock and wgengine, but also in logtail and backoff.
To add insult to injury, 'go test' also rearranges the output when a
parent test has multiple sub-tests (all the sub-test's t.Logf is always
printed after all the parent tests t.Logf), so we need to screw around
with a special Logf that can point at the "current" t (current_t.Logf)
in some places. Probably our entire way of using subtests is wrong,
since 'go test' would probably like to run them all in parallel if you
called t.Parallel(), but it definitely can't because the're all
manipulating the shared state created by the parent test. They should
probably all be separate toplevel tests instead, with common
setup/teardown logic. But that's a job for another time.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
We'll be fixing the server so this won't trigger in practice,
but it demos the connection reuse problem.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Its semantics has changed slightly, this will let us use it to
drive batched logging in special circumstances.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>