This commit adds tests to validate that there are
issues with how we propagate tag changes in the system.
This replicates #2389
Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
This PR restructures the integration tests and prebuilds all common assets used in all tests:
Headscale and Tailscale HEAD image
hi binary that is used to run tests
go cache is warmed up for compilation of the test
This essentially means we spend 6-10 minutes building assets before any tests starts, when that is done, all tests can just sprint through.
It looks like we are saving 3-9 minutes per test, and since we are limited to running max 20 concurrent tests across the repo, that means we had a lot of double work.
There is currently 113 checks, so we have to do five runs of 20, and the saving should be quite noticeable! I think the "worst case" saving would be 20+min and "best case" probably towards an hour.
This PR investigates, adds tests and aims to correctly implement Tailscale's model for how Tags should be accepted, assigned and used to identify nodes in the Tailscale access and ownership model.
When evaluating in Headscale's policy, Tags are now only checked against a nodes "tags" list, which defines the source of truth for all tags for a given node. This simplifies the code for dealing with tags greatly, and should help us have less access bugs related to nodes belonging to tags or users.
A node can either be owned by a user, or a tag.
Next, to ensure the tags list on the node is correctly implemented, we first add tests for every registration scenario and combination of user, pre auth key and pre auth key with tags with the same registration expectation as observed by trying them all with the Tailscale control server. This should ensure that we implement the correct behaviour and that it does not change or break over time.
Lastly, the missing parts of the auth has been added, or changed in the cases where it was wrong. This has in large parts allowed us to delete and simplify a lot of code.
Now, tags can only be changed when a node authenticates or if set via the CLI/API. Tags can only be fully overwritten/replaced and any use of either auth or CLI will replace the current set if different.
A user owned device can be converted to a tagged device, but it cannot be changed back. A tagged device can never remove the last tag either, it has to have a minimum of one.
This PR changes tags to be something that exists on nodes in addition to users, to being its own thing. It is part of moving our tags support towards the correct tailscale compatible implementation.
There are probably rough edges in this PR, but the intention is to get it in, and then start fixing bugs from 0.28.0 milestone (long standing tags issue) to discover what works and what doesnt.
Updates #2417Closes#2619
On Windows, if the user clicks the Tailscale icon in the system tray,
it opens a login URL in the browser.
When the login URL is opened, `state/nonce` cookies are set for that particular URL.
If the user clicks the icon again, a new login URL is opened in the browser,
and new cookies are set.
If the user proceeds with auth in the first tab,
the redirect results in a "state did not match" error.
This patch ensures that each opened login URL sets an individual cookie
that remains valid on the `/oidc/callback` page.
`TestOIDCMultipleOpenedLoginUrls` illustrates and tests this behavior.
This PR addresses some consistency issues that was introduced or discovered with the nodestore.
nodestore:
Now returns the node that is being put or updated when it is finished. This closes a race condition where when we read it back, we do not necessarily get the node with the given change and it ensures we get all the other updates from that batch write.
auth:
Authentication paths have been unified and simplified. It removes a lot of bad branches and ensures we only do the minimal work.
A comprehensive auth test set has been created so we do not have to run integration tests to validate auth and it has allowed us to generate test cases for all the branches we currently know of.
integration:
added a lot more tooling and checks to validate that nodes reach the expected state when they come up and down. Standardised between the different auth models. A lot of this is to support or detect issues in the changes to nodestore (races) and auth (inconsistencies after login and reaching correct state)
This PR was assisted, particularly tests, by claude code.
- tailscale client gets a new AuthUrl and sets entry in the regcache
- regcache entry expires
- client doesn't know about that
- client always polls followup request а gets error
When user clicks "Login" in the app (after cache expiry), they visit
invalid URL and get "node not found in registration cache". Some clients
on Windows for e.g. can't get a new AuthUrl without restart the app.
To fix that we can issue a new reg id and return user a new valid
AuthUrl.
RegisterNode is refactored to be created with NewRegisterNode() to
autocreate channel and other stuff.
We are already being punished by github actions, there seem to be
little value in running all the tests for both databases, so only
run a few key tests to check postgres isnt broken.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* cmd/hi: add integration test runner CLI tool
Add a new CLI tool 'hi' for running headscale integration tests
with Docker automation. The tool replaces manual Docker command
composition with an automated solution.
Features:
- Run integration tests in golang:1.24 containers
- Docker context detection (supports colima and other contexts)
- Test isolation with unique run IDs and isolated control_logs
- Automatic Docker image pulling and container management
- Comprehensive cleanup operations for containers, networks, images
- Docker volume caching for Go modules
- Verbose logging and detailed test artifact reporting
- Support for PostgreSQL/SQLite selection and various test flags
Usage: go run ./cmd/hi run TestPingAllByIP --verbose
The tool uses creachadair/command and flax for CLI parsing and
provides cleanup subcommands for Docker resource management.
Updates flake.nix vendorHash for new Go dependencies.
* ci: update integration tests to use hi CLI tool
Replace manual Docker command composition in GitHub Actions
workflow with the new hi CLI tool for running integration tests.
Changes:
- Replace complex docker run command with simple 'go run ./cmd/hi run'
- Remove manual environment variable setup (handled by hi tool)
- Update artifact paths for new timestamped log directory structure
- Simplify command from 15+ lines to 3 lines
- Maintain all existing functionality (postgres/sqlite, timeout, test patterns)
The hi tool automatically handles Docker context detection, container
management, volume mounting, and environment variable setup that was
previously done manually in the workflow.
* makefile: remove test integration
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* fix issue auto approve route on register bug
This commit fixes an issue where routes where not approved
on a node during registration. This cause the auto approval
to require the node to readvertise the routes.
Fixes#2497Fixes#2485
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* hsic: only set db policy if exist
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* policy: calculate changed based on policy and filter
v1 is a bit simpler than v2, it does not pre calculate the auto approver map
and we cannot tell if it is changed.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* utility iterator for ipset
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* split policy -> policy and v1
This commit split out the common policy logic and policy implementation
into separate packages.
policy contains functions that are independent of the policy implementation,
this typically means logic that works on tailcfg types and generic formats.
In addition, it defines the PolicyManager interface which the v1 implements.
v1 is a subpackage which implements the PolicyManager using the "original"
policy implementation.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* use polivyv1 definitions in integration tests
These can be marshalled back into JSON, which the
new format might not be able to.
Also, just dont change it all to JSON strings for now.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* formatter: breaks lines
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* remove compareprefix, use tsaddr version
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* remove getacl test, add back autoapprover
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* use policy manager tag handling
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* rename display helper for user
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* introduce policy v2 package
policy v2 is built from the ground up to be stricter
and follow the same pattern for all types of resolvers.
TODO introduce
aliass
resolver
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* wire up policyv2 in integration testing
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* split policy v2 tests into seperate workflow to work around github limit
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* add policy manager output to /debug
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* update changelog
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* set state and nounce in oidc to prevent csrf
Fixes#2276
* try to fix new postgres issue
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Docker releases a patch release which changed the required permissions to be able to do tun devices in containers, this caused all containers to fail in tests causing us to fail all tests. This fixes it, and adds some tools for debugging in the future.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Setup mike to provide versioned builds of the documentation.
The goal is to have versioned docs for stable releases (0.23.0, 0.24.0)
and development docs that can progress along with the code. This allows
us to tailor docs to the next upcoming version as we no longer need to
care about diversion between rendered docs and the latest release.
Versions:
* development (alias: unstable) on each push to the main branch
* MAJOR.MINOR.PATCH (alias: stable, latest for the newest version)
* for each "final" release tag
* for each push to doc maintenance branches: doc/MAJOR.MINOR.PATCH
The default version should the current stable version. The doc
maintenance branches may be used to update the version specific
documentation when issues arise after a release.