Skip to content
This repository was archived by the owner on Jul 13, 2025. It is now read-only.

Fork Sync: Update from parent repository#36

Open
github-actions[bot] wants to merge 1667 commits into
MultiMx:mainfrom
tailscale:main
Open

Fork Sync: Update from parent repository#36
github-actions[bot] wants to merge 1667 commits into
MultiMx:mainfrom
tailscale:main

Conversation

@github-actions

Copy link
Copy Markdown

No description provided.

franbull and others added 30 commits May 8, 2026 08:12
If a DNS query for a domain that should be routed through a connector
results in CNAME records in the response, collapse the CNAME chain to an
A/AAAA record for the domain -> magic IP.

Fixes tailscale/corp#39978

Signed-off-by: Fran Bull <fran@tailscale.com>
…9660)

When a peer is not able to connect to control after a restart and is
using a cached netmap, that nodes should be able to connect to another
peer in its tailnet (given that the home DERP of that peer has not
changed in the meantime).

Add test that starts two peers and connects them to a tailnet with
caching enabled. Then blackhole traffic to control from one peer and
restart it. Verify that the connection between the two ends up direct.

Adds facilities for expecting a certain path type between nodes.

Updates: #19597

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Updates tailscale/corp#39975

Signed-off-by: Fran Bull <fran@tailscale.com>
Make it possible to remove the least recently used expired address
assignment from addrAssignments.
Before checking out a new address from the IP pools, return a handful of
expired addresses.

Updates tailscale/corp#39975

Signed-off-by: Fran Bull <fran@tailscale.com>
There is a 30-second timeout set on client TLS connections but the handshake was
called on the wrong connection and so the timeout was never used in practice.

Signed-off-by: Francois Marier <francois@fmarier.org>
…-generic core

Splits SubscriberFunc[T] into:

  - SubscriberFunc[T]: a thin user-facing facade that holds only a
    pointer to a non-generic core. It exposes Close() to user code,
    which forwards to the core.
  - subscriberFuncCore: a non-generic struct that owns all the
    subscriber state (stop flag, unregister, logf, slow timer,
    cached reflect.Type) and implements the bus's package-private
    subscriber interface. Its dispatch() invokes a closure
    captured at construction time that performs the
    vals.Peek().Event.(T) type assertion and runs the user
    callback on the unboxed value.

The bus's outputs map and subscriber-interface itab are
parameterized only by *subscriberFuncCore, not by T, eliminating
both the per-T itab and the per-T generic dictionary that
previously scaled with the number of subscribed event types.

Measured impact (util/eventbus/sizetest):

  total per-flow binary cost:
    linux/amd64:  3039.2 B/flow -> 2252.8 B/flow  (-786.4 B / -25.9%)
    linux/arm64:  3145.7 B/flow -> 2228.2 B/flow  (-917.5 B / -29.2%)

  SubscriberFunc per-receiver attribution:
    linux/amd64:   840.8 B/flow ->  300.8 B/flow  (-540.0 B / -64.2%)
    linux/arm64:   849.9 B/flow ->  303.8 B/flow  (-546.1 B / -64.3%)

Dropped per-T symbols (200-flow eventbus binary):

  - (*SubscriberFunc[T]).dispatch     was 26,639 B total (130 B/T)
  - (*SubscriberFunc[T]).subscribeType was  3,600 B total ( 18 B/T)
  - .dict.SubscriberFunc[T]            was 14,400 B total ( 72 B/T)
  - go:itab.*SubscriberFunc[T],...     was  9,600 B total ( 48 B/T)

Of the original 913 B/flow attributed to SubscriberFunc, 540 B/flow
is now gone, dropping the receiver to 300 B/flow.

Behavior is unchanged: BenchmarkBasicThroughput is within noise
(1955 -> 1941 ns/op on the test box) and all eventbus tests pass.

Updates #12614

Change-Id: I646b3b05fd8d95f9afead59bfd0f69cd18b7a709
Signed-off-by: James Tucker <james@tailscale.com>
…ic core

Mirrors the same refactor previously applied to SubscriberFunc:

  - Publisher[T]: a thin user-facing facade. Holds a pointer to a
    non-generic publisherCore and exposes Publish/Close/ShouldPublish.
  - publisherCore: a non-generic struct that owns the *Client back-
    pointer, stop flag, and cached reflect.Type. It implements the
    package-private publisher interface (publishType, Close).
    The bus's per-Client publisher set is set.Set[publisher] keyed
    on this single non-generic type.

The publisher interface only exists to support diagnostic
introspection (Debugger.PublishTypes returning the list of types a
client publishes). Previously, satisfying that diagnostic-only
interface forced *Publisher[T] to be the implementor and cost a
per-T itab, generic dictionary, and equality function on every
event type ever passed through Publish[T]. Moving the
implementation to a non-generic core lets the diagnostic surface
work unchanged while charging zero per-T cost for the
diagnostic-driven generic interface.

Publisher[T].Publish is also slimmed: the channel/select/stopFlag
loop is now a non-generic publish() helper that takes the value as
'any'. The per-T body is reduced to forwarding the boxed value to
the helper.

Measured impact (util/eventbus/sizetest):

  total per-flow binary cost:
    linux/amd64:  2252.8 B/flow -> 1900.5 B/flow  (-352.3 B / -15.6%)
    linux/arm64:  2228.2 B/flow -> 1835.0 B/flow  (-393.2 B / -17.6%)

  Publisher per-receiver attribution:
    linux/amd64:   635.2 B/flow ->  369.6 B/flow  (-265.6 B / -41.8%)
    linux/arm64:   751.7 B/flow ->  373.2 B/flow  (-378.5 B / -50.4%)

Cumulative reduction from the original baseline (5167ff412):
    linux/amd64:  3096.6 B/flow -> 1900.5 B/flow  (-1196.1 B / -38.6%)
    linux/arm64:  3145.7 B/flow -> 1835.0 B/flow  (-1310.7 B / -41.7%)

Dropped per-T symbols (200-flow eventbus binary):

  - .dict.Publisher[T]                   was 14,400 B (72 B/T)
  - type:.eq.Publisher[T]                was 11,832 B (58 B/T)
  - go:itab.*Publisher[T],publisher      was  8,000 B (40 B/T)
  - (*Publisher[T]).Close shape stencils collapsed to 1

Behavior is unchanged: BenchmarkBasicThroughput is within noise
(2018 -> 2038 ns/op at -benchtime=2s) and all eventbus tests pass.

Updates #12614

Change-Id: I61979c2bf95d2a711c2321e6e0b4b7d15980e9f5
Signed-off-by: James Tucker <james@tailscale.com>
The natlab vmtest suite (tstest/natlab/vmtest) and the integration nat
tests are gated behind --run-vm-tests because they need KVM and are
slow. Until now nothing in CI exercised them apart from a single
canary TestEasyEasy run on every PR.

Add .github/workflows/natlab-test.yml that runs the full opt-in suite
on demand (workflow_dispatch), on PRs labeled "natlab", and on main
every 12 hours via cron. The workflow has two phases:

  - "prepare" builds the gokrazy VM image, downloads the Ubuntu and
    FreeBSD cloud images once via the new natlabprep tool, and emits
    a dynamic JSON matrix of every TestX function it finds in the two
    opt-in packages.
  - "test" is a per-test matrix that depends on prepare. Each matrix
    job restores the shared caches and runs a single test, so adding
    a new TestFoo is automatically picked up on the next run without
    any workflow edits.

Rename the existing natlab-integrationtest.yml to natlab-basic.yml
since it's the small smoke variant (just TestEasyEasy on every PR);
the new natlab-test.yml is the bigger suite. The job inside is
renamed to EasyEasy for the same reason.

Move the macOS arm64 host check from vmtest.Env.Start into
vmtest.Env.AddNode so a test that adds a vmtest.MacOS node skips
immediately on a non-macOS host, and add an explicit
skipIfNotMacOSArm64 helper at the top of the two macOS-only tests
so the platform requirement is obvious to readers.

Quiet the takeAgentConnOne miss log in tstest/natlab/vnet by default
(it was the overwhelming majority of bytes in CI logs, with no signal
in healthy runs) and replace it with a periodic "still waiting" line
that only fires after 10s, so a truly stuck agent connection still
surfaces.

Updates #13038

Change-Id: I4582098d8865200fd5a73a9b696942319ccf3bf0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
startCloudQEMU hardcoded -machine q35,accel=kvm and -cpu host,
which fails on any host without KVM (notably macOS). Replace
with a qemuAccelArgs helper that probes /dev/kvm and falls back
to QEMU's TCG software emulation, matching the pattern already
used by tstest/integration/nat. Also wire the helper into
startGokrazyQEMU so gokrazy VMs pick up KVM when available.

Updates #13038

Change-Id: I7745518db823279b1880957bb14ca2ffdaab4c50
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
macOS limits Unix socket paths to 104 bytes. The Go test TempDir
path (e.g. /var/folders/.../TestDirectConnection...679197086/001/)
easily exceeds that, causing "bind: invalid argument". Create a
short /tmp/vmtest* directory for all socket files (vnet, QMP,
dgram) so the paths stay well under the limit on every platform.

Updates #13038

Change-Id: I721d24561d1766aaa964692bc77f40a131aa9455
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
…d cache type name

Two changes that share the same intent of reducing per-T duplication
in code that doesn't actually depend on T:

1. Hoist the non-generic portion of newSubscriberFunc[T] into a
   newSubscriberFuncCore() helper. The hoisted work is the time
   timer setup, the subscriberFuncCore allocation, and the
   unregister closure (which captures only the non-generic
   reflect.Type and *subscribeState). The generic body now does
   only the two T-bound things it has to: compute reflect.TypeFor[T]
   and create the dispatch closure.

   Effect on the per-shape-stencil body of newSubscriberFunc[T]:
     before: 523 B per shape (in synthetic test)
     after:  293 B per shape (-230 B per shape; -56% on this body)

2. Cache reflect.Type.String() once at construction (in core.typeName)
   instead of recomputing it every time the dispatch closure runs.
   The dispatch closure also now takes the *subscriberFuncCore directly
   rather than building an intermediate dispatchFuncState struct on
   every call.

   Effect on the dispatch closure body (newSubscriberFunc[T].func1):
     before: 581 B per shape
     after:  480 B per shape (-101 B per shape; -17%)

Combined effect on tailscaled (linux/amd64):
  named-symbol savings via symcost: ~7 KB
  stripped binary delta:            -8 KB (page-quantized)
  arm64 binary delta:                0 (page-quantized)

  cumulative reduction from baseline (5167ff412):
    linux/amd64:  -110,592 bytes (-0.391%)
    linux/arm64:  -131,072 bytes (-0.499%)

Throughput is also improved by the typeName cache: BenchmarkBasic
goes from 2018 ns/op to 1864 ns/op (-7.6%) because the dispatch hot
path no longer allocates a string on every event.

Updates #12614

Change-Id: Ib3a3d6796785e16506330ec034e1144580d467a3
Signed-off-by: James Tucker <james@tailscale.com>
…onnectivity (#19699)

Add new clientmetric counters for establishing contact with peers while using
cached network map data. To do this, instrument the magicsock.Conn with a bit
to indicate whether its peer data came from a cached netmap. If so, there are
two conditions we will count as establishing connectivity to a peer:

  - Receipt of a CallMeMaybe from a peer via disco.
  - Establishing a valid endpoint address for a peer.

In vmtest, add Env.ClientMetrics to scrape metrics from the specified node.
Use this to check that counters were updated in caching tests.

Updates tailscale/projects#13
Updates #12639

Change-Id: Ie8cf3244ac8af4f5bcfe4d0d944078da2ba08990
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Fixes #12778

Change-Id: If9f8b299cef0cb68f93b344845b5c6a5b7554d2c
Signed-off-by: DeedleFake <deedlefake@users.noreply.github.com>
…services

Adds two new cap resolution methods alongside the existing PeerCaps:

PeerCapsForService(src netip.Addr, svcName tailcfg.ServiceName) resolves
the service name to its VIP addresses via the node's service IP mappings
and returns caps scoped to that service. Exposed on /v0/whois via the
svc_name query parameter and on client/local.Client as WhoIsForService.

PeerCapsForIP(src, dst netip.Addr) resolves caps against an arbitrary
destination IP. Exposed on /v0/whois via the svc_addr query parameter
and on client/local.Client as WhoIsForIP.

svc_name takes priority over svc_addr when both are present. Invalid
values for either return 400. The existing PeerCaps/WhoIs path is
unchanged: without a service parameter, WhoIs returns only host-level
caps.

Updates tailscale/corp#41632

Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
Replace the process-global Server.mu lookup in the packet send hot path
with a global hashtriemap mirror of local clientSet entries. The
authoritative clients map remains guarded by Server.mu; clientsAtomic is
only a lock-free fast path for active local clients.

Misses, stale inactive client sets, duplicate accounting, and mesh
forwarding still fall back to lookupDestUncached. This avoids taking
Server.mu for the common local active-client send path, at the cost of
adding one global concurrent map that mirrors Server.clients for local
peers.

The benchmark uses four destination peers. The before run sets
TS_DEBUG_DERP_DISABLE_PEER_HASHTRIE=true to force the old mutex lookup
path; the after run uses the hashtrie fast path.

    goos: linux
    goarch: amd64
    pkg: tailscale.com/derp/derpserver
    cpu: Intel(R) Xeon(R) 6975P-C
                          │    before     │                after                │
                          │    sec/op     │   sec/op     vs base                │
    LookupDestHashTrie-16   176.050n ± 1%   1.904n ± 6%  -98.92% (p=0.000 n=10)

                          │   before   │             after              │
                          │    B/op    │    B/op     vs base            │
    LookupDestHashTrie-16   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
    ¹ all samples are equal

                          │   before   │             after              │
                          │ allocs/op  │ allocs/op   vs base            │
    LookupDestHashTrie-16   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
    ¹ all samples are equal

Updates #3560 (very indirectly, historically)
Updates #19713 (as an alternative to that PR)

Change-Id: Ifb72e5c9854ad00e938cd24c6ab9c27312f297e8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This fixes a log message where ipn/ipnlocal.shouldUseOneCGNATRoute
would claim that an android machines was actually macOS.

Updates #cleanup
Updates #19652

Signed-off-by: Simon Law <sfllaw@tailscale.com>
…19721)

This patch fixes a data race in wgengine/netstack that surfaced while
running both TestTCPForwardLimits and TestTCPForwardLimits_PerClient.
Because these two tests both setup the TS_DEBUG_NETSTACK envknob, a
race happens because netstack.Impl.Close leaked its inject goroutine.
The inject goroutine also reads the TS_DEBUG_NETSTACK envknob, so if
it is still running when the next test starts, then it will break.

This patch also cleans up the tests a bit, ensuring that neither of
them run in T.Parallel. It also adds a T.Cleanup call to clear the
envknob.

Fixes #19720

Signed-off-by: Simon Law <sfllaw@tailscale.com>
Fixes tailscale/corp#40250

Signed-off-by: Fran Bull <fran@tailscale.com>
)

Instead of having two entry points for running natlab tests, start
converting the connectivity tests to use the vmtest framework.

Grid and pair tests have yet to be moved over.

Updates #13038

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
A missing hosts file is not a fatal error. We should log it, but still proceed
and create a new one instead of failing the DNS reconfiguration completely.

Fixes #19733

Signed-off-by: Nick Khyl <nickk@tailscale.com>
Adds a new NoiseRoundTripper field to tsd.Sys
to expose an http.RoundTripper to make requests
over the control plane Noise connection.

This will be used in PAM use cases soon.

Updates tailscale/corp#41800

Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
…ns unchanged

Warnables with a non-zero TimeToVisible are only published on the eventbus when
they remain unhealthy long enough to become visible.

However, we still publish a health.Change when a warning that was never visible
(and was never published to the eventbus) becomes healthy.

This PR fixes that and reduces churn when there is no actual state change. In
particular, it avoids unnecessary IPN bus notifications sent to GUI/CLI clients,
captive portal detection, etc.

Updates tailscale/corp#39759 (noticed while working on it)

Signed-off-by: Nick Khyl <nickk@tailscale.com>
Server.clientsAtomic was introduced in 6b72979 as a lock-free
mirror of Server.clients to skip Server.mu on the packet send hot
path. This drops the non-concurrent map and makes all the existing
callers of the old plain map just use the concurrent map, but still
holding Server.mu.

BenchmarkLookupDestHashTrie is unchanged at ~2ns/op.

Fixes #19726

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I0894e4d86914d152b9b5fef969a3184bcb96f678
…etry

Brings Subscriber[T] in line with the same non-generic-core pattern already
applied to SubscriberFunc[T] and Publisher[T]:

  - Renames subscriberFuncCore to subscriberCore and shares it between
    Subscriber[T] and SubscriberFunc[T]. Both typed facades hold a
    *subscriberCore plus their respective per-T delivery state
    (Subscriber: chan T; SubscriberFunc: nothing, the user callback is
    captured in the dispatch closure).

  - The bus's outputs map and subscriber-interface itab key on
    *subscriberCore for both subscriber kinds, so adding a new Subscribe[T]
    call site no longer pays a per-T itab, dictionary, or equality function
    for the subscriber-interface side.

  - Subscribe[T] now hoists the non-generic constructor portion into
    newSubscriberCore (timer setup, core allocation, cached type/typeName,
    unregister method-value), matching SubscribeFunc.

The dispatch loop is intentionally NOT extracted to a non-generic helper for
Subscriber[T], unlike SubscriberFunc[T]. The reason is the typed channel send
'case s.read <- t:' must appear lexically inside the select; the only way to
lift it into a non-generic loop is to bridge typed and untyped via a per-event
goroutine, which costs ~2.7x throughput on BenchmarkBasicThroughput. We keep
dispatchTyped on the generic facade and accept the per-shape stencil cost as
the cheaper alternative.

Symbol-level effect on tailscaled (linux/amd64, measured via
`go tool nm -size`):

  Before:
    (*Subscriber[T]).dispatch
      2 shape stencils:        1,682 + 1,549 = 3,231 B
      3 thin per-T wrappers:   124 B each   =   372 B
      2 deferwrap1 helpers:    62 B each    =   124 B
      total:                                 3,727 B

  After:
    (*Subscriber[T]).dispatchTyped
      2 shape stencils:        1,678 + 1,582 = 3,260 B
      0 per-T wrappers (replaced by closure stored on core)
      2 deferwrap1 helpers:    62 B each    =   124 B
      total:                                 3,384 B

  dispatch path .text delta:                   -343 B (-9.2%)

Per-shape stencils are ~1,600 B (.text body) + ~1,100 B (pclntab) =
~2,700 B each on production tailscaled. The shape count matches before/after
(two distinct GC shapes for the Subscriber[T] event types in this binary).
What changes is that the per-T thin wrappers are eliminated because
Subscriber[T] no longer implements the subscriber interface directly.

Whole-binary section deltas:

  .text:        -2,304 B  (includes the dispatch savings plus other
                            small downstream effects)
  .rodata:        +512 B  (additional closure-type metadata)
  .gopclntab:   -2,981 B  (fewer per-T compiled functions => less metadata)

Stripped tailscaled (linux/amd64): no change at the file level (the savings
fall below the linker's section-alignment boundary). Unstripped builds shrink
by ~2,900 B.

Behavior is unchanged:
  BenchmarkBasicThroughput:       2,161 ns/op,  0 B/op,  0 allocs/op
  BenchmarkBasicFuncThroughput:   2,493 ns/op, 144 B/op, 2 allocs/op
  BenchmarkSubsThroughput:        3,727 ns/op,  0 B/op,  0 allocs/op

Updates #12614

Change-Id: I97918ec68bd2cdb15958bbfd7687592b39663efe
Signed-off-by: James Tucker <james@tailscale.com>
…eck (#19725)

Fix the following issues:

1. Endianness Bug: The nftables runner used hardcoded
   big-endian byte arrays for firewall mark values (0xff0000, etc.), breaking
   bitwise operations on little-endian systems (all x86/x64, ARM). This caused
   connmark save/restore rules to silently fail. Fixed by using
   binary.NativeEndian to generate correct byte order for the host system.

2. Connmark Restore Conditional Check: The connmark restore
   mechanism unconditionally overwrote packet marks, even when Tailscale
   hadn't set any mark bits in conntrack. This destroyed mark bits set by
   other systems (VPNs, policy routing, vendor flags), breaking coexistence.
   Fixed by adding a conditional check to only restore when (ct mark &
   0xff0000) != 0, preventing the worst case of wiping all marks to zero.

Changes:
- util/linuxfw/linuxfw.go: Added nativeEndianUint32() helper and updated
  all mask functions to use native byte order instead of hardcoded bytes
- util/linuxfw/nftables_runner.go: Added conditional check in
  makeConnmarkRestoreExprs() to only restore when ct mark has Tailscale
  bits set; added detailed comment about bit preservation limitations
- util/linuxfw/iptables_runner.go: Added conditional check using -m
  connmark ! --mark to match nftables behavior
- Tests updated: Fixed byte-level regression tests to expect little-endian
  byte sequences and verify the new conditional check

Note: Perfect bit preservation in nftables remains challenging
due to nftables expression VM limitations. The current implementation
prevents the critical case of wiping marks with zero.

Updates #3310
Fixes #11803
Related to #8555

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
The codegen path for map-of-slice-of-pointer fields, skipped
nil-valued entries. That dropped the key from the map.

This broke how dns.Config.Routes uses nil values sentinels.

Fixes #19730
Fixes #19732
Fixes #19746
Fixes #19744

Change-Id: Ic6400227f4ab21b3ca0e8c0eeecf9b83d145a9ab

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
The label "natlab" is a bit confusing and also used for other things.
Instead, change the trigger label to "run-natlab-tests".

Updates #13038

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
In a lot of places, we construct an error to End a step, then immediately log
it to the governing test as test fatal. Save ourselves a bit of boilerplate by
putting methods on Step for that.

There are a couple cases this doesn't cover, e.g., where we construct the Step
outside a subtest that wants to fail individually, but it helps enough to pay
for its lines.

Updates #13038

Change-Id: I71f9900942962de16609b6b198d3ba13d6958a5f
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
…#19758)

Their version scheme is different, even though the OS is based on
Ubuntu. We need to check Zorin's version numbers to pick the right
APT_KEY_TYPE.

Updates #18925

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Add a VM-based natlab test that exercises the peer-relay feature
(feature/relayserver) end-to-end across three Tailscale nodes whose
network topology makes a direct A<->B UDP path impossible: both peers
are behind HardNAT (FreeBSD/pfSense-style endpoint-dependent NAT) with
no port-mapping services, while the relay node is behind One2OneNAT so
its STUN-discovered WAN endpoint is reachable from both peers. The
test enables the relay server via EditPrefs, then waits for an a->b
PingDisco whose PingResult.PeerRelay is set (proving magicsock chose
the peer-relay path, not DERP), and finally asserts that the relay's
DebugPeerRelaySessions LocalAPI reports the session.

The existing TestPeerRelayPing in tstest/integration runs three
tailscaled processes on the loopback interface with no NATs; this new
vmtest covers peer relay through real per-VM kernels and NATs.

To wire control-server capabilities into vmtest, also add a
PeerRelayGrants() EnvOption (sibling of AllOnline,
SameTailnetUser) that flips testcontrol.Server.PeerRelayGrants so the
wildcard packet filter grants tailcfg.PeerCapabilityRelay and
PeerCapabilityRelayTarget; without those caps magicsock won't consider
any peer a candidate relay.

Updates #13038

Change-Id: Ib3440b83ec442da0d3b89ffa48ceea9398ea9062
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
knyar and others added 30 commits June 22, 2026 12:28
When recommending an exit node, suggestExitNodeLocked ranks candidates by
the latency to their home DERP region, taken from the most recent netcheck
report. But netcheck alternates between full reports, which probe every
region, and incremental reports, which only re-probe the home region and a
handful of the fastest regions. When the most recent report is incremental,
the suggestion fell back to a random for exit nodes that are far away.

Now we rank candidates against the best recent latency, tracked by the
`netcheck.Client` - the same data that is used to pick the preferred
DERP. It uses a history of measurements which includes a full netcheck
report, so should cover all DERP regions.

Updates tailscale/corp#17516

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
suggestExitNodeLocked now ranks exit node candidates using the per-region
latency tracked by the netcheck Client (RecentRegionLatency), which merges
the reports retained in c.prev. That history is only useful for far-away
regions if it contains a full netcheck report, since incremental reports
only re-probe the home region and a handful of the fastest ones.

The full-report cadence in GetReport and the c.prev retention window were
two independent 5-min constants - the way we schedule netchecks ensured
that the history always contaned a full report, but it was not a strong
contract and we did not have any checks around this.

Now full report interval and retention window are driven by the same
var, and a test confirms that the history contains a full report.

Updates tailscale/corp#17516

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Fix leaking peers that failed to complete the handshake.

Updates #20183

Change-Id: I84f7ea0484f05b090d963a7d12c135a66a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
Outbound packets produced by netstack (used by tailscaled with
--tun userspace-networking, by tsnet, and by the SOCKS5/HTTP proxies)
enter the wrapper via InjectOutbound{,PacketBuffer} and take the
injectedRead path, which bypasses Filter.RunOut.

RunOut's side effect for UDP/SCTP is to insert the reverse-flow tuple
into the connection-tracking LRU so that Filter.RunIn admits inbound
replies that no explicit ACL rule covers. Skipping it on the injected
path meant a netstack-side dial of UDP would send fine but the reply
would be dropped as "no matching rule". The kernel-TUN path was
already fine because it goes through RunOut.

Fixes #14229
Fixes #20064

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I816ef55c493a12ff4f561cd89c095559b5c2743b
Both tests started flaking after my 9107354 ("tstest/natlab/vnet:
send unsolicited IPv6 Router Advertisements") added background RA
traffic on v6-enabled networks.

TestPacketSideEffects races the periodic unsolicited-RA goroutine
against its synchronous packet-count assertions: when the multicast
RA fires after the test has registered its sinks, both sinks receive
it and "got 1 packet, want N" becomes "got N+2".

TestProtocolQEMU's reader was doing raw Read on the SOCK_STREAM unix
socket and comparing the whole result to the expected length-prefixed
packet. The kernel is free to coalesce the on-register RA frame and
the test packet into one Read, in which case bytes.Equal fails and
the entire chunk (including the test packet's bytes) gets discarded
as "unexpected", leading to a 5s i/o timeout. Parse the QEMU uint32
length-prefix framing with io.ReadFull instead so we read exactly one
frame per iteration regardless of how the kernel buffers them. The
SOCK_DGRAM path (TestProtocolUnixDgram) keeps the original raw Read
since datagram boundaries are preserved.

These where the top two flakes in oss on the flakes dashboards.

Updates #13038

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I32983656b692921a0f43a4a5e9a8a6ab2555ee49
The ProxyGroup HA Service reconciler's validateService scanned every
Service in the cluster with shouldExpose=true for duplicate hostnames.
With multi-tailnet (Tailnet CRD) support, that scan reaches across
tailnet boundaries:

  * A Service exposed via the single-proxy path (tailscale.com/expose)
    on the primary tailnet would block a ProxyGroup ingress Service
    for the same hostname on a secondary tailnet, even though the two
    live in different reconcilers and different tailnet DNS namespaces.

  * Two ProxyGroups joined to different tailnets via spec.tailnet
    would also block one another for shared hostnames, again despite
    living in separate DNS namespaces.

In both cases the ProxyGroup ingress Service was silently dropped
(IngressSvcInvalid event raised, queue cleared, ConfigMap never
written, ProxyGroup never serves the backend).

This change tightens the check in two ways:

  * Skip Services that aren't themselves managed by the ProxyGroup
    reconciler (use isTailscaleService instead of shouldExpose).
  * For ProxyGroup-managed Services attached to a different
    ProxyGroup, look up that ProxyGroup and skip the duplicate
    report when spec.Tailnet differs from the current one. Fall
    through and flag the collision on lookup failure so genuine
    duplicates are not silently allowed.

Adds regression tests covering both the single-proxy and the
different-tailnet cases. Updates the existing TestValidateService
expected error to reflect the rephrased message.

Updates #20069

Signed-off-by: tsushanth <78000697+tsushanth@users.noreply.github.com>
aa5da2e (in the 1.99.x dev series, unstable) introduced some bugs,
only some of which were later fixed. This fixed another. As of that
change, tkaFilterNetmapLocked ran only on full netmaps through
LocalBackend.setClientStatusLocked and not peer upserts via new or
changed peers. The later ae74364 fixed a regression in the
Engine layer but didn't fix the tkaFilter code from re-running on
upserts.

This add a tkaFilterDeltaMutsLocked pass before
nodeBackend.UpdateNetmapDelta. For each NodeMutationUpsert whose
peer fails the same signature check tkaFilterNetmapLocked applies,
rewrite the upsert in place into a NodeMutationRemove targeting the
same node ID, so magicsock's per-mutation dispatch and
nodeBackend.peers both drop the peer, matching the prior full-netmap
semantics.

New tsnet tests added:

  - TestTailnetLockFiltersUnsignedDeltaPeer covers the new-peer
    case.
  - TestTailnetLockFiltersUnsignedDeltaPeerReplacement covers the
    existing-peer-replacement case, to an empty signature.
  - TestTailnetLockFiltersDeltaPeerWithInvalidSignature like above
    but with a bogus signature.

Updates #12542
Updates tailscale/corp#43767

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ib35d0391541fee654867c26489847dbc5b7e2ae8
The test transferred only 64 KiB over loopback, which can complete
within a single clock tick on fast CI machines, causing
time.Since(start).Seconds() to return 0 and the
"transfer_time_seconds_total > 0" assertion to fail.

Increase the payload to 1 MiB so zero is genuinely implausible, and
retry up to 3 additional times. If the metric is still zero after 4
total attempts, fail hard — at that size it means the timing logic is
actually broken.

Fixes #20213

Change-Id: I3fab510ce8c567506fea5ad803d35acf40d65700
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This applies the same treatment from 8f21045 (netlog) to wglog,
ending use of netmap.NetworkMap and instead getting the canonical data
from LocalBackend/nodeBackend.

This is a dependency to removing the netmap.NetworkMap from
upstream callers, like wgengine.Engine in general.

Updates #12542

Change-Id: Icb5af0799322def048a6f594b49f7d11273f025d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This applies the same treatment from PR #20162 (netlog) and
PR #20171 (wglog) to the local Taildrive filesystem wiring, ending the
per-netmap-update O(n) rebuild of the drive remotes list.

This moves the O(n peers) taildrive-remote list rebuild from every
peer change (which previously happened regardless of whether you were
even using taildrive) to instead happen only as needed.

That running on every netmap update and was a contributor to the
broader quadratic behavior we want to eliminate when a single peer is
added or removed.

Instead, this introduces drive.RemoteSource, a small interface the
Taildrive filesystem pulls from lazily on incoming WebDAV requests,
and caches by a generation counter. ipn/ipnlocal installs a
driveRemoteSource once at NewLocalBackend time and bumps
LocalBackend.driveGen on the three events that can actually flip the
drive-capable peer set: full netmap installs (domain + self caps),
UpdateNetmapDelta (peer add/remove or per-peer address changes), and
updatePacketFilter (since PeerCapability values are derived from the
packet filter rules, not from peer.CapMap).

The hook itself is kept but narrowed: it no longer takes a
*netmap.NetworkMap and its only remaining job is to re-notify IPN bus
listeners of the current local shares list on full installs.

This is a dependency to removing the netmap.NetworkMap type from
upstream callers, like wgengine.Engine in general.

(Also add a bunch more tests)

Updates #12542

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I7e3d2f5b4a9c8e1d6f0a3b7c9e2d4f8a1b6c5e9d
Detect Hetzner via /sys/class/dmi/id/sys_vendor == "Hetzner" and wire
up Hetzner's public recursive DNS resolvers (185.12.64.1, 185.12.64.2)
for use as a cloud host resolver.

Fixes #20217

Change-Id: I24a4c51956adfdd5731f62c937e3c7a4a733ffc7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Pin govulncheck to resolve panics in the most recent version.

Updates #cleanup

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
The watchdog (ipn/ipnlocal/watchdog.go) was abusing PeerForIP with an
invalid netip.Addr as a way to acquire and release the engine's
internal locks for deadlock detection. This does the TODO to break it out
into its own method like all the other similarly named methods.

Splitting this out as a prerequisite for a follow-up rewrite of
PeerForIP itself; not having to preserve the lock-probe overload in
the new implementation keeps that follow-up smaller.

Updates #12542
Updates #cleanup

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I25cbffd11aeb65600d9128845404c4918ef88ead
I'm not keen on us having to deal with the bad side effects of the
autocrlf default, but alas, if it makes things easier.

Fixes #16175
Closes #16176

Signed-off-by: James Tucker <james@tailscale.com>
…ression

Otherwise we may never handshake a new peer relay server endpoint
around remote client restarts and/or disco key rotation.

Updates #20215

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Another baby step toward removing slices of peers from the engine.

getStatus iterated peerSequence (a key snapshot built in Reconfig
from cfg.Peers) and then asked wgdev for each peer's stats; peers
that weren't active in wgdev silently fell out. Iterate active wgdev
peers directly via RemoveMatchingPeers(returnFalse) instead.

Updates #12542

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I3abd348abc30db706db29b3a785179259e48abda
userspaceEngine.PeerForIP read from e.netMap.Peers and
e.lastCfgFull.Peers, both of which go stale when peers arrive via
netmap deltas (which skip Engine.SetNetworkMap and Engine.Reconfig).
Every PeerForIP caller (Engine.Ping, the TSMP disco-key handler,
pendopen diagnostics, tsdial.Dialer.UseNetstackForIP, and
LocalBackend.GetPeerEndpointChanges) would report "no matching peer"
for freshly-added peers.

Fix it the same way SetPeerByIPPacketFunc fixed the outbound packet
hot path: have LocalBackend install a callback that reads the live
nodeBackend. nb.NodeByAddr is built from both SelfNode and Peers
(updateNodeByAddrLocked), so a single lookup covers the common case
with IsSelf set when the matched node ID is SelfNode's. The subnet-
route / exit-node-default-route slow path goes through a new
Engine.PeerKeyForIP that exposes the engine's AllowedIPs BART table
(the same table the outbound packet hot path already consults, with
exit-node selection honored), and resolves the matched key back to a
NodeView via the live nodeBackend.

Updates #12542

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I0d4b0d8997c8e796b7367c46b49b61d4fdc717b0
The logging added in 12188c0 was generating excessive spam in
backend logs. This may have been exacerbated by
tailscale GUI<->backend architecture on certain platforms like
Windows, where the GUI polls for exit node suggestions rather
than listening on the IPN bus.

Change this to log on error or if the current suggestion differs
from the previous suggestion.

Updates tailscale/corp#43691
Updates #20194

Signed-off-by: Amal Bansode <amal@tailscale.com>
Most of our flag descriptions start with a lowercase word (except proper
nouns); fix the handful which do not.

Fixes #20230

Change-Id: I00aaac171254c050ad0b75c2cf8746590c8c4d8f
Signed-off-by: Alex Chan <alexc@tailscale.com>
Add a retry loop with BatchMode=yes to absorb the race window
between Env.Start() returning (when tta reports the tailscale
backend as Running) and cloud-init finishing the user/SSH-key
setup. In CI, the second VM's tta agent has been observed
connecting only a few hundred milliseconds before the test SSHes
in, which is inside the window where /root/.ssh/authorized_keys
hasn't fully landed yet. SSH key auth then fails and ssh(1) falls
back to interactive password prompts (3x), wasting time and
producing a confusing "Permission denied (publickey,password)"
error.

BatchMode=yes makes the client fail fast on auth failure instead
of prompting, and the retry loop handles SSH transport-level
errors (exit code 255) for up to 30 seconds with 500ms backoff.
Remote command non-zero exits still pass through unchanged.

Fixes #20228

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I17f7422e9e27bf7b995f505c0184cbb2b230ed81
Env.Start boots all VM nodes in parallel; each calls
createCloudInitISO -> ensureDebugSSHKey concurrently. When
/tmp/vmtest_key doesn't yet exist, the first goroutine creates it
with os.WriteFile, which opens with O_CREATE|O_TRUNC and briefly
leaves the file existing-but-empty between the open and the
subsequent write. A concurrent goroutine that hits that window
sees ReadFile succeed with zero bytes, then fails ssh.ParsePrivateKey
with "ssh: no key found", causing boot to fail with:

  boot: creating cloud-init ISO: parse /tmp/vmtest_key: ssh: no key found

Observed in CI on TestSiteToSite (3 nodes). Wrap the function in
a package-level Mutex so the first caller fully writes the key
before any other caller reads it.

Updates #20228

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ie6399dcba0c397bb8041931d3de1c6063a11c568
tsdial.Dialer.SetNetMap rebuilt an O(n peers) map of MagicDNS names on
every netmap change. As we move toward per-peer incremental deltas,
this becomes quadratic. This removes it and replaces it with
SetResolveMagicDNS, a callback into LocalBackend that looks up
hostnames from nodeBackend's new nodeByName index (populated alongside
nodeByAddr/nodeByKey on both full and delta paths). The index stores
both FQDNs and short names as keys.

This is the same treatment applied to netlog (8f21045), wglog
(988b090), and drive (1d69894): stop pushing *netmap.NetworkMap
into subsystems and instead have them pull from LocalBackend's live
data via callbacks.

Updates #12542

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I24557ab0c8a27636e08e4779bcfd3ec633db0a78
Add zizmor GitHub Actions linting on changes to .github/workflows.

Updates tailscale/corp#28760

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
…20199)

Router.Set reconciled tailscale0's addresses only against the in-memory
r.addrs map, which starts empty each run. After a restart the kernel can
still hold the addresses a previous profile put on tailscale0. With no
record of them, Set never removed them, leaving two tailnets' CGNAT
addresses on the interface. That broke connectivity, because the kernel
could source traffic from the wrong IP.

Fix this by scanning the addresses actually on the interface and, after
reconciling the desired set, removing any in Tailscale's CGNAT/ULA ranges
that aren't in the config. Non-Tailscale addresses are never touched,
and IPv6 addresses are skipped when IPv6 is unavailable, since delAddress
no-ops there. To avoid a netlink dump on every Set, the scan runs only on
the first Set and when the desired address set changes.

This also needs the iptables DelLoopbackRule to tolerate a missing rule:
an orphan left by a previous instance never went through AddLoopbackRule
here, and iptables (unlike nftables) errors when deleting an absent
rule, which would otherwise block the address delete.

Fixes #19974

Signed-off-by: Brendan Creane <bcreane@gmail.com>
The primary purpose is that return packets from the target app get
properly SNATed on connectors with --tun=userspace-networking, matching
the NAT behavior in the kernel tun path.

This is also necessary but not sufficient for clients of connectors in
userspace networking mode. The hook will DNAT MagicIPs, but won't
actually be sent MagicIPs until conn25 app connector DNS works with
userspace networking.

Fixes tailscale/corp#43201

Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
The engine only used the netmap to look up self addresses and the
self node's primary routes, so pass it the self node directly
rather than the whole netmap.

Updates #12542

Change-Id: I13c0028eed65d2177baf4cf6c449f5e441845a18
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
setWebClientAtomicBoolLocked and setDebugLogsByCapabilityLocked
each only need the node capabilities to decide what to do, so
take a set.Set[tailcfg.NodeCapability] directly as part of
getting rid of netmap.NetworkMap.

Updates #12542

Change-Id: If7c30b6354fd42dfe82ed6d2e2fe3439de401315
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
No code changes needed; this is to rule out cmpver as the source of any
version-comparison issues.

Updates #20238

Change-Id: Ib8765dd042e994549d9e2c03859a5f769a856704
Signed-off-by: Alex Chan <alexc@tailscale.com>
364b952 switched containerboot to partial netmap fetching, but
stopped refreshing `DNS.ExtraRecords`, so Tailscale Services created
after pod boot were invisible to resolveTailnetFQDN. To fix we watch
for SelfChange ipn bus notifies, and refetch dns-config via LocalAPI
to get a fresh set of `DNS.ExtraRecords`.

Fixes #20233

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
… receive extensions" (#20257)

* Revert "control/controlclient: continue map poll during key expiry to receive extensions"

This reverts commit 6a822dc. This commit
has caused test failures in the corp repo by unexpected changing the login
behaviour when nodes have a valid node key.

Updates tailscale/corp#43705
Updates #19326

Signed-off-by: Alex Chan <alexc@tailscale.com>

* Revert "tsnet: test key extension after server restart"

This reverts commit 3172013. This test
relies on changes in 3172013, which is
also being reverted because it causes test failures in corp.

Updates tailscale/corp#43705
Updates #19326

Signed-off-by: Alex Chan <alexc@tailscale.com>

---------

Signed-off-by: Alex Chan <alexc@tailscale.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.