Skip to content

feat(dns): add dual-stack DNS resolution with dnsdualstack+ scheme#8651

Open
elidhu wants to merge 5 commits intothanos-io:mainfrom
elidhu:feature/dualstack-dns
Open

feat(dns): add dual-stack DNS resolution with dnsdualstack+ scheme#8651
elidhu wants to merge 5 commits intothanos-io:mainfrom
elidhu:feature/dualstack-dns

Conversation

@elidhu
Copy link
Copy Markdown

@elidhu elidhu commented Jan 28, 2026

Add dnsdualstack+ DNS scheme that resolves both A and AAAA records, enabling IPv4/IPv6 failover for store connections. Failover is handled automatically by gRPC's built-in health checking and round_robin balancer.

Usage:
--store=dnsdualstack+store.example.com:10901

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

  • Add ADualStack resolver type that queries both A and AAAA DNS records
  • Add LookupIPAddrDualStack to godns and miekgdns resolvers
  • Add dnsdualstack+ scheme prefix for service discovery
  • Document new scheme in service-discovery.md

Verification

  • go test ./pkg/discovery/dns/... -v - all tests pass
  • go build ./cmd/thanos/... - builds clean

@elidhu elidhu force-pushed the feature/dualstack-dns branch 3 times, most recently from 411292a to 7af0be1 Compare January 28, 2026 11:10
@MichaHoffmann
Copy link
Copy Markdown
Contributor

Note that this is only useful for endpoint groups!

select {
case <-ctx.Done():
if len(result) > 0 {
return result, nil
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the deadline is exceeded then this can silently return partial results, right? That's probably not what the user expects. Probably worth just doing:

Suggested change
return result, nil
return nil, ctx.Err()

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also very observant, I notice the partial result pattern in miekgdns, but maybe that's different? -- I didn't look into it too much.

elidhu added a commit to elidhu/thanos that referenced this pull request Feb 8, 2026
- Use --endpoint-group in docs (--store is deprecated), note endpoint-group context
- Never return partial results on ctx cancellation (both godns and miekgdns)
- Use package-level errNoSuchHost sentinel instead of hand-crafted net.DNSError
- Remove dead code: IsDualStackNode (unused)
- Fix CHANGELOG placeholder with actual PR number thanos-io#8651
- Update Resolver interface comment to include all qtypes
- Fix misleading test: mock IsNotFound properly, split into not-found vs error paths

Signed-off-by: Kevin Glasson <kevinglasson@gmail.com>
@elidhu elidhu force-pushed the feature/dualstack-dns branch from af04e8b to b14daa8 Compare February 8, 2026 01:46
elidhu added a commit to elidhu/thanos that referenced this pull request Feb 8, 2026
- Use --endpoint-group in docs (--store is deprecated), note endpoint-group context
- Never return partial results on ctx cancellation (both godns and miekgdns)
- Use package-level errNoSuchHost sentinel instead of hand-crafted net.DNSError
- Remove dead code: IsDualStackNode (unused)
- Fix CHANGELOG placeholder with actual PR number thanos-io#8651
- Update Resolver interface comment to include all qtypes
- Fix misleading test: mock IsNotFound properly, split into not-found vs error paths

Signed-off-by: Kevin Glasson <kevinglasson@gmail.com>
elidhu added a commit to elidhu/thanos that referenced this pull request Feb 8, 2026
- Use --endpoint-group in docs (--store is deprecated), note endpoint-group context
- Never return partial results on ctx cancellation (both godns and miekgdns)
- Use package-level errNoSuchHost sentinel instead of hand-crafted net.DNSError
- Remove dead code: IsDualStackNode (unused)
- Fix CHANGELOG placeholder with actual PR number thanos-io#8651
- Update Resolver interface comment to include all qtypes
- Fix misleading test: mock IsNotFound properly, split into not-found vs error paths

Signed-off-by: Kevin Glasson <kevinglasson@gmail.com>
@elidhu elidhu force-pushed the feature/dualstack-dns branch 3 times, most recently from 4b13e0f to c3c4fef Compare February 8, 2026 04:18
elidhu added a commit to elidhu/thanos that referenced this pull request Feb 8, 2026
- Use --endpoint-group in docs (--store is deprecated), note endpoint-group context
- Never return partial results on ctx cancellation (both godns and miekgdns)
- Use package-level errNoSuchHost sentinel instead of hand-crafted net.DNSError
- Remove dead code: IsDualStackNode (unused)
- Fix CHANGELOG placeholder with actual PR number thanos-io#8651
- Update Resolver interface comment to include all qtypes
- Fix misleading test: mock IsNotFound properly, split into not-found vs error paths

Signed-off-by: Kevin Glasson <kevinglasson@gmail.com>
@elidhu elidhu force-pushed the feature/dualstack-dns branch from c3c4fef to 13ff0a8 Compare February 8, 2026 05:14
@elidhu
Copy link
Copy Markdown
Author

elidhu commented Feb 8, 2026

I modified it slightly to pull out the common orchestration logic, preventing duplication in every single resolver.

The interface still needed a new method LookupIPAddrByNetwork(ctx context.Context, network, host string) ([]net.IPAddr, error) but this is IMO a much cleaner design.

Still addressed all original comments.

Looks like some flaky tests?

elidhu added a commit to elidhu/thanos that referenced this pull request Feb 17, 2026
- Use --endpoint-group in docs (--store is deprecated), note endpoint-group context
- Never return partial results on ctx cancellation (both godns and miekgdns)
- Use package-level errNoSuchHost sentinel instead of hand-crafted net.DNSError
- Remove dead code: IsDualStackNode (unused)
- Fix CHANGELOG placeholder with actual PR number thanos-io#8651
- Update Resolver interface comment to include all qtypes
- Fix misleading test: mock IsNotFound properly, split into not-found vs error paths

Signed-off-by: Kevin Glasson <kevinglasson@gmail.com>
@elidhu elidhu force-pushed the feature/dualstack-dns branch from 48c4940 to 5e78793 Compare February 17, 2026 00:46
@elidhu elidhu force-pushed the feature/dualstack-dns branch from 11cb5e6 to 1a6c0f4 Compare March 11, 2026 13:23
elidhu and others added 4 commits March 11, 2026 21:54
Add dnsdualstack+ DNS scheme that resolves both A and AAAA records,
enabling IPv4/IPv6 failover for store connections. Failover is handled
automatically by gRPC's built-in health checking and round_robin balancer.
Usage:
  --store=dnsdualstack+store.example.com:10901

Signed-off-by: Kevin Glasson <kglasson@cloudflare.com>
Signed-off-by: Kevin Glasson <kevinglasson@gmail.com>
- Use --endpoint-group in docs (--store is deprecated), note endpoint-group context
- Never return partial results on ctx cancellation (both godns and miekgdns)
- Use package-level errNoSuchHost sentinel instead of hand-crafted net.DNSError
- Remove dead code: IsDualStackNode (unused)
- Fix CHANGELOG placeholder with actual PR number thanos-io#8651
- Update Resolver interface comment to include all qtypes
- Fix misleading test: mock IsNotFound properly, split into not-found vs error paths

Signed-off-by: Kevin Glasson <kevinglasson@gmail.com>
…work primitive

Replace composed LookupIPAddrDualStack with primitive LookupIPAddrByNetwork
on the ipLookupResolver interface. Dual-stack orchestration (iterate families,
error handling) now lives in the parent resolver's ADualStack case.

- godns: thin wrapper around net.Resolver.LookupIP
- miekgdns: network-to-dns.Type mapping with CNAME recursion
- Removed sort, dedup map, per-family counting from orchestration
- Mock is now network-aware for realistic per-family test coverage

Signed-off-by: Kevin Glasson <kevinglasson@gmail.com>
Signed-off-by: Kevin Glasson <kevinglasson@gmail.com>
@elidhu elidhu force-pushed the feature/dualstack-dns branch from 1a6c0f4 to cb22ea8 Compare March 11, 2026 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants