Pass tenant context to downstream queriers in distributed mode#8722
Pass tenant context to downstream queriers in distributed mode#8722ViciousEagle03 wants to merge 4 commits intothanos-io:mainfrom
Conversation
GiedriusS
left a comment
There was a problem hiding this comment.
Could we add a test for this case? We try to do that to ensure that this behavior will not change accidentally in the future.
7ed50f1 to
ab608b0
Compare
Done, I've added a test case for this fix. Let me know if the approach looks right to you or if it needs any tweaks. |
pkg/query/remote_engine.go
Outdated
|
|
||
| if tenant := ctx.Value(tenancy.TenantKey); tenant != nil { | ||
| if tenantStr, ok := tenant.(string); ok { | ||
| qctx = metadata.AppendToOutgoingContext(qctx, tenancy.DefaultTenantHeader, tenantStr) |
There was a problem hiding this comment.
I also realized that this is a bit tricky - the downstream might have a different "tenant header" setting set and then this won't work, right? Seems like we should move the "tenant" field into querypb.QueryRequest and then read it in the handler to make this foolproof. What do you think?
There was a problem hiding this comment.
Thanks for the review. Apologies for the delay. I was unwell, so it took me some time to get myself up and running again.
You are correct to point out that we should be using a protobuf field for the tenant ID to keep it robust.
To sum up my recent commit,
- Added the tenant field to the
querypb.QueryRequestand thequerypb.QueryRangeRequest. - Updated the Query handler in
pkg/api/query/grpc.goto pull the tenant out from the protobuf and inject it into the local context. - Updated the test in
remote_engine_test.go. - Also rebased my branch over the recent changes of the
main
…d query mode Signed-off-by: Piyush Sharma <piyushsharma04321@gmail.com>
Signed-off-by: Piyush Sharma <piyushsharma04321@gmail.com>
Signed-off-by: Piyush Sharma <piyushsharma04321@gmail.com>
…er mismatch Signed-off-by: Piyush Sharma <piyushsharma04321@gmail.com>
ab608b0 to
1fca96f
Compare
|
I've also rebased my branch over |
Changes
This PR fixes a bug in distributed query mode where the THANOS-TENANT context was not included in the outgoing gRPC metadata during the fanout to downstream query nodes.
While the StoreAPI (
proxy.go) correctly propagates the tenant, the QueryAPI (remote_engine.go) was leaving the outgoing gRPC metadata empty. This PR adds the missing logic toremote_engine.goby extracting the tenant from the local context and passing it tometadata.AppendToOutgoingContextbefore executing the instant and range queries.Note: I am pretty new to the Thanos codebase, so I am not sure if a test needs to be added to
remote_engine_test.gosince this is more like a bug in the distributed query mode. Happy to add one if it is required.Verification
--query.tenant-headerflag enabled.Below is the screenshot of the same
Before the Patch:
As seen in the debug output below, just before the Global Querier makes the gRPC call to the downstream node, the
THANOS-TENANTkey is missing from the outgoing gRPC metadata.After the Patch:
The tenant ID is packed into the outgoing gRPC metadata envelope before the network call is executed.

Reproduction Steps:
Had Prometheus scrape itself for metrics by running the below command (
./prometheus --config.file=prometheus.yml),Then, I spun up the following topology:
# Starting Thanos Sidecar ./.bin/thanos sidecar \ --tsdb.path=../prometheus-2.51.0.linux-amd64/data \ --prometheus.url=http://localhost:9090 \ --http-address=0.0.0.0:10906 \ --grpc-address=0.0.0.0:10905make test-local.Fixes: 8632