Search Sharded+LoadBalancing: base feature branch#806
Draft
Search Sharded+LoadBalancing: base feature branch#806
Conversation
Contributor
Author
This stack of pull requests is managed by Graphite. Learn more about stacking. |
This was referenced Feb 18, 2026
Closed
MCK 1.8.0 Release NotesNew Features
Bug Fixes
Other Changes
|
This was referenced Feb 20, 2026
0a2cc17 to
dda0ae1
Compare
Closed
706638a to
d36531b
Compare
427e0fd to
ad08030
Compare
ad08030 to
0021270
Compare
This was referenced Feb 25, 2026
3 tasks
0021270 to
2143c73
Compare
3 tasks
lsierant
added a commit
that referenced
this pull request
Mar 3, 2026
<!-- start git-machete generated --> # Based on PR #817 ## Chain of upstream PRs as of 2026-03-03 * PR #806: `master` ← `search/base` * PR #816: `search/base` ← `search/sharded-cluster` * PR #817: `search/sharded-cluster` ← `search/multiple-mongot` * **PR #853 (THIS ONE)**: `search/multiple-mongot` ← `search/lsierant/revert-rs-cluster-index` <!-- end git-machete generated -->
lsierant
added a commit
that referenced
this pull request
Mar 3, 2026
<!-- start git-machete generated --> # Based on PR #817 ## Chain of upstream PRs as of 2026-03-03 * PR #806: `master` ← `search/base` * PR #816: `search/base` ← `search/sharded-cluster` * PR #817: `search/sharded-cluster` ← `search/multiple-mongot` * **PR #853 (THIS ONE)**: `search/multiple-mongot` ← `search/lsierant/revert-rs-cluster-index` <!-- end git-machete generated -->
This was referenced Mar 13, 2026
lsierant
added a commit
that referenced
this pull request
Mar 16, 2026
<!-- start git-machete generated --> # Based on PR #806 ## Chain of upstream PRs as of 2026-03-03 * PR #806: `master` ← `search/base` * **PR #816 (THIS ONE)**: `search/base` ← `search/sharded-cluster` <!-- end git-machete generated --> # Summary MCK already supported deploying single instance of mongot process using MongoDBSearch resource, with the MongoDB replicaset as source. That meant the customers can run search queries/use search against a ReplicaSet deployment. In this PR, we are adding support to have sharded cluster as source to the MongoDBSearch resource. So that search can be used with sharded clusters as well. To achieve this we have added the new field in the `ExternalMongoDBSource` type that can be used to configure the details of the sharded cluster that is going to be used as source. ```go type ExternalMongoDBSource struct { // ShardedCluster contains configuration for external sharded MongoDB clusters. // Mutually exclusive with HostAndPorts. // +optional ShardedCluster *ExternalShardedClusterConfig `json:"shardedCluster,omitempty"` ``` The `ExternalShardedClusterConfig` is mainly used to generate the mongot config, so that mongot correctly knows how to talk to the mongod processes. Rest of the changes revolve around this. ## Proof of Work TBD --------- Co-authored-by: Vivek Singh <vsingh.ggits.2010@gmail.com> Co-authored-by: Julien-Ben <33035980+Julien-Ben@users.noreply.github.com> Co-authored-by: Julien Benhaim <julien.benhaim@mongodb.com> Co-authored-by: Vivek Singh <vivek.s@mongodb.com>
lsierant
added a commit
that referenced
this pull request
Mar 16, 2026
<!-- start git-machete generated --> # Based on PR #817 ## Chain of upstream PRs as of 2026-03-03 * PR #806: `master` ← `search/base` * PR #816: `search/base` ← `search/sharded-cluster` * PR #817: `search/sharded-cluster` ← `search/multiple-mongot` * **PR #853 (THIS ONE)**: `search/multiple-mongot` ← `search/lsierant/revert-rs-cluster-index` <!-- end git-machete generated -->
lsierant
added a commit
that referenced
this pull request
Mar 16, 2026
# Summary Introduce Astral's [ty](https://docs.astral.sh/ty/) static type checker to avoid simple syntactic/type errors in e2e tests. ty check is applied only to search files for now and is part of precommit. <!-- start git-machete generated --> # Based on PR #817 ## Chain of upstream PRs as of 2026-03-16 * PR #806: `master` ← `search/base` * PR #817: `search/base` ← `search/multiple-mongot` * **PR #896 (THIS ONE)**: `search/multiple-mongot` ← `search/lsierant/mypy` <!-- end git-machete generated -->
anandsyncs
added a commit
that referenced
this pull request
Mar 18, 2026
…es (#886) <!-- start git-machete generated --> # Based on PR #817 ## Chain of upstream PRs as of 2026-03-13 * PR #806: `master` ← `search/base` * PR #817: `search/base` ← `search/multiple-mongot` * **PR #886 (THIS ONE)**: `search/multiple-mongot` ← `search/validate-shardname-tls-san` <!-- end git-machete generated --> # Summary Adds validation for `shardName` in MongoDBSearch sharded cluster configurations to ensure generated Kubernetes resource names comply with naming constraints. **Changes:** - Validate shardName as RFC 1123 DNS Label (lowercase, alphanumeric, hyphens, max 63 chars) - Validate uniqueness of shardNames across shards - Predictively validate generated resource names (StatefulSet, Service, ConfigMap, Secrets) against their respective Kubernetes naming standards - Provide actionable error messages with character counts when validation fails ## Proof of Work ``` $ go test -v ./api/v1/search/... ./controllers/searchcontroller/... --- PASS: TestValidateShardNames (0.00s) --- PASS: TestShardedExternalSearchSource_Validate (0.00s) PASS ``` ## Checklist - [x] Have you linked a jira ticket and/or is the ticket in the title? - [ ] Have you checked whether your jira ticket required DOCSP changes? - [ ] Have you added changelog file? ---------
Draft
lsierant
added a commit
that referenced
this pull request
Mar 23, 2026
<!-- start git-machete generated --> # Based on PR #806 ## Chain of upstream PRs as of 2026-03-03 * PR #806: `master` ← `search/base` * **PR #816 (THIS ONE)**: `search/base` ← `search/sharded-cluster` <!-- end git-machete generated --> # Summary MCK already supported deploying single instance of mongot process using MongoDBSearch resource, with the MongoDB replicaset as source. That meant the customers can run search queries/use search against a ReplicaSet deployment. In this PR, we are adding support to have sharded cluster as source to the MongoDBSearch resource. So that search can be used with sharded clusters as well. To achieve this we have added the new field in the `ExternalMongoDBSource` type that can be used to configure the details of the sharded cluster that is going to be used as source. ```go type ExternalMongoDBSource struct { // ShardedCluster contains configuration for external sharded MongoDB clusters. // Mutually exclusive with HostAndPorts. // +optional ShardedCluster *ExternalShardedClusterConfig `json:"shardedCluster,omitempty"` ``` The `ExternalShardedClusterConfig` is mainly used to generate the mongot config, so that mongot correctly knows how to talk to the mongod processes. Rest of the changes revolve around this. ## Proof of Work TBD --------- Co-authored-by: Vivek Singh <vsingh.ggits.2010@gmail.com> Co-authored-by: Julien-Ben <33035980+Julien-Ben@users.noreply.github.com> Co-authored-by: Julien Benhaim <julien.benhaim@mongodb.com> Co-authored-by: Vivek Singh <vivek.s@mongodb.com>
lsierant
added a commit
that referenced
this pull request
Mar 23, 2026
<!-- start git-machete generated --> # Based on PR #817 ## Chain of upstream PRs as of 2026-03-03 * PR #806: `master` ← `search/base` * PR #816: `search/base` ← `search/sharded-cluster` * PR #817: `search/sharded-cluster` ← `search/multiple-mongot` * **PR #853 (THIS ONE)**: `search/multiple-mongot` ← `search/lsierant/revert-rs-cluster-index` <!-- end git-machete generated -->
lsierant
added a commit
that referenced
this pull request
Mar 23, 2026
# Summary Introduce Astral's [ty](https://docs.astral.sh/ty/) static type checker to avoid simple syntactic/type errors in e2e tests. ty check is applied only to search files for now and is part of precommit. <!-- start git-machete generated --> # Based on PR #817 ## Chain of upstream PRs as of 2026-03-16 * PR #806: `master` ← `search/base` * PR #817: `search/base` ← `search/multiple-mongot` * **PR #896 (THIS ONE)**: `search/multiple-mongot` ← `search/lsierant/mypy` <!-- end git-machete generated -->
lsierant
pushed a commit
that referenced
this pull request
Mar 23, 2026
…es (#886) <!-- start git-machete generated --> # Based on PR #817 ## Chain of upstream PRs as of 2026-03-13 * PR #806: `master` ← `search/base` * PR #817: `search/base` ← `search/multiple-mongot` * **PR #886 (THIS ONE)**: `search/multiple-mongot` ← `search/validate-shardname-tls-san` <!-- end git-machete generated --> # Summary Adds validation for `shardName` in MongoDBSearch sharded cluster configurations to ensure generated Kubernetes resource names comply with naming constraints. **Changes:** - Validate shardName as RFC 1123 DNS Label (lowercase, alphanumeric, hyphens, max 63 chars) - Validate uniqueness of shardNames across shards - Predictively validate generated resource names (StatefulSet, Service, ConfigMap, Secrets) against their respective Kubernetes naming standards - Provide actionable error messages with character counts when validation fails ## Proof of Work ``` $ go test -v ./api/v1/search/... ./controllers/searchcontroller/... --- PASS: TestValidateShardNames (0.00s) --- PASS: TestShardedExternalSearchSource_Validate (0.00s) PASS ``` ## Checklist - [x] Have you linked a jira ticket and/or is the ticket in the title? - [ ] Have you checked whether your jira ticket required DOCSP changes? - [ ] Have you added changelog file? ---------
lsierant
added a commit
that referenced
this pull request
Mar 26, 2026
<!-- start git-machete generated --> # Based on PR #806 ## Chain of upstream PRs as of 2026-03-03 * PR #806: `master` ← `search/base` * **PR #816 (THIS ONE)**: `search/base` ← `search/sharded-cluster` <!-- end git-machete generated --> # Summary MCK already supported deploying single instance of mongot process using MongoDBSearch resource, with the MongoDB replicaset as source. That meant the customers can run search queries/use search against a ReplicaSet deployment. In this PR, we are adding support to have sharded cluster as source to the MongoDBSearch resource. So that search can be used with sharded clusters as well. To achieve this we have added the new field in the `ExternalMongoDBSource` type that can be used to configure the details of the sharded cluster that is going to be used as source. ```go type ExternalMongoDBSource struct { // ShardedCluster contains configuration for external sharded MongoDB clusters. // Mutually exclusive with HostAndPorts. // +optional ShardedCluster *ExternalShardedClusterConfig `json:"shardedCluster,omitempty"` ``` The `ExternalShardedClusterConfig` is mainly used to generate the mongot config, so that mongot correctly knows how to talk to the mongod processes. Rest of the changes revolve around this. ## Proof of Work TBD --------- Co-authored-by: Vivek Singh <vsingh.ggits.2010@gmail.com> Co-authored-by: Julien-Ben <33035980+Julien-Ben@users.noreply.github.com> Co-authored-by: Julien Benhaim <julien.benhaim@mongodb.com> Co-authored-by: Vivek Singh <vivek.s@mongodb.com>
lsierant
added a commit
that referenced
this pull request
Mar 26, 2026
<!-- start git-machete generated --> # Based on PR #817 ## Chain of upstream PRs as of 2026-03-03 * PR #806: `master` ← `search/base` * PR #816: `search/base` ← `search/sharded-cluster` * PR #817: `search/sharded-cluster` ← `search/multiple-mongot` * **PR #853 (THIS ONE)**: `search/multiple-mongot` ← `search/lsierant/revert-rs-cluster-index` <!-- end git-machete generated -->
lsierant
added a commit
that referenced
this pull request
Mar 26, 2026
# Summary Introduce Astral's [ty](https://docs.astral.sh/ty/) static type checker to avoid simple syntactic/type errors in e2e tests. ty check is applied only to search files for now and is part of precommit. <!-- start git-machete generated --> # Based on PR #817 ## Chain of upstream PRs as of 2026-03-16 * PR #806: `master` ← `search/base` * PR #817: `search/base` ← `search/multiple-mongot` * **PR #896 (THIS ONE)**: `search/multiple-mongot` ← `search/lsierant/mypy` <!-- end git-machete generated -->
lsierant
pushed a commit
that referenced
this pull request
Mar 26, 2026
…es (#886) <!-- start git-machete generated --> # Based on PR #817 ## Chain of upstream PRs as of 2026-03-13 * PR #806: `master` ← `search/base` * PR #817: `search/base` ← `search/multiple-mongot` * **PR #886 (THIS ONE)**: `search/multiple-mongot` ← `search/validate-shardname-tls-san` <!-- end git-machete generated --> # Summary Adds validation for `shardName` in MongoDBSearch sharded cluster configurations to ensure generated Kubernetes resource names comply with naming constraints. **Changes:** - Validate shardName as RFC 1123 DNS Label (lowercase, alphanumeric, hyphens, max 63 chars) - Validate uniqueness of shardNames across shards - Predictively validate generated resource names (StatefulSet, Service, ConfigMap, Secrets) against their respective Kubernetes naming standards - Provide actionable error messages with character counts when validation fails ## Proof of Work ``` $ go test -v ./api/v1/search/... ./controllers/searchcontroller/... --- PASS: TestValidateShardNames (0.00s) --- PASS: TestShardedExternalSearchSource_Validate (0.00s) PASS ``` ## Checklist - [x] Have you linked a jira ticket and/or is the ticket in the title? - [ ] Have you checked whether your jira ticket required DOCSP changes? - [ ] Have you added changelog file? ---------
lsierant
added a commit
that referenced
this pull request
Mar 27, 2026
…er (#817) <!-- start git-machete generated --> # Based on PR #806 ## Chain of upstream PRs as of 2026-03-23 * PR #806: `master` <- `search/base` * **PR #817 (THIS ONE)**: `search/base` <- `search/multiple-mongot` <!-- end git-machete generated --> # Summary This PR adds support for **multiple mongot instances** per MongoDB source, with L7 load balancing and sharded cluster integration. It builds on `search/base` (#806) and introduces three major capabilities: 1. **Multiple mongot replicas** with managed (Envoy) or unmanaged (BYO) load balancing 2. **Sharded cluster support** for both operator-managed and external MongoDB 3. **X509 client certificate authentication** for mongot-to-MongoDB sync source connections 4. **JVM flags** for mongot process configuration --- ## API/CRD Changes The `MongoDBSearch` CRD gains the following new fields: ```yaml spec: # New: Load balancer configuration (exactly one of managed/unmanaged must be set) loadBalancer: # was: spec.lb with mode enum managed: # operator-deployed Envoy proxy externalHostname: "..." # SNI hostname, supports {shardName} placeholder resourceRequirements: { ... } # Envoy container resources (default: 100m/128Mi req, 500m/512Mi lim) deployment: { ... } # Envoy Deployment overrides (same convention as spec.statefulSet) unmanaged: # BYO L7 load balancer endpoint: "lb.example.com:27029" # supports {shardName} placeholder for sharded # New: X509 client certificate auth for sync source (mutually exclusive with username/password) source: x509: clientCertificateSecretRef: name: "mongot-x509-cert" # Secret with tls.crt, tls.key (required), tls.keyFilePassword (optional) # New: JVM flags for mongot jvmFlags: ["-Xms2g", "-Xmx2g"] # New: External sharded cluster source source: external: shardedCluster: router: hosts: ["mongos-0.example.com:27017"] shards: - shardName: "shard-0" hosts: ["shard0-rs0.example.com:27017"] - shardName: "shard-1" hosts: ["shard1-rs0.example.com:27017"] ``` Key structural change: `spec.lb.mode` (Managed/Unmanaged enum) was replaced with `spec.loadBalancer.managed` / `spec.loadBalancer.unmanaged` (mutually exclusive sub-objects, enforced via CEL validation). --- ## Resource Naming Conventions All search resource names include a hardcoded `-0-` cluster index to reserve the naming scheme for future multi-cluster support. ### ReplicaSet (Non-Sharded) Resources | Resource | Name Pattern | |----------|-------------| | StatefulSet | `{name}-search` | | Headless Service | `{name}-search-svc` | | ConfigMap | `{name}-search-config` | | TLS Secret | `[{prefix}-]{name}-search-0-cert` | | TLS Client Secret | `[{prefix}-]{name}-search-0-client-cert` | | X509 Client Cert Secret | `{name}-x509-client-cert` | | Proxy Service (Envoy) | `{name}-search-0-proxy-svc` | | LB Deployment | `{name}-search-lb-0` | | LB ConfigMap | `{name}-search-lb-0-config` | | LB Server Cert | `[{prefix}-]{name}-search-lb-0-cert` | | LB Client Cert | `[{prefix}-]{name}-search-lb-0-client-cert` | ### Sharded Resources (Per-Shard) | Resource | Name Pattern | |----------|-------------| | StatefulSet | `{name}-search-0-{shard}` | | Headless Service | `{name}-search-0-{shard}-svc` | | ConfigMap | `{name}-search-0-{shard}-config` | | TLS Secret | `[{prefix}-]{name}-search-0-{shard}-cert` | | Proxy Service (Envoy) | `{name}-search-0-{shard}-proxy-svc` | | LB Server Cert (per-shard) | `[{prefix}-]{name}-search-lb-0-{shard}-cert` | --- ## Architecture ### Reconciliation Paths The `MongoDBSearchReconcileHelper` dispatches to: - **`reconcileNonSharded`**: Creates a single set of resources (1 StatefulSet, 1 Service, 1 ConfigMap). All mongot pods connect to the same RS host seeds. - **`reconcileSharded`**: Creates per-shard resources -- one StatefulSet, Service, ConfigMap, and TLS secret per shard. Each shard's mongot connects to that shard's mongod hosts. A `Router` section in the mongot config points to mongos for query routing. ### MongoDBSearchEnvoyReconciler (New Controller) A dedicated controller that manages the Envoy proxy infrastructure when `spec.loadBalancer.managed` is set: - **ReplicaSet**: 1 Envoy Deployment + 1 ConfigMap + 1 ClusterIP proxy Service - **Sharded**: 1 Envoy Deployment + 1 ConfigMap + N ClusterIP proxy Services (one per shard) For sharded clusters, all shard routes are multiplexed through a **single Envoy deployment** using SNI-based routing. Each shard gets a dedicated proxy Service that resolves to the same Envoy pods. Envoy reads the SNI hostname from the TLS ClientHello to route traffic to the correct shard's mongot backend. The controller: - Watches `MongoDBSearch`, `MongoDB`, and `MongoDBCommunity` resources - Owns Deployment and ConfigMap resources - Returns early with `Pending` status if no routes can be configured - Updates LB sub-status on the MongoDBSearch CR at every return path - Uses TLS 1.3 exclusively for Envoy-to-mongot connections ### X509 Client Certificate Authentication When `spec.source.x509` is configured, mongot authenticates to the MongoDB sync source using x509 client certificates instead of username/password: - **Mutually exclusive** with `spec.source.passwordSecretRef` and `spec.source.username` (enforced by validation) - **Requires TLS** to be enabled on the sync source - The operator reads the user-provided Secret (containing `tls.crt`, `tls.key`, and optionally `tls.keyFilePassword`), combines them into an operator-managed Secret (`{name}-x509-client-cert`), and mounts it into the mongot container - Mongot config is modified to clear username/password fields and set `authSource: $external` with `x509.tlsCertificateKeyFile` pointing to the mounted cert - For sharded clusters, x509 config is applied to both the ReplicaSet sync source and the Router section - Optional key password support: if `tls.keyFilePassword` is present in the Secret, it is mounted separately and referenced via `tlsCertificateKeyFilePasswordFile` The `x509AuthResource` adapter implements the `TLSConfigurableResource` interface, allowing reuse of the existing `tls.EnsureTLSSecret` infrastructure. ### Endpoint Resolution How `mongod.setParameter.mongotHost` resolves for each topology + LB mode: | Topology | LB Mode | mongotHost | |----------|---------|------------| | RS | None (single replica) | `{name}-search-svc.{ns}.svc.{domain}:27028` | | RS | Unmanaged | User-provided `spec.loadBalancer.unmanaged.endpoint` | | RS | Managed | `{name}-search-0-proxy-svc.{ns}.svc.{domain}:27029` | | Sharded | None | `{name}-search-0-{shard}-svc.{ns}.svc.{domain}:27028` (per shard) | | Sharded | Unmanaged | Endpoint template with `{shardName}` substituted per shard | | Sharded | Managed | `{name}-search-0-{shard}-proxy-svc.{ns}.svc.{domain}:27029` (per shard) | ### Sharded Controller Integration The `mongodbshardedcluster_controller` is extended to watch MongoDBSearch resources via `lookupCorrespondingSearchResource()`. It applies search config to each shard via `applySearchParametersForShards()`: - Per-shard mongod config points to each shard's own mongot endpoint - Mongos config gets `mongotHost` pointing to the first shard's endpoint --- ## Test Coverage ### E2E Tests (Python) | # | Topology | Source | Mongot Count | LB Mode | Test File | |---|----------|--------|-------------|---------|-----------| | 1 | RS | External | Single | None | `search_replicaset_external_mongodb_single_mongot.py` | | 2 | RS | External | Multi | Unmanaged | `search_replicaset_external_mongodb_multi_mongot_unmanaged_lb.py` | | 3 | RS | Internal | Multi | Unmanaged | `search_replicaset_internal_mongodb_multi_mongot_unmanaged_lb.py` | | 4 | RS | External | Multi | Managed | `search_replicaset_external_mongodb_multi_mongot_managed_lb.py` | | 5 | RS | Internal | Multi | Managed | `search_replicaset_internal_mongodb_multi_mongot_managed_lb.py` | | 6 | RS | External | - | Proxy Svc | `search_replicaset_external_mongodb_proxy_service.py` | | 7 | RS | - | - | X509 | `search_mongot_replicaset_x509_auth.py` | | 8 | RS | Community | Multi | Managed | `search_community_auto_embedding_multi_mongot.py` | | 9 | Sharded | Internal | Single | None | `search_sharded_internal_mongodb_single_mongot.py` | | 10 | Sharded | External | Single | None | `search_sharded_external_mongodb_single_mongot.py` | | 11 | Sharded | Internal | Multi | Unmanaged | `search_sharded_internal_mongodb_multi_mongot_unmanaged_lb.py` | | 12 | Sharded | External | Multi | Unmanaged | `search_sharded_external_mongodb_multi_mongot_unmanaged_lb.py` | | 13 | Sharded | Internal | Multi | Managed | `search_sharded_internal_mongodb_multi_mongot_managed_lb.py` | | 14 | Sharded | External (Ent) | Multi | Managed | `search_sharded_enterprise_external_mongod_managed_lb.py` | ### Unit Tests (Go) | File | Lines | Coverage | |------|-------|----------| | `mongodbsearch_reconcile_helper_test.go` | 2,313 | Per-shard mongod/mongos config, LB endpoint resolution, validation | | `sharded_external_search_source_test.go` | 452 | External sharded source, shard name validation, host lists | | `enterprise_search_source_test.go` | 373 | Enterprise RS source, error returns (no panic) | | `envoy_config_builder_test.go` | 318 | Envoy JSON generation, TLS/non-TLS, single/multi route | | `mongodbsearch_validation_test.go` | 300 | Shard names, X509 auth config | | `mongodbsearchenvoy_controller_test.go` | 255 | Route building, pod spec, deployment overrides | | `mongodbsearch_validation_test.go` (api/) | 212 | JVM flags, LB config, endpoint template | | `search_construction_test.go` | 209 | StatefulSet construction, JVM flags generation | | `mongodbsearch_types_test.go` | 132 | Resource name generation | --- ## Other Changes - **Error handling**: `panic()` calls in `HostSeeds()` implementations replaced with `fmt.Errorf` returns across `enterprise_search_source.go`, `external_search_source.go`, `community_search_source.go` - **Keyfile handling**: Extracted `ensureKeyfileModification()` helper to deduplicate keyfile blocks between `reconcileNonSharded` and `reconcileSharded` - **Endpoint resolution**: Extracted `mongotEndpointForShard()` helper for consistent endpoint resolution across managed/unmanaged/direct modes - **File renames**: `sharded_enterprise_search_source.go` renamed to `sharded_internal_search_source.go` (struct `ShardedInternalSearchSource`) - **Changelog**: Consolidated all search changes into a single entry (`20260318_feature_mongodbsearch_improvements.md`) - **RBAC**: Added `deployments` permission to `apps` API group for Envoy Deployment management - **Helm chart**: New `MDB_ENVOY_IMAGE` env var for configuring the Envoy container image - **Readiness probe** for mongot containers - **Auto-heap sizing**: JVM `-Xms`/`-Xmx` set to 50% of memory request when not explicitly provided - **Duplicate function consolidation**: `verify_rs_mongod_parameters` consolidated into `replicaset_search_helper.py` - **IsShardedEndpoint()**: Extracted helper method for checking sharded endpoint configuration - **Status updates**: `updateLBStatus()` patches LB sub-status on MongoDBSearch CR at every reconcile return path - **Test infrastructure**: `SearchDeploymentHelper`, `SearchTester` factories, `search_resource_names` utilities, `ToolsPod` for in-cluster mongorestore --- ## Known Limitations / Follow-ups - **Envoy log level**: Currently hardcoded to `"info"` -- needs a configurable field in `ManagedLBConfig` - **`rs_search_helper.py`**: Should be consolidated into `replicaset_search_helper.py` and removed - **Reconcile loop tests**: `mongodbsearchenvoy_controller_test.go` covers route building and pod spec but lacks full reconcile loop integration tests - **Test file naming**: Some test files don't follow a consistent naming convention from the test matrix ## Checklist - [ ] Have you linked a jira ticket and/or is the ticket in the title? - [ ] Have you checked whether your jira ticket required DOCSP changes? - [ ] Have you added changelog file? - use `skip-changelog` label if not needed - refer to [Changelog files and Release Notes](https://github.com/mongodb/mongodb-kubernetes/blob/master/CONTRIBUTING.md#changelog-files-and-release-notes) section in [CONTRIBUTING.md](http://CONTRIBUTING.md) for more details --------- Co-authored-by: Julien Benhaim <julien.benhaim@mongodb.com> Co-authored-by: Vivek Singh <vsingh.ggits.2010@gmail.com> Co-authored-by: Julien-Ben <33035980+Julien-Ben@users.noreply.github.com> Co-authored-by: Anand <13899132+anandsyncs@users.noreply.github.com> Co-authored-by: Claude Code <noreply@anthropic.com> Co-authored-by: Vivek Singh <vivek.s@mongodb.com>
<!-- start git-machete generated --> # Based on PR #806 ## Chain of upstream PRs as of 2026-03-03 * PR #806: `master` ← `search/base` * **PR #816 (THIS ONE)**: `search/base` ← `search/sharded-cluster` <!-- end git-machete generated --> # Summary MCK already supported deploying single instance of mongot process using MongoDBSearch resource, with the MongoDB replicaset as source. That meant the customers can run search queries/use search against a ReplicaSet deployment. In this PR, we are adding support to have sharded cluster as source to the MongoDBSearch resource. So that search can be used with sharded clusters as well. To achieve this we have added the new field in the `ExternalMongoDBSource` type that can be used to configure the details of the sharded cluster that is going to be used as source. ```go type ExternalMongoDBSource struct { // ShardedCluster contains configuration for external sharded MongoDB clusters. // Mutually exclusive with HostAndPorts. // +optional ShardedCluster *ExternalShardedClusterConfig `json:"shardedCluster,omitempty"` ``` The `ExternalShardedClusterConfig` is mainly used to generate the mongot config, so that mongot correctly knows how to talk to the mongod processes. Rest of the changes revolve around this. ## Proof of Work TBD --------- Co-authored-by: Vivek Singh <vsingh.ggits.2010@gmail.com> Co-authored-by: Julien-Ben <33035980+Julien-Ben@users.noreply.github.com> Co-authored-by: Julien Benhaim <julien.benhaim@mongodb.com> Co-authored-by: Vivek Singh <vivek.s@mongodb.com>
…er (#817) <!-- start git-machete generated --> # Based on PR #806 ## Chain of upstream PRs as of 2026-03-23 * PR #806: `master` <- `search/base` * **PR #817 (THIS ONE)**: `search/base` <- `search/multiple-mongot` <!-- end git-machete generated --> # Summary This PR adds support for **multiple mongot instances** per MongoDB source, with L7 load balancing and sharded cluster integration. It builds on `search/base` (#806) and introduces three major capabilities: 1. **Multiple mongot replicas** with managed (Envoy) or unmanaged (BYO) load balancing 2. **Sharded cluster support** for both operator-managed and external MongoDB 3. **X509 client certificate authentication** for mongot-to-MongoDB sync source connections 4. **JVM flags** for mongot process configuration --- ## API/CRD Changes The `MongoDBSearch` CRD gains the following new fields: ```yaml spec: # New: Load balancer configuration (exactly one of managed/unmanaged must be set) loadBalancer: # was: spec.lb with mode enum managed: # operator-deployed Envoy proxy externalHostname: "..." # SNI hostname, supports {shardName} placeholder resourceRequirements: { ... } # Envoy container resources (default: 100m/128Mi req, 500m/512Mi lim) deployment: { ... } # Envoy Deployment overrides (same convention as spec.statefulSet) unmanaged: # BYO L7 load balancer endpoint: "lb.example.com:27029" # supports {shardName} placeholder for sharded # New: X509 client certificate auth for sync source (mutually exclusive with username/password) source: x509: clientCertificateSecretRef: name: "mongot-x509-cert" # Secret with tls.crt, tls.key (required), tls.keyFilePassword (optional) # New: JVM flags for mongot jvmFlags: ["-Xms2g", "-Xmx2g"] # New: External sharded cluster source source: external: shardedCluster: router: hosts: ["mongos-0.example.com:27017"] shards: - shardName: "shard-0" hosts: ["shard0-rs0.example.com:27017"] - shardName: "shard-1" hosts: ["shard1-rs0.example.com:27017"] ``` Key structural change: `spec.lb.mode` (Managed/Unmanaged enum) was replaced with `spec.loadBalancer.managed` / `spec.loadBalancer.unmanaged` (mutually exclusive sub-objects, enforced via CEL validation). --- ## Resource Naming Conventions All search resource names include a hardcoded `-0-` cluster index to reserve the naming scheme for future multi-cluster support. ### ReplicaSet (Non-Sharded) Resources | Resource | Name Pattern | |----------|-------------| | StatefulSet | `{name}-search` | | Headless Service | `{name}-search-svc` | | ConfigMap | `{name}-search-config` | | TLS Secret | `[{prefix}-]{name}-search-0-cert` | | TLS Client Secret | `[{prefix}-]{name}-search-0-client-cert` | | X509 Client Cert Secret | `{name}-x509-client-cert` | | Proxy Service (Envoy) | `{name}-search-0-proxy-svc` | | LB Deployment | `{name}-search-lb-0` | | LB ConfigMap | `{name}-search-lb-0-config` | | LB Server Cert | `[{prefix}-]{name}-search-lb-0-cert` | | LB Client Cert | `[{prefix}-]{name}-search-lb-0-client-cert` | ### Sharded Resources (Per-Shard) | Resource | Name Pattern | |----------|-------------| | StatefulSet | `{name}-search-0-{shard}` | | Headless Service | `{name}-search-0-{shard}-svc` | | ConfigMap | `{name}-search-0-{shard}-config` | | TLS Secret | `[{prefix}-]{name}-search-0-{shard}-cert` | | Proxy Service (Envoy) | `{name}-search-0-{shard}-proxy-svc` | | LB Server Cert (per-shard) | `[{prefix}-]{name}-search-lb-0-{shard}-cert` | --- ## Architecture ### Reconciliation Paths The `MongoDBSearchReconcileHelper` dispatches to: - **`reconcileNonSharded`**: Creates a single set of resources (1 StatefulSet, 1 Service, 1 ConfigMap). All mongot pods connect to the same RS host seeds. - **`reconcileSharded`**: Creates per-shard resources -- one StatefulSet, Service, ConfigMap, and TLS secret per shard. Each shard's mongot connects to that shard's mongod hosts. A `Router` section in the mongot config points to mongos for query routing. ### MongoDBSearchEnvoyReconciler (New Controller) A dedicated controller that manages the Envoy proxy infrastructure when `spec.loadBalancer.managed` is set: - **ReplicaSet**: 1 Envoy Deployment + 1 ConfigMap + 1 ClusterIP proxy Service - **Sharded**: 1 Envoy Deployment + 1 ConfigMap + N ClusterIP proxy Services (one per shard) For sharded clusters, all shard routes are multiplexed through a **single Envoy deployment** using SNI-based routing. Each shard gets a dedicated proxy Service that resolves to the same Envoy pods. Envoy reads the SNI hostname from the TLS ClientHello to route traffic to the correct shard's mongot backend. The controller: - Watches `MongoDBSearch`, `MongoDB`, and `MongoDBCommunity` resources - Owns Deployment and ConfigMap resources - Returns early with `Pending` status if no routes can be configured - Updates LB sub-status on the MongoDBSearch CR at every return path - Uses TLS 1.3 exclusively for Envoy-to-mongot connections ### X509 Client Certificate Authentication When `spec.source.x509` is configured, mongot authenticates to the MongoDB sync source using x509 client certificates instead of username/password: - **Mutually exclusive** with `spec.source.passwordSecretRef` and `spec.source.username` (enforced by validation) - **Requires TLS** to be enabled on the sync source - The operator reads the user-provided Secret (containing `tls.crt`, `tls.key`, and optionally `tls.keyFilePassword`), combines them into an operator-managed Secret (`{name}-x509-client-cert`), and mounts it into the mongot container - Mongot config is modified to clear username/password fields and set `authSource: $external` with `x509.tlsCertificateKeyFile` pointing to the mounted cert - For sharded clusters, x509 config is applied to both the ReplicaSet sync source and the Router section - Optional key password support: if `tls.keyFilePassword` is present in the Secret, it is mounted separately and referenced via `tlsCertificateKeyFilePasswordFile` The `x509AuthResource` adapter implements the `TLSConfigurableResource` interface, allowing reuse of the existing `tls.EnsureTLSSecret` infrastructure. ### Endpoint Resolution How `mongod.setParameter.mongotHost` resolves for each topology + LB mode: | Topology | LB Mode | mongotHost | |----------|---------|------------| | RS | None (single replica) | `{name}-search-svc.{ns}.svc.{domain}:27028` | | RS | Unmanaged | User-provided `spec.loadBalancer.unmanaged.endpoint` | | RS | Managed | `{name}-search-0-proxy-svc.{ns}.svc.{domain}:27029` | | Sharded | None | `{name}-search-0-{shard}-svc.{ns}.svc.{domain}:27028` (per shard) | | Sharded | Unmanaged | Endpoint template with `{shardName}` substituted per shard | | Sharded | Managed | `{name}-search-0-{shard}-proxy-svc.{ns}.svc.{domain}:27029` (per shard) | ### Sharded Controller Integration The `mongodbshardedcluster_controller` is extended to watch MongoDBSearch resources via `lookupCorrespondingSearchResource()`. It applies search config to each shard via `applySearchParametersForShards()`: - Per-shard mongod config points to each shard's own mongot endpoint - Mongos config gets `mongotHost` pointing to the first shard's endpoint --- ## Test Coverage ### E2E Tests (Python) | # | Topology | Source | Mongot Count | LB Mode | Test File | |---|----------|--------|-------------|---------|-----------| | 1 | RS | External | Single | None | `search_replicaset_external_mongodb_single_mongot.py` | | 2 | RS | External | Multi | Unmanaged | `search_replicaset_external_mongodb_multi_mongot_unmanaged_lb.py` | | 3 | RS | Internal | Multi | Unmanaged | `search_replicaset_internal_mongodb_multi_mongot_unmanaged_lb.py` | | 4 | RS | External | Multi | Managed | `search_replicaset_external_mongodb_multi_mongot_managed_lb.py` | | 5 | RS | Internal | Multi | Managed | `search_replicaset_internal_mongodb_multi_mongot_managed_lb.py` | | 6 | RS | External | - | Proxy Svc | `search_replicaset_external_mongodb_proxy_service.py` | | 7 | RS | - | - | X509 | `search_mongot_replicaset_x509_auth.py` | | 8 | RS | Community | Multi | Managed | `search_community_auto_embedding_multi_mongot.py` | | 9 | Sharded | Internal | Single | None | `search_sharded_internal_mongodb_single_mongot.py` | | 10 | Sharded | External | Single | None | `search_sharded_external_mongodb_single_mongot.py` | | 11 | Sharded | Internal | Multi | Unmanaged | `search_sharded_internal_mongodb_multi_mongot_unmanaged_lb.py` | | 12 | Sharded | External | Multi | Unmanaged | `search_sharded_external_mongodb_multi_mongot_unmanaged_lb.py` | | 13 | Sharded | Internal | Multi | Managed | `search_sharded_internal_mongodb_multi_mongot_managed_lb.py` | | 14 | Sharded | External (Ent) | Multi | Managed | `search_sharded_enterprise_external_mongod_managed_lb.py` | ### Unit Tests (Go) | File | Lines | Coverage | |------|-------|----------| | `mongodbsearch_reconcile_helper_test.go` | 2,313 | Per-shard mongod/mongos config, LB endpoint resolution, validation | | `sharded_external_search_source_test.go` | 452 | External sharded source, shard name validation, host lists | | `enterprise_search_source_test.go` | 373 | Enterprise RS source, error returns (no panic) | | `envoy_config_builder_test.go` | 318 | Envoy JSON generation, TLS/non-TLS, single/multi route | | `mongodbsearch_validation_test.go` | 300 | Shard names, X509 auth config | | `mongodbsearchenvoy_controller_test.go` | 255 | Route building, pod spec, deployment overrides | | `mongodbsearch_validation_test.go` (api/) | 212 | JVM flags, LB config, endpoint template | | `search_construction_test.go` | 209 | StatefulSet construction, JVM flags generation | | `mongodbsearch_types_test.go` | 132 | Resource name generation | --- ## Other Changes - **Error handling**: `panic()` calls in `HostSeeds()` implementations replaced with `fmt.Errorf` returns across `enterprise_search_source.go`, `external_search_source.go`, `community_search_source.go` - **Keyfile handling**: Extracted `ensureKeyfileModification()` helper to deduplicate keyfile blocks between `reconcileNonSharded` and `reconcileSharded` - **Endpoint resolution**: Extracted `mongotEndpointForShard()` helper for consistent endpoint resolution across managed/unmanaged/direct modes - **File renames**: `sharded_enterprise_search_source.go` renamed to `sharded_internal_search_source.go` (struct `ShardedInternalSearchSource`) - **Changelog**: Consolidated all search changes into a single entry (`20260318_feature_mongodbsearch_improvements.md`) - **RBAC**: Added `deployments` permission to `apps` API group for Envoy Deployment management - **Helm chart**: New `MDB_ENVOY_IMAGE` env var for configuring the Envoy container image - **Readiness probe** for mongot containers - **Auto-heap sizing**: JVM `-Xms`/`-Xmx` set to 50% of memory request when not explicitly provided - **Duplicate function consolidation**: `verify_rs_mongod_parameters` consolidated into `replicaset_search_helper.py` - **IsShardedEndpoint()**: Extracted helper method for checking sharded endpoint configuration - **Status updates**: `updateLBStatus()` patches LB sub-status on MongoDBSearch CR at every reconcile return path - **Test infrastructure**: `SearchDeploymentHelper`, `SearchTester` factories, `search_resource_names` utilities, `ToolsPod` for in-cluster mongorestore --- ## Known Limitations / Follow-ups - **Envoy log level**: Currently hardcoded to `"info"` -- needs a configurable field in `ManagedLBConfig` - **`rs_search_helper.py`**: Should be consolidated into `replicaset_search_helper.py` and removed - **Reconcile loop tests**: `mongodbsearchenvoy_controller_test.go` covers route building and pod spec but lacks full reconcile loop integration tests - **Test file naming**: Some test files don't follow a consistent naming convention from the test matrix ## Checklist - [ ] Have you linked a jira ticket and/or is the ticket in the title? - [ ] Have you checked whether your jira ticket required DOCSP changes? - [ ] Have you added changelog file? - use `skip-changelog` label if not needed - refer to [Changelog files and Release Notes](https://github.com/mongodb/mongodb-kubernetes/blob/master/CONTRIBUTING.md#changelog-files-and-release-notes) section in [CONTRIBUTING.md](http://CONTRIBUTING.md) for more details --------- Co-authored-by: Julien Benhaim <julien.benhaim@mongodb.com> Co-authored-by: Vivek Singh <vsingh.ggits.2010@gmail.com> Co-authored-by: Julien-Ben <33035980+Julien-Ben@users.noreply.github.com> Co-authored-by: Anand <13899132+anandsyncs@users.noreply.github.com> Co-authored-by: Claude Code <noreply@anthropic.com> Co-authored-by: Vivek Singh <vivek.s@mongodb.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

[skip-ci]
Summary
Proof of Work
Checklist
skip-changeloglabel if not needed