Skip to content

[fix][broker]The partitions of the topic was wrongly created even though this cluster was not allowed to access it#25443

Open
poorbarcode wants to merge 5 commits intoapache:masterfrom
poorbarcode:fix/create_missing_partitions_without_allowed_clusters
Open

[fix][broker]The partitions of the topic was wrongly created even though this cluster was not allowed to access it#25443
poorbarcode wants to merge 5 commits intoapache:masterfrom
poorbarcode:fix/create_missing_partitions_without_allowed_clusters

Conversation

@poorbarcode
Copy link
Copy Markdown
Contributor

Motivation

Env:

  • cluster-1 and cluster-2 share configuration metadata store
  • create a partitioned topic public/default/tp1
    • The ZK node /admin/partitioned-topics/{topic} will be created on the shared configuration metadata store.
    • The topic will enable a binary way replication between cluster-1 and cluster-2
  • Remove cluster-2 from the topic-level replication policies
    • The partition public/default/tp1-partition-0 will be deleted automatically.

Issue: call pulsar-admin topics create-missed-partitions <topic>, the partition public/default/tp1-partition-0 was loaded up again.

Modifications

Fix the issue

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: x

@poorbarcode poorbarcode added this to the 5.0.0 milestone Mar 31, 2026
@poorbarcode poorbarcode self-assigned this Mar 31, 2026
@github-actions github-actions Bot added the doc-not-needed Your PR changes do not impact docs label Mar 31, 2026
Comment thread pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java Outdated
@poorbarcode
Copy link
Copy Markdown
Contributor Author

/pulsarbot rerun-failure-checks

@poorbarcode poorbarcode force-pushed the fix/create_missing_partitions_without_allowed_clusters branch from d3ca28b to 877b734 Compare April 22, 2026 02:40
@poorbarcode poorbarcode requested a review from Denovo1998 April 23, 2026 13:34
@lhotari lhotari modified the milestones: 5.0.0, 5.0.0-M1 Apr 30, 2026
Copy link
Copy Markdown
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, left a few comments.

}

/**
* @return Triple [namespace policies, global topic policies, topic policies].
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale copy-paste from getCombinedTopicPolicies — this method returns CompletableFuture<Boolean>, not a Triple.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good pointer, I fixed it

Optional<TopicPolicies> globalTopicP = triple.getMiddle();
Optional<Policies> nsPolicies = triple.getLeft();
// Disabled a cluster for a namespace manually.
if (nsPolicies.isPresent() && !isCurrentClusterAllowed(topicName.getNamespaceObject(), nsPolicies.get())) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible regression: isCurrentClusterAllowed(NamespaceName, Policies) returns false when namespace replication_clusters excludes the current cluster (with empty allowed_clusters). But topic-level replicationClusters is supposed to override namespace-level defaults — e.g. ns replication_clusters=[c1], topic replicationClusters=[c1,c2] should be allowed on c2, but this short-circuit returns false.

Suggest only short-circuiting on the allowed_clusters hard gate (PIP-321), then deferring to topic-level checks before falling back to ns replication_clusters.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible regression: isCurrentClusterAllowed(NamespaceName, Policies) returns false when namespace replication_clusters excludes the current cluster (with empty allowed_clusters). But topic-level replicationClusters is supposed to override namespace-level defaults — e.g. ns replication_clusters=[c1], topic replicationClusters=[c1,c2] should be allowed on c2, but this short-circuit returns false.

This situation does not exist. Let's review all possible use cases:

  1. Both clusters use a shared metadata store
    a. The topic will never be allowed to be accessed by the current cluster, even if the topic-level replication policy contains {current cluster}.
    b. The PIP-321 Introduce allowed-cluster at the namespace level introduces concept namespace level allowed_clusters to let clusters that using shared metadata store can enable topic level replication.
  2. Both clusters use their own metadata store
    a. All namespace-level replication_clutsers will contain the current cluster, and namespace-level replication cluster can not be set to a empty value.
    b. No one will remove current cluster from namespace-level replication_clusters, which is meaningless

@poorbarcode poorbarcode dismissed Denovo1998’s stale review May 6, 2026 14:56

I have ansered him for a long time, he did not reply

@poorbarcode poorbarcode force-pushed the fix/create_missing_partitions_without_allowed_clusters branch from 86d621f to 8d9cabc Compare May 6, 2026 15:21
@poorbarcode poorbarcode requested a review from codelipenghui May 6, 2026 15:25
@poorbarcode poorbarcode closed this May 6, 2026
@poorbarcode poorbarcode reopened this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants