-
Notifications
You must be signed in to change notification settings - Fork 62
Description
After the merging of #200 and #213 rebalance of topologies no longer does anything. This is because there are no offers on which slots can be made when a rebalance happens unless there happen to also be other topologies needing assignments.
This is as a result of the way that Nimbus handles the TopologiesMissingAssignments component. A quick rundown of what now happens is:
storm-mesosdoes scheduling of topologies until no topologies need assignments
since no topologies need assignments, offers are suppressedstorm-mesosdoesn't do anything inMesosNimbusbecause no topologies need assignments (and offers are already suppressed)- a rebalance command comes in and is registered by Nimbus, a
:do-rebalanceevent is scheduled some number of seconds in the future - those number of seconds later there is finally a topology that needs assignment (i.e. the one that was just rebalanced), but there are no offers buffered
- since there are no offers buffered and there are topologies needing assignments, offers are revived
allSlotsAvailableForSchedulingreturns after reviving offersNimbuswants slots immediately for the rebalancing topology on, and there's no time for offers to come in and be used in the nextallSlotsAvailableForSchedulingcall- since there are no slots available for the workers to be rescheduled onto, they don't get rescheduled and rebalance therefore does nothing
Notably, if there are other topologies needing assignments at the same time as the :do-rebalance is executed, then the rebalance should work as expected.
This also is simply referring to the Storm UI "Rebalance" and its associated command. I have not tested this with the type of rebalance mentioned in the Storm documentation:
## Reconfigure the topology "mytopology" to use 5 worker processes,
## the spout "blue-spout" to use 3 executors and
## the bolt "yellow-bolt" to use 10 executors.
$ storm rebalance mytopology -n 5 -e blue-spout=3 -e yellow-bolt=10
However, I fully expect they hit the same logic in the Nimbus and this same behavior (or something similar) happens that way too.