-
Notifications
You must be signed in to change notification settings - Fork 90
NHC Helpers vs. Unknown Slurm States #126
Copy link
Copy link
Open
Labels
dev-onboarding?Identifies Issues that might make a good starting point for new devs (quick/simple/self-contained)Identifies Issues that might make a good starting point for new devs (quick/simple/self-contained)enhancementusabilityConfusing, strange, misleading, or otherwise problematic UXConfusing, strange, misleading, or otherwise problematic UX
Milestone
Metadata
Metadata
Assignees
Labels
dev-onboarding?Identifies Issues that might make a good starting point for new devs (quick/simple/self-contained)Identifies Issues that might make a good starting point for new devs (quick/simple/self-contained)enhancementusabilityConfusing, strange, misleading, or otherwise problematic UXConfusing, strange, misleading, or otherwise problematic UX
nhc/helpers/node-mark-offline
Line 88 in 375e7e0
nhc/helpers/node-mark-online
Line 81 in 375e7e0
At present, the handling of unknown node states in Slurm is somewhat undefined/unspecified, but it shouldn't be. (It just
echos a message and continues with whatever comes next.) The user should be able to control whether NHC considers unknown states to be errors or if they should be ignored.What to do? Add either
NHC_IGNORE_UNKNOWN_STATEorNHC_FAIL_UNKNOWN_STATEas a new config variable (preferably one or the other, not both) to allow the helpers to online/offline a node even if the node's state isn't recognized as valid.For a solid, production-quality, commercially supported product, Slurm is still innovating at a fairly rapid pace. And as frequently as this involves adding new node states, I think being more explicit and giving the user control over this behavior would improve usability.