-
Notifications
You must be signed in to change notification settings - Fork 234
Action chains implementation
This page describes the implementation of the action chains for regular Salt minion and for SSH minions.
The action chain implementation for Salt uses the same database tables used by the traditional clients.
+------------------------------+ +----------------------------------------+
| | | |
| rhnActionChain | | rhnActionChainEntry |
| | | |
-------------------------------- ------------------------------------------
| | | |
| id | | actionchain_id -> rhnActionChain(id) |
| label | 1 * | action_id -> rhnAction(id) |
| user_id -> web_contact(id) <-------------------- server_id -> rhnServer(id) |
| created | | sort_order |
| modified | | created |
| | | modified |
| | | |
| | | |
| | | |
| | | |
+------------------------------+ +----------------------------------------+
An action chain has one or more entries. Each entry points to an Action and to a Server entity. For each server there is one action being created, even if the actions are added from SSM and target multiple servers. The Action doesn't have any corresponding ServerAction until it's executed. The target server for the action is stored in the ActionChainEntry.
The entries are ordered according to the field sort_order.
When the action chain is executed by Taskomatic:
-
ServerActions are created to store the result of the execution - for each minion an
.slsfile is generated that contains states for each action in the chain. This.slsfile is then applied to the target minion by a custom modulemgractionchains. Under the hood this module does astate.sls <generated sls>. - the action chain is deleted from the database
The steps are the following:
- The user creates the action chain by adding one or more actions to it.
ActionChainEntryobjects are stored in the db, one for each target server. NoServerActions are created yet. - The user schedules the execution at a certain date and time
- Taskomatic executes the action chain at the schedule time by invoking
MinionActionChainExecutor:-
For each minion,
ServerActionand theLocalCallobjects are created. -
The
LocalCallobjects are converted intoSaltStateobjects. Reboot actions are conerted intoSaltSystemRebootobjects while all other actions are converted intoSaltModuleRunobjects. -
For
salt-sshminions include additional files in the state tarball that gets copied over to the SSH minion.By default,
salt-sshtries to figure out what files to include in the state tarball, besides the.slsto be applied. However Uyuni typically usesmgrcompat.module_run+state.applyto apply the.slsfile that corresponds to a particular action type. E.g.:mgr_actionchain_131_action_1_chunk_1: mgrcompat.module_run: - name: state.apply - mods: remotecommands . . .
Any state applied this can't be inspected by Salt for things like includes,
salt://...urls, etc. Therefore any additional files referenced by the state applied viamgrcompat.module_run + state.applyneed to be included explicitly. -
The
SaltStateobjects are rendered into.slsfiles. The resulting.slsfiles contain states for eachAction, with one state perAction.If one of the actions is a reboot action or a package upgrade action that touches the
salt-minionpackage the resulting.slsis split in two or more chunks depending on the number of reboot/package upgrade actions.The resulting files are written to
/srv/susemanager/salt/actionchains/. The filename has the formatactionchain_<CHAIN_ID>_<MACHINE_ID>_<CHUNK>.slsExample 1 An action on two minions is added from SSM to an action chain. The resulting files are:
/srv/susemanager/salt/actionchains/actionchain_45_df00a3b56f3aa159746b8c835eaaeede_1.sls /srv/susemanager/salt/actionchains/actionchain_45_dec27e1bc3cffa7749c965c15eaca15c_1.slsExample 2 A simple action chain:
- Action 1: Run script
- Action 2: Apply highstate
is rendered to:
mgr_actionchain_131_action_1_chunk_1: mgrcompat.module_run: - name: state.apply - mods: remotecommands - kwargs: pillar: mgr_remote_cmd_runas: joe mgr_remote_cmd_script: salt://scripts/script_1.sh mgr_actionchain_131_action_2_chunk_1: mgrcompat.module_run: - name: state.apply - require: - mgrcompat: mgr_actionchain_131_action_1_chunk_1
Example 3 An action chain containing a reboot action:
- Action 1: Run script
- Action 2: Reboot
- Action 3: Apply highstate
is rendered to:
Chunk 1 (
actionchain_45_df00a3b56f3aa159746b8c835eaaeede_1.sls):mgr_actionchain_131_action_1_chunk_1: mgrcompat.module_run: - name: state.apply - mods: remotecommands - kwargs: pillar: mgr_remote_cmd_runas: foobar mgr_remote_cmd_script: salt://scripts/script_1.sh mgr_actionchain_131_action_2_chunk_1: mgrcompat.module_run: - name: system.reboot - at_time: 1 - require: - mgrcompat: mgr_actionchain_131_action_1_chunk_1 schedule_next_chunk: mgrcompat.module_run: - name: mgractionchains.next - actionchain_id: 131 - chunk: 2 - next_action_id: 3 - require: - mgrcompat: mgr_actionchain_131_action_2_chunk_1
Chunk 2 (
actionchain_45_df00a3b56f3aa159746b8c835eaaeede_2.sls):mgr_actionchain_131_action_3_chunk_2: mgrcompat.module_run: - name: state.apply
-
Split the minions into regular minions and salt-ssh minions
-
Invoke Salt to execute the action chain by calling the custom module
mgractionchains.start. This will calculate the name of the chunk based on the action chain id and the machine id of the minion and will do astate.sls <CHUNK>to apply the file generated in step 4.For regular minions the execution happens asynchronously.
For salt-ssh minions the execution is synchronous.
-
The action chain is deleted from the database.
For regular minions this happens once all the Salt calls are made. There is no waiting for the Salt jobs to return because regular minions operate in an asynchronous manner.
For SSH minions the delete happens once all the
salt-sshcalls complete and the response is returned. This is because thesalt-apiworks synchronously for SSH minions.
-
As mentioned above, the generated .sls file is split into multiple chunks in case the action chain contains reboot actions or upgrade actions that affect the salt-minion package (for regular minions).
The resume mechanism is different for regular minions and for SSH minions.
-
The first chunk is executed/applied by
mgractionchains.start. -
If there are multiple chunks, the next chunk to be applied will be saved in a local file on the minion (in
/etc/salt/minion.d/_mgractionchains.conf). This is done by adding a call tomgractionchains.nextin the.slsfile of the chunk as the last state:... schedule_next_chunk: mgrcompat.module_run: - name: mgractionchains.next - actionchain_id: <CHAIN_ID> - chunk: <NEXT_CHUNK> - next_action_id: <FIRST_ACTION_ID_IN_THE_NEXT_CHUNK> - require: - mgrcompat: <LAST_ACTION_IN_THIS_CHUNK>
-
After a reboot or
salt-minionpackage upgrade, the minion service is started and theminion/startevent is fired. This triggers a reactor on the master which calls themgractionchains.resumemodule on the minion.The
mgractionchains.resumemodule reads the file/etc/salt/minion.d/_mgractionchains.confto get the<NEXT_CHUNK>, it deletes the file and then does astate.sls <NEXT_CHUNK>.The reactor is configured in
/etc/salt/master.d/susemanager.conf:... reactor: - 'salt/minion/*/start': - /usr/share/susemanager/reactor/resume_action_chain.sls ...
If the next chunk contains again a reboot, the steps
2and3are repeated.
The generated .sls is split into chunks only if there's a reboot action present in the chain. Upgrading the salt-minion package doesn't affect an SSH minion so there's no need to split.
- Check first if there are any pending action chains to be resumed by calling
mgractionchains.get_pending_resume. - If there's not pending action chain to be resumed, execute the first chunk by calling
mgractionchains.startsynchronously. This may trigger a reboot. If there's already an action chain to be resumed on a minion set theServerActionss of that minion to failed to avoid concurrent execution. - Handle the execution result and update the corresponding
ServerActions. The reboot action is set toSTATUS_PICKED_UP. - The
SSHPushWorkerSaltwill check periodically for SSH minions that have any of these:- reboot actions older than 4 minutes
- queued actions that have as prerequisite a completed reboot action.
- Call
mgractionchains.get_pending_resumeon each of the minions found in the previous step to get the information on which action chain and chunk to resume - Call
actionchains.resumesshsynchronously and handle the results updating the correspondingServerActions
Independent of steps 4-6, the SSHPushWorkerSalt always updates system information like the kernel version and the uptime. It also sets reboot actions to STATUS_COMPLETED if one of:
- the
ServerActionis inSTATUS_PICKED_UPand the boot time is after the action pickup time - the
ServerActionis inSTATUS_PICKED_UPbut the pickup time is missing and the boot time is after the schedule time of theAction.earliestAction - the action is in
STATUS_QUEUEDand the boot time is afterAction.earliestAction
For regular minions, the generated .sls files are removed after the job return event is processed.
For SSH minions, the generated .sls files are removed after the synchronous calls finish and the result is processed.
-
Asynchronous execution for SSH minions. One option would be to create a new job type to execute
mgractionchains.startand to schedule a job execution for each SSH minion instead of making a synchronous call to thesalt-api. A similar approach is used for executing regular actions on SSH minions - instead of making a sync call, assh-minion-action-executorjob is scheduled for each SSH minion and the number of parallel jobs is controlled with thetaskomatic.com.redhat.rhn.taskomatic.task.SSHMinionActionExecutor.parallel_threadsparameter. -
Improve error handling. See issue https://github.com/SUSE/spacewalk/issues/12826.
One solution would be to keep the action chain in the DB but add a flag to the
rhnActionChaintable and set this flag to true for the action chains that have already been started. This way only the action chains not yet scheduled can be shown in the UI while allowing for better error handling. -
Rename the
SSHPushWorkerSaltto something more appropriate (see issue https://github.com/SUSE/spacewalk/issues/12914).