A group effort on getting the Precipitation Runoff Modeling Simulation to run on Pegasus. We have attempted to execute a prms simulation as a simple process workflow on pegasus.
Important directory/files for any modeling simulation:
control/- contains the control files for the said simulationinput/- contains all input files for the said simulationoutput/- contains all output files for the said simulationdaxgen.py- python DAX generatorgenerate_dax.py- shell script that adds Pegasus python library to path and generates DAX usingdaxgen.pyplan_dax.py- plans the workflow and submits it to the right specified job pool, condorpool.rc.txt- replication control file. Contains the fully qualified paths for files used in the context of the workflow.tc.txt- transformation control file. Contains the paths to executables, along with architecture requirements of executables used in the context of the workflow.sites.xml- specifies, the directories to be used for scratch, or directories to write output to. etc.pegasus.properties- Tells Pegasus where to find rc.txt, tc.txt and other important config files.
-
Step 1: Generate dax (Direct-Acylic Graph XML) for workflow using command:
./generate_dax.sh <name_of_daxFile> <name_of_controlFile> [name_of_outputFile]. -
Step 2: Edit
tc.txtto reflect the new fully qualified path for the prms executable. -
Step 3: Edit
rc.txtto reflect the fully qualified path of input control file -
Step 4: Edit
sites.xmlto reflect correct fully qualified scratch and output space -
Step 5: Run
./plan_dax <name_of_daxFile>to submit workflow to HTCondor
You would need to create a tunnel to achieve this, since we only have remote access to the server running pegasus. Recall that Pegasus is running on dewey.cs.pdx.edu
-
Step 1: create a tunnel like so
ssh -L <sourcePort>:127.0.0.1:<destinationPort> <user>@dewey.cs.pdx.edu -
Step 2: Start up a pegasus service using
pegasus-service. This starts the service on port 5000. So destination port above should be set to 5000. To start service on a custom port runpegasus-service -p <port-no>. Remember to use the destinationPort as specified in step-1 above. -
Step 3: View dashboard on your machine at
https://localhost:<sourcePort>
For a single workflow run, enter the sagehen-on-pegasus directory and simply execute ./plan_dax prms.dax
For multiple simulations in a workflow (concurrent/parallel runs), enter the sample-with-multiple-simulations directory and run ./plan_dax.sh prms2.dax
Remember to always change the paths in the input files to their fully qualified paths.