-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Hello,
I've ran the pipeline without HTCondor up until the processing results part (which I assume is not currently possible without running the pipeline in HTCondor unless I write a custom script that takes the non-HTCondor energize_output and packages it into a database understandable by metl).
From my understanding, it's unfeasible to generate a good enough training set without parallelizing the computation of rosetta's energy parameters for all variants. I've setup my own HTCondor instance to which I'm able to connect a few execute nodes, and would like to run metl-sim on my this cluster. The part that I don't understand is: do I really need to upload rosetta and python to osdf/squid if I'm running the algorithm only on my own machines? Or is there another way (such as adding the rosetta and python env to all execute nodes through my docker-compose)?
I might be wrong, but it seems like I would only need to upload to squid if I'm connecting to a highly distributed HTCondor cluster to which I don't have admin privileges to right?
Where in the scripts are the osdf python/rosetta env being accessed? Is there a workaround to skip that step and instead use a local install?