-
Notifications
You must be signed in to change notification settings - Fork 61
Description
COVID work has slowed down a bit, so now I have time to think about how to better integrate colvars onto GPUs to avoid data transfers across the relatively slow PCI bus. As a discussion point, there are a few parts to colvars that aren't ideal from a GPU perspective as it has currently been designed.
-
Right now, quantities like the atomic coordinates are stored as arrays of rvectors, so the data is stored xyzxyzxyz, etc. On the GPU, the coordinates are stored as independent arrays, (xxxyyyzzz), since this better exposes the parallelism on the hardware. rvector seems to be used everywhere within the module, so in the short term it may be worth rearranging the data within the GPU every timestep into an array of
double3(orfloat3depending on thecvm::realtypedef). -
There are some code patterns that would probably vectorize well, but instead need to resort to simple for loop patterns because
std::transformand theirthrustequivalents are verboten until VMD allows us to use C++11.
My inclination is to use as much of the existing code-base as is practical at the expense of potential GPU performance, so long as no memory needs to move across the PCI-bus, but this is still at the beginning stages so it would be useful to gather feedback. Thoughts and opinions welcome!