Replies: 3 comments 10 replies
-
|
It looks like the |
Beta Was this translation helpful? Give feedback.
-
|
@HanatoK I made a quick edit about LAMMPS to your message at the top (which we could probably keep editing to keep the information organized, since you have a great starting point there?) Regarding the point on the atom groups: the main idea behind #655 (see also your comment here) was to allow sharing the atomic coordinate buffers between different Beyond that, the longer-term "plan" was less about improving the data structure of the atom groups, and more about trying to have the CVCs be more agnostic to the details of that data structure. This was the goal of the second point of #655, a good chunk of which you have also implemented in #788. Ideally, some of the member functions of the CVCs could become templates that are instantiated differently in each scenario (sequential, shared memory, domain decomposition). I had originally thought that major refactoring would be required for run all CVCs efficiently, but #783 shows that this is probably not be needed for every feature. At this point, it would definitely make sense to have separate implementations of |
Beta Was this translation helpful? Give feedback.
-
|
After some explorations and micro-benchmarks, I think it would be better to:
In addition, for the time being, it is difficult to make a base class for different implementations of |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This is a draft plan to continue #652 and #655.
colvarproxyMD engines
Investigate how Colvars interoperates with GROMACS,
LAMMPSand Tinker-HP in case of the GPU-resident mode.LAMMPS support
LAMMPS uses GPUs primarily in two ways (see https://docs.lammps.org/Speed_packages.html):
GPUpackage, which supports offload;KOKKOSpackage, which is "GPU-resident" but also uses more abstract syntax; KOKKOS should be interoperable with the underlying languages (CUDA, HIP, SYCL, ...) but probably not all their specialized features.GPU buffers
atoms_masses,atoms_charge,atoms_positions,atoms_total_forcesandatoms_new_colvar_forcesto the subclasses ofcolvarproxy, and allocate device memory if a subclass ofcolvarproxysupports the GPU-resident mode.Stream/Queue management
colvarproxy_gpuclass to create, synchronize and delete the streams (CUDA and HIP) or queues (SYCL).colvarmodulesmp gpu.cvm::atom_groupGPU buffers
atoms_pos,atoms_charge,atoms_vel,atoms_mass,atoms_grad,atoms_total_forceandatoms_weighton device memory;read_positions,read_velocitiesandread_total_forceson GPU.GPU kernels of atom-group calculations
Basically we need to implement everything in
calc_required_propertieswith GPU kernels:calc_center_of_masson GPU;calc_center_of_geometryon GPU;calc_apply_roto_translationon GPU;colvarmodule::rotationon GPU;calc_optimal_rotation_soaon GPU.Question: should we have a separate
cvm::atom_group_basefor the CPU and GPU implementations?colvar::cvcGPU kernels for CVCs
calc_value_gpuandcalc_gradients_gpufor all CVCs;smp gpuis used, thencalc_value_gpuandcalc_gradients_gpuwill be called.Tests
run_colvars_test.cppon GPU;colvarproxy_stub_gpuon GPU;Beta Was this translation helpful? Give feedback.
All reactions