Parallelize sampling external sources and threadsafe rejection counters#3830
Open
eepeterson wants to merge 11 commits intoopenmc-dev:developfrom
Open
Parallelize sampling external sources and threadsafe rejection counters#3830eepeterson wants to merge 11 commits intoopenmc-dev:developfrom
eepeterson wants to merge 11 commits intoopenmc-dev:developfrom
Conversation
- Add SOURCE_SAMPLE_BATCH_SIZE (default 1M = ~104 MB) to bound memory
usage when sampling more sites than the batch size. The function
internally loops over batches, reusing the same cached ctypes buffer.
Seed offsets ensure bitwise-identical results regardless of batch size.
- Value-initialize SourceSite (SourceSite site {}) to zero progeny_id,
parent_id, parent_nuclide fields that are not set during sampling.
Fixes non-deterministic garbage across threads from uninitialized stack.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR parallelizes sampling external sources from Python through
openmc.liband the C API by adding an additional optionalthreadsargument toopenmc_sample_external_sourcefunction to batch the work across multiple OpenMP threads if desired. It also provides the option to return results from thesample_external_sourcemethod as a numpy structured array with a new d_type that mimics the_SourceSitestruct to avoid the expensive conversion (both time and memory) to aParticleListofSourceParticleobjects. Along the way I also noticed that the source site rejection and acceptance counters that we use to determine if the sampling of the source is too inefficient and error out werestatic intso I made those global scope atomics and added amax_source_rejections_per_samplesetting so we can keep a local rejection counter to error out rather than relying solely on the global check.Performance results for two source examples are shown below. One is a simple isotropic, point, Watt spectrum
IndependentSourceand the other is a custom compiled source that simulates the Frascati Neutron Generator (FNG). I sampled both sources for 100k, 1M, 10M, and 50M particles on different numbers of threads and using the new output option of a numpy structured array. On my laptop the existing implementation on thedevelopbranch and my implementation that returns aParticleListcannot sample 50M particles because the memory requirements eat up all 32GB of RAM (largely due to Python object overhead). Returning the numpy array instead buys about a factor of 3 in memory savings and allows me to sample over 100M particles but I didn't both pushing it.The plan is then to implement batching source sites to file for very large sample sizes that wouldn't fit in memory and implement the ability to generate histograms or source distributions from very large numbers of discrete samples, but that will be done in a separate PR. The parallelization implemented here is in support of these use cases.
Fixes # (issue)
N/A
Checklist