Skip to content

Create unique, sequence, and fill wrappers in cugraph to reduce binary size#5559

Open
ChuckHastings wants to merge 6 commits into
rapidsai:mainfrom
ChuckHastings:reduce_binary_size_unique_sequence_fill
Open

Create unique, sequence, and fill wrappers in cugraph to reduce binary size#5559
ChuckHastings wants to merge 6 commits into
rapidsai:mainfrom
ChuckHastings:reduce_binary_size_unique_sequence_fill

Conversation

@ChuckHastings

@ChuckHastings ChuckHastings commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Next effort in binary size reduction.

This PR:

  • Renames the sort_wrapper back to sort using the workaround we tried for scan
  • Creates wrappers for thrust::unique, thrust::sequence, thrust::fill
Stage libcugraph_common.so libcugraph.so libcugraph_mg.so Combined
CUDA 12 Before 98 MB 752 MB 935 MB 1785 MB
CUDA 12 After 99 MB (+1, +1.2%) 736 MB (-16, -2.1%) 919 MB (-16, -1.8%) 1754 MB (-31, -1.8%)
--- --- --- --- ---
CUDA 13 Before 50 MB 382 MB 469 MB 901 MB
CUDA 13 After 50 MB (+0.5, +0.96%) 374 MB (-8, -2.1%) 461 MB (-8, -1.8%) 885 MB (-16, -1.8%)

@copy-pr-bot

copy-pr-bot Bot commented Jun 11, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@ChuckHastings ChuckHastings marked this pull request as ready for review June 12, 2026 13:45
@ChuckHastings ChuckHastings requested a review from a team as a code owner June 12, 2026 13:45
@ChuckHastings ChuckHastings added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jun 12, 2026
T const& value) const
{
using value_t = detail::fill_iterator_value_t<ForwardIterator>;
if constexpr (detail::fill_supported_iterator_v<ForwardIterator>) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to check for exec_policy (i.e. detail::is_rmm_exec_policy_v)?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently instantiate both of the exec policies that we currently use (rmm::exec_policy, rmm::exec_policy_nosync), so everything currently compiles and executes properly. If we were to try and use a different exec policy, the compilation would fail - so at least it wouldn't be a runtime error.

We could add a safety check in case we want things to keep compiling in those cases (if we somehow add a different exec policy). But if we do that I suspect we should do it for all of the functions in this file.

RandomAccessIterator last) const
{
using value_t = detail::sort_iterator_value_t<RandomAccessIterator>;
if constexpr (detail::sort_supported_arithmetic_scalar_v<value_t> ||

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using sort_supported_arithmetic_scalar_v here but checking for iterator in other places? (e.g. fill_supported_iterator_v). And no need to check for exec_policy?

Will this work if RandomAccessIterator is not a pointer?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was an oversight in the original implementation. I will fix this.

…ighten guard around iterators, add nosync exec policy support for scan, saving more binary size

@seunghwak seunghwak left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides few minor cosmetic issues

* matching explicit instantiation (otherwise a call through @ref cugraph::sort_wrapper can fail at
* matching explicit instantiation (otherwise a call through @ref cugraph::sort can fail at
* link
* time).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something very minor but "link" and "time" can go to the same line. No need to break line here.

{
if constexpr ((detail::is_rmm_exec_policy_v<ExecutionPolicy> ||
detail::is_rmm_exec_policy_nosync_v<ExecutionPolicy>) &&
detail::sort_supported_iterator_v<RandomAccessIterator>) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a misnomer?

If sort & unique support same types of iterators, should we better define

using iterator_supported_iterator_v = sort_supported_iterator_v;

?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added additional using.

namespace detail {

/** @brief Whether iterator type @p T has an out-of-line @ref sort_impl lexicographic sort. */
/** @brief True for scalar value types dispatched to @ref sort_impl / @ref unique_impl. */

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something minor but what's the rule in placing these type trait templates?

We have

sort related templates
scan related templates
rmm_exec_policy related templates
more scan related templates
scan_impl
sort_impl
unique_impl
more sort related templates
fill related templates
sequence related templates
(and they alternate few times)
fill_impl
sequence_impl
...

This looks inconsistent and makes it a bit difficult to navigate.

We may do something like

all type trait templates
all implementations

or

sort related templates
sort impl
unique related templates
unique impl
scan related templates
scan impl
...

and so on.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reordered

@ChuckHastings

Copy link
Copy Markdown
Collaborator Author

/merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants