Skip to content

Improve corrcoef performance with simpler reduction algorithm#74

Merged
keichi merged 2 commits intomasterfrom
use-kokkos-math-functions
Jan 6, 2026
Merged

Improve corrcoef performance with simpler reduction algorithm#74
keichi merged 2 commits intomasterfrom
use-kokkos-math-functions

Conversation

@keichi
Copy link
Copy Markdown
Owner

@keichi keichi commented Jan 6, 2026

Summary

  • Add CorrcoefSimpleState using sum-of-products instead of Welford's algorithm for faster parallel reduction
  • Use CorrcoefSimpleState for single corrcoef() calculation (improved performance)
  • Replace std::min/abs/sqrt with Kokkos equivalents for CUDA compatibility
  • Remove unused volatile operator+= overload

Test plan

  • C++ tests pass
  • Python tests pass

@keichi keichi force-pushed the use-kokkos-math-functions branch from 269453f to 9b560c3 Compare January 6, 2026 13:37
@keichi keichi changed the title Use Kokkos math functions instead of std:: for CUDA compatibility Improve corrcoef performance with simpler reduction algorithm Jan 6, 2026
@keichi keichi force-pushed the use-kokkos-math-functions branch 2 times, most recently from 504aa42 to b1ad0a7 Compare January 6, 2026 13:45
keichi added 2 commits January 6, 2026 22:46
…bility

- Replace std::min/abs with Kokkos::min/abs in device code
- Remove #ifndef KOKKOS_ENABLE_CUDA guards for using declarations
- Add CorrcoefSimpleState using sum-of-products instead of Welford's algorithm
- Use CorrcoefSimpleState for both corrcoef functions (faster parallel reduction)
- Remove unused volatile operator+= overload from CorrcoefState
@keichi keichi force-pushed the use-kokkos-math-functions branch from b1ad0a7 to bd6eeb1 Compare January 6, 2026 13:47
@keichi keichi merged commit 3a221e3 into master Jan 6, 2026
10 checks passed
@keichi keichi deleted the use-kokkos-math-functions branch January 6, 2026 13:53
keichi pushed a commit that referenced this pull request Jan 6, 2026
Resolved merge conflict by combining both changes:
- PR #75: Changed loop indices from int to size_t to prevent overflow
- PR #74: Kept CorrcoefSimpleState and Kokkos::min for performance
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant