Skip to content

Improve Matern kernels runtime performance#405

Merged
relf merged 9 commits intomasterfrom
matern-perf
Apr 3, 2026
Merged

Improve Matern kernels runtime performance#405
relf merged 9 commits intomasterfrom
matern-perf

Conversation

@relf
Copy link
Copy Markdown
Owner

@relf relf commented Apr 3, 2026

This PR achieves a x2 speedup when Egor is using Matern32 or Matern52 kernels

relf added 9 commits April 3, 2026 13:45
57% performance improvement — benchmark went from 237 ms → 102 ms.

Two optimizations applied to correlation_models.rs:

rval_from_distances: Replaced two-pass computation (separate a and b arrays with inner mapv().product() allocations) with a single-pass scalar loop — no intermediate array allocations.

_jac_helper → _jac_from_r: Replaced the O(n·d²·h²) nested "product-excluding-one-factor" loop + einsum with a closed-form O(n·d·h) formula. Since the Matern 5/2 polynomial is always positive, the excluded-product can be computed via division: total_product / single_factor.
The changes for SE and AE show marginal improvement (criterion reports "change within noise threshold" and "no change in performance detected" for AE). This is expected — these kernels were already simpler than the Matern ones. The main optimization (powf(2.) → v * v and shared theta_w computation) eliminates unnecessary allocations and expensive transcendental calls, but since the exponential kernels lack the O(n·d²·h²) product-excluding-one-factor structure that made Matern so costly, the absolute gains are modest.

Summary of optimizations applied:

SquaredExponential: Replaced all powf(F::cast(2.)) with v * v (avoids log+exp internally); shared neg_theta_w_sq computation in jac and rval_with_jac instead of recomputing theta_w² + separate negation
AbsoluteExponential: Shared neg_theta_w in rval_with_jac, computing r and jr from the same intermediate; avoided redundant rval_from_distances call in rval_with_jac
@relf relf merged commit 5bbc327 into master Apr 3, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant