feat: add plugin interface by adil-soubki · Pull Request #586 · astroautomata/SymbolicRegression.jl

adil-soubki · 2026-03-13T15:52:38Z

No description provided.

Adds a two-layer plugin system for research extensions: **Layer 1** (already worked): algorithm overrides via `AbstractOptions` dispatch — `compute_complexity`, `optimize_constants`, etc. All relevant internal functions already use `::AbstractOptions` signatures. **Layer 2** (new): lifecycle hooks + persistent mutable state via `AbstractPlugin` / `AbstractPluginState`. New hook call sites: - `on_search_start!` — end of `_initialize_search!` - `on_search_end!` — start of `_tear_down!` - `on_generation_complete!` — after migration in `_main_search_loop!` - `on_population_evaluated!` — end of `_dispatch_s_r_cycle` - `init_member` — in `Population` constructor via `_init_tree` helper Per-(output, population) `Ref{AbstractPluginState}` allocated outside the main loop so worker state persists across iterations. Lazily initialized on first worker call. All new code paths are no-ops for `NoPlugin` with zero overhead. Note: a design simplification (removing `AbstractPlugin` in favour of dispatching directly on `AbstractOptions`/`AbstractPluginState`) has been decided and will be applied in a follow-up commit.

Introduces a two-layer extension system for SymbolicRegression.jl: **Layer 1 — Algorithm overrides via AbstractOptions dispatch** Any exported function that takes `options::AbstractOptions` can be overridden by dispatching on a custom options subtype. Documents `compute_complexity`, `eval_cost`, `optimize_constants`, and `mutate!` as explicit override points. **Layer 2 — Lifecycle hooks + persistent per-worker state** New `src/Plugin.jl` defines: - `AbstractPluginState` / `NoPluginState` — per-worker mutable state - `init_plugin_state(options, datasets)` — dispatches on AbstractOptions subtype - Five hooks: `on_search_start!`, `on_search_end!`, `on_generation_complete!`, `on_population_evaluated!`, `init_member` Hook call sites wired throughout the search loop: - `SearchState` carries head-node plugin state - Per-(output, population) worker state Refs allocated outside the main loop so state persists across iterations - Lazy per-worker init on first `_dispatch_s_r_cycle` call via `nothing` sentinel - `init_member` hooks into `Population` construction for custom tree initialization **Design notes** - No `AbstractPlugin`/`NoPlugin`/`get_plugin` indirection — plugin config lives directly in `MyOptions` fields; hooks dispatch on `AbstractPluginState` subtype - `init_plugin_state` receives a `Vector{<:Dataset}` on the head node and a `Tuple{<:Dataset}` on workers (documented) - All plugin types marked `public` (not exported); users use `import` **Files changed** - `src/Plugin.jl` (new) — all types, defaults, hook signatures - `src/Core.jl` — include + re-export - `src/SearchUtils.jl` — `plugin_state` field in `SearchState` - `src/SymbolicRegression.jl` — hook call sites, worker Refs, exports, `_tear_down!` - `src/Population.jl` — `_init_tree` helper + `plugin_state` kwarg - `src/Mutate.jl` / `RegularizedEvolution.jl` / `SingleIteration.jl` — thread through - `src/Complexity.jl`, `LossFunctions.jl`, `ConstantOptimization.jl` — Layer 1 docstrings - `docs/src/plugin-guide.md` (new) — full development guide including LaSR-style worked example, threading/multiprocessing safety notes, package structure - `test/unit/misc/test_plugin_interface.jl` (new) — lifecycle counter test + init_member test All existing tests pass.

for more information, see https://pre-commit.ci

github-actions · 2026-03-13T16:14:35Z

Benchmark Results (Julia v1)

Time benchmarks

	master	`b85ccd9`...	master / `b85ccd9`...
search/multithreading	14.6 ± 0.063 s	14.4 ± 0.15 s	1.01 ± 0.012
search/serial	31.9 ± 0.037 s	31 ± 0.26 s	1.03 ± 0.0087
utils/best_of_sample	1.72 ± 0.37 μs	1.63 ± 0.34 μs	1.06 ± 0.32
utils/check_constraints_x10	16.7 ± 4.3 μs	16.6 ± 4.2 μs	1.01 ± 0.36
utils/compute_complexity_x10/Float64	2.18 ± 0.09 μs	2.17 ± 0.11 μs	1 ± 0.066
utils/compute_complexity_x10/Int64	2.05 ± 0.081 μs	2.11 ± 0.09 μs	0.972 ± 0.056
utils/compute_complexity_x10/nothing	1.52 ± 0.09 μs	1.55 ± 0.1 μs	0.981 ± 0.086
utils/insert_random_op_x10	5.24 ± 1.7 μs	5.05 ± 1.7 μs	1.04 ± 0.48
utils/next_generation_x100	0.437 ± 0.027 ms	0.432 ± 0.028 ms	1.01 ± 0.09
utils/optimize_constants_x10	0.0337 ± 0.0082 s	0.0328 ± 0.0075 s	1.03 ± 0.34
utils/randomly_rotate_tree_x10	8.06 ± 0.99 μs	8.21 ± 1 μs	0.982 ± 0.17
time_to_load	2.59 ± 0.0042 s	2.54 ± 0.018 s	1.02 ± 0.0074

Memory benchmarks

	master	`b85ccd9`...	master / `b85ccd9`...
search/multithreading	0.202 G allocs: 51.5 GB	0.206 G allocs: 54 GB	0.953
search/serial	0.207 G allocs: 53.8 GB	0.207 G allocs: 53.8 GB	1
utils/best_of_sample	0.038 k allocs: 3.25 kB	0.038 k allocs: 3.25 kB	1
utils/check_constraints_x10	0.034 k allocs: 0.875 kB	0.034 k allocs: 0.875 kB	1
utils/compute_complexity_x10/Float64	0 allocs: 0 B	0 allocs: 0 B
utils/compute_complexity_x10/Int64	0 allocs: 0 B	0 allocs: 0 B
utils/compute_complexity_x10/nothing	0 allocs: 0 B	0 allocs: 0 B
utils/insert_random_op_x10	0.04 k allocs: 1.56 kB	0.04 k allocs: 1.56 kB	1
utils/next_generation_x100	4.63 k allocs: 0.276 MB	4.63 k allocs: 0.276 MB	1
utils/optimize_constants_x10	24.2 k allocs: 25.2 MB	23.3 k allocs: 22.6 MB	1.12
utils/randomly_rotate_tree_x10	0.042 k allocs: 1.34 kB	0.042 k allocs: 1.34 kB	1
time_to_load	0.145 k allocs: 11 kB	0.145 k allocs: 11 kB	1

codecov · 2026-03-13T16:21:26Z

Codecov Report

❌ Patch coverage is 96.77419% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/Population.jl	80.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@extend_mutation_weights macro (MutationWeights.jl, Core.jl, SymbolicRegression.jl) Adds a new exported macro that generates a full AbstractMutationWeights subtype from a short declaration block, eliminating the need to manually copy all 14 standard MutationWeights fields: @extend_mutation_weights PerturbWeights begin perturb_all_constants::Float64 = 0.5 end The macro generates: - A `Base.@kwdef mutable struct` with all standard fields pre-populated at their defaults, plus the declared extra fields - `Base.copy` (collects all field values into the positional constructor) - `sample_mutation` (uses GlobalRef to extend MutationWeightsModule.sample_mutation, not create a new function in the caller; stores field names in a module-level `const` to avoid per-call allocation, consistent with `v_mutations`) All extra fields are validated to be `::Float64` at macro-expansion time, since `_dispatch_mutations!` treats every fieldname as a mutation type symbol. examples/plugin_mutation.jl (new) Self-contained runnable example demonstrating the full Layer 1 recipe: - `@extend_mutation_weights` for the weights type - custom `AbstractOptions` subtype with `getproperty` forwarding - `mutate!` implementation using `get_scalar_constants`/`set_scalar_constants!` - `equation_search` call Run with: `julia --project=. examples/plugin_mutation.jl` Public API additions (SymbolicRegression.jl) - Export `@extend_mutation_weights` - Mark `AbstractPopMember` as `public` so plugin authors can write `import SymbolicRegression: AbstractPopMember` instead of reaching into the internal `PopMemberModule` Bug fix: warmup double-initializes worker plugin state (SymbolicRegression.jl) `_warmup_search!` was passing `Ref{..}(nothing)` to `_dispatch_s_r_cycle`, which triggered lazy `init_plugin_state` and `on_population_evaluated!` on a throw-away state that is discarded before the main loop. This violated the documented "initialized once per worker" contract and caused side effects (e.g., LaSR-style hooks draining channels) during the JIT warmup pass. Fix: pre-populate the warmup ref with `NoPluginState()` so the lazy-init condition (`isnothing(ref[])`) is never true during warmup. Documentation fixes (plugin-guide.md) - Replace manual `mutate!` boilerplate snippet with `@extend_mutation_weights` usage; add `@docs` block and link to the runnable example - Add `import SymbolicRegression: AbstractOptions` in three places where `AbstractOptions` was used after `using SymbolicRegression` alone (it is `public` but not exported, so the bare name was not in scope) - Clarify `on_generation_complete!` semantics: fires once per population-cycle completion, not once per "global" generation step; update both prose and table - Update `AbstractMutationWeights` docstring: replace the old manual example (which silently dropped all 14 standard mutations) with a pointer to `@extend_mutation_weights` and a `!!! note` on the Float64 constraint Tests (test/unit/misc/test_plugin_interface.jl) - Fix invalid `const init_count` inside `@testitem` scope - Add test for `@extend_mutation_weights`: verifies standard field defaults, field count, `copy` independence, and that `sample_mutation` samples the correct field when weight is biased to 1.0 - Add test that macro raises `ErrorException` on non-Float64 field annotation

MilesCranmer · 2026-03-16T17:11:55Z

docs/src/plugin-guide.md

+import SymbolicRegression: AbstractOptions, AbstractPopMember, MutationResult, mutate!
+using DynamicExpressions: AbstractExpression
+
+@extend_mutation_weights MyWeights begin


P.S., @atharvas how does LaSR handle this? Does it basically just write out all the mutation weights from scratch?

I am wondering what the right way to do this is. Feels like there could be a better API out there somewhere...

@adil-soubki what about just doing smth like this:

Base.@kwdef struct MyWeights my_mutation::Float64 = 1.0 base::MutationWeights = MutationWeights() end

And then you can forward calls to the base weights as needed

**New file: `examples/plugin_operator_stats.jl`** Self-contained, runnable Layer 2 demo — an operator frequency reporter that counts how often each operator appears across all evaluated populations and prints a ranked table at the end of the search. It does not interfere with the search, making it easy to verify correctness by reading the output. Demonstrates: - `AbstractPluginState` holding mutable worker-local data - `on_population_evaluated!` — worker hook, fires after each s_r_cycle - `on_search_end!` — head-node hook, fires once after all workers finish - The Channel pattern: workers push `Dict{String,Int}` batches, head drains and aggregates in `on_search_end!` Run with: `julia --project=. examples/plugin_operator_stats.jl` **Bug fix (`src/SymbolicRegression.jl`)**: `on_search_end!` was called as the very first thing in `_tear_down!` — before `close_reader!`, before `rmprocs`, and before waiting for in-flight worker tasks. In `:multithreading` mode this meant workers could still be writing to Channels when the head node tried to drain them. Moved the call to after the full worker-cleanup block so the hook always sees complete worker output. **Docstring / documentation fixes**: - `init_member` semantics were wrong throughout: it uses the **head node's** state (not a per-worker copy) during initial population creation only; concurrent in `:multithreading`. Updated the `AbstractPluginState` class docstring, `init_member` docstring (new `!!! note "State used"` block), plugin-guide.md thread safety table ("Worker" → "Head node (initial population creation only)"), and Rules section. - `on_search_start!`: clarify it fires before warmup. - `on_search_end!`: "before tearing down workers" → "after all workers have completed, before tearing down processes/threads". - `plugin-guide.md` point 6: add `seed_members` persistence note (persists across generations; must `empty!` for one-shot injection). - `plugin-guide.md`: add link to `plugin_operator_stats.jl` after the LaSR worked example. **Minor**: - Remove trailing comma from `@compat public` import block. - Improve warmup `NoPluginState()` comment. - `examples/plugin_mutation.jl`: drop `result =` capture and `println(result)` — SR.jl already prints the HoF table. - `README.md`: add brief Plugin Interface section.

@compat

- Add on_mutation_evaluated!(plugin_state, mutation_type, accepted, dataset, options) lifecycle hook, firing once per next_generation return with the mutation type and accept/reject outcome - Wire 5 call sites in Mutate.jl (return_immediately, constraint failure, NaN loss, annealing rejection, normal acceptance) - Re-export from Core.jl and add to @compat public in SymbolicRegression.jl - Add test verifying hook fires with valid mutation types and both accepted and rejected outcomes - Add examples/plugin_adaptive_weights.jl: EMA-based adaptive mutation weights using on_mutation_evaluated! + on_population_evaluated! + on_generation_complete!, with periodic per-generation weight logging

Pass delta_loss = before_loss - after_loss (Float64) to the hook so plugin authors get a temperature-independent, continuous signal instead of the noisy accepted/rejected boolean alone. - NaN on constraint failure or NaN loss paths - Finite value on all valid evaluation paths (accepted or rejected) - Example updated to use delta_loss > 0 as improvement criterion - Intermediate weight tables now show prev/adapted columns with ▲/▼

…on_evaluated! hook The on_mutation_evaluated! hook previously received a single `delta_loss` (before - after), which loses information: you couldn't distinguish a valid evaluation that was stochastically rejected (finite delta) from a constraint failure or NaN eval (NaN delta). The hook now receives `before_loss` and `after_loss` separately: - `before_loss`: always finite (NaN losses are never propagated into the population) - `after_loss`: NaN on constraint failure or NaN eval; finite if the tree was successfully evaluated (including annealing rejections) This makes the independent information in `accepted` vs `after_loss` explicit: a plugin can distinguish "valid tree, probabilistically rejected" from "invalid tree, never evaluated" by checking `isnan(after_loss)`. Changes: - src/Plugin.jl: updated hook signature, default no-op, and docstring - src/Mutate.jl: updated all four call sites to pass before_loss/after_loss - test/unit/misc/test_plugin_interface.jl: updated to 4-tuple events, added isfinite(before_loss) assertion, added isnan(after_loss) assertion (with maxsize=5 to reliably trigger constraint failures) - examples/plugin_adaptive_weights.jl: updated to use the new signature; switched signal metric from binary success rate to mean relative improvement (before - after) / before; fixed normalization to use current unobserved weights rather than defaults so total weight mass is exactly conserved; added :simplify to the skip list (EMA would unfairly penalize it for not maximally improving loss); wrapped Step 7 in _run_adaptive_example() with an abspath(PROGRAM_FILE) == @__FILE__ guard so the file can be safely included by other scripts without triggering the search - examples/compare_adaptive_vs_baseline.jl: new script for side-by-side comparison of adaptive vs baseline mutation weights

adil-soubki and others added 3 commits March 12, 2026 11:45

[pre-commit.ci] auto fixes from pre-commit.com hooks

bae2bf3

for more information, see https://pre-commit.ci

MilesCranmer reviewed Mar 16, 2026

View reviewed changes

adil-soubki added 4 commits March 18, 2026 14:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add plugin interface#586

feat: add plugin interface#586
adil-soubki wants to merge 8 commits intoastroautomata:masterfrom
adil-soubki:plugin-interface

adil-soubki commented Mar 13, 2026

Uh oh!

github-actions bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 13, 2026

Uh oh!

MilesCranmer Mar 16, 2026

Uh oh!

MilesCranmer Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

adil-soubki commented Mar 13, 2026

Uh oh!

github-actions bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results (Julia v1)

Uh oh!

codecov bot commented Mar 13, 2026

Codecov Report

Uh oh!

MilesCranmer Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

MilesCranmer Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Mar 13, 2026 •

edited

Loading