Skip to content

feat: add plugin interface#586

Draft
adil-soubki wants to merge 8 commits intoastroautomata:masterfrom
adil-soubki:plugin-interface
Draft

feat: add plugin interface#586
adil-soubki wants to merge 8 commits intoastroautomata:masterfrom
adil-soubki:plugin-interface

Conversation

@adil-soubki
Copy link
Copy Markdown

No description provided.

adil-soubki and others added 3 commits March 12, 2026 11:45
Adds a two-layer plugin system for research extensions:

**Layer 1** (already worked): algorithm overrides via `AbstractOptions`
dispatch — `compute_complexity`, `optimize_constants`, etc. All relevant
internal functions already use `::AbstractOptions` signatures.

**Layer 2** (new): lifecycle hooks + persistent mutable state via
`AbstractPlugin` / `AbstractPluginState`.

New hook call sites:
- `on_search_start!` — end of `_initialize_search!`
- `on_search_end!` — start of `_tear_down!`
- `on_generation_complete!` — after migration in `_main_search_loop!`
- `on_population_evaluated!` — end of `_dispatch_s_r_cycle`
- `init_member` — in `Population` constructor via `_init_tree` helper

Per-(output, population) `Ref{AbstractPluginState}` allocated outside
the main loop so worker state persists across iterations. Lazily
initialized on first worker call. All new code paths are no-ops for
`NoPlugin` with zero overhead.

Note: a design simplification (removing `AbstractPlugin` in favour of
dispatching directly on `AbstractOptions`/`AbstractPluginState`) has
been decided and will be applied in a follow-up commit.
Introduces a two-layer extension system for SymbolicRegression.jl:

**Layer 1 — Algorithm overrides via AbstractOptions dispatch**
Any exported function that takes `options::AbstractOptions` can be overridden
by dispatching on a custom options subtype. Documents `compute_complexity`,
`eval_cost`, `optimize_constants`, and `mutate!` as explicit override points.

**Layer 2 — Lifecycle hooks + persistent per-worker state**
New `src/Plugin.jl` defines:
- `AbstractPluginState` / `NoPluginState` — per-worker mutable state
- `init_plugin_state(options, datasets)` — dispatches on AbstractOptions subtype
- Five hooks: `on_search_start!`, `on_search_end!`, `on_generation_complete!`,
  `on_population_evaluated!`, `init_member`

Hook call sites wired throughout the search loop:
- `SearchState` carries head-node plugin state
- Per-(output, population) worker state Refs allocated outside the main loop
  so state persists across iterations
- Lazy per-worker init on first `_dispatch_s_r_cycle` call via `nothing` sentinel
- `init_member` hooks into `Population` construction for custom tree initialization

**Design notes**
- No `AbstractPlugin`/`NoPlugin`/`get_plugin` indirection — plugin config lives
  directly in `MyOptions` fields; hooks dispatch on `AbstractPluginState` subtype
- `init_plugin_state` receives a `Vector{<:Dataset}` on the head node and a
  `Tuple{<:Dataset}` on workers (documented)
- All plugin types marked `public` (not exported); users use `import`

**Files changed**
- `src/Plugin.jl` (new) — all types, defaults, hook signatures
- `src/Core.jl` — include + re-export
- `src/SearchUtils.jl` — `plugin_state` field in `SearchState`
- `src/SymbolicRegression.jl` — hook call sites, worker Refs, exports, `_tear_down!`
- `src/Population.jl` — `_init_tree` helper + `plugin_state` kwarg
- `src/Mutate.jl` / `RegularizedEvolution.jl` / `SingleIteration.jl` — thread through
- `src/Complexity.jl`, `LossFunctions.jl`, `ConstantOptimization.jl` — Layer 1 docstrings
- `docs/src/plugin-guide.md` (new) — full development guide including LaSR-style
  worked example, threading/multiprocessing safety notes, package structure
- `test/unit/misc/test_plugin_interface.jl` (new) — lifecycle counter test + init_member test

All existing tests pass.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 13, 2026

Benchmark Results (Julia v1)

Time benchmarks
master b85ccd9... master / b85ccd9...
search/multithreading 14.6 ± 0.063 s 14.4 ± 0.15 s 1.01 ± 0.012
search/serial 31.9 ± 0.037 s 31 ± 0.26 s 1.03 ± 0.0087
utils/best_of_sample 1.72 ± 0.37 μs 1.63 ± 0.34 μs 1.06 ± 0.32
utils/check_constraints_x10 16.7 ± 4.3 μs 16.6 ± 4.2 μs 1.01 ± 0.36
utils/compute_complexity_x10/Float64 2.18 ± 0.09 μs 2.17 ± 0.11 μs 1 ± 0.066
utils/compute_complexity_x10/Int64 2.05 ± 0.081 μs 2.11 ± 0.09 μs 0.972 ± 0.056
utils/compute_complexity_x10/nothing 1.52 ± 0.09 μs 1.55 ± 0.1 μs 0.981 ± 0.086
utils/insert_random_op_x10 5.24 ± 1.7 μs 5.05 ± 1.7 μs 1.04 ± 0.48
utils/next_generation_x100 0.437 ± 0.027 ms 0.432 ± 0.028 ms 1.01 ± 0.09
utils/optimize_constants_x10 0.0337 ± 0.0082 s 0.0328 ± 0.0075 s 1.03 ± 0.34
utils/randomly_rotate_tree_x10 8.06 ± 0.99 μs 8.21 ± 1 μs 0.982 ± 0.17
time_to_load 2.59 ± 0.0042 s 2.54 ± 0.018 s 1.02 ± 0.0074
Memory benchmarks
master b85ccd9... master / b85ccd9...
search/multithreading 0.202 G allocs: 51.5 GB 0.206 G allocs: 54 GB 0.953
search/serial 0.207 G allocs: 53.8 GB 0.207 G allocs: 53.8 GB 1
utils/best_of_sample 0.038 k allocs: 3.25 kB 0.038 k allocs: 3.25 kB 1
utils/check_constraints_x10 0.034 k allocs: 0.875 kB 0.034 k allocs: 0.875 kB 1
utils/compute_complexity_x10/Float64 0 allocs: 0 B 0 allocs: 0 B
utils/compute_complexity_x10/Int64 0 allocs: 0 B 0 allocs: 0 B
utils/compute_complexity_x10/nothing 0 allocs: 0 B 0 allocs: 0 B
utils/insert_random_op_x10 0.04 k allocs: 1.56 kB 0.04 k allocs: 1.56 kB 1
utils/next_generation_x100 4.63 k allocs: 0.276 MB 4.63 k allocs: 0.276 MB 1
utils/optimize_constants_x10 24.2 k allocs: 25.2 MB 23.3 k allocs: 22.6 MB 1.12
utils/randomly_rotate_tree_x10 0.042 k allocs: 1.34 kB 0.042 k allocs: 1.34 kB 1
time_to_load 0.145 k allocs: 11 kB 0.145 k allocs: 11 kB 1

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 96.77419% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/Population.jl 80.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@extend_mutation_weights macro (MutationWeights.jl, Core.jl, SymbolicRegression.jl)

Adds a new exported macro that generates a full AbstractMutationWeights subtype
from a short declaration block, eliminating the need to manually copy all 14
standard MutationWeights fields:

    @extend_mutation_weights PerturbWeights begin
        perturb_all_constants::Float64 = 0.5
    end

The macro generates:
- A `Base.@kwdef mutable struct` with all standard fields pre-populated at their
  defaults, plus the declared extra fields
- `Base.copy` (collects all field values into the positional constructor)
- `sample_mutation` (uses GlobalRef to extend MutationWeightsModule.sample_mutation,
  not create a new function in the caller; stores field names in a module-level
  `const` to avoid per-call allocation, consistent with `v_mutations`)

All extra fields are validated to be `::Float64` at macro-expansion time, since
`_dispatch_mutations!` treats every fieldname as a mutation type symbol.

examples/plugin_mutation.jl (new)

Self-contained runnable example demonstrating the full Layer 1 recipe:
- `@extend_mutation_weights` for the weights type
- custom `AbstractOptions` subtype with `getproperty` forwarding
- `mutate!` implementation using `get_scalar_constants`/`set_scalar_constants!`
- `equation_search` call

Run with: `julia --project=. examples/plugin_mutation.jl`

Public API additions (SymbolicRegression.jl)

- Export `@extend_mutation_weights`
- Mark `AbstractPopMember` as `public` so plugin authors can write
  `import SymbolicRegression: AbstractPopMember` instead of reaching into
  the internal `PopMemberModule`

Bug fix: warmup double-initializes worker plugin state (SymbolicRegression.jl)

`_warmup_search!` was passing `Ref{..}(nothing)` to `_dispatch_s_r_cycle`, which
triggered lazy `init_plugin_state` and `on_population_evaluated!` on a throw-away
state that is discarded before the main loop. This violated the documented
"initialized once per worker" contract and caused side effects (e.g., LaSR-style
hooks draining channels) during the JIT warmup pass.

Fix: pre-populate the warmup ref with `NoPluginState()` so the lazy-init
condition (`isnothing(ref[])`) is never true during warmup.

Documentation fixes (plugin-guide.md)

- Replace manual `mutate!` boilerplate snippet with `@extend_mutation_weights`
  usage; add `@docs` block and link to the runnable example
- Add `import SymbolicRegression: AbstractOptions` in three places where
  `AbstractOptions` was used after `using SymbolicRegression` alone (it is
  `public` but not exported, so the bare name was not in scope)
- Clarify `on_generation_complete!` semantics: fires once per population-cycle
  completion, not once per "global" generation step; update both prose and table
- Update `AbstractMutationWeights` docstring: replace the old manual example
  (which silently dropped all 14 standard mutations) with a pointer to
  `@extend_mutation_weights` and a `!!! note` on the Float64 constraint

Tests (test/unit/misc/test_plugin_interface.jl)

- Fix invalid `const init_count` inside `@testitem` scope
- Add test for `@extend_mutation_weights`: verifies standard field defaults,
  field count, `copy` independence, and that `sample_mutation` samples the
  correct field when weight is biased to 1.0
- Add test that macro raises `ErrorException` on non-Float64 field annotation
import SymbolicRegression: AbstractOptions, AbstractPopMember, MutationResult, mutate!
using DynamicExpressions: AbstractExpression

@extend_mutation_weights MyWeights begin
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S., @atharvas how does LaSR handle this? Does it basically just write out all the mutation weights from scratch?

I am wondering what the right way to do this is. Feels like there could be a better API out there somewhere...

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adil-soubki what about just doing smth like this:

Base.@kwdef struct MyWeights
    my_mutation::Float64 = 1.0
    base::MutationWeights = MutationWeights()
end

And then you can forward calls to the base weights as needed

**New file: `examples/plugin_operator_stats.jl`**

Self-contained, runnable Layer 2 demo — an operator frequency reporter
that counts how often each operator appears across all evaluated
populations and prints a ranked table at the end of the search. It does
not interfere with the search, making it easy to verify correctness by
reading the output.

Demonstrates:
- `AbstractPluginState` holding mutable worker-local data
- `on_population_evaluated!` — worker hook, fires after each s_r_cycle
- `on_search_end!` — head-node hook, fires once after all workers finish
- The Channel pattern: workers push `Dict{String,Int}` batches,
  head drains and aggregates in `on_search_end!`

Run with: `julia --project=. examples/plugin_operator_stats.jl`

**Bug fix (`src/SymbolicRegression.jl`)**: `on_search_end!` was called
as the very first thing in `_tear_down!` — before `close_reader!`,
before `rmprocs`, and before waiting for in-flight worker tasks. In
`:multithreading` mode this meant workers could still be writing to
Channels when the head node tried to drain them. Moved the call to after
the full worker-cleanup block so the hook always sees complete worker output.

**Docstring / documentation fixes**:

- `init_member` semantics were wrong throughout: it uses the **head
  node's** state (not a per-worker copy) during initial population
  creation only; concurrent in `:multithreading`. Updated the
  `AbstractPluginState` class docstring, `init_member` docstring (new
  `!!! note "State used"` block), plugin-guide.md thread safety table
  ("Worker" → "Head node (initial population creation only)"), and Rules
  section.
- `on_search_start!`: clarify it fires before warmup.
- `on_search_end!`: "before tearing down workers" → "after all workers
  have completed, before tearing down processes/threads".
- `plugin-guide.md` point 6: add `seed_members` persistence note
  (persists across generations; must `empty!` for one-shot injection).
- `plugin-guide.md`: add link to `plugin_operator_stats.jl` after the
  LaSR worked example.

**Minor**:
- Remove trailing comma from `@compat public` import block.
- Improve warmup `NoPluginState()` comment.
- `examples/plugin_mutation.jl`: drop `result =` capture and
  `println(result)` — SR.jl already prints the HoF table.
- `README.md`: add brief Plugin Interface section.
- Add on_mutation_evaluated!(plugin_state, mutation_type, accepted, dataset, options)
  lifecycle hook, firing once per next_generation return with the mutation
  type and accept/reject outcome
- Wire 5 call sites in Mutate.jl (return_immediately, constraint failure,
  NaN loss, annealing rejection, normal acceptance)
- Re-export from Core.jl and add to @compat public in SymbolicRegression.jl
- Add test verifying hook fires with valid mutation types and both accepted
  and rejected outcomes
- Add examples/plugin_adaptive_weights.jl: EMA-based adaptive mutation
  weights using on_mutation_evaluated! + on_population_evaluated! +
  on_generation_complete!, with periodic per-generation weight logging
Pass delta_loss = before_loss - after_loss (Float64) to the hook so
plugin authors get a temperature-independent, continuous signal instead
of the noisy accepted/rejected boolean alone.

- NaN on constraint failure or NaN loss paths
- Finite value on all valid evaluation paths (accepted or rejected)
- Example updated to use delta_loss > 0 as improvement criterion
- Intermediate weight tables now show prev/adapted columns with ▲/▼
…on_evaluated! hook

The on_mutation_evaluated! hook previously received a single `delta_loss`
(before - after), which loses information: you couldn't distinguish a valid
evaluation that was stochastically rejected (finite delta) from a constraint
failure or NaN eval (NaN delta).

The hook now receives `before_loss` and `after_loss` separately:
- `before_loss`: always finite (NaN losses are never propagated into the population)
- `after_loss`: NaN on constraint failure or NaN eval; finite if the tree
  was successfully evaluated (including annealing rejections)

This makes the independent information in `accepted` vs `after_loss` explicit:
a plugin can distinguish "valid tree, probabilistically rejected" from "invalid
tree, never evaluated" by checking `isnan(after_loss)`.

Changes:
- src/Plugin.jl: updated hook signature, default no-op, and docstring
- src/Mutate.jl: updated all four call sites to pass before_loss/after_loss
- test/unit/misc/test_plugin_interface.jl: updated to 4-tuple events, added
  isfinite(before_loss) assertion, added isnan(after_loss) assertion (with
  maxsize=5 to reliably trigger constraint failures)
- examples/plugin_adaptive_weights.jl: updated to use the new signature;
  switched signal metric from binary success rate to mean relative improvement
  (before - after) / before; fixed normalization to use current unobserved
  weights rather than defaults so total weight mass is exactly conserved;
  added :simplify to the skip list (EMA would unfairly penalize it for not
  maximally improving loss); wrapped Step 7 in _run_adaptive_example() with
  an abspath(PROGRAM_FILE) == @__FILE__ guard so the file can be safely
  included by other scripts without triggering the search
- examples/compare_adaptive_vs_baseline.jl: new script for side-by-side
  comparison of adaptive vs baseline mutation weights
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants