feat: add plugin interface#586
Draft
adil-soubki wants to merge 8 commits intoastroautomata:masterfrom
Draft
Conversation
Adds a two-layer plugin system for research extensions:
**Layer 1** (already worked): algorithm overrides via `AbstractOptions`
dispatch — `compute_complexity`, `optimize_constants`, etc. All relevant
internal functions already use `::AbstractOptions` signatures.
**Layer 2** (new): lifecycle hooks + persistent mutable state via
`AbstractPlugin` / `AbstractPluginState`.
New hook call sites:
- `on_search_start!` — end of `_initialize_search!`
- `on_search_end!` — start of `_tear_down!`
- `on_generation_complete!` — after migration in `_main_search_loop!`
- `on_population_evaluated!` — end of `_dispatch_s_r_cycle`
- `init_member` — in `Population` constructor via `_init_tree` helper
Per-(output, population) `Ref{AbstractPluginState}` allocated outside
the main loop so worker state persists across iterations. Lazily
initialized on first worker call. All new code paths are no-ops for
`NoPlugin` with zero overhead.
Note: a design simplification (removing `AbstractPlugin` in favour of
dispatching directly on `AbstractOptions`/`AbstractPluginState`) has
been decided and will be applied in a follow-up commit.
Introduces a two-layer extension system for SymbolicRegression.jl:
**Layer 1 — Algorithm overrides via AbstractOptions dispatch**
Any exported function that takes `options::AbstractOptions` can be overridden
by dispatching on a custom options subtype. Documents `compute_complexity`,
`eval_cost`, `optimize_constants`, and `mutate!` as explicit override points.
**Layer 2 — Lifecycle hooks + persistent per-worker state**
New `src/Plugin.jl` defines:
- `AbstractPluginState` / `NoPluginState` — per-worker mutable state
- `init_plugin_state(options, datasets)` — dispatches on AbstractOptions subtype
- Five hooks: `on_search_start!`, `on_search_end!`, `on_generation_complete!`,
`on_population_evaluated!`, `init_member`
Hook call sites wired throughout the search loop:
- `SearchState` carries head-node plugin state
- Per-(output, population) worker state Refs allocated outside the main loop
so state persists across iterations
- Lazy per-worker init on first `_dispatch_s_r_cycle` call via `nothing` sentinel
- `init_member` hooks into `Population` construction for custom tree initialization
**Design notes**
- No `AbstractPlugin`/`NoPlugin`/`get_plugin` indirection — plugin config lives
directly in `MyOptions` fields; hooks dispatch on `AbstractPluginState` subtype
- `init_plugin_state` receives a `Vector{<:Dataset}` on the head node and a
`Tuple{<:Dataset}` on workers (documented)
- All plugin types marked `public` (not exported); users use `import`
**Files changed**
- `src/Plugin.jl` (new) — all types, defaults, hook signatures
- `src/Core.jl` — include + re-export
- `src/SearchUtils.jl` — `plugin_state` field in `SearchState`
- `src/SymbolicRegression.jl` — hook call sites, worker Refs, exports, `_tear_down!`
- `src/Population.jl` — `_init_tree` helper + `plugin_state` kwarg
- `src/Mutate.jl` / `RegularizedEvolution.jl` / `SingleIteration.jl` — thread through
- `src/Complexity.jl`, `LossFunctions.jl`, `ConstantOptimization.jl` — Layer 1 docstrings
- `docs/src/plugin-guide.md` (new) — full development guide including LaSR-style
worked example, threading/multiprocessing safety notes, package structure
- `test/unit/misc/test_plugin_interface.jl` (new) — lifecycle counter test + init_member test
All existing tests pass.
for more information, see https://pre-commit.ci
Contributor
Benchmark Results (Julia v1)Time benchmarks
Memory benchmarks
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
@extend_mutation_weights macro (MutationWeights.jl, Core.jl, SymbolicRegression.jl)
Adds a new exported macro that generates a full AbstractMutationWeights subtype
from a short declaration block, eliminating the need to manually copy all 14
standard MutationWeights fields:
@extend_mutation_weights PerturbWeights begin
perturb_all_constants::Float64 = 0.5
end
The macro generates:
- A `Base.@kwdef mutable struct` with all standard fields pre-populated at their
defaults, plus the declared extra fields
- `Base.copy` (collects all field values into the positional constructor)
- `sample_mutation` (uses GlobalRef to extend MutationWeightsModule.sample_mutation,
not create a new function in the caller; stores field names in a module-level
`const` to avoid per-call allocation, consistent with `v_mutations`)
All extra fields are validated to be `::Float64` at macro-expansion time, since
`_dispatch_mutations!` treats every fieldname as a mutation type symbol.
examples/plugin_mutation.jl (new)
Self-contained runnable example demonstrating the full Layer 1 recipe:
- `@extend_mutation_weights` for the weights type
- custom `AbstractOptions` subtype with `getproperty` forwarding
- `mutate!` implementation using `get_scalar_constants`/`set_scalar_constants!`
- `equation_search` call
Run with: `julia --project=. examples/plugin_mutation.jl`
Public API additions (SymbolicRegression.jl)
- Export `@extend_mutation_weights`
- Mark `AbstractPopMember` as `public` so plugin authors can write
`import SymbolicRegression: AbstractPopMember` instead of reaching into
the internal `PopMemberModule`
Bug fix: warmup double-initializes worker plugin state (SymbolicRegression.jl)
`_warmup_search!` was passing `Ref{..}(nothing)` to `_dispatch_s_r_cycle`, which
triggered lazy `init_plugin_state` and `on_population_evaluated!` on a throw-away
state that is discarded before the main loop. This violated the documented
"initialized once per worker" contract and caused side effects (e.g., LaSR-style
hooks draining channels) during the JIT warmup pass.
Fix: pre-populate the warmup ref with `NoPluginState()` so the lazy-init
condition (`isnothing(ref[])`) is never true during warmup.
Documentation fixes (plugin-guide.md)
- Replace manual `mutate!` boilerplate snippet with `@extend_mutation_weights`
usage; add `@docs` block and link to the runnable example
- Add `import SymbolicRegression: AbstractOptions` in three places where
`AbstractOptions` was used after `using SymbolicRegression` alone (it is
`public` but not exported, so the bare name was not in scope)
- Clarify `on_generation_complete!` semantics: fires once per population-cycle
completion, not once per "global" generation step; update both prose and table
- Update `AbstractMutationWeights` docstring: replace the old manual example
(which silently dropped all 14 standard mutations) with a pointer to
`@extend_mutation_weights` and a `!!! note` on the Float64 constraint
Tests (test/unit/misc/test_plugin_interface.jl)
- Fix invalid `const init_count` inside `@testitem` scope
- Add test for `@extend_mutation_weights`: verifies standard field defaults,
field count, `copy` independence, and that `sample_mutation` samples the
correct field when weight is biased to 1.0
- Add test that macro raises `ErrorException` on non-Float64 field annotation
| import SymbolicRegression: AbstractOptions, AbstractPopMember, MutationResult, mutate! | ||
| using DynamicExpressions: AbstractExpression | ||
|
|
||
| @extend_mutation_weights MyWeights begin |
Collaborator
There was a problem hiding this comment.
P.S., @atharvas how does LaSR handle this? Does it basically just write out all the mutation weights from scratch?
I am wondering what the right way to do this is. Feels like there could be a better API out there somewhere...
Collaborator
There was a problem hiding this comment.
@adil-soubki what about just doing smth like this:
Base.@kwdef struct MyWeights
my_mutation::Float64 = 1.0
base::MutationWeights = MutationWeights()
endAnd then you can forward calls to the base weights as needed
**New file: `examples/plugin_operator_stats.jl`**
Self-contained, runnable Layer 2 demo — an operator frequency reporter
that counts how often each operator appears across all evaluated
populations and prints a ranked table at the end of the search. It does
not interfere with the search, making it easy to verify correctness by
reading the output.
Demonstrates:
- `AbstractPluginState` holding mutable worker-local data
- `on_population_evaluated!` — worker hook, fires after each s_r_cycle
- `on_search_end!` — head-node hook, fires once after all workers finish
- The Channel pattern: workers push `Dict{String,Int}` batches,
head drains and aggregates in `on_search_end!`
Run with: `julia --project=. examples/plugin_operator_stats.jl`
**Bug fix (`src/SymbolicRegression.jl`)**: `on_search_end!` was called
as the very first thing in `_tear_down!` — before `close_reader!`,
before `rmprocs`, and before waiting for in-flight worker tasks. In
`:multithreading` mode this meant workers could still be writing to
Channels when the head node tried to drain them. Moved the call to after
the full worker-cleanup block so the hook always sees complete worker output.
**Docstring / documentation fixes**:
- `init_member` semantics were wrong throughout: it uses the **head
node's** state (not a per-worker copy) during initial population
creation only; concurrent in `:multithreading`. Updated the
`AbstractPluginState` class docstring, `init_member` docstring (new
`!!! note "State used"` block), plugin-guide.md thread safety table
("Worker" → "Head node (initial population creation only)"), and Rules
section.
- `on_search_start!`: clarify it fires before warmup.
- `on_search_end!`: "before tearing down workers" → "after all workers
have completed, before tearing down processes/threads".
- `plugin-guide.md` point 6: add `seed_members` persistence note
(persists across generations; must `empty!` for one-shot injection).
- `plugin-guide.md`: add link to `plugin_operator_stats.jl` after the
LaSR worked example.
**Minor**:
- Remove trailing comma from `@compat public` import block.
- Improve warmup `NoPluginState()` comment.
- `examples/plugin_mutation.jl`: drop `result =` capture and
`println(result)` — SR.jl already prints the HoF table.
- `README.md`: add brief Plugin Interface section.
- Add on_mutation_evaluated!(plugin_state, mutation_type, accepted, dataset, options) lifecycle hook, firing once per next_generation return with the mutation type and accept/reject outcome - Wire 5 call sites in Mutate.jl (return_immediately, constraint failure, NaN loss, annealing rejection, normal acceptance) - Re-export from Core.jl and add to @compat public in SymbolicRegression.jl - Add test verifying hook fires with valid mutation types and both accepted and rejected outcomes - Add examples/plugin_adaptive_weights.jl: EMA-based adaptive mutation weights using on_mutation_evaluated! + on_population_evaluated! + on_generation_complete!, with periodic per-generation weight logging
Pass delta_loss = before_loss - after_loss (Float64) to the hook so plugin authors get a temperature-independent, continuous signal instead of the noisy accepted/rejected boolean alone. - NaN on constraint failure or NaN loss paths - Finite value on all valid evaluation paths (accepted or rejected) - Example updated to use delta_loss > 0 as improvement criterion - Intermediate weight tables now show prev/adapted columns with ▲/▼
…on_evaluated! hook The on_mutation_evaluated! hook previously received a single `delta_loss` (before - after), which loses information: you couldn't distinguish a valid evaluation that was stochastically rejected (finite delta) from a constraint failure or NaN eval (NaN delta). The hook now receives `before_loss` and `after_loss` separately: - `before_loss`: always finite (NaN losses are never propagated into the population) - `after_loss`: NaN on constraint failure or NaN eval; finite if the tree was successfully evaluated (including annealing rejections) This makes the independent information in `accepted` vs `after_loss` explicit: a plugin can distinguish "valid tree, probabilistically rejected" from "invalid tree, never evaluated" by checking `isnan(after_loss)`. Changes: - src/Plugin.jl: updated hook signature, default no-op, and docstring - src/Mutate.jl: updated all four call sites to pass before_loss/after_loss - test/unit/misc/test_plugin_interface.jl: updated to 4-tuple events, added isfinite(before_loss) assertion, added isnan(after_loss) assertion (with maxsize=5 to reliably trigger constraint failures) - examples/plugin_adaptive_weights.jl: updated to use the new signature; switched signal metric from binary success rate to mean relative improvement (before - after) / before; fixed normalization to use current unobserved weights rather than defaults so total weight mass is exactly conserved; added :simplify to the skip list (EMA would unfairly penalize it for not maximally improving loss); wrapped Step 7 in _run_adaptive_example() with an abspath(PROGRAM_FILE) == @__FILE__ guard so the file can be safely included by other scripts without triggering the search - examples/compare_adaptive_vs_baseline.jl: new script for side-by-side comparison of adaptive vs baseline mutation weights
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.