-
-
Notifications
You must be signed in to change notification settings - Fork 244
Description
Problem
Profiling the TTFX of a standard MTK DAE case (Cartesian pendulum) on Julia 1.12 reveals ~127s total time-to-first-solve, with the dominant cost being recompilation from method invalidations:
@mtkbuild(structural simplification): 66.5s (85% recompilation)ODEProblemconstruction: 31.5s (58% recompilation)- First
solve: 15.7s - Package loading: 13.2s
Using SnoopCompile.@snoop_invalidations, we identified 70,261 total invalidations across 621 trees when loading the MTK + OrdinaryDiffEq stack. The top 3 sources account for 57% of all invalidations:
| # | Package | Method | Invalidations | % of Total |
|---|---|---|---|---|
| 1 | SparseMatrixColorings.jl | eltype(::SparsityPatternCSC{T}) |
24,906 | 35.4% |
| 2 | BlockArrays.jl | getindex(::AbstractArray{T,N}, ::BlockIndices{N}) |
12,082 | 17.2% |
| 3 | StaticArrays.jl | any(f::Function, a::StaticArray; dims) |
3,081 | 4.4% |
Upstream PRs
Fixes have been submitted to all three packages:
-
SparseMatrixColorings.jl — gdalle/SparseMatrixColorings.jl#304
Base.eltype(::SparsityPatternCSC{Ti})was returningTi(the index type) instead ofBool, invalidating alleltype(::AbstractArray)backedges- Fix: Changed to
SparseArrays.indtypesinceTiis the index type, not element type
-
BlockArrays.jl — JuliaArrays/BlockArrays.jl#499
getindex(::AbstractArray{T,N}, ::BlockIndices{N})invalidated all compiledgetindexinstances becauseBlockIndices <: AbstractArray- Fix: Removed the
AbstractArrayfallback and added specializedunblockmethods that decomposeBlockIndicesinto block-level + sub-indexing for non-blocked axes
-
StaticArrays.jl — JuliaArrays/StaticArrays.jl#1334
any(f::Function, a::StaticArray; dims)invalidatedBase.any(::Function, ::AbstractArray)- Fix: Override
Base._any/Base._all(internal dispatch targets) instead of the entry-point methods
All three fixes pass their respective full test suites locally.
Expected impact
Eliminating ~40,069 invalidations (57% of total) should significantly reduce the recompilation overhead in @mtkbuild and ODEProblem construction, which together account for ~98s of the ~127s TTFX.
Full profiling details
TTFX breakdown (Julia 1.12.4, MTK v11.12.0)
Phase Time Compile% Recompile%
───────────────────────────── ──────── ──────── ──────────
Package loading (MTK) ~11.0s - -
Package loading (ODE solvers) ~0.7s - -
Model definition ~1.3s 99.72% -
@mtkbuild (structural simpl.) ~66.5s 99.93% 85%
ODEProblem construction ~31.5s 99.82% 58%
First solve ~15.7s 99.96% -
Second solve ~0.001s - -
───────────────────────────────────────────────────────────────
TOTAL TTFX ~127s
Top 10 import times (@time_imports)
Package Time (ms) % of total
────────────────────────── ───────── ──────────
SymbolicUtils 870 12.1%
BlockArrays 716 9.9%
ArrayLayouts 680 9.4%
NonlinearSolveFirstOrder 583 8.1%
Pkg→REPLExt 577 8.0%
MutableArithmetics 528 7.3%
SciMLBase 474 6.6%
ModelingToolkit 330 4.6%
OrdinaryDiffEqNonlinearSolve 325 4.5%
FillArrays 308 4.3%
Remaining invalidation sources (after top 3 fixed)
The remaining ~30k invalidations come from many smaller sources. The next largest contributors should be profiled once the top 3 fixes are merged to see the updated landscape.