Skip to content

TTFX: Upstream invalidation fixes to reduce compilation time by ~40k invalidations #4358

@ChrisRackauckas-Claude

Description

@ChrisRackauckas-Claude

Problem

Profiling the TTFX of a standard MTK DAE case (Cartesian pendulum) on Julia 1.12 reveals ~127s total time-to-first-solve, with the dominant cost being recompilation from method invalidations:

  • @mtkbuild (structural simplification): 66.5s (85% recompilation)
  • ODEProblem construction: 31.5s (58% recompilation)
  • First solve: 15.7s
  • Package loading: 13.2s

Using SnoopCompile.@snoop_invalidations, we identified 70,261 total invalidations across 621 trees when loading the MTK + OrdinaryDiffEq stack. The top 3 sources account for 57% of all invalidations:

# Package Method Invalidations % of Total
1 SparseMatrixColorings.jl eltype(::SparsityPatternCSC{T}) 24,906 35.4%
2 BlockArrays.jl getindex(::AbstractArray{T,N}, ::BlockIndices{N}) 12,082 17.2%
3 StaticArrays.jl any(f::Function, a::StaticArray; dims) 3,081 4.4%

Upstream PRs

Fixes have been submitted to all three packages:

  1. SparseMatrixColorings.jlgdalle/SparseMatrixColorings.jl#304

    • Base.eltype(::SparsityPatternCSC{Ti}) was returning Ti (the index type) instead of Bool, invalidating all eltype(::AbstractArray) backedges
    • Fix: Changed to SparseArrays.indtype since Ti is the index type, not element type
  2. BlockArrays.jlJuliaArrays/BlockArrays.jl#499

    • getindex(::AbstractArray{T,N}, ::BlockIndices{N}) invalidated all compiled getindex instances because BlockIndices <: AbstractArray
    • Fix: Removed the AbstractArray fallback and added specialized unblock methods that decompose BlockIndices into block-level + sub-indexing for non-blocked axes
  3. StaticArrays.jlJuliaArrays/StaticArrays.jl#1334

    • any(f::Function, a::StaticArray; dims) invalidated Base.any(::Function, ::AbstractArray)
    • Fix: Override Base._any/Base._all (internal dispatch targets) instead of the entry-point methods

All three fixes pass their respective full test suites locally.

Expected impact

Eliminating ~40,069 invalidations (57% of total) should significantly reduce the recompilation overhead in @mtkbuild and ODEProblem construction, which together account for ~98s of the ~127s TTFX.

Full profiling details

TTFX breakdown (Julia 1.12.4, MTK v11.12.0)
Phase                          Time      Compile%  Recompile%
─────────────────────────────  ────────  ────────  ──────────
Package loading (MTK)          ~11.0s    -         -
Package loading (ODE solvers)  ~0.7s     -         -
Model definition               ~1.3s    99.72%    -
@mtkbuild (structural simpl.) ~66.5s    99.93%    85%
ODEProblem construction       ~31.5s    99.82%    58%
First solve                   ~15.7s    99.96%    -
Second solve                   ~0.001s  -         -
───────────────────────────────────────────────────────────────
TOTAL TTFX                    ~127s
Top 10 import times (@time_imports)
Package                        Time (ms)  % of total
──────────────────────────     ─────────  ──────────
SymbolicUtils                     870      12.1%
BlockArrays                       716       9.9%
ArrayLayouts                      680       9.4%
NonlinearSolveFirstOrder          583       8.1%
Pkg→REPLExt                       577       8.0%
MutableArithmetics                528       7.3%
SciMLBase                         474       6.6%
ModelingToolkit                   330       4.6%
OrdinaryDiffEqNonlinearSolve     325       4.5%
FillArrays                        308       4.3%
Remaining invalidation sources (after top 3 fixed)

The remaining ~30k invalidations come from many smaller sources. The next largest contributors should be profiled once the top 3 fixes are merged to see the updated landscape.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions