Skip to content

island solver#1303

Open
thowell wants to merge 1 commit intogoogle-deepmind:mainfrom
thowell:island_solver
Open

island solver#1303
thowell wants to merge 1 commit intogoogle-deepmind:mainfrom
thowell:island_solver

Conversation

@thowell
Copy link
Copy Markdown
Collaborator

@thowell thowell commented Apr 20, 2026

add island solver

  • jacobian: dense, sparse
  • solver: newton, cg

humanoid: monolithic, newton, dense

mjwarp-testspeed benchmarks/humanoid/humanoid.xml --nworld=8192 --nconmax=24 --njmax=64 -o "opt.use_islands=False" -o "opt.solver="newton"" -o "opt.jacobian="dense"" --event_trace --memory --info --measure_alloc --measure_solver

Summary for 8192 parallel rollouts

Total JIT time: 0.37 s
Total simulation time: 1.99 s
Total steps per second: 4,107,666
Total realtime factor: 20,538.33 x
Total time per step: 243.45 ns
Total converged worlds: 8192 / 8192

Event trace:

step: 241.46
  
    solve: 122.45
      mul_m: 5.97
    sensor_acc: 3.18

nacon alloc:

mean    std      min     max   
-------------------------------
125330  42559.5   49152  188823
184071    12864  151170  193299
190897  1166.86  189094  192776
191825  703.674  189959  192596
190507  688.169  189219  191474
189070  288.936  188687  189631
189313  184.752  188949  189676
188363  218.536  188108  188994
188766   495.19  188152  189892
191547  934.322  189962  193035

nefc alloc:

mean   std       min  max
-------------------------
36.32   12.1679   11   49
 48.7   1.23693   47   51
51.65   1.26787   50   55
   55         0   55   55
   55         0   55   55
   55         0   55   55
   55         0   55   55
55.31  0.462493   55   56
55.86  0.632772   55   57
56.41  0.491833   56   57

solver niter:

mean     std       min  max
---------------------------
 1.9597  0.991898    1    8
1.81859  0.863628    1    7
1.27772  0.552244    1    8
1.27851   0.61241    1    8
1.34154  0.623869    1    8
1.32148  0.576217    1    7
1.38368  0.641771    1    8
1.39315  0.645936    1    8
1.35529  0.615481    1    8
1.35027  0.610405    1    8

Model memory 0.03 MiB (0.01% of used memory):
 (no field >= 1% of used memory)
Data memory 247.91 MiB (67.37% of used memory):

Other memory: 120.06 MiB (32.62% of used memory)
Total memory: 368.00 MiB (0.76% of total device memory)

humanoid: island, newton, dense

mjwarp-testspeed benchmarks/humanoid/humanoid.xml --nworld=8192 --nconmax=24 --njmax=64 -o "opt.use_islands=True" -o "opt.solver="newton"" -o "opt.jacobian="dense"" --event_trace --memory --info --measure_alloc --measure_solver
Summary for 8192 parallel rollouts

Total JIT time: 0.29 s
Total simulation time: 6.85 s
Total steps per second: 1,195,074
Total realtime factor: 5,975.37 x
Total time per step: 836.77 ns
Total converged worlds: 8192 / 8192

Event trace:

step: 834.76

      island: 3.88
        tree_edges: 2.42
        flood_fill: 0.66
    
    solve: 711.57
      compute_island_mapping: 9.28
      gather_island_inputs: 34.33
      _solve_islands: 664.56
        mul_m_island: 8.40
        _update_constraint_island: 6.85
        _update_gradient_island_newton: 159.25
      scatter_island_results: 2.08

nacon alloc:

mean    std      min     max   
-------------------------------
125330  42559.6   49152  188823
184071  12863.9  151171  193289
190897  1165.86  189112  192771
191837  706.376  189961  192618
190522  690.813  189255  191488
189088  292.398  188699  189657
189348  178.284  188998  189702
188391  227.155  188125  189025
188766   483.26  188171  189902
191529  938.436  189930  193036

nefc alloc:

mean   std       min  max
-------------------------
36.32   12.1679   11   49
 48.7   1.23693   47   51
51.65   1.26787   50   55
   55         0   55   55
55.03  0.298496   55   58
   55         0   55   55
   55         0   55   55
55.62  0.596322   55   57
55.79  0.604897   55   57
56.34  0.751266   56   58

solver niter:

mean        std       min  max
------------------------------
 0.0260046  0.291834    0    8
0.00376831  0.138823    0    7
0.00198242  0.104657    0    8
 0.0025061  0.122892    0    8
0.00230591  0.113185    0    8
0.00231934  0.110295    0    7
0.00319702  0.129615    0    8
0.00237427  0.113947    0    9
0.00185547  0.105415    0    9
0.00218384  0.112463    0    9

Model memory 0.03 MiB (0.01% of used memory):
 (no field >= 1% of used memory)
Data memory 247.91 MiB (57.65% of used memory):

Other memory: 182.06 MiB (42.34% of used memory)
Total memory: 430.00 MiB (0.88% of total device memory)

three_humanoids: monolithic, newton, sparse

mjwarp-testspeed benchmarks/humanoid/three_humanoids.xml --nworld=8192 --nconmax=100 --njmax=192 -o "opt.use_islands=False" -o "opt.solver="newton"" -o "opt.jacobian="sparse"" --event_trace --memory --info --measure_alloc --measure_solver
Summary for 8192 parallel rollouts

Total JIT time: 10.86 s
Total simulation time: 11.71 s
Total steps per second: 699,736
Total realtime factor: 3,498.68 x
Total time per step: 1429.11 ns
Total converged worlds: 8192 / 8192

Event trace:

step: 1426.72

    solve: 1014.63
      mul_m: 6.75

nacon alloc:

mean    std      min     max   
-------------------------------
224216  59100.8  150536  331934
288903  31706.9  230056  330433
377805  29018.2  330938  416367
436762  10552.4  416516  458325
484967  14660.3  458858  508841
526886  8554.28  509286  538758
544362  2906.76  538925  548691
552115  2043.23  548664  555391
558604  1947.16  555447  562082
565599  1798.22  562115  568480

nefc alloc:

mean    std       min  max
--------------------------
104.39   14.3833   78  126
 120.1   3.02159  115  132
145.11   13.4855  130  171
175.76    1.3647  168  176
175.98  0.198997  174  176
175.74   0.54074  174  176
175.87  0.559553  175  177
176.83  0.375633  176  177
176.07  0.815537  173  177
175.04   1.78841  169  177

solver niter:

mean     std       min  max
---------------------------
2.61624   1.16035    1   10
2.78211    1.0825    1   10
3.31995   1.10896    1   12
3.38718   1.14931    1   11
 3.1064   1.17156    1   10
2.96407   1.18365    1   10
 2.6378   1.07454    1   10
2.43273   1.00831    1    9
2.31909  0.965218    1    9
2.28188  0.946758    1   10

Model memory 0.15 MiB (0.01% of used memory):
 (no field >= 1% of used memory)
Data memory 886.13 MiB (55.38% of used memory):

Other memory: 713.72 MiB (44.61% of used memory)
Total memory: 1600.00 MiB (3.29% of total device memory)

three humanoids: island, newton, sparse

mjwarp-testspeed benchmarks/humanoid/three_humanoids.xml --nworld=8192 --nconmax=100 --njmax=192 -o "opt.use_islands=True" -o "opt.solver="newton"" -o "opt.jacobian="sparse"" --event_trace --memory --info --measure_alloc --measure_solver
Summary for 8192 parallel rollouts

Total JIT time: 0.28 s
Total simulation time: 26.98 s
Total steps per second: 303,619
Total realtime factor: 1,518.09 x
Total time per step: 3293.60 ns
Total converged worlds: 8192 / 8192

Event trace:

step: 3290.98
 
      island: 5.08
        tree_edges: 3.43
        flood_fill: 0.85
      transmission: 4.84
    
    solve: 2878.30
      compute_island_mapping: 28.13
      gather_island_inputs: 57.49
      _solve_islands: 2785.06
        mul_m_island: 8.56
        _update_constraint_island: 34.67
        _update_gradient_island_newton: 530.51
      scatter_island_results: 6.10

nacon alloc:

mean    std      min     max   
-------------------------------
224217  59100.9  150535  331936
288904  31709.5  230058  330465
377806  29025.7  330944  416372
436713  10558.6  416531  458334
485029  14697.6  458819  509005
526952  8575.69  509425  538741
544390  2922.58  538876  548758
552263  2008.89  548850  555522
558674  1896.63  555596  562175
565614  1783.01  562196  568487

nefc alloc:

mean    std       min  max
--------------------------
104.38    14.372   78  126
 120.1   3.02159  115  132
145.15   13.5058  130  171
175.72   1.41478  168  176
175.98  0.198997  174  176
175.84  0.463033  174  176
175.85   0.68374  172  177
176.83  0.375633  176  177
176.14  0.693109  173  177
175.02   1.84922  169  177

solver niter:

mean        std       min  max
------------------------------
0.00486572  0.164696    0    9
0.00425171  0.172917    0   10
0.00282959  0.154298    0   11
0.00293213  0.155729    0   11
0.00326416  0.161586    0   11
0.00332397  0.163264    0   10
0.00326904  0.155707    0    9
 0.0025415  0.132411    0    9
0.00355957  0.151931    0    9
0.00260254  0.131827    0    9

Model memory 0.15 MiB (0.01% of used memory):
 (no field >= 1% of used memory)
Data memory 886.13 MiB (52.81% of used memory):

Other memory: 791.72 MiB (47.18% of used memory)
Total memory: 1678.00 MiB (3.45% of total device memory)

three humanoids: monolithic, cg, sparse

mjwarp-testspeed benchmarks/humanoid/three_humanoids.xml --nworld=8192 --nconmax=100 --njmax=192 -o "opt.use_islands=False" -o "opt.solver="cg"" -o "opt.jacobian="sparse"" --event_trace --memory --info --measure_alloc --measure_solver
Summary for 8192 parallel rollouts

Total JIT time: 0.33 s
Total simulation time: 33.74 s
Total steps per second: 242,806
Total realtime factor: 1,214.03 x
Total time per step: 4118.51 ns
Total converged worlds: 8192 / 8192

Event trace:

step: 4115.81
  
    solve: 3687.33
      mul_m: 6.94
      solve_m: 15.27

nacon alloc:

mean    std      min     max   
-------------------------------
224217  59101.7  150512  331930
288882  31718.9  230056  330323
377776  28998.7  330903  416225
436806  10619.1  416174  458339
484965  14652.1  458917  508970
526865  8572.93  509354  538756
544557  2922.69  538917  548920
552382  2029.46  548924  555606
558565  1879.09  555643  562005
565493  1817.36  562007  568497

nefc alloc:

mean    std       min  max
--------------------------
104.38    14.372   78  126
120.07   3.10243  114  133
145.06   13.8281  131  177
175.84      1.12  168  176
175.96      0.28  174  176
175.72  0.735935  174  178
175.84  0.703136  174  177
 176.7  0.458258  176  177
175.81   0.85668  172  177
175.15   1.82414  171  177

solver niter:

mean     std      min  max
--------------------------
27.5409  10.0002   10  100
 28.498  9.08936   11  100
29.4747   6.8135   11  100
29.6585  7.92503   11  100
27.8478  9.50745    9  100
25.5289  9.20017    9  100
21.9026  6.38348    9  100
20.1549  5.39458    9  100
 19.209  4.89946    8   89
18.8773  4.70942    8   90

Model memory 0.15 MiB (0.01% of used memory):
 (no field >= 1% of used memory)
Data memory 886.13 MiB (80.41% of used memory):

Other memory: 215.72 MiB (19.58% of used memory)
Total memory: 1102.00 MiB (2.27% of total device memory)

three humanoids: island, cg, sparse

mjwarp-testspeed benchmarks/humanoid/three_humanoids.xml --nworld=8192 --nconmax=100 --njmax=192 -o "opt.use_islands=True" -o "opt.solver="cg"" -o "opt.jacobian="sparse"" --event_trace --memory --info --measure_alloc --measure_solver
Summary for 8192 parallel rollouts

Total JIT time: 0.29 s
Total simulation time: 47.24 s
Total steps per second: 173,406
Total realtime factor: 867.03 x
Total time per step: 5766.81 ns
Total converged worlds: 8192 / 8192

Event trace:

step: 5763.58
  
      island: 5.23
        tree_edges: 3.53
        flood_fill: 0.88
      transmission: 5.48
  
    solve: 5339.11
      compute_island_mapping: 29.21
      gather_island_inputs: 58.13
      _solve_islands: 5246.01
        mul_m_island: 8.73
        _update_constraint_island: 35.15
        _update_gradient_island: 19.75
          solve_m_island: 16.78
            solve_m: 14.50
      scatter_island_results: 4.20

nacon alloc:

mean    std      min     max   
-------------------------------
224211  59095.7  150512  331915
288909  31710.1  230070  330434
377803  29019.5  330932  416423
436818  10637.2  416465  458623
484883  14598.4  459070  508660
526722   8530.3  509045  538608
544214  2895.52  538823  548552
552003  2021.19  548496  555328
558471  1939.71  555388  562051
565668  1776.79  562198  568506

nefc alloc:

mean    std       min  max
--------------------------
104.39   14.3833   78  126
120.03   3.04123  115  132
144.92   13.1967  131  170
175.72   1.41478  168  176
175.96      0.28  174  176
175.64  0.574804  174  176
175.84  0.595315  175  177
176.82  0.384187  176  177
175.86   0.76184  172  178
174.98   1.68511  172  177

solver niter:

mean        std       min  max
------------------------------
 0.0083606  0.679955    0   99
 0.0097876  0.829331    0  100
0.00935303  0.817928    0  100
 0.0113013  0.965254    0  100
  0.013302   1.10927    0  100
  0.016488   1.25677    0  100
0.00970459  0.872545    0  100
0.00866699  0.756244    0  100
 0.0072522   0.64194    0  100
0.00717285  0.632332    0  100

Model memory 0.15 MiB (0.01% of used memory):
 (no field >= 1% of used memory)
Data memory 886.13 MiB (57.17% of used memory):

Other memory: 663.72 MiB (42.82% of used memory)
Total memory: 1550.00 MiB (3.19% of total device memory)

mujoco/model/humanoid/22humanoids.xml: monolithic, newton, sparse

mjwarp-testspeed ../mujoco/model/humanoid/22_humanoids.xml --nworld=4096 --nconmax=1000 --njmax=4000 --nstep=100 -o "opt.use_islands=False" -o "opt.solver="newton"" -o "opt.jacobian="sparse"" --event_trace --memory --info --measure_alloc --measure_solver
Summary for 4096 parallel rollouts

Total JIT time: 10.52 s
Total simulation time: 53.80 s
Total steps per second: 7,614
Total realtime factor: 38.07 x
Total time per step: 131344.97 ns
Total converged worlds: 4096 / 4096

Event trace:

step: 131266.59
 
    solve: 127252.48
      mul_m: 54.63

nacon alloc:

mean    std      min     max   
-------------------------------
711248  18989.1  655360  720896
706247  5654.76  696549  712704
703544  8848.23  696059  719440
702937  2632.66  701227  710060
700257  749.916  699387  701810
706327  2852.91  701757  709159
704144  3620.84  696059  708303
723380  52129.3  684685  841684
929273  16399.2  889214  945130
812146  66219.7  698727  903521

nefc alloc:

mean   std      min  max
------------------------
695.2  18.7446  640  704
695.5  1.56525  692  698
705.7  11.6795  692  725
710.4  2.87054  707  717
719.3  5.25452  712  728
  739  5.31037  728  745
746.5  4.29535  740  752
749.5  3.10644  747  758
807.3  15.7991  771  824
785.9  6.39453  778  802

solver niter:

mean     std       min  max
---------------------------
1.78958  0.752719    1    5
1.75471  0.716038    1    6
3.00054  0.865771    1    7
2.46475  0.808027    1    7
3.43237   1.00803    1    8
3.58306  0.951092    1    8
3.33528   1.02608    1    9
3.98877   1.12086    1   10
4.93096   1.30705    2   13
4.38918   1.15871    2   12

Model memory 4.57 MiB (0.02% of used memory):
 (no field >= 1% of used memory)
Data memory 6381.59 MiB (34.50% of used memory):

Other memory: 12109.83 MiB (65.47% of used memory)
Total memory: 18496.00 MiB (38.03% of total device memory)

mujoco/model/humanoid/22humanoids.xml: island, newton, sparse

mjwarp-testspeed ../mujoco/model/humanoid/22_humanoids.xml --nworld=4096 --nconmax=1000 --njmax=4000 --nstep=100 -o "opt.use_islands=True" -o "opt.solver="newton"" -o "opt.jacobian="sparse"" --event_trace --memory --info --measure_alloc --measure_solver
Summary for 4096 parallel rollouts

Total JIT time: 0.28 s
Total simulation time: 14.68 s
Total steps per second: 27,905
Total realtime factor: 139.52 x
Total time per step: 35836.30 ns
Total converged worlds: 4096 / 4096

Event trace:

step: 35772.01
 
      island: 47.53
        tree_edges: 26.21
        flood_fill: 19.21
      transmission: 105.94
  
    solve: 31673.63
      compute_island_mapping: 327.25
      gather_island_inputs: 505.67
      _solve_islands: 30792.58
        mul_m_island: 81.59
        _update_constraint_island: 308.61
        _update_gradient_island_newton: 7285.73
      scatter_island_results: 43.99

nacon alloc:

mean    std      min     max   
-------------------------------
711248  18989.1  655360  720896
706247  5654.45  696549  712704
703543  8848.42  696059  719440
702938  2631.69  701237  710059
700260  754.304  699373  701824
706332  2852.75  701764  709191
704143  3620.04  696085  708282
723386  52124.1  684743  841685
929272  16396.7  889218  945129
812140  66219.4  698725  903517

nefc alloc:

mean   std      min  max
------------------------
695.2  18.7446  640  704
695.5  1.56525  692  698
705.7  11.6795  692  725
710.4  2.87054  707  717
719.3  5.25452  712  728
  739  5.31037  728  745
746.5  4.29535  740  752
749.5  3.10644  747  758
807.7     14.9  775  824
785.9  6.39453  778  802

solver niter:

mean        std       min  max
------------------------------
 0.0313477  0.354097    0    5
0.00649414  0.167725    0    5
0.00952148   0.23135    0    8
0.00568848   0.17425    0    7
0.00717773    0.2074    0    6
 0.0119141  0.269647    0    7
0.00888672  0.234415    0    7
 0.0154785  0.314454    0    8
 0.0114014  0.296203    0    9
0.00786133  0.246036    0    9

Model memory 4.57 MiB (0.03% of used memory):
 (no field >= 1% of used memory)
Data memory 6381.59 MiB (37.88% of used memory):

Other memory: 10459.83 MiB (62.09% of used memory)
Total memory: 16846.00 MiB (34.63% of total device memory)

mujoco/model/humanoid/22humanoids.xml: monolithic, cg, sparse

mjwarp-testspeed ../mujoco/model/humanoid/22_humanoids.xml --nworld=4096 --nconmax=1000 --njmax=4000 --nstep=100 -o "opt.use_islands=False" -o "opt.solver="cg"" -o "opt.jacobian="sparse"" --event_trace --memory --info --measure_alloc --measure_solver
Summary for 4096 parallel rollouts

Total JIT time: 0.33 s
Total simulation time: 34.06 s
Total steps per second: 12,025
Total realtime factor: 60.13 x
Total time per step: 83157.11 ns
Total converged worlds: 4096 / 4096

Event trace:

step: 83131.55
  
    solve: 78942.95
      mul_m: 55.65
      solve_m: 143.38

nacon alloc:

mean    std      min     max   
-------------------------------
711246  18988.9  655360  720896
706233   5654.1  696546  712704
703546  8851.73  696059  719444
702935  2634.98  701203  710061
700263  750.976  699378  701807
706384  2879.24  701762  709328
704160   3597.8  696117  708285
723624  52396.3  684669  842352
929384  16242.5  889907  945167
811813    66394  698012  903398

nefc alloc:

mean   std      min  max
------------------------
695.2  18.7446  640  704
695.5  1.56525  692  698
705.7  11.6795  692  725
710.4  2.87054  707  717
719.4   5.2192  712  728
738.4  4.88262  728  745
746.5  4.29535  740  752
749.6  3.44093  746  759
807.6  15.6346  771  823
785.9  6.39453  778  802

solver niter:

mean     std      min  max
--------------------------
18.1097  4.81382   12   33
 18.736  5.88986   10   39
  27.64  4.47904   14   47
23.9244  3.99587   13   64
35.5156   10.787   13   88
31.9945  7.14788   17   87
30.5227  5.96464   16   59
39.2051  8.93398   17   74
64.0329  6.79207   39   90
57.0233  6.86902   25   81

Model memory 4.57 MiB (0.06% of used memory):
 (no field >= 1% of used memory)
Data memory 6381.59 MiB (89.65% of used memory):

Other memory: 731.83 MiB (10.28% of used memory)
Total memory: 7118.00 MiB (14.63% of total device memory)

mujoco/model/humanoid/22humanoids.xml: island, cg, sparse

mjwarp-testspeed ../mujoco/model/humanoid/22_humanoids.xml --nworld=4096 --nconmax=1000 --njmax=4000 --nstep=100 -o "opt.use_islands=True" -o "opt.solver="cg"" -o "opt.jacobian="sparse"" --event_trace --memory --info --measure_alloc --measure_solver
Summary for 4096 parallel rollouts

Total JIT time: 0.28 s
Total simulation time: 15.59 s
Total steps per second: 26,279
Total realtime factor: 131.39 x
Total time per step: 38053.35 ns
Total converged worlds: 4096 / 4096

Event trace:

step: 38010.27

      island: 50.80
        tree_edges: 27.72
        flood_fill: 20.85
      transmission: 114.01
   
    solve: 33755.45
      compute_island_mapping: 343.38
      gather_island_inputs: 510.59
      _solve_islands: 32857.43
        mul_m_island: 83.20
        _update_constraint_island: 316.31
        _update_gradient_island: 171.70
          solve_m_island: 155.55
            solve_m: 144.50
      scatter_island_results: 39.68

nacon alloc:

mean    std      min     max   
-------------------------------
711246  18988.8  655360  720896
706240  5654.66  696547  712704
703548  8853.29  696058  719440
702935  2631.29  701209  710060
700264  744.981  699382  701798
706346  2862.43  701767  709176
704148  3605.46  696116  708294
723398  52154.6  684666  841734
929282  16377.2  889299  945133
812101  66248.8  698645  903521

nefc alloc:

mean   std      min  max
------------------------
695.2  18.7446  640  704
695.5  1.56525  692  698
705.7  11.6795  692  725
710.4  2.87054  707  717
719.3  5.25452  712  728
739.4  4.90306  728  745
746.6  4.15211  741  752
749.7  3.34813  747  759
807.5  15.3704  775  824
785.5  6.91737  778  802

solver niter:

mean        std       min  max
------------------------------
0.00898437  0.428897    0   32
 0.0114258  0.592817    0   43
 0.0178223  0.816561    0   51
 0.0133057  0.836102    0   89
  0.019458   1.25304    0  100
 0.0177002   1.05081    0   75
 0.0158447   0.86123    0   57
 0.0179932   1.05533    0   75
 0.0167969   1.07978    0   82
 0.0204346   1.25942    0  100

Model memory 4.57 MiB (0.04% of used memory):
 (no field >= 1% of used memory)
Data memory 6381.59 MiB (54.87% of used memory):

Other memory: 5243.83 MiB (45.09% of used memory)
Total memory: 11630.00 MiB (23.91% of total device memory)

summary: monolithic v island

humanoid newton dense: 4,107,666 v 1,195,074

three humanoid newton sparse 699,736 v 303,619
three humanoid cg sparse 242,806 v 173,406

22 humanoids newton sparse 7,614 v 27,905
22 humanoids cg sparse 12,025 v 26,279


mjwarp-viewer ../mujoco/model/humanoid/22_humanoids.xml --nconmax=1000 --njmax=4000 -o "opt.use_islands=True"
22_humanoids_island_solver.mov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant