Skip to content

GH-5672 introduce simple learned join optimization that tracks fanout metrics#5673

Draft
hmottestad wants to merge 32 commits intodevelopfrom
GH-5672-explore-learned-join-optimization
Draft

GH-5672 introduce simple learned join optimization that tracks fanout metrics#5673
hmottestad wants to merge 32 commits intodevelopfrom
GH-5672-explore-learned-join-optimization

Conversation

@hmottestad
Copy link
Contributor

GitHub issue resolved: #5672

Briefly describe the changes proposed in this PR:


PR Author Checklist (see the contributor guidelines for more details):

  • my pull request is self-contained
  • I've added tests for the changes I made
  • I've applied code formatting (you can use mvn process-resources to format from the command line)
  • I've squashed my commits where necessary
  • every commit message starts with the issue number (GH-xxxx) followed by a meaningful description of the change

@hmottestad
Copy link
Contributor Author

Ran parts of some benchmarks and found that it generally helps for the LMDB store, but can make things worse for the MemoryStore. Kinda makes sense since the MemoryStore has a very accurate statistics.

@hmottestad hmottestad force-pushed the GH-5672-explore-learned-join-optimization branch from f4400aa to 63ae8d3 Compare January 26, 2026 14:02
@hmottestad
Copy link
Contributor Author

LMDB

Benchmark                             (themeName)  (z_queryIndex)  Mode  Cnt    Score   Error  Units
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               0  avgt        49.647          ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               1  avgt       158.159          ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               2  avgt        14.243          ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               3  avgt        82.263          ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               4  avgt       109.864          ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               5  avgt        63.425          ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               6  avgt        58.501          ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               7  avgt        49.383          ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               8  avgt        54.302          ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               9  avgt       271.530          ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS              10  avgt       208.109          ms/op
ThemeQueryBenchmark.executeQuery  ELECTRICAL_GRID               0  avgt        43.350          ms/op
ThemeQueryBenchmark.executeQuery  ELECTRICAL_GRID               1  avgt        51.632          ms/op
ThemeQueryBenchmark.executeQuery  ELECTRICAL_GRID               2  avgt         3.568          ms/op
ThemeQueryBenchmark.executeQuery  ELECTRICAL_GRID               3  avgt       320.512          ms/op
ThemeQueryBenchmark.executeQuery  ELECTRICAL_GRID               4  avgt         3.511          ms/op
ThemeQueryBenchmark.executeQuery  ELECTRICAL_GRID               5  avgt         4.964          ms/op
ThemeQueryBenchmark.executeQuery  ELECTRICAL_GRID               6  avgt        88.501          ms/op
ThemeQueryBenchmark.executeQuery  ELECTRICAL_GRID               7  avgt        20.093          ms/op
ThemeQueryBenchmark.executeQuery  ELECTRICAL_GRID               8  avgt        13.274          ms/op
ThemeQueryBenchmark.executeQuery  ELECTRICAL_GRID               9  avgt         4.670          ms/op
ThemeQueryBenchmark.executeQuery  ELECTRICAL_GRID              10  avgt       534.293          ms/op

@hmottestad
Copy link
Contributor Author

Benchmark                             (themeName)  (z_queryIndex)  Mode  Cnt    Score    Error  Units
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               0  avgt    5   54.043 ±  3.694  ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               1  avgt    5  172.954 ±  4.662  ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               2  avgt    5   19.069 ±  2.153  ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               3  avgt    5  103.410 ±  8.854  ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               4  avgt    5  136.816 ±  7.925  ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               5  avgt    5   76.785 ±  3.890  ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               6  avgt    5   67.350 ±  3.459  ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               7  avgt    5   56.240 ±  1.586  ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               8  avgt    5   67.391 ±  4.450  ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS               9  avgt    5  322.706 ± 18.317  ms/op
ThemeQueryBenchmark.executeQuery  MEDICAL_RECORDS              10  avgt    5  206.549 ± 10.093  ms/op

@hmottestad

This comment was marked as outdated.

@hmottestad

This comment was marked as resolved.

@hmottestad
Copy link
Contributor Author

hmottestad commented Jan 31, 2026

LMDB Store - develop branch

Benchmark                              (themeName)  (z_queryIndex)  Mode  Cnt       Score   Error  Units
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               0  avgt    2      51.256          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               1  avgt    2     186.645          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               2  avgt    2      50.005          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               3  avgt    2      98.563          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               4  avgt    2     121.535          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               5  avgt    2      64.928          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               6  avgt    2      61.191          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               7  avgt    2      51.656          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               8  avgt    2      58.814          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               9  avgt    2     272.421          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS              10  avgt    2  293429.747          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               0  avgt    2       0.061          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               1  avgt    2       6.016          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               2  avgt    2       0.076          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               3  avgt    2       0.069          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               4  avgt    2       0.087          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               5  avgt    2     861.699          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               6  avgt    2       0.088          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               7  avgt    2       5.986          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               8  avgt    2     737.210          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               9  avgt    2       5.751          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA              10  avgt    2       2.608          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               0  avgt    2     665.671          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               1  avgt    2     254.135          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               2  avgt    2      36.678          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               3  avgt    2      42.249          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               4  avgt    2     120.662          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               5  avgt    2       9.535          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               6  avgt    2   23784.702          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               7  avgt    2   91930.684          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               8  avgt    2      74.209          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               9  avgt    2     151.046          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY              10  avgt    2     183.888          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               0  avgt    2     233.185          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               1  avgt    2     303.684          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               2  avgt    2       1.289          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               3  avgt    2     150.705          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               4  avgt    2      60.455          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               5  avgt    2       2.357          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               6  avgt    2     248.904          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               7  avgt    2       4.436          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               8  avgt    2       2.528          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               9  avgt    2       3.786          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING              10  avgt    2       1.922          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               0  avgt    2     323.203          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               1  avgt    2    1165.947          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               2  avgt    2     548.036          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               3  avgt    2      99.026          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               4  avgt    2     248.351          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               5  avgt    2     109.557          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               6  avgt    2    1463.763          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               7  avgt    2     152.188          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               8  avgt    2    1170.030          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               9  avgt    2    1481.964          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED              10  avgt    2  126334.595          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               0  avgt    2      37.937          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               1  avgt    2      86.435          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               2  avgt    2       8.738          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               3  avgt    2     162.916          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               4  avgt    2     138.137          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               5  avgt    2      26.519          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               6  avgt    2      97.552          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               7  avgt    2      51.524          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               8  avgt    2     269.601          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               9  avgt    2     262.040          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN              10  avgt    2     226.000          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               0  avgt    2      42.103          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               1  avgt    2      87.022          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               2  avgt    2       4.939          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               3  avgt    2     328.922          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               4  avgt    2       4.621          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               5  avgt    2      11.560          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               6  avgt    2      95.600          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               7  avgt    2      19.453          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               8  avgt    2      13.925          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               9  avgt    2       5.615          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID              10  avgt    2     524.353          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               0  avgt    2       0.303          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               1  avgt    2       1.736          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               2  avgt    2      39.557          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               3  avgt    2      14.988          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               4  avgt    2      31.892          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               5  avgt    2       0.408          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               6  avgt    2       5.374          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               7  avgt    2      22.562          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               8  avgt    2      32.379          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               9  avgt    2      18.538          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA              10  avgt    2   13561.666          ms/op

LMDB - current branch

Benchmark                              (themeName)  (z_queryIndex)  Mode  Cnt       Score   Error  Units
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               0  avgt           47.744          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               1  avgt          163.881          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               2  avgt           13.534          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               3  avgt           76.088          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               4  avgt          118.056          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               5  avgt           41.171          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               6  avgt           52.954          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               7  avgt           40.042          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               8  avgt           50.850          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               9  avgt          260.544          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS              10  avgt          197.068          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               0  avgt            0.062          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               1  avgt            6.802          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               2  avgt            0.078          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               3  avgt            0.069          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               4  avgt            0.090          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               5  avgt          878.428          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               6  avgt            0.094          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               7  avgt            5.957          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               8  avgt         1418.638          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               9  avgt            6.528          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA              10  avgt            2.906          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               0  avgt          635.159          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               1  avgt          281.900          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               2  avgt           38.960          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               3  avgt           39.309          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               4  avgt           50.105          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               5  avgt            6.886          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               6  avgt        24480.634          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               7  avgt         1073.785          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               8  avgt           72.580          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               9  avgt          127.586          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY              10  avgt          181.650          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               0  avgt          209.665          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               1  avgt          261.191          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               2  avgt            1.079          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               3  avgt          118.136          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               4  avgt          123.813          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               5  avgt            1.477          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               6  avgt          189.156          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               7  avgt            3.752          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               8  avgt            2.084          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               9  avgt            2.338          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING              10  avgt            1.533          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               0  avgt          338.324          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               1  avgt         1100.662          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               2  avgt          467.560          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               3  avgt          100.586          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               4  avgt          224.992          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               5  avgt          116.484          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               6  avgt         1265.802          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               7  avgt          114.089          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               8  avgt         1087.505          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               9  avgt         1242.152          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED              10  avgt       141286.116          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               0  avgt           32.008          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               1  avgt           81.153          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               2  avgt            7.534          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               3  avgt          144.042          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               4  avgt          130.057          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               5  avgt           19.557          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               6  avgt           90.876          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               7  avgt           43.371          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               8  avgt          247.294          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               9  avgt          224.832          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN              10  avgt          177.129          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               0  avgt           37.270          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               1  avgt           81.124          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               2  avgt            3.337          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               3  avgt          319.407          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               4  avgt           46.699          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               5  avgt            7.177          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               6  avgt           76.785          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               7  avgt           18.051          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               8  avgt           12.992          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               9  avgt            4.230          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID              10  avgt          463.624          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               0  avgt            0.253          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               1  avgt            1.393          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               2  avgt           42.382          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               3  avgt           10.249          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               4  avgt           33.044          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               5  avgt            0.375          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               6  avgt           35.152          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               7  avgt           22.053          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               8  avgt           23.804          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               9  avgt           14.635          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA              10  avgt          245.959          ms/op

@hmottestad
Copy link
Contributor Author

Untitled 12 numbers-Sheet 2

@hmottestad
Copy link
Contributor Author

Greedy only

Benchmark                              (themeName)  (z_queryIndex)  Mode  Cnt       Score   Error  Units
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               0  avgt           47.184          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               1  avgt          167.937          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               2  avgt           44.550          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               3  avgt           81.024          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               4  avgt          112.538          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               5  avgt           44.401          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               6  avgt           58.372          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               7  avgt           45.652          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               8  avgt           52.312          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS               9  avgt          276.430          ms/op
ThemeQueryBenchmark.executeQuery   MEDICAL_RECORDS              10  avgt          248.024          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               0  avgt            0.063          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               1  avgt            6.157          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               2  avgt            0.085          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               3  avgt            0.071          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               4  avgt            0.091          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               5  avgt          884.975          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               6  avgt            0.097          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               7  avgt            5.997          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               8  avgt          832.997          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA               9  avgt            6.501          ms/op
ThemeQueryBenchmark.executeQuery      SOCIAL_MEDIA              10  avgt            2.827          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               0  avgt          666.685          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               1  avgt          285.595          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               2  avgt           39.355          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               3  avgt           41.051          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               4  avgt          123.073          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               5  avgt            7.158          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               6  avgt        23766.987          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               7  avgt        50752.718          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               8  avgt           92.749          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY               9  avgt          149.571          ms/op
ThemeQueryBenchmark.executeQuery           LIBRARY              10  avgt          184.438          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               0  avgt          220.602          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               1  avgt          250.381          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               2  avgt            1.095          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               3  avgt          123.884          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               4  avgt          131.283          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               5  avgt            1.587          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               6  avgt          205.274          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               7  avgt            3.843          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               8  avgt            2.205          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING               9  avgt            2.321          ms/op
ThemeQueryBenchmark.executeQuery       ENGINEERING              10  avgt            1.547          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               0  avgt          350.279          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               1  avgt         1105.783          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               2  avgt          843.457          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               3  avgt          100.418          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               4  avgt          240.476          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               5  avgt          118.305          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               6  avgt         1357.955          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               7  avgt          116.490          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               8  avgt         1119.168          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED               9  avgt         1331.186          ms/op
ThemeQueryBenchmark.executeQuery  HIGHLY_CONNECTED              10  avgt       143214.190          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               0  avgt           33.528          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               1  avgt           81.513          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               2  avgt            7.768          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               3  avgt          147.244          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               4  avgt          131.114          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               5  avgt           18.858          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               6  avgt           88.361          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               7  avgt           44.738          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               8  avgt          212.916          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN               9  avgt          228.640          ms/op
ThemeQueryBenchmark.executeQuery             TRAIN              10  avgt          182.709          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               0  avgt           38.748          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               1  avgt           79.991          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               2  avgt            3.558          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               3  avgt          330.158          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               4  avgt            3.365          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               5  avgt            7.067          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               6  avgt           83.344          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               7  avgt           18.608          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               8  avgt           12.851          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID               9  avgt            4.956          ms/op
ThemeQueryBenchmark.executeQuery   ELECTRICAL_GRID              10  avgt          474.914          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               0  avgt            0.256          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               1  avgt            2.033          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               2  avgt           37.610          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               3  avgt           11.456          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               4  avgt           27.441          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               5  avgt            0.380          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               6  avgt            4.666          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               7  avgt           19.869          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               8  avgt           24.607          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA               9  avgt           14.113          ms/op
ThemeQueryBenchmark.executeQuery            PHARMA              10  avgt          246.578          ms/op

@kenwenzel
Copy link
Contributor

@hmottestad I really like this - thank you :-)
Do you also have some ideas for improving the existing cardinality estimation within LmdbStore?

@hmottestad
Copy link
Contributor Author

@hmottestad I really like this - thank you :-) Do you also have some ideas for improving the existing cardinality estimation within LmdbStore?

How does it work today?

@kenwenzel
Copy link
Contributor

kenwenzel commented Feb 2, 2026

It tries to estimate the cardinality based on the distance between the keys. For the first n keys the distance between two consecutive ones is measured.
Then more samples are taken, e.g., in the middle of the existing key range and in the end.
Finally, based on the computed distance between to keys an estimation is computed.
(Sounds a bit complicated. Maybe I need to create a formula for this ;-))

BTW, libmdbx - a fork of LMDB has something like the NativeStore's btree estimation of nodes built-in.

@hmottestad
Copy link
Contributor Author

What happens if you don't have a perfect match for the index? Does it account for the fact that a lot of results will be discarded very quickly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants