Fix 1770, O3mda8 model maps, selectively for countour maps #1774

charlienegri · 2025-12-18T15:33:26Z

Change Summary

calculation of countour model maps for conco3mda8 in the map engine

Related issue number

Checklist

Start with a draft-PR
The PR title is a good summary of the changes
PR is set to AeroTools and a tentative milestone
Documentation reflects the changes where applicable
Tests for the changes exist where applicable
Tests pass locally
Tests pass on CI
At least 1 reviewer is selected
Make PR ready to review

…lmaps engine

…ars list, i.e. has been computed in the map engine, fix obs_name in the onlymap case for cams283 where first_with_mod_name[0] gives the key name which is EEA while we want EEA-UTD

codecov · 2025-12-18T15:38:34Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.50%. Comparing base (357b9b4) to head (603ef16).
⚠️ Report is 37 commits behind head on main-dev.

Additional details and impacted files

@@             Coverage Diff              @@
##           main-dev    #1774      +/-   ##
============================================
+ Coverage     78.35%   78.50%   +0.15%     
============================================
  Files           176      176              
  Lines         23414    23445      +31     
============================================
+ Hits          18346    18406      +60     
+ Misses         5068     5039      -29

Flag	Coverage Δ
unittests	`78.50% <100.00%> (+0.15%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

charlienegri · 2026-01-08T08:13:59Z

charlienegri · 2026-01-08T11:55:29Z

@dulte @heikoklein this approach makes it possible to compute the maps for a single full season, which is what Svetlana needs if I understood correctly, but for 11 seasons it takes close to 40h so I would not use the flag in that case...

if you see a smarter way of implementing it let me know

heikoklein

I don't have any better ideas, but I don't understand why the mda8 calculation of 3d data should take 40hours for a year? Maybe you have a simple test-case where we can check performance?

Possible pitfalls might be that the dataset is computed several times? An hourly dataset will be too much for memory, while a daily dataset should work fine, so maybe try to compute before moving back to iris?

Maybe try some dask parallellization? (also easier to test with a simpler test-case)

pyaerocom/aeroval/modelmaps_engine.py

heikoklein · 2026-01-15T14:27:02Z

pyaerocom/aeroval/modelmaps_engine.py

+        data2.rolling(time=8, center=False, min_periods=6)
+        .mean("time")
+        .resample(time="24h", origin="start_day", label="left", offset="1h")
+        .reduce(lambda x, axis: np.apply_along_axis(min_periods_max, 0, x, min_periods=18))


I find the code in stats/mda8/mda8.py easier readable:
mda8 = _daily_max(_rolling_average_8hr(data))

Maybe you can reuse that part? (and/or put it it there)?

cannot be reused directly, at least the _daily_max part because of different structure of the object handled, and so the axis to apply the lambda function to is different (if I got it right...)

Ah, I see, mean("time") <-> mean() and np.apply_along_axis(min_periods_max, 0, x, min_periods=18) <-> np.apply_along_axis(min_periods_max, 1, x, min_periods=18)

What I like in

pyaerocom/pyaerocom/stats/mda8/mda8.py

Line 91 in 4c3bda6

mda8 = _daily_max(_rolling_average_8hr(data))

is that the 2 map-reduce operations are split and somehow documented by function names. Putting your code in the same module as e.g. mda8_3d and rolling_average_8h_3d would make the code clearer and would show, that you tried to implement exactly the same mda8 as we do for collocated data.

sure, this we can do , I will refactor the code

charlienegri · 2026-01-15T18:55:56Z

I don't have any better ideas, but I don't understand why the mda8 calculation of 3d data should take 40hours for a year? Maybe you have a simple test-case where we can check performance?

Possible pitfalls might be that the dataset is computed several times? An hourly dataset will be too much for memory, while a daily dataset should work fine, so maybe try to compute before moving back to iris?

Maybe try some dask parallellization? (also easier to test with a simpler test-case)

40h is the total runtime for all models and 11 season which is almost 3 years...
conco3mda calculation for a single model and 11 seasons (33 months) is something like a little under 2h if I remember correctly from that test..
I think I have leveraged the lazy loading as possible but maybe I am not seeing some easy optimization
I can further test performance with a simple case..
but anyway for 1 season is doable and for a single model and 1 year might also be doable... the code is stiil ugly tho,

charlienegri · 2026-01-15T19:28:21Z

I don't have any better ideas, but I don't understand why the mda8 calculation of 3d data should take 40hours for a year? Maybe you have a simple test-case where we can check performance?
Possible pitfalls might be that the dataset is computed several times? An hourly dataset will be too much for memory, while a daily dataset should work fine, so maybe try to compute before moving back to iris?
Maybe try some dask parallellization? (also easier to test with a simpler test-case)

40h is the total runtime for all models and 11 season which is almost 3 years... conco3mda calculation for a single model and 11 seasons (33 months) is something like a little under 2h if I remember correctly from that test.. I think I have leveraged the lazy loading as possible but maybe I am not seeing some easy optimization I can further test performance with a simple case.. but anyway for 1 season is doable and for a single model and 1 year might also be doable... the code is stiil ugly tho,

heikoklein · 2026-01-16T08:21:33Z

40h is the total runtime for all models and 11 season which is almost 3 years... conco3mda calculation for a single model and 11 seasons (33 months) is something like a little under 2h if I remember correctly from that test.. I think I have leveraged the lazy loading as possible but maybe I am not seeing some easy optimization I can further test performance with a simple case.. but anyway for 1 season is doable and for a single model and 1 year might also be doable... the code is stiil ugly tho,

From this I see that the maps for a normal component takes ~20min, while mda8/conc03 takes ~115min. Considering 24x the amount of data which needs to be read, this looks pretty good. Or are all variables read hourly?

charlienegri · 2026-01-16T08:26:17Z

40h is the total runtime for all models and 11 season which is almost 3 years... conco3mda calculation for a single model and 11 seasons (33 months) is something like a little under 2h if I remember correctly from that test.. I think I have leveraged the lazy loading as possible but maybe I am not seeing some easy optimization I can further test performance with a simple case.. but anyway for 1 season is doable and for a single model and 1 year might also be doable... the code is stiil ugly tho,

From this I see that the maps for a normal component takes ~20min, while mda8/conc03 takes ~115min. Considering 24x the amount of data which needs to be read, this looks pretty good. Or are all variables read hourly?

all data is read hourly so there is no extra reading time at all, it's all purely computation alas..

heikoklein · 2026-01-16T08:48:44Z

From this I see that the maps for a normal component takes ~20min, while mda8/conc03 takes ~115min. Considering 24x the amount of data which needs to be read, this looks pretty good. Or are all variables read hourly?

all data is read hourly so there is no extra reading time at all, it's all purely computation alas..

If all data is read hourly, and all data is resampled to daily plots, then 6 times the time just for a rolling average resampling sounds a lot (pure gutt-feeling, no proof yet). Or is conco3 also plottet as other things (o3dailymax/o3dailymean)? Do you have an example input-file?

charlienegri · 2026-01-16T08:59:26Z

From this I see that the maps for a normal component takes ~20min, while mda8/conc03 takes ~115min. Considering 24x the amount of data which needs to be read, this looks pretty good. Or are all variables read hourly?

all data is read hourly so there is no extra reading time at all, it's all purely computation alas..

If all data is read hourly, and all data is resampled to daily plots, then 6 times the time just for a rolling average resampling sounds a lot (pure gutt-feeling, no proof yet). Or is conco3 also plottet as other things (o3dailymax/o3dailymean)? Do you have an example input-file?

the rolling average alone is quite fast, based on my tests, it's the rest that is slow...
a way to run a test is to use the cli with type season and whatever time window you want, example

cams2_83 forecast season 2025-11-01 2025-11-30 --model-path /lustre/storeB/project/fou/kl/CAMS2_83/model --obs-path /lustre/storeB/project/fou/kl/CAMS2_83/obs --data-path /lustre/storeB/users/heikok/something/data --coldata-path /lustre/storeB/users/heikok/something/coldata --cache /lustre/storeB/users/heikok/something/_cache_test --name 'TEST' --id test-conco3mda-contours --description 'test'  -p 2 --onlymap --conco3mda8contours

(note that this will not produce an experiment output valid for aeroval)

…cated data object

…t this point

charlienegri · 2026-01-16T13:59:40Z

I am testing the latest code with as much overlap as possible with the mda8 calculation already existing and it's even slower.... I think the reason is that the time dimension shift + filtering baked into _calc_mda8 are too expensive, even if they make sense also for the mapengine calculation...

for 1 season is ~ 9 min per model
I am not sure how much slower, may also be very little, I'll do some more testing and then refactor again if significant

… the case of griddeddata, skip it

heikoklein

Nice improvements in readability.

pyaerocom/stats/mda8/mda8.py

heikoklein · 2026-01-16T17:39:05Z

pyaerocom/aeroval/modelmaps_engine.py

    VariableDefinitionError,
    VarNotAvailableError,
 )
+from pyaerocom.stats.mda8.mda8 import _calc_mda8


In theory, functions starting with _ should not be considered private and not be imported (except in tests). Consider turning _calc_mda8 to calc_mda8 to make it officially public.

Co-authored-by: Heiko Klein <Heiko.Klein@met.no>

charlienegri · 2026-01-18T17:01:28Z

timing with latest code for 11 seasons for 1 model is ~ 1.5h

heikoklein · 2026-01-19T07:46:56Z

timing with latest code for 11 seasons for 1 model is ~ 1.5h
This is better than the 115min you had before, but also the concno2 numbers are reduced from 9min to 6min. Unless you have made considerable code-changes (last changes were more about readability), I would rather say the performance improvement come from a faster node?

charlienegri · 2026-01-19T07:56:30Z

timing with latest code for 11 seasons for 1 model is ~ 1.5h
This is better than the 115min you had before, but also the concno2 numbers are reduced from 9min to 6min. Unless you have made considerable code-changes (last changes were more about readability), I would rather say the performance improvement come from a faster node?

yes indeed, so far we have just moved things around, performance has not been improved (was actually made worse before 164e8f3 )

This reverts commit 4e9b3c2.

…ase only_model_map=True and compute_conco3mda8_contours=True

…for conco3 by copying it, fix bug in the case uri.meta['obsvar'] == 'conco3mda8': continue instead of break, add test

…se is valid, i.e. is such that all the tabs work in aeroval

charlienegri · 2026-01-21T11:38:40Z

https://aeroval-test.met.no/charlien/pages/maps/?project=cams2-83&experiment=forecast-SON2025&parameter=conco3mda8

note for cams2_83: menu.json is now written compatibly with a non-only-map experiment.. it will need to be rsynced from the only-map experiments too on the top of the contour folder

charlienegri · 2026-01-22T09:27:31Z

@heikoklein if no objections I will merge this, we can consider performance improvement if there is a need at some point..

charlienegri added 3 commits December 18, 2025 14:35

this approach bets on a conversion to xarray, all happens in the mode…

7a10825

…lmaps engine

handle vert_code in the case of conco3mda8 when it's not in the obs v…

d892814

…ars list, i.e. has been computed in the map engine, fix obs_name in the onlymap case for cams283 where first_with_mod_name[0] gives the key name which is EEA while we want EEA-UTD

fix wrong key

3e1f8b9

charlienegri added 2 commits December 19, 2025 09:35

make the conco3mda8 contours calculation happening selectively

16870f6

fix tests

34253e4

This was referenced Dec 21, 2025

O3mda8 model maps #1770

Closed

Fix 1770 #1773

Closed

charlienegri added 4 commits December 22, 2025 16:49

Merge branch 'main-dev' into fix_1770_approach_2

1f31d36

drop all nans in conco3mda8 computation, add tests

143dcef

remove duplicate code

1a3de9e

calculate conco3mda8 in process_countour_map only if GriddedData

c1e1139

charlienegri self-assigned this Jan 5, 2026

charlienegri added enhancement ✨ New feature or request CAMS2_83 Issues related to the CAMS2_83 contract labels Jan 5, 2026

the dropna is extremely slow and also memory expensive

4c3bda6

charlienegri added this to the m2026-02 milestone Jan 8, 2026

charlienegri changed the title ~~Fix 1770 approach 2~~ Fix 1770 selectively for countour maps Jan 8, 2026

charlienegri requested a review from dulte January 8, 2026 11:55

heikoklein changed the title ~~Fix 1770 selectively for countour maps~~ Fix 1770, O3mda8 model maps, selectively for countour maps Jan 15, 2026

heikoklein reviewed Jan 15, 2026

View reviewed changes

fix ts_type in conco3mda8 GriddedData object

9779ceb

charlienegri added 3 commits January 16, 2026 14:12

even more recycling of the mda8 calculation tooling used for the colo…

7de6f7a

…cated data object

correct check is done on dimension

a816a68

coherence with mda8 in colocated data should be now already ensured a…

b80fa11

…t this point

time shift and filtering is too expensive for the mda8 calculation in…

164e8f3

… the case of griddeddata, skip it

heikoklein approved these changes Jan 16, 2026

View reviewed changes

charlienegri and others added 2 commits January 17, 2026 15:50

accepts review proposed changes

a648bcc

Co-authored-by: Heiko Klein <Heiko.Klein@met.no>

addressing review, making function public

861055e

charlienegri added 9 commits January 19, 2026 11:00

add modelmaps_opts for the frontend (Augustin's request)

3e2293f

fix extra_map_vars option

4e9b3c2

Revert "fix extra_map_vars option"

2bd51d1

This reverts commit 4e9b3c2.

fix extra_map_vars option

20cb912

try to add conco3mda to the menu.json in the case of only_model_map=True

bf0bd35

try to add conco3mda to the menu.json in the case of only_model_map=True

2682019

simplify the creation of the menu.json entry for conco3mda8 for the c…

ea7b90a

…ase only_model_map=True and compute_conco3mda8_contours=True

fix typo in variable name, avoid modifying original dictionary entry …

3f025cc

…for conco3 by copying it, fix bug in the case uri.meta['obsvar'] == 'conco3mda8': continue instead of break, add test

make sure the menu entry for conco3mda8 in the only_model_map=True ca…

603ef16

…se is valid, i.e. is such that all the tabs work in aeroval

charlienegri marked this pull request as ready for review January 21, 2026 11:39

charlienegri merged commit f492a4e into main-dev Jan 22, 2026
8 checks passed

Fix 1770, O3mda8 model maps, selectively for countour maps #1774

Fix 1770, O3mda8 model maps, selectively for countour maps #1774

Uh oh!

Conversation

charlienegri commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Summary

Related issue number

Checklist

Uh oh!

codecov bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

charlienegri commented Jan 8, 2026

Uh oh!

charlienegri commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

heikoklein left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

heikoklein Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

charlienegri Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

heikoklein Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

charlienegri Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charlienegri commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charlienegri commented Jan 15, 2026

Uh oh!

heikoklein commented Jan 16, 2026

Uh oh!

charlienegri commented Jan 16, 2026

Uh oh!

heikoklein commented Jan 16, 2026

Uh oh!

charlienegri commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charlienegri commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

heikoklein left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

heikoklein Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

charlienegri commented Jan 18, 2026

Uh oh!

heikoklein commented Jan 19, 2026

Uh oh!

charlienegri commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charlienegri commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charlienegri commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

charlienegri commented Dec 18, 2025 •

edited

Loading

codecov bot commented Dec 18, 2025 •

edited

Loading

charlienegri commented Jan 8, 2026 •

edited

Loading

charlienegri Jan 16, 2026 •

edited

Loading

charlienegri commented Jan 15, 2026 •

edited

Loading

charlienegri commented Jan 16, 2026 •

edited

Loading

charlienegri commented Jan 16, 2026 •

edited

Loading

charlienegri commented Jan 19, 2026 •

edited

Loading

charlienegri commented Jan 21, 2026 •

edited

Loading