A comprehensive statistics and utility library for atmospheric sciences, optimized for the Pangeo ecosystem and fully Aero Protocol compliant.
Architect scientific pipelines that balance four competing goals:
- Speed: Aggressive vectorization (Numpy/Xarray) and lazy evaluation (Dask).
- Maintainability: Strictly typed code with NumPy-style docstrings.
- Provenance: Automatically track data lineage (what happened to the data) via
attrs['history']. - Visualization: A hybrid approach (Matplotlib for papers, HvPlot for interaction).
pip install monet-statsOptional dependencies for Pangeo stack:
pip install monet-stats[dask,docs,test]import xarray as xr
import numpy as np
from monet_stats.error_metrics import MB, RMSE
# Assume all data > RAM. Use dask chunks immediately.
obs = xr.open_dataset('obs.nc', chunks={'time': 100})['variable']
mod = xr.open_dataset('mod.nc', chunks={'time': 100})['variable']
# Compute Mean Bias map over the time dimension
bias = MB(obs, mod, axis='time')
# Automatic provenance tracking
print(bias.attrs['history'])from monet_stats.contingency_metrics import HSS, ETS
# Evaluate Heidke Skill Score at a specific threshold
skill = HSS(obs, mod, minval=50.0)Full API documentation and tutorials are available at: https://noaa-oar-arl.github.io/monet-stats
This project uses a comprehensive CI/CD pipeline with the following quality checks:
- Testing: Multi-Python version testing (3.8-3.12) with 60%+ coverage
- Code Formatting: Black and Ruff formatting enforcement
- Linting: Ruff and Pycodestyle linting
- Type Checking: MyPy static type analysis
- Provenance: Mandatory
historyattribute updates on all transformations