stac2cube converts SpatioTemporal Asset Catalogs (STAC) into Analysis-Ready Data (ARD) Cubes for efficient Earth Observation (EO) processing.
ARD cubes are designed by 3 major components for Sentinel-2:
- Cloud masks are generated by user-based thresholds using s2cloudless, meaning the user can select the strenght of the cloud detection and export multiple cloud masked data cubes. Besides, the user can still use traditional, unflexible methods like query max_cc from STAC metadata and Scene Classification Layer masking (speed efficient).
- Spatially co-registered scenes with AROSICS, resolving 1-2 global X-Y pixel shifts between consecutive Sentinel-2 scenes. Sub-pixel (below 10-m) shifts can still occur.
- RGBN are super-resolved to 2.5-m with SEN2SR.
Leading a final datacube that is cloud masked with customized threshold, pixels are aligned and represented with higher spatial resolution.
This tool is designed to function both on any local-machine and HPC system using SLURM jobs.
- Feature Overview
- Installation
- Examples
- How to run on HPC (terrabyte users only)
- What's upcoming?
- Access and Licensing Details for STAC Catalogs
- How to cite
Below is an example of 2 animations showing before and after ARD cube generation.
- stac2cube.get_stac_layers
- Collects images from STAC catalogs for the selected mission based on users parameters.
- Automatically preprocess spectral/radar values based on specifications of the selected mission.
- Generates multi-dimensional data cubes, suitable for time-series.
- The data cubes can be updated anytime without generating them from the scratch.
- Available missions: Sentinel-2 L2A, Sentinel-2 L1C, Sentinel-1 RTC, Landsat C2 L2, COP DEM Glo-30 (single time)
- stac2cube.get_cloud_layers
- Collects images from Sentinel-2 L1C to automatically apply s2cloudless cloud probability algorithm on data cube structure.
- The result contains cloud probability maps and user defined binary cloud mask layers.
- When selected, clouds from the generated data cube are automatically masked out.
- Can be updated anytime.
- stac2cube.coregister_cube
- Applies coregistration algorithm on Sentinel-2 data cubes.
- AROSICS package provides the coregistration algorithm
Daniel Scheffler. (2017, July 3). AROSICS: An Automated and Robust Open-Source Image Co-Registration Software for Multi-Sensor Satellite Data (Version 0.12.1). Zenodo. https://doi.org/10.5281/zenodo.3742909 - Fix the global X/Y shift between consecutive Sentinel-2 items.
- stac2cube.super_resolve_cube
- Applies super-resolution algorithm on Sentinel-2 data cubes.
- SEN2SR package provides DNN based super-resolution algorithm
Aybar, C., Contreras, J., Donike, S., Portalés-Julià, E., Mateo-García, G., & Gómez-Chova, L. (2026). A radiometrically and spatially consistent super-resolution framework for Sentinel-2. Remote Sensing of Environment, 334, 115222. https://doi.org/10.1016/j.rse.2025.115222 - Currently super resolve 10-meters RGBN bands to 2.5-meters (soon 20-meters bands will be also super-resolved to 2.5-meters).
Installation is possible with a package manager like Micromamba & Anaconda.
$ cd "path/to/stac2cube/"
$ micromamba env create -n stac2cube2 -f environment.yml
$ micromamba env create -n stac2cube -f environment.yml; micromamba install -n stac2cube -c conda-forge vs2015_runtime
Jupyter notebooks on how to use stac2cube features and how to process data cube structure can be found in the interactive folder.
A documentation file on how to use stac2cube features on terrabyte's HPC for compute-intensive processes and for faster processing time can be found in the slurm folder.
- Sentinel-2 co-registration
- Merge Landsat TM and OLI missions for terrabyte catalogues.
- Batch processing tools for all the steps (under development).
- Caching mechanism to automatically update the missing scenes: get_stac_layers, get_cloud_layers
- More advanced interactive tools for better experience
- Sentinel-1: Orbit-state selection: Ascending/Descending
- Sentinel-1: Automatic preprocessing, e.g. SNAP tools. OR replace with NRB
- Add new spectral indices: EVI, Built-up Index (More upon request!)
- Add SLURM job array to submit multiple json files at once (good for enourmous areas; "divide and conquer")
- Silent parameter for get_stac_layers that will automatically switch to terrabyte STAC catalogs when run on HPC
- Import bbox list with projected coords: proj to geographic transformation (Under development)
- Improvements for get_cloud_layers function: mask calculation function, mask l2a data directly
- Cloud shadow detection and masking for Sentinel-2
- Native cloud masking for Landsat and Sentinel-2 scenes (Under development)
- Switch python package setup from setup.py to pyproject.toml: enables uv install besides pip
- Quite mode
- Verbose mode
- Important: terrabyte STAC catalogs can be only computed when working on a terrabyte environment.
- However, stac2cube package is designed to work on both local-machine without terrabyte connection and within terrabyte HPC environment.
- Therefore, a silent parameter will enable terrabyte STAC catalogs when a SLURM job is activated.
- The default set-up (terrabyte disabled) will feature STAC catalogs that provide "open-access data" (not open-source).
- Thus, note that stac2cube package can not guarantee unlimited access to these open-access data catalogs in the future!
| Provider | Service | STAC API | License | Open-Access | Open-Source |
|---|---|---|---|---|---|
| DLR | terrabyte | https://stac.terrabyte.lrz.de/public/api/ | MIT License Copyright (c) 2024 Deutsches Zentrum für Luft- und Raumfahrt e.V. | No | No |
| Element 84 | Earth Search | https://earth-search.aws.element84.com/v1/ | Apache License 2.0 | Yes | Yes |
| Microsoft | Planetary Computer | https://planetarycomputer.microsoft.com/api/stac/v1 | MIT License Copyright (c) Microsoft Corporation. | Yes | No |
Why do terraybte users collect data from terrabyte STAC catalog instead of open-source Earth Search?
- The data by Element 84 is stored in AWS S3 services.
- The data by DLR is stored in the servers of The Leibniz Supercomputing Centre (LRZ) in Garching/Munich.
- When working on a terrabyte environment, the data query is returned from same server instead of connecting to AWS.
- daterange: ["2017-01-01", "2025-03-28"]
- polygon: Nord Hubland/Würzburg/Germany
| Service | Returned Date | Processing Time (s) |
|---|---|---|
| terrabyte | 1134 | 24.0 |
| Earth Search | 1038 | 140.5 |
| Planetary Computer | 1133 | 12.2 |
- Indicates* that queries are faster when working on a terrabyte environment.
- Most importantly, this indicates that Earth Search archive has some missing scenes.
- Also Earth Search STAC definitions are sometimes faulty (especially Sentinel-2 L1C) and as a developer of this package, I prefer working with terrabyte API.
* Queries are iterated 10 times per each service and the average time per run is calculated (timeit module).
to be announced
Contact: https://www.geographie.uni-wuerzburg.de/en/earthobservation/staff/baturalp-arisoy/


