diff --git a/paper/paper.md b/paper/paper.md index b9804a7..4a57d92 100644 --- a/paper/paper.md +++ b/paper/paper.md @@ -14,7 +14,7 @@ authors: affiliations: - name: NORCE Research AS, Bergen, Norway index: 1 -date: 28 January 2026 +date: 11 May 2026 bibliography: paper.bib --- @@ -26,17 +26,20 @@ Reservoir simulations help the energy industry make better decisions by predicti # Statement of need -The first step in reservoir simulations is to chose a simulation model, which serves as the computational representation of a geological model, incorporating properties such as heterogeinity, physics, fluid properties, boundary conditions, and wells. Once the spatial model is designed, it is discretized into cells containing average properties of the continuous reservoir model. All this information is then comunicated to the simulator, which internally solves conservation equations (mass, momentum, and energy) and constitutive equations (e.g., saturation functions, well models) to perform the predictions. OPM Flow is an open-source simulator for subsurface applications such as hydrocarbon recovery, CO$_2$ storage, and H$_2$ storage [@Rassmussen:2021]. The input files of OPM Flow follows the standar-industry Eclipse format. The different functionality is defined using keywords in a main input deck with extension .DATA and additional information is usually added in additional .INC files such as tables (saturation functions, PVT) and the grid discretization. We refer to the OPM Flow manual [OPM Flow manual](https://opm-project.org/?page_id=955) for an introduction to OPM Flow and all suported keywords. +The first step in reservoir simulations is to choose a simulation model, which serves as the computational representation of a geological model, incorporating properties such as heterogeneity, physics, fluid properties, boundary conditions, and wells. Once the spatial model is designed, it is discretized into cells containing average properties of the continuous reservoir model. All this information is then communicated to the simulator, which internally solves conservation equations (mass, momentum, and energy) and constitutive equations (e.g., saturation functions, well models) to perform the predictions. OPM Flow is an open-source simulator for subsurface applications such as hydrocarbon recovery, CO$_2$ storage, and H$_2$ storage [@Rassmussen:2021]. The input files of OPM Flow follow the standard-industry Eclipse format. The different functionalities are defined using keywords in a main input deck with extension .DATA and additional information is usually added in additional .INC files such as tables (saturation functions, PVT) and the grid discretization. We refer to the OPM Flow manual [OPM Flow manual](https://opm-project.org/?page_id=955) for an introduction to OPM Flow and all supported keywords. -Simulation models can be substantial, typically encompassing millions of cells, and can be quite complex due to the number of wells and faults, defined by cell indices in the x, y, and z direction (i,j,k nomenclature). While manually modifying small input decks is feasible, it becomes impractical for large models. In addition, these models commonly rely on corner‑point grids defined through pillars and horizons, and they may include further geometric modifications specified through deck keywords. Such representations are not intuitive to manipulate, particularly for users who are not familiar with the internal structure of simulation decks. +Simulation models can be substantial, typically encompassing millions of cells, and can be quite complex due to the number of wells and faults, defined by cell indices in the x, y, and z direction (i, j, and k nomenclature). While manually modifying small input decks is feasible, it becomes impractical for large models. In addition, these models commonly rely on corner‑point grids defined through pillars and horizons, and they may include further geometric modifications specified through deck keywords. Such representations are not intuitive to manipulate, particularly for users who are not familiar with the internal structure of simulation decks. -These challenges inspired the development of `pycopm`, a user-friendly Python tool designed to taylor geological models from provided input decks. `pycopm` is intended for researchers, engineers, and students who need to apply model transformations such as coarsening, refinement, submodel extraction, and structural adjustments. The coarsening and refinement capabilities are especially relevant in current workflows, since multi‑fidelity modeling has become an active and widely adopted research direction that requires the flexibility to generate models with different levels of complexity. +These challenges inspired the development of `pycopm`, a user-friendly Python tool designed to tailor geological models from provided input decks. `pycopm` is intended for researchers, engineers, and students who need to apply model transformations such as coarsening, refinement, submodel extraction, and structural adjustments. The coarsening and refinement capabilities are especially relevant in current workflows, since multi‑fidelity modeling has become an active and widely adopted research direction that requires the flexibility to generate models with different levels of complexity. # State of the field -Two key properties in a reservoir model are its storage capacity, measured by pore volume, and the ability of fluids to flow between cells, known as transmissibilities. Therefore, these properties must be properly handled when generating a new model. While grid refinements and transformations do not pose a significant issue, submodels and grid coarsening present challenges due to lack of unique methods for adressing these properties. In other words, the approach depends on the specific model, and while there are a few methods in literature, this remains an active are of research. +Two key properties in a reservoir model are its storage capacity, measured by pore volume, and the ability of fluids to flow between cells, known as transmissibilities. Therefore, these properties must be properly handled when generating a new model. While grid refinements and transformations do not pose a significant issue, submodels and grid coarsening present challenges due to the lack of unique methods for addressing these properties. In other words, the approach depends on the specific model, and while there are a few methods in the literature, this remains an active area of research. + + +Two of the most widely used commercial software suites in reservoir modeling are [Petrel](https://www.slb.com/products-and-services/delivering-digital-at-scale/software/petrel-subsurface-software) and [Aspen RMS](https://www.aspentech.com/en/products/sse/aspen-rms). Petrel provides an integrated, end-to-end environment that spans workflows from seismic interpretation to reservoir simulation, enabling users to construct simulation-ready models with customized gridding strategies. Aspen RMS, while also capable of handling 3D and 4D seismic data and integrating these into reservoir models, is more strongly focused on geological modeling and reservoir characterization. Despite their extensive capabilities, both platforms are highly complex and demand significant user expertise, often involving steep learning curves. Additionally, their use is constrained by proprietary licensing models, which can limit accessibility and reproducibility, particularly in academic and open research contexts. [opm-upscaling](https://github.com/OPM/opm-upscaling) is part of the [OPM initiative](https://opm-project.org/?page_id=23), and provides a set of C++ tools focused on single-phase and steady-state upscaling of capillary pressure and relative permeability. However, it does not include functionality for grid refinement or affine transformations, and its upscaling routines operate mainly on the grid structure. As a result, users must manually adjust the remaining components of the deck to match the new i, j, and k indices. Examples include updating the locations of wells, numerical aquifers, and other model elements. @@ -49,10 +52,21 @@ To the author's knowledge, prior to the development of `pycopm` there was no int # Software design -`pycopm` leverages well-stablished and excellent Python libraries. The Python package numpy [@2020NumPy-Array] forms the basis for performing arrays operations. The pandas package [@the_pandas_development_team], is used for handling cell clusters, specifically employing the methods in pandas.Series.groupby. The Shapely package [@Gillies_Shapely_2025], particularly the contains_xy, is fundamental for submodel implementation to locate grid cells within a given polygon. To parse the output binary files of OPM Flow, the [opm](https://pypi.org/project/opm/) Python libraries is utilized. The primary methods developed in `pycopm` include handling of corner-point grids, upscaling transmissibilities in complex models with faults (non-neighbouring connections) and inactive cells, projecting pore volumes on submodel boundaries, interpolating to extend definition of i,j,k dependent properties (e.g., wells, faults) in grid refinement, and parsing and writing input decks. +`pycopm` leverages well-established and excellent Python libraries. The Python package numpy [@2020NumPy-Array] forms the basis for performing array operations. The pandas package [@the_pandas_development_team] is used for handling cell clusters, specifically employing the methods in pandas.Series.groupby. The Shapely package [@Gillies_Shapely_2025], particularly the contains_xy method, is fundamental for submodel implementation used to locate grid cells within a given polygon. To parse the output binary files of OPM Flow, the [opm](https://pypi.org/project/opm/) Python libraries are utilized. The primary methods developed in `pycopm` include handling of corner-point grids, upscaling transmissibilities in complex models with faults (non-neighboring connections) and inactive cells, projecting pore volumes on submodel boundaries, interpolating to extend the definition of i, j, and k dependent properties (e.g., wells, faults) in grid refinement, and parsing and writing input decks. + + +While graphical user interfaces (GUIs) are generally more intuitive for beginners, command-line interfaces (CLIs) offer advantages in speed and efficiency, enable powerful automation and scripting, provide fine-grained control, require minimal computational resources, and integrate naturally with modern AI-driven workflows. Therefore, interaction with the tool is performed through a terminal executable named `pycopm`, which provides a set of command-line flags (27 at the time of writing; see the online documentation for the [current list](https://cssr-tools.github.io/pycopm/introduction.html#overview)). These flags control the desired functionality, such as specifying the input deck, defining how the model should be modified, and selecting the output file name. This design enables users to chain multiple operations by further editing the generated decks. For example, a user may refine a model first and subsequently extract a submodel. An illustrative example is provided in [test_4_submodel.py](https://github.com/cssr-tools/pycopm/blob/main/tests/test_4_submodel.py) in the project repository. Advanced users who are familiar with Python can access the underlying functionality directly through Python scripts. This provides greater flexibility for integrating the tool into more sophisticated workflows and for customizing model transformations to meet specific research or engineering needs. + +Internally, `pycopm` is structured as a modular Python package with clearly defined subpackages, each responsible for a specific aspect of the workflow. These include: + +* input parsing and deck processing, +* grid construction and transformations, +* property mapping and upscaling, +* file generation and output writing, and +* execution and workflow control. +This separation of concerns enhances maintainability and extensibility, allowing developers to modify or extend individual components without affecting the overall system. The primary design goal of `pycopm` is to provide flexibility for handling and modifying reservoir models, rather than focusing on support for multiple data deck formats or maximizing raw computational performance. This flexibility is enabled by its Python-based implementation and modular architecture. -Interaction with the tool is performed through a terminal executable named `pycopm`, which provides a set of command-line flags (27 at the time of writing; see the online documentation for the [current list](https://cssr-tools.github.io/pycopm/introduction.html#overview)). These flags control the desired functionality, such as specifying the input deck, defining how the model should be modified, and selecting the output file name. This design enables users to chain multiple operations by further editing the generated decks. For example, a user may refine a model first and subsequently extract a submodel. An illustrative example is provided in [test_4_submodel.py](https://github.com/cssr-tools/pycopm/blob/main/tests/test_4_submodel.py) in the project repository. Advanced users who are familiar with Python can access the underlying functionality directly through Python scripts. This provides greater flexibility for integrating the tool into more sophisticated workflows and for customizing model transformations to meet specific research or engineering needs. # Research impact statement @@ -65,9 +79,9 @@ The software has supported several research publications, including: * @Sandve2025, where submodel extraction and coarsening were applied to the Troll aquifer model to analyze pressure interference, and * @landamarbán2025, which used coarsening of the Troll aquifer model to optimize well placement in CO$_2$ storage simulations. -The `pycopm` is part of the software suite developed within the [Centre for Sustainable Subsurface Resources](https://cssr.no) and maintained under the [cssr-tools](https://github.com/cssr-tools) GitHub organization. A key objective of these tools is to support research outputs that adhere to the FAIR principles (Findable, Accessible, Interoperable, Reusable) originally formalized in @Wilkinson2016. These principles have not been consistently implemented in subsurface research in recent years [@liu2025], limiting the long-term impact and reproducibility of published results. To address this, significant effort has been dedicated to building comprehensive online documentation that enables users to reproduce figures, tables, and computational workflows from recent publications. For example, the [TCCS-13](https://cssr-tools.github.io/expreccs/tccs-13.html#) documentation includes step‑by‑step terminal commands required to generate the results presented in [@landamarbán2025]. This ensures that published work is not only transparent but also directly reusable by other researchers, enhancing scientific rigor and accelerating future developments. +`pycopm` is part of the software suite developed within the [Centre for Sustainable Subsurface Resources](https://cssr.no) and maintained under the [cssr-tools](https://github.com/cssr-tools) GitHub organization. A key objective of these tools is to support research outputs that adhere to the FAIR principles (Findable, Accessible, Interoperable, Reusable) originally formalized in @Wilkinson2016. These principles have not been consistently implemented in subsurface research in recent years [@liu2025], limiting the long-term impact and reproducibility of published results. To address this, significant effort has been dedicated to building comprehensive online documentation that enables users to reproduce figures, tables, and computational workflows from recent publications. For example, the [TCCS-13](https://cssr-tools.github.io/expreccs/tccs-13.html#) documentation includes step‑by‑step terminal commands required to generate the results presented in @landamarbán2025. This ensures that published work is not only transparent but also directly reusable by other researchers, enhancing scientific rigor and accelerating future developments. -Looking ahead to increase the research impact, the plan for `pycopm`'s future development includes extending its functionality to support additional keywords from input decks beyond those in geological models, which `pycopm` has been sucessfully tested on ([Drogon](https://github.com/OPM/opm-tests/tree/master/drogon), [Norne](https://github.com/OPM/opm-tests/tree/master/norne), [Smeaheia](https://co2datashare.org/dataset/smeaheia-dataset), [SPE10](https://github.com/OPM/opm-data/tree/master/spe10model2), [Troll aquifer model](https://arxiv.org/abs/2508.08670)). This support will be added as `pycopm` is applied in further models, and external contributions to the tool are welcomed. Additionally, extending `pycopm`'s capabilities includes implementing a feature to generate a single input deck by combining geological models from different input decks. +Looking ahead to increase the research impact, the plan for `pycopm`'s future development includes extending its functionality to support additional keywords from input decks beyond those in geological models, on which `pycopm` has been successfully tested ([Drogon](https://github.com/OPM/opm-tests/tree/master/drogon), [Norne](https://github.com/OPM/opm-tests/tree/master/norne), [Smeaheia](https://co2datashare.org/dataset/smeaheia-dataset), [SPE10](https://github.com/OPM/opm-data/tree/master/spe10model2), [Troll aquifer model](https://arxiv.org/abs/2508.08670)). This support will be added as `pycopm` is applied in further models, and external contributions to the tool are welcomed. Additionally, extending `pycopm`'s capabilities includes implementing a feature to generate a single input deck by combining geological models from different input decks. # AI usage disclosure