You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
LORIS Python is a codebase whose code lives in a single package made up of the lib and scripts directories. This codebase contains several pipelines whose isolation from each other greatly varies: the DICOM importer code almost exclusively lives in the lib/import_dicom_study directory, the BIDS converter code mostly lives in the lib/dcm2bids_imaging_pipeline_lib directory, and the (current) BIDS importer lives in several files spread out across the codebase (lib/bidsreader.py, lib/candidate.py, lib/eeg.py, lib/mri.py, lib/session.py...)
I believe this is not a sustainable way to do development: since all the pipelines live in a single package, the boundaries between those are often blurry in practice, the pipelines themselves are less discoverable, and adding or modifying a pipeline becomes very hard to review and to merge as it risks impacting the whole codebase (which partly explains the BIDS importer refactor situation). Moreover, I personally want to add new pipelines to LORIS Python in the near or far future, notably for MEG support and imaging upload. Some of these pipelines may contain consequential features (probably an HTTP server), so I want those to be fully isolated, opt-in, and disabled by default.
In order to enforce better segmentation and modularity in the codebase, I propose to divide LORIS-Python into several packages, which can be installed, reviewed, or replaced independently.
Architecture
The new architecture I propose, given the current pipelines in the codebase, looks like this:
Notice the lib and scripts directories have been replaced by several loris_x packages, which each declares its own dependencies, including with other packages. There are basically four kinds of packages:
Packages that do not require to have a LORIS installation (loris_util, loris_bids_reader, loris_eeg_chunker).
The core package used to interact with the LORIS environment or database (loris_core).
The pipeline-specific packages that contains the libraries and scripts relevant to a pipeline (loris_dicom_importer, loris_bids_importer).
The main LORIS package declared at the root level and contains the other packages.
Each package should follow the conventional src-layout, which looks like this:
A major question is obviously how to move from the current monolithic architecture to the modular one? Well, it is actually not that hard, and can be accomplished in the few following steps:
Make the EEG chunker directory into a real package (see Make the EEG chunker into a standalone Python package #1338). This is also convenient because I learned that C-BRAIN is currently pulling the whole of LORIS-MRI / LORIS Python to use the chunker, which could be avoided if it is made into a standalone package.
Extract lib/util out of lib into its own loris_util package, so that it can be used by any other package. This is not a hard breaking change because this module was only added in last version, which has not been adopted yet by any project.
Extract the BIDS reader (loris_bids_reader) out of the new BIDS importer (BIDS importer refactor #1325) into its own package. The BIDS importer is basically made up of two parts: one that reads the BIDS dataset, and the other that ingests its data in the database. The former part is completely independent of LORIS, and should therefore be put in its own package to make it more isolated and reusable.
Make the lib and scripts directories into the LORIS core package (loris_core). This can be simply accomplished by putting scripts directory inside lib, renaming the latter to loris_core, and adding the package metadata (pyproject.toml).
Extract the DICOM importer and new BIDS importer into their specific loris_dicom_importer and loris_bids_importer packages, that depend on loris_core. Those are pipelines that I wrote and that are already very well isolated so this is rather trivial.
Extract the other pipelines into their own specific packages (such as the DICOM archive to BIDS converter). This is an optional step, as these pipelines can also simply live in the loris_core package until we decide to move them out.
I can write the PRs for all the steps except the last one myself (which as said, is not required anyway). While it seems like a lot of breaking changes at first glance, this is mostly just moving code around and adding package metadata. From a current user perspective this mostly results in simply updating imports, which is mostly a simple find-and-replace operation (which is also checked by CI!):
from lib import x → from loris_core import x
from lib.util import x → from loris_util import x
from lib.imaging_lib.bids import x → from loris_bids_reader import x
from chunking import x → from loris_eeg_chunker.chunking import x (already done)
Use
To install LORIS Python with this modular architecture, simply use pip install $LORIS_MRI_DIR_PATH exactly as it works now in CI (and that I also use on my VMs), this installs the main LORIS package and all the other packages it depends on (that is, loris_core and all the pipelines). The only subtlety is that to edit a specific package, you probably need to use pip install -e $YOUR_PACKAGE_PATH (-e stands for --editable) so that the changes in the code are reflected in the package behavior. This will likely need to be documented in a docs/python/Packaging.md file (just as I have written some for the database abstraction and tooling).
Moreover, I also plan to provide an optional UV configuration that allows to install all the packages as editable at once using uv pip install -e $LORIS_MRI_DIR_PATH. For those who don't know, UV is a third-party package manager by the same people that made Ruff (the linter we use), which has gained a lot of traction in recent years notably for its speed and multi-package management. In any case, note that this configuration is fully optional and that all the installation also works with PIP.
Overrides
There is a potential problem in that the current LORIS-MRI Python override process does not have a single piece of documentation and is therefore impossible to fully understand. I would like to schedule a call with @cmadjar so she can show me what it looks like in practice and I can better answer her needs.
In any case, any package in this modular architecture can be replaced by an alternative version of that package that contains all the overrides needed for a project.
Modify CI to test each package individually instead of only as a whole
Conclusion
As said in the introduction, this is a rather critical project for me, as I need that modularity to develop the new MEG pipeline, and as I think the BIDS importer refactor mess has shown how much we need more isolation between the pipelines.
Thank you for reading, please tell your thoughts and questions in the issue comments.
Introduction
LORIS Python is a codebase whose code lives in a single package made up of the
libandscriptsdirectories. This codebase contains several pipelines whose isolation from each other greatly varies: the DICOM importer code almost exclusively lives in thelib/import_dicom_studydirectory, the BIDS converter code mostly lives in thelib/dcm2bids_imaging_pipeline_libdirectory, and the (current) BIDS importer lives in several files spread out across the codebase (lib/bidsreader.py,lib/candidate.py,lib/eeg.py,lib/mri.py,lib/session.py...)I believe this is not a sustainable way to do development: since all the pipelines live in a single package, the boundaries between those are often blurry in practice, the pipelines themselves are less discoverable, and adding or modifying a pipeline becomes very hard to review and to merge as it risks impacting the whole codebase (which partly explains the BIDS importer refactor situation). Moreover, I personally want to add new pipelines to LORIS Python in the near or far future, notably for MEG support and imaging upload. Some of these pipelines may contain consequential features (probably an HTTP server), so I want those to be fully isolated, opt-in, and disabled by default.
In order to enforce better segmentation and modularity in the codebase, I propose to divide LORIS-Python into several packages, which can be installed, reviewed, or replaced independently.
Architecture
The new architecture I propose, given the current pipelines in the codebase, looks like this:
Notice the
libandscriptsdirectories have been replaced by severalloris_xpackages, which each declares its own dependencies, including with other packages. There are basically four kinds of packages:loris_util,loris_bids_reader,loris_eeg_chunker).loris_core).loris_dicom_importer,loris_bids_importer).Each package should follow the conventional
src-layout, which looks like this:Migration
A major question is obviously how to move from the current monolithic architecture to the modular one? Well, it is actually not that hard, and can be accomplished in the few following steps:
lib/utilout oflibinto its ownloris_utilpackage, so that it can be used by any other package. This is not a hard breaking change because this module was only added in last version, which has not been adopted yet by any project.loris_bids_reader) out of the new BIDS importer (BIDS importer refactor #1325) into its own package. The BIDS importer is basically made up of two parts: one that reads the BIDS dataset, and the other that ingests its data in the database. The former part is completely independent of LORIS, and should therefore be put in its own package to make it more isolated and reusable.libandscriptsdirectories into the LORIS core package (loris_core). This can be simply accomplished by puttingscriptsdirectory insidelib, renaming the latter toloris_core, and adding the package metadata (pyproject.toml).loris_dicom_importerandloris_bids_importerpackages, that depend onloris_core. Those are pipelines that I wrote and that are already very well isolated so this is rather trivial.loris_corepackage until we decide to move them out.I can write the PRs for all the steps except the last one myself (which as said, is not required anyway). While it seems like a lot of breaking changes at first glance, this is mostly just moving code around and adding package metadata. From a current user perspective this mostly results in simply updating imports, which is mostly a simple find-and-replace operation (which is also checked by CI!):
from lib import x→from loris_core import xfrom lib.util import x→from loris_util import xfrom lib.imaging_lib.bids import x→from loris_bids_reader import xfrom chunking import x→from loris_eeg_chunker.chunking import x(already done)Use
To install LORIS Python with this modular architecture, simply use
pip install $LORIS_MRI_DIR_PATHexactly as it works now in CI (and that I also use on my VMs), this installs the main LORIS package and all the other packages it depends on (that is,loris_coreand all the pipelines). The only subtlety is that to edit a specific package, you probably need to usepip install -e $YOUR_PACKAGE_PATH(-estands for--editable) so that the changes in the code are reflected in the package behavior. This will likely need to be documented in adocs/python/Packaging.mdfile (just as I have written some for the database abstraction and tooling).Moreover, I also plan to provide an optional UV configuration that allows to install all the packages as editable at once using
uv pip install -e $LORIS_MRI_DIR_PATH. For those who don't know, UV is a third-party package manager by the same people that made Ruff (the linter we use), which has gained a lot of traction in recent years notably for its speed and multi-package management. In any case, note that this configuration is fully optional and that all the installation also works with PIP.Overrides
There is a potential problem in that the current LORIS-MRI Python override process does not have a single piece of documentation and is therefore impossible to fully understand. I would like to schedule a call with @cmadjar so she can show me what it looks like in practice and I can better answer her needs.
In any case, any package in this modular architecture can be replaced by an alternative version of that package that contains all the overrides needed for a project.
Checklist
Conclusion
As said in the introduction, this is a rather critical project for me, as I need that modularity to develop the new MEG pipeline, and as I think the BIDS importer refactor mess has shown how much we need more isolation between the pipelines.
Thank you for reading, please tell your thoughts and questions in the issue comments.