Skip to content

Fix ICEBERG forward prediction on pip-install / CPU / pytorch_lightning 2.x#35

Open
hugogontijomachado wants to merge 1 commit into
coleygroup:mainfrom
hugogontijomachado:fix/pip-install-cpu-pl2-compat
Open

Fix ICEBERG forward prediction on pip-install / CPU / pytorch_lightning 2.x#35
hugogontijomachado wants to merge 1 commit into
coleygroup:mainfrom
hugogontijomachado:fix/pip-install-cpu-pl2-compat

Conversation

@hugogontijomachado

Copy link
Copy Markdown

Summary

Three small, backward-compatible fixes so ICEBERG forward prediction
(iceberg_prediction()dag_pred/predict_smis.py) runs end-to-end when ms_pred is
pip-installed, on CPU, and with pytorch_lightning ≥ 2.0. None of them change
behavior for an editable checkout / GPU / PL 1.x.

1. dag_pred/iceberg_elucidation.py — resolve predict_smis.py relative to the package

iceberg_prediction() builds the subprocess with a cwd-relative path:

cmd = f'''{python_path} src/ms_pred/dag_pred/predict_smis.py ...'''

That only resolves when run from the repo root. On a pip install there is no src/, so the
subprocess fails (can't open file '.../src/ms_pred/dag_pred/predict_smis.py') and
load_pred_spec then raises on the missing preds.hdf5. Fixed by locating the script via
Path(__file__).resolve().parent / "predict_smis.py" (the two files live side by side).

2. dag_pred/predict_smis.pypl.seed_everything

pytorch_lightning.utilities.seed.seed_everything was removed in PL 2.0
(AttributeError: module 'pytorch_lightning.utilities.seed' has no attribute 'seed_everything').
pl.seed_everything is the public API and exists in both PL 1.x and 2.x.

3. dag_pred/predict_smis.py — guard torch.cuda.set_device on CPU

In producer_func, torch.cuda.set_device(gpu_id) runs unconditionally, but gpu_id is
only assigned in the GPU branch — on CPU this raises NameError (and there is no CUDA
device to select). Guarded with if gpu and avail_gpu_num > 0:.

How these surfaced

Predicting MS/MS from SMILES with the public MassSpecGym checkpoints on macOS (CPU) with
pytorch_lightning 2.6.5; each fix revealed the next.

Testing

After all three, iceberg_prediction(...) runs end-to-end on CPU and produces preds.hdf5
(verified by predicting glyphosate MS/MS for [M-H]- and [M+H]+).

…ng 2.x

Three small, backward-compatible fixes so iceberg_prediction() ->
dag_pred/predict_smis.py runs end-to-end outside an editable repo checkout:

1. iceberg_elucidation.py: build the subprocess command with predict_smis.py
   resolved relative to the package (Path(__file__).parent) instead of the
   cwd-relative "src/ms_pred/dag_pred/predict_smis.py", which only exists when
   run from the repo root (fails on a pip install).

2. predict_smis.py: use pl.seed_everything instead of
   pl.utilities.seed.seed_everything, which was removed in pytorch_lightning 2.0.
   pl.seed_everything is the public API and works in PL 1.x and 2.x.

3. predict_smis.py: guard torch.cuda.set_device(gpu_id) with
   `if gpu and avail_gpu_num > 0`; on CPU gpu_id is undefined (NameError) and
   there is no CUDA device to select.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant