Skip to content

enhance PythonPackage and PythonBundle to verify Python package names and versions with pip list with sanity_check_pip_list parameter#4049

Merged
boegel merged 30 commits intoeasybuilders:developfrom
smoors:sanity_pip_list
Apr 8, 2026
Merged

enhance PythonPackage and PythonBundle to verify Python package names and versions with pip list with sanity_check_pip_list parameter#4049
boegel merged 30 commits intoeasybuilders:developfrom
smoors:sanity_pip_list

Conversation

@smoors
Copy link
Copy Markdown
Contributor

@smoors smoors commented Jan 20, 2026

cfr. conf call discussion https://github.com/easybuilders/easybuild/wiki/Conference-call-notes-20260114

  • check if the normalized Python package extension name and version in the easyconfig match the normalized pip list name and version.
  • show close matches in pip list if there is no matching name.

implemented for Python, PythonPackage, PythonBundle easyblocks (+ derivatives).

UPDATE 2026-03-22:

  • by default, the extended pip list sanity check runs only if --upload-test-report is enabled. you can override this by setting easyconfig parameter sanity_check_pip_list to True or False
  • the pip list check for 0.0.0 versions is still done by default as before this PR, but is now done in run_pip_list() instead of run_pip_check().

@boegel boegel changed the title add sanity_pip_list parameter to verify Python package names and versions with 'pip list' add sanity_pip_list parameter to verify Python package names and versions with pip list Jan 28, 2026
@smoors
Copy link
Copy Markdown
Contributor Author

smoors commented Jan 28, 2026

hm, pip list doesn't always give the same name as advertised on pypi:

ERROR: Installation of Python-bundle-PyPI-2025.04-GCCcore-14.2.0.eb failed: The following Python packages were likely specified with a wrong name because they are missing from the 'pip list' output:
semantic-version (close matches: ['semantic_version']) -> wrong
importlib-metadata (close matches: ['importlib_metadata']) -> wrong
backports.entry-points-selectable (close matches: ['backports.entry_points_selectable']) -> wrong
importlib-resources (close matches: ['importlib_resources']) -> wrong
py-expression-eval (close matches: ['py_expression_eval']) -> wrong
backports.functools-lru-cache (close matches: ['backports.functools_lru_cache']) -> wrong
Babel (close matches: ['babel']) -> correct
tomli-w (close matches: ['tomli_w', 'tomli', 'toml']) -> wrong
rapidfuzz (close matches: ['RapidFuzz']) -> correct

we might have to query pypi to get the correct name:

$ python -c 'import requests; print(requests.get("https://pypi.org/pypi/semantic_version/json").json()["info"]["name"])'                                                                
semantic-version

@boegel what do you think?

@smoors
Copy link
Copy Markdown
Contributor Author

smoors commented Jan 28, 2026

i may have a better idea.

pypa normalizes the names as follows (see https://packaging.python.org/en/latest/specifications/name-normalization/#name-normalization)

import re

def normalize(name):
    return re.sub(r"[-_.]+", "-", name).lower()

this produces a canonical name for all the allowed variations.

the idea is to let EB normalize the names on the fly and write the normalized extension names in the module file. sanity_pip_list checks that the normalized extension names are equal to the normalized pip list names, which will (almost) always be true, so very few easyconfigs will need fixing (if any at all).

the upside is that there is guaranteed to be only 1 extension name in the modules, making lookup (e.g. with module spider) much easier.

the downside is that some extension names will look a bit different from before and from what's shown on pypi (no dots, no underscores). in case some sites care about that, we could make writing the normalized extension names in the module file optional, e.g. with a build option.

thoughts?

@smoors
Copy link
Copy Markdown
Contributor Author

smoors commented Jan 29, 2026

new plan after short discussion with @boegel:

  • normalize extension names and pip list names, and compare them as part of the sanity_pip_list check
  • check the connection to pypi.org
  • if the connection fails, print a warning and write the easyconfig extension name in the module file
  • if the connection succeeds but querying the name on pypi.org fails, error (extension name is wrong)
  • if querying the name on pypi.org succeeds, write that name in the module file.

EDIT: after thinking about this more, i'm not sure if we should be querying pypi.org for the display name of every package, because project authors can change this, so it's not guaranteed to stay the same. in any case, that will be for another PR.

@smoors smoors changed the title add sanity_pip_list parameter to verify Python package names and versions with pip list verify Python package names and versions with pip list in sanity_pip_check Feb 14, 2026
@smoors smoors changed the title verify Python package names and versions with pip list in sanity_pip_check verify Python package names and versions with pip list as part of sanity_pip_check Feb 14, 2026
@smoors
Copy link
Copy Markdown
Contributor Author

smoors commented Feb 14, 2026

Test report by @smoors

Overview of tested easyconfigs (in order)

  • SUCCESS Python-3.13.1-GCCcore-14.2.0.eb

  • FAIL Python-bundle-PyPI-2025.04-GCCcore-14.2.0.eb (build issue)
    (partial log available at https://gist.github.com/smoors/90a0a2774dea6565c769db019e5d5641)

  • SUCCESS psutil-7.0.0-GCCcore-14.2.0.eb

  • SUCCESS SciPy-bundle-2025.06-gfbf-2025a.eb

  • SUCCESS Biopython-1.85-gfbf-2025a.eb

Build succeeded for 4 out of 5 (total: 2 mins 35 secs) (5 easyconfigs in total)
node700.hydra.os - Linux Rocky Linux 9.7, x86_64, AMD EPYC 9535 64-Core Processor (zen5), Python 3.9.25
See https://gist.github.com/smoors/bb4f30ceb589a8cf8887c4cfcee5cd05 for a full test report.

@smoors
Copy link
Copy Markdown
Contributor Author

smoors commented Feb 14, 2026

Test report by @smoors

Overview of tested easyconfigs (in order)

  • SUCCESS archspec-0.2.5-GCCcore-14.2.0.eb

Build succeeded for 1 out of 1 (total: 18 secs) (1 easyconfigs in total)
node700.hydra.os - Linux Rocky Linux 9.7, x86_64, AMD EPYC 9535 64-Core Processor (zen5), Python 3.9.25
See https://gist.github.com/smoors/2ccd9a9b15da8de0d01f340ece0e6a37 for a full test report.

@smoors
Copy link
Copy Markdown
Contributor Author

smoors commented Feb 14, 2026

Test report by @smoors

Overview of tested easyconfigs (in order)

  • SUCCESS Python-bundle-PyPI-2025.04-GCCcore-14.2.0.eb

Build succeeded for 1 out of 1 (total: 1 min 57 secs) (1 easyconfigs in total)
node700.hydra.os - Linux Rocky Linux 9.7, x86_64, AMD EPYC 9535 64-Core Processor (zen5), Python 3.9.25
See https://gist.github.com/smoors/0051b4667be7a594b9f1d307cebc98c5 for a full test report.

@smoors smoors changed the title verify Python package names and versions with pip list as part of sanity_pip_check verify Python package names and versions with pip list with sanity_pip_check Feb 15, 2026
@smoors smoors changed the title verify Python package names and versions with pip list with sanity_pip_check verify Python package names and versions with pip list with sanity_check_pip_list Mar 21, 2026
@smoors smoors changed the title verify Python package names and versions with pip list with sanity_check_pip_list verify Python package names and versions with pip list with sanity_check_pip_list parameter Mar 21, 2026
@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 7, 2026

@boegelbot please test @ jsc-zen3
CORE_CNT=16
EB_ARGS="--installpath /tmp/$USER/pr4049 Python-3.14.2-GCCcore-15.2.0.eb Python-bundle-PyPI-2023.06-GCCcore-12.3.0.eb yaff-1.6.0-foss-2023b.eb matplotlib-3.9.2-gfbf-2024a.eb"

@boegelbot
Copy link
Copy Markdown

@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=4049 EB_ARGS="--installpath /tmp/$USER/pr4049 Python-3.14.2-GCCcore-15.2.0.eb Python-bundle-PyPI-2023.06-GCCcore-12.3.0.eb yaff-1.6.0-foss-2023b.eb matplotlib-3.9.2-gfbf-2024a.eb" EB_CONTAINER= EB_REPO=easybuild-easyblocks EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_4049 --ntasks="16" ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 10139

Test results coming soon (I hope)...

Details

- notification for comment with ID 4200667368 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Copy Markdown

Test report by @boegelbot

Overview of tested easyconfigs (in order)

Build succeeded for 3 out of 4 (total: 2 hours 8 mins 34 secs) (4 easyconfigs in total)
jsczen3c2.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.7, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.25
See https://gist.github.com/boegelbot/92a0f0cc3b4780c439b2e4f818956e73 for a full test report.

@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 7, 2026

== 2026-04-07 19:01:57,291 build_log.py:233 ERROR EasyBuild encountered an error: 
The following Python packages were likely specified with a wrong version because they have another version in the 'pip list' output:
kiwisolver 1.4.5 (version in 'pip list' output: 1.4.4) (at easybuild/easyblocks/python.py:359 in run_pip_list)

This is expected, it's being fixed in:

@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 7, 2026

@boegelbot please test @ jsc-zen3
CORE_CNT=16
EB_ARGS="--installpath /tmp/$USER/pr4049 matplotlib-3.9.2-gfbf-2024a.eb"

@boegelbot
Copy link
Copy Markdown

@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=4049 EB_ARGS="--installpath /tmp/$USER/pr4049 matplotlib-3.9.2-gfbf-2024a.eb" EB_CONTAINER= EB_REPO=easybuild-easyblocks EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_4049 --ntasks="16" ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 10142

Test results coming soon (I hope)...

Details

- notification for comment with ID 4201753494 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Copy Markdown

Test report by @boegelbot

Overview of tested easyconfigs (in order)

  • SUCCESS matplotlib-3.9.2-gfbf-2024a.eb

Build succeeded for 1 out of 1 (total: 4 mins 44 secs) (1 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.7, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.25
See https://gist.github.com/boegelbot/f58b1efea95f89136e8860a90e569f40 for a full test report.

Copy link
Copy Markdown
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 8, 2026

Test report by @boegel

Overview of tested easyconfigs (in order)

Build succeeded for 3 out of 4 (total: 23 mins 36 secs) (4 easyconfigs in total)
node4232.shinx.os - Linux RHEL 9.6, x86_64, AMD EPYC 9654 96-Core Processor (zen4), Python 3.9.21
See https://gist.github.com/boegel/f78fef597bbfc4f6883059be018e47c1 for a full test report.

@boegel
Copy link
Copy Markdown
Member

boegel commented Apr 8, 2026

@smoors Another actual problem uncovered:

The following Python packages were likely specified with a wrong name because they are missing in the 'pip list' output:
netcdf4-python (close matches in 'pip list' output: netcdf4 (at easybuild/easyblocks/python.py:360 in run_pip_list)

That shows that this all works as designed, so good!

Copy link
Copy Markdown
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel boegel merged commit 32dae03 into easybuilders:develop Apr 8, 2026
22 checks passed
@boegel boegel changed the title verify Python package names and versions with pip list with sanity_check_pip_list parameter enhance PythonPackage and PythonBundle to verify Python package names and versions with pip list with sanity_check_pip_list parameter Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants