Skip to content

Bug fixes, correctness improvements, and performance optimisations#9

Merged
bjmorgan merged 18 commits intomainfrom
code-review-fixes
Feb 21, 2026
Merged

Bug fixes, correctness improvements, and performance optimisations#9
bjmorgan merged 18 commits intomainfrom
code-review-fixes

Conversation

@bjmorgan
Copy link
Copy Markdown
Owner

Summary

  • Fix mutable default argument, stale caches, infinite loop risk, wrong class_str, sys.exit() in library code, and subclass type loss in __mul__
  • Optimise permutation matrix inversion (transpose), index_mapping/as_vector (argmax), symmetry vector deduplication (set), and species relabelling (vectorised)
  • Clean up outdated docstrings, unused imports, duplicate code, and deprecated tqdm_notebook

Bug fixes

  • Mutable default argument in SymmetryGroup.__init__: all instances shared the same list, risking cross-contamination.
  • Stale caches after extend()/append(): unique_index_mappings and stacked_index_mappings were not invalidated when new operations were added.
  • Infinite loop risk in random_unique_configurations: added a max_attempts parameter with RuntimeError when the configuration space appears exhausted.
  • Wrong class_str in SpaceGroup: was set to 'SymmetryGroup' instead of 'SpaceGroup'.
  • sys.exit() in library code: bsym.bsym called sys.exit() on import; replaced with raise ImportError(...).
  • SymmetryGroup.__mul__ lost subclass type: SpaceGroup * SpaceGroup returned a plain SymmetryGroup; now uses type(self)(...).

Type annotation fixes

  • Fixed labels property return type (list[str | None], not list[SymmetryOperation]).
  • Added proper type annotations to cache fields in SymmetryGroup.

Performance optimisations

  • Matrix transpose for permutation inversion instead of np.linalg.inv (15-26x micro-benchmark speedup).
  • np.argmax for index_mapping and as_vector instead of Python loops with .index() (2-47x micro-benchmark speedup).
  • Set-based deduplication of symmetry operation vectors in the pymatgen interface (~3x micro-benchmark speedup).
  • Vectorised apply_species_mapping using numpy fancy indexing (1.4-5x micro-benchmark speedup).
  • End-to-end: 3.9x speedup on ternary composition enumeration, 2.4x on random sampling.

Code hygiene

  • Removed outdated docstrings referencing numpy.matrix subclassing.
  • Removed unused imports (copy, SpacegroupOperations, Lattice) from pymatgen interface.
  • Removed duplicate permutation_as_config_number function.
  • Simplified SymmetryOperation.__init__ type checking.
  • Replaced deprecated tqdm_notebook with tqdm.auto.

Replace the shared mutable default list with None sentinel to prevent
instances created without arguments from sharing the same list object.
extend() and append() now reset _stacked_mappings and
_unique_mappings so that cached values are recomputed after
the operation list changes.
Add max_attempts parameter to limit consecutive failed attempts.
Raises RuntimeError when the configuration space appears exhausted,
preventing an infinite loop when n exceeds the number of available
inequivalent configurations.
- Fix labels property return type: list[str | None] not list[SymmetryOperation]
- Add proper type annotations to cache fields (_stacked_mappings, _unique_mappings)
- Fix SpaceGroup.class_str from 'SymmetryGroup' to 'SpaceGroup'

All pre-existing mypy errors in symmetry_group.py are now resolved.
For permutation matrices, the inverse equals the transpose. This
replaces np.linalg.inv (O(n^3) general inversion with floating-point
rounding) with a simple transpose, giving 15-26x speedup depending
on matrix size.
Use type(self)(...) instead of hardcoded SymmetryGroup(...) so that
SpaceGroup * SpaceGroup returns a SpaceGroup, etc.
Calling sys.exit() at import time kills the entire process, which is
too aggressive for a library. An ImportError with a clear message is
the standard way to signal an incompatible import.
- Fix outdated Configuration docstring (no longer subclasses numpy.matrix)
- Add proper docstring for unique_configurations_by_composition
- Remove duplicate permutation_as_config_number (identical to as_number)
- Remove redundant molecule centring in configuration_space_from_molecule
- Remove unused imports: copy, Lattice, SpacegroupOperations
Replace O(n^2) Python loop with vectorised np.argmax for computing
index_mapping (2-47x speedup) and as_vector (2-12x speedup for
realistic matrix sizes).
Replace O(n^2) list membership test with a set of tuples for ~3x
speedup when deduplicating symmetry operations from pymatgen.
Replace Python list comprehension with numpy fancy indexing for
1.4-5x speedup depending on configuration size.
Collapse three identical isinstance branches into a single check
with np.asarray (which avoids copying when already an ndarray).
tqdm_notebook was deprecated in favour of tqdm.auto.tqdm, which
automatically selects the appropriate widget for terminals and
Jupyter notebooks.
Removes the push trigger to avoid duplicate workflow runs when a
branch has an open PR.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on improving correctness and performance in bsym’s symmetry/configuration handling, while also updating tests and some integration utilities.

Changes:

  • Fixes correctness issues in symmetry group behavior (mutable default argument, cache invalidation, subclass preservation in __mul__, SpaceGroup.class_str) and replaces sys.exit() on import with ImportError.
  • Improves performance for symmetry operations and configuration manipulation (transpose-based inversion, argmax-based mappings/vectors, set-based deduplication, vectorized species mapping).
  • Updates/extends unit tests to cover the above behaviors and modernizes tqdm usage.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
bsym/symmetry_operation.py Uses argmax for mapping/vector extraction and transposes permutation matrices for inversion.
bsym/symmetry_group.py Fixes mutable default arg, adds cache invalidation on mutation, corrects typing, preserves subclass type in __mul__.
bsym/space_group.py Corrects class_str to 'SpaceGroup'.
bsym/configuration_space.py Replaces deprecated notebook tqdm usage, adds max_attempts to random sampling, vectorizes apply_species_mapping, removes duplicate helper.
bsym/interface/pymatgen.py Removes unused imports and speeds up symmetry vector deduplication with a set.
bsym/configuration.py Updates documentation to match internal numpy.int8 representation.
bsym/bsym.py Raises ImportError instead of exiting the interpreter at import time.
tests/unit_tests/test_symmetry_group.py Adds regression tests for default arg non-sharing and cache invalidation.
tests/unit_tests/test_space_group.py Adds test ensuring SpaceGroup * SpaceGroup preserves type.
tests/unit_tests/test_configuration_space.py Removes tests for deleted helper; adds test for exhaustion behavior in random sampling.
tests/unit_tests/test_bsym.py Updates expectations for legacy bsym.bsym import behavior.
.github/workflows/build.yml Changes CI trigger to run only on pull requests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/unit_tests/test_bsym.py Outdated
Comment thread tests/unit_tests/test_configuration_space.py
Comment thread tests/unit_tests/test_configuration_space.py Outdated
Comment thread tests/unit_tests/test_symmetry_group.py
Comment thread .github/workflows/build.yml
- Remove unused `import importlib` in test_bsym.py
- Remove unused `call_count` variable and pass `max_attempts=10`
  to keep exhaustion test fast
- Assert both `stacked_index_mappings` and `unique_index_mappings`
  in cache invalidation tests
Only increment consecutive_failures when a duplicate configuration
is found, not when the uniform-sampling acceptance test rejects a
novel configuration.  The previous behaviour could trigger a spurious
RuntimeError for highly degenerate configurations even when the space
was not exhausted.

Also add a docstring note to SymmetryGroup.__mul__ documenting that
the return type is determined by the left operand.
Copilot AI review requested due to automatic review settings February 21, 2026 12:52

This comment was marked as resolved.

Remove duplicate in-function import of compute_mapping_vector in
unique_configurations_by_composition (already imported at module scope).

Update SymmetryOperation.__init__ docstring to document that np.matrix
is accepted (deprecated) alongside np.ndarray and list, matching the
runtime isinstance check.
Copilot AI review requested due to automatic review settings February 21, 2026 13:12
@bjmorgan bjmorgan merged commit 3c96806 into main Feb 21, 2026
15 checks passed
@bjmorgan bjmorgan deleted the code-review-fixes branch February 21, 2026 13:16
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -5,7 +5,8 @@
import numpy as np
from itertools import combinations_with_replacement
from collections import Counter
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counter is imported but not used anywhere in this module. Please remove the unused import to avoid dead code and keep the import section clean.

Suggested change
from collections import Counter

Copilot uses AI. Check for mistakes.
Comment on lines +146 to +147
max_attempts: Maximum number of consecutive failed attempts before
raising RuntimeError. Default is 1000.
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for max_attempts says it counts “consecutive failed attempts”, but in the implementation uniform-sampling rejections (the acceptance test) intentionally do not increment the counter. Please clarify this in the max_attempts docstring (e.g., that only repeated already-seen configurations count as failures) to avoid surprising API behavior.

Suggested change
max_attempts: Maximum number of consecutive failed attempts before
raising RuntimeError. Default is 1000.
max_attempts: Maximum number of consecutive failed attempts (i.e.,
attempts that generate configurations already seen or excluded)
before raising RuntimeError. In 'uniform' sampling, rejections
from the acceptance test (the 1/degeneracy filter) do not count
toward this limit. Default is 1000.

Copilot uses AI. Check for mistakes.
Comment thread bsym/symmetry_group.py
) -> list[SymmetryOperation]:
def labels(self) -> list[str | None]:
"""
A list of labels for each :any:`SymmetryOperation` in this spacegroup.
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labels is a property on SymmetryGroup, but the docstring says “in this spacegroup”. Please update the wording to “in this symmetry group” (or similar) to avoid confusing readers.

Suggested change
A list of labels for each :any:`SymmetryOperation` in this spacegroup.
A list of labels for each :any:`SymmetryOperation` in this symmetry group.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants