Decouple AtomData from frames, unify 2-D shape validation, drop top-level re-exports by bjmorgan · Pull Request #71 · bjmorgan/hofmann

bjmorgan · 2026-04-11T16:24:12Z

Closes #69.

Makes the per-atom metadata container frame-agnostic, tightens the scene's write API, and unifies 2-D shape validation into a single prospective-state walk inside the container.

Container changes

AtomData is no longer re-exported from hofmann or hofmann.model. The class is still public-named and documented in docs/api.rst so users can inspect instances returned by scene.atom_data; construction is considered an internal implementation detail and scene.atom_data is the only supported way to obtain one. Follows stdlib precedent -- types.MappingProxyType, dict_keys, dict_values, range, enumerate are all public, documented return types obtained via other APIs.
Constructor takes only n_atoms; frames= parameter and n_frames property are gone.
Inherits from collections.abc.Mapping (not MutableMapping), so mutation lives on private _set / _del / _clear_2d methods. Direct scene.atom_data[key] = arr and del scene.atom_data[key] raise TypeError (from Python's subscript bytecode, since Mapping defines no __setitem__ / __delitem__); .pop, .popitem, .setdefault, .update, .clear raise AttributeError (those methods do not exist on a Mapping).
__repr__ shows shapes in numpy tuple format, e.g. AtomData({'charge': (3,), 'energy': (5, 3)}).

Unified 2-D shape validation

AtomData._check_2d_consistency(expected, *, pending=None) validates the prospective post-operation state. pending is a mapping of keys being written; entries in pending override stored entries of the same key. Pending entries are checked first so user errors on a call-site array beat stale-stored errors.
AtomData._set(key, value, *, expected_frames: int) takes a required keyword frame count and calls _check_2d_consistency(expected_frames, pending={key: arr}). There is no separate scene-level shape check; all 2-D validation lives in this one walk.
Input validation (ndim, per-atom shape, dtype whitelist) runs before the state-dependent consistency walk. This ensures a bad-dtype write fires the dtype error directly instead of sending the user on an unnecessary clear_2d_atom_data() recovery for a stale stored entry that was going to stay valid anyway.

Single-entry in-place reassign

Reassigning a single 2-D entry at a new shape after scene.frames.append(...) now works without clear_2d_atom_data(). The stored version of the key is treated as overridden by the pending write and skipped during the consistency walk. Multi-entry scenes still require clear_2d_atom_data because the other stored entries are stale; the error names the actually-stale key so the user knows what to drop.

Scene write API

StructureScene gains del_atom_data(key) and clear_2d_atom_data(), alongside the existing set_atom_data.
set_atom_data and the __init__ atom_data loop pass expected_frames=len(self.frames) through to _set; there is no scene-side shape helper. The old _check_2d_atom_data_shape was deleted.
The @atom_data.setter is removed.

Render validation

Private StructureScene._validate_for_render() calls self._atom_data._check_2d_consistency(len(self.frames)) with no pending, validating the current stored state as-is. Invoked at the top of render_mpl, render_mpl_interactive, and render_animation as a backstop for the specific case where scene.frames is mutated after the last set_atom_data but before the next render.
Stale 2-D data (the original Detect stale 2-D atom_data after frames mutation #69 scenario) now raises a clear ValueError at the start of the render call, naming the stale key and pointing at clear_2d_atom_data() for recovery.

Documentation

AtomData class docstring describes the two enforcement sites honestly (write-time via _set, render-time via the scene's backstop). The prior draft's "the container itself does not know about the scene's trajectory" was aspirational; post-refactor the container enforces consistency on every write.
docs/api.rst documents AtomData via its fully qualified module path with an explicit :members: n_atoms, ranges, labels, so internal _set / _del / _clear_2d do not leak into the rendered reference.
docs/colouring.rst documents the single-entry in-place reassign shortcut alongside the multi-entry clear_2d_atom_data workflow.
docs/changelog.rst 0.19.0 entry reflects the final shipped design: no-re-export of AtomData, the unified validation walk, the single-entry shortcut, and the honest TypeError vs AttributeError split at the mutation surface. 0.17.0 "update values by reassigning the key" how-to (which would now raise AttributeError) has been rewritten to point at scene.set_atom_data.

Breaking API changes

from hofmann import AtomData raises ImportError (still importable via from hofmann.model.atom_data import AtomData for type annotations only; construction is not a supported workflow).
AtomData.n_frames removed; use len(scene.frames).
AtomData.__init__(frames=...) removed.
scene.atom_data[key] = arr and del scene.atom_data[key] raise TypeError.
scene.atom_data.pop(...) / .popitem() / .setdefault(...) / .update(...) / .clear() raise AttributeError.
scene.atom_data = new_container raises AttributeError (setter removed).

None of the removed paths were documented in the user guide or API reference. Migrations are mechanical one-line changes to the corresponding StructureScene methods, all covered in the 0.19.0 changelog.

…ng refs

The earlier commits in this branch renamed the container to _AtomData to signal "don't construct this", relying on the leading underscore as the hiding mechanism. On review that position is incoherent: the public StructureScene.atom_data property returns this object, users can and should iterate it, read membership, and access the derived ranges / labels / n_atoms metadata, yet the type they receive has a private name. A public accessor returning a privately-named type mismatches stdlib precedent (MappingProxyType, dict_keys, dict_values, range, enumerate are all public, documented return types whose constructors are effectively private), and forces a nitpick_ignore entry in Sphinx to paper over the inconsistency. The load-bearing defences against the original bugs (#69) are: 1. Removed @atom_data.setter — no externally-built container can be attached to a scene at all. 2. Decoupled AtomData from frames — the "stale frames reference" state is no longer representable. 3. Cross-entry 2-D shape invariant in _set. 4. _validate_for_render at the start of every public render_* method. None of these depend on the class-level underscore. Privatising the name was redundant with the real defences, and it pushed the class's type annotation into Sphinx-unresolvable territory. This commit: - Renames _AtomData -> AtomData in src/, tests/, and internal imports. The class is still kept out of hofmann.__all__ and hofmann.model.__all__, so `from hofmann import AtomData` and `from hofmann.model import AtomData` both fail as before. The only supported way to obtain an instance is `scene.atom_data`. - Adds AtomData to docs/api.rst via its fully qualified module path `hofmann.model.atom_data.AtomData`, so Sphinx can cross-link the return type of the atom_data property. Drops the nitpick_ignore entry that was covering the private name. - Updates the class docstring to state that the class is obtained via scene.atom_data and not instantiated directly. - Rewrites the changelog 0.19.0 entry: the breaking change is "no longer re-exported" rather than "renamed to _AtomData". - Addresses Copilot review comment on the atom_data property docstring: clarifies that 2-D shapes are *expected* to match len(self.frames) and are validated at render time, rather than implying the invariant always holds. - Declines the other Copilot review comment (annotate return as Mapping[str, np.ndarray]): doing so would erase the derived ranges / labels / n_atoms attributes from type information, which are useful for inspection. The right fix is the one above — expose the concrete type with a public name. Downstream effects: pytest 1024 passing, mypy clean, sphinx nitpicky warning count 87 (baseline was 88 — one fewer, none introduced).

Collapse the two duplicate 2-D shape checks (scene-level ``_check_2d_atom_data_shape`` and container-level ``_check_frames_compatibility``) into a single container method, ``AtomData._check_2d_consistency(expected, *, pending=None)``, that validates the prospective post-operation state. The new helper takes an optional ``pending`` mapping of keys to arrays being written. Pending entries replace stored entries of the same key (for reassignment) or add new entries (for first writes); stored entries whose keys appear in ``pending`` are skipped during the walk because they will no longer exist in the post-operation state. Pending entries are checked before stored entries so that a user passing a wrong-shape array gets a call-site error rather than a confusing stale-stored error. ``AtomData._set`` now takes a required ``expected_frames: int`` kwarg and calls ``_check_2d_consistency(expected_frames, pending={key: arr})`` for 2-D writes. ``StructureScene`` passes ``expected_frames=len(self.frames)`` at every write site (``set_atom_data`` and the ``__init__`` atom_data loop). The scene-level ``_check_2d_atom_data_shape`` helper is deleted; its job is now done entirely inside ``_set`` via the container check. ``StructureScene._validate_for_render`` now calls ``self._atom_data._check_2d_consistency(len(self.frames))`` with no ``pending``, which validates the current stored state (no write is happening at render time). Three substantive user-visible effects: 1. Error messages name the key in every failure mode. A wrong-shape write reports ``atom_data['foo'] has N rows but M frames were expected``; a stale stored entry reports ``atom_data has stale 2-D entry 'foo' sized for N frames, but M frames were expected; call StructureScene.clear_2d_atom_data() to recover before reassigning``. The old scene-level check and container-level check had divergent wording; both now flow from one helper. 2. Reassigning a single 2-D entry in place with a new shape now succeeds when it is the only 2-D entry. The stored version is overridden by the pending entry and skipped during the walk, so ``_set("energy", new_4_row_array, expected_frames=4)`` on a container that previously held ``"energy"`` at ``(3, n_atoms)`` now works directly without requiring ``clear_2d_atom_data``. Multi-entry cases still require recovery because the walk fires on the other stale entry, naming the actually-stale key. 3. ``_set`` cannot be called without an ``expected_frames`` kwarg. A required keyword argument is type-system enforced, closing the "direct caller forgets the scene check" bug surface that existed when the scene-level ``_check_2d_atom_data_shape`` was the outer wrapper. Test updates: - Every ``_set`` call gains an ``expected_frames=`` kwarg. For 1-D tests any value works; the kwarg is required regardless to keep the internal API uniform. - ``test_reassigning_only_2d_to_new_shape_raises`` was already removed in the previous commit; no further action needed. - New ``test_in_place_reassign_same_key_new_shape_succeeds`` pins the single-entry reassignment case. - New ``test_in_place_reassign_fails_when_other_2d_stale`` pins the multi-entry case and verifies the error names the actually-stale key. - ``test_set_detects_stale_stored_entry`` and ``test_set_rejects_2d_wrong_new_shape`` already match the new error wording since they were added in the previous commit. - Render-time stale-2-D integration tests updated to match the new ``stale 2-D entry 'energy'`` wording.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Previously, a 2-D write with an unsupported dtype could trigger a stale-stored-entry error before the dtype error ever ran, because _check_2d_consistency was called from inside the ndim==2 branch and ran before the dtype whitelist check. Concrete failure: container holds "energy" at shape (3, N), scene is extended to 4 frames, user calls set_atom_data("new_key", complex_4N) -- the dtype is bad, so the write was never going to land, but the cross-entry walk fires first on the now-stale "energy" and the error reads "call clear_2d_atom_data() to recover before reassigning". The user duly destroys "energy", retries, and only then gets the dtype error they should have seen on the first call. Reorder _set so all input-validation checks (ndim, per-atom shape, dtype) run before the state-dependent cross-entry check. Input errors now take priority, and the state check runs only after the input is known to be valid. Add a regression test pinning the ordering: a bad-dtype 2-D array assigned alongside a stale stored entry raises the dtype error, not the stale error.

Docstring and documentation accuracy: - AtomData class docstring no longer claims "the container itself does not know about the scene's trajectory". Post-3629f6d every _set takes a required expected_frames kwarg and the container enforces consistency on every write; the previous claim was aspirational but false. Rewrite to describe both enforcement sites (write-time via _set, render-time via _validate_for_render as a backstop). - StructureScene.set_atom_data Raises: section adds the previously undocumented unsupported-dtype case and the stale-stored-other- key case introduced by the unified consistency walk (the error names a different key from the one being written and points at clear_2d_atom_data). - StructureScene.set_atom_data body prose replaces the stale "in addition to the cross-entry ... check performed by the underlying container" framing (which described the deleted two- check design) with an honest description of the single prospective-state walk. - StructureScene.atom_data property docstring stops softening the invariant to "expected to match"; instead names the two enforcement sites explicitly and mentions the single-entry in-place reassign behaviour. - StructureScene._validate_for_render docstring trimmed from 16 lines to 2: a backstop helper does not need elaborate prose explaining what a backstop is. Naming: - AtomData.clear_2d -> AtomData._clear_2d. Only non-underscored mutation method on a class that says "do not instantiate directly"; renders as a public method in the API reference and provides a back door around the scene's single-write API. The scene's public clear_2d_atom_data is unaffected. - docs/api.rst autoclass directive now lists only the public read surface (``:members: n_atoms, ranges, labels``) instead of ``:members:`` wildcard, so internal ``_set`` / ``_del`` / ``_clear_2d`` do not leak into the rendered reference. Changelog: - 0.19.0 mutation-surface bullet now accurately describes the TypeError (for ``scene.atom_data[key] = arr`` and ``del scene.atom_data[key]``, from Python's STORE_SUBSCR / DELETE_SUBSCR bytecode) vs AttributeError (for ``pop``, ``popitem``, ``setdefault``, ``update``, ``clear``, which simply do not exist on a ``Mapping``). The earlier phrasing ("raises at the Python protocol level") was deliberately vague and one reviewer's suggested "all raise AttributeError" rewrite would have been wrong. - 0.19.0 adds a bullet documenting the single-entry in-place reassign shortcut introduced in 3629f6d. Previously the changelog taught clear_2d_atom_data as the unconditional recovery idiom; a user following it would have destroyed state they could have kept. - 0.19.0 reframes the "set_atom_data now performs an explicit check" bullet to reflect the single unified walk (the stale shape[0] check wording referenced a deleted scene-level helper). - 0.17.0 "update values by building a new array and reassigning the key" how-to would now raise AttributeError. Rewrite to point at scene.set_atom_data. docs/colouring.rst: - Document the single-entry in-place reassign shortcut explicitly, alongside the multi-entry clear_2d_atom_data recovery workflow. Tests: - Move test_set_bad_dtype_fires_before_stale_stored_check into the atom_data test suite from the earlier commit where it was added directly after the reorder. (Already landed.) - New container-level tests: test_set_2d_nested_list_coerced (2-D nested-list input), test_del_2d_preserves_constraint_when_others_remain (delete of one 2-D entry does not release the constraint when others remain), test_atom_data_not_importable_from_top_level (the re-export contract advertised in the changelog), test_bracket_del_raises (symmetric with test_bracket_assignment_raises). - Parametrised test_mutable_mapping_methods_absent covering the seven mutation entry points individually, as promised (but previously undelivered) by test_inherits_from_mapping_not_mutable_mapping. - test_set_detects_stale_stored_entry now pins the recovery-advice tail ("clear_2d_atom_data") via explicit substring assertions so a future reword cannot silently drop the user-actionable part of the error message. - New scene-level tests: test_render_mpl_succeeds_after_in_place_reassign (end-to-end single-entry workflow) and test_in_place_reassign_names_other_stale_key (the error names the actually-stale entry in the multi-entry case). These pin the main behavioural improvement of 3629f6d at the user-facing layer, not just the container layer.

Two small fixes on top of the previous review-findings commit: 1. Stale-stored error message: rephrase the recovery hint from "call StructureScene.clear_2d_atom_data() to recover" to "call scene.clear_2d_atom_data() to recover". The class-qualified form reads as a pseudo-classmethod call that users cannot literally type -- the method is an instance method and users call it via their scene variable (canonically named "scene" in the user guide, changelog, and README examples). The substring-based test assertions on the error message still pass because they match on "clear_2d_atom_data" alone. 2. TestValidateForRender previously had a "# --- Unit test directly on the helper ---" section header with no test under it, left over from an earlier refactor. Add test_validate_for_render_raises_on_stale_2d that calls the helper directly, bypassing the render stack. Faster and narrower than the end-to-end integration tests, and guards against a refactor that reshuffles how render_* call the helper without changing the helper itself.

The method's docstring still taught the unconditional ``frames.append`` -> ``clear_2d_atom_data`` -> ``set_atom_data`` -> ``render_mpl`` workflow, which is only accurate for multi-entry scenes. After 3629f6d, a single 2-D entry can be reassigned at a new shape directly via ``set_atom_data`` without clearing first, because the stored version is treated as overridden by the pending write. Three other documentation sites (``set_atom_data`` docstring, ``docs/colouring.rst``, the 0.19.0 changelog entry) already teach the nuanced story; this docstring was the last stale site. Rewrite it to match: state that the method is *required* for multi-entry recovery, note that single-entry reassignment can go through ``set_atom_data`` directly, and show the multi-entry workflow example explicitly (two ``set_atom_data`` calls after the clear, not one).

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

bjmorgan added 14 commits April 11, 2026 13:59

Privatise _AtomData and drop from public API

a01eb40

Fix Task 1 review issues: docstrings, user-doc phrasing, base-class test

7903195

Decouple _AtomData from frames and enforce cross-entry constraint

70f9700

Fix Task 2 review issues: clearer error phrasing and _set docstring

7b951ee

Add _AtomData.clear_2d helper for per-frame recovery

7f9a541

Tighten scene atom_data write API to three methods

2dff5cd

Fix Task 4 review issues: prune duplicate tests and docstring polish

cfe47af

Validate atom_data shape at scene render entry points

d20344d

Document privatised _AtomData and new scene write methods

ac84c96

Address deferred review nits from Tasks 5 and 6

1eb9053

Address deferred review nits: helper consolidation, comments, docstri…

1671a13

…ng refs

Revert cargo-cult comments; silence NaN and AtomStyle warnings

c152988

Prune redundant tests from atom_data refactor

73392ad

Prune helper unit test covered by render integration tests

2587fe4

Copilot AI review requested due to automatic review settings April 11, 2026 16:24

Copilot started reviewing on behalf of bjmorgan April 11, 2026 16:24 View session

This comment was marked as resolved.

Sign in to view

bjmorgan changed the title ~~Privatise _AtomData and decouple container from frames~~ Decouple AtomData container from frames and drop top-level re-exports Apr 11, 2026

Copilot AI review requested due to automatic review settings April 11, 2026 17:41

Copilot started reviewing on behalf of bjmorgan April 11, 2026 17:41 View session

Copilot AI reviewed Apr 11, 2026

View reviewed changes

Comment thread src/hofmann/model/atom_data.py Outdated

Comment thread src/hofmann/model/atom_data.py Outdated

Comment thread src/hofmann/model/structure_scene.py Outdated

bjmorgan added 2 commits April 11, 2026 18:55

Copilot AI review requested due to automatic review settings April 11, 2026 21:02

Copilot started reviewing on behalf of bjmorgan April 11, 2026 21:03 View session

bjmorgan changed the title ~~Decouple AtomData container from frames and drop top-level re-exports~~ Decouple AtomData from frames, unify 2-D shape validation, drop top-level re-exports Apr 11, 2026

This comment was marked as resolved.

Sign in to view

Copilot AI review requested due to automatic review settings April 11, 2026 21:17

Copilot started reviewing on behalf of bjmorgan April 11, 2026 21:17 View session

Copilot AI reviewed Apr 11, 2026

View reviewed changes

bjmorgan merged commit 2744adb into main Apr 11, 2026
9 checks passed

bjmorgan deleted the privatise-atom-data branch April 11, 2026 21:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decouple AtomData from frames, unify 2-D shape validation, drop top-level re-exports#71

Decouple AtomData from frames, unify 2-D shape validation, drop top-level re-exports#71
bjmorgan merged 20 commits intomainfrom
privatise-atom-data

bjmorgan commented Apr 11, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bjmorgan commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Container changes

Unified 2-D shape validation

Single-entry in-place reassign

Scene write API

Render validation

Documentation

Breaking API changes

Uh oh!

This comment was marked as resolved.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bjmorgan commented Apr 11, 2026 •

edited

Loading