My patches may2026#25
Open
ingcoder wants to merge 3 commits into
Open
Conversation
- martini.py: set cter=none when chains already end in NME (fixes ~20/45 martinize2 failures) - martini.py: pass cnt_model not model_id for merge-file lookup (fixes trivalent crosslink detection) - amber.py: match 'Protein' or 'Protein_chain_A' for GROMACS version compat - colbuilder.py: defer stage imports to avoid startup import errors - pyproject.toml: numpy>=1.26 (Python 3.12), vermouth==0.9.6 (match reference run)
… of model_id to processed_models
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Running colbuilder with PYD (trivalent) crosslinks in geometry mode produced two categories of failures:
~20/45 martinize2 calls failed — the cap_pdb() function detected NME-capped chain ends but still passed -cter NME to martinize2, asking it to cap chains that were already capped. This caused martinize2 to error on those models.
Trivalent crosslinks were not detected — itp_.make_topology() was called with model_id when it needed cnt_model. The crosslink lookup reads the merged CG PDB as {cnt_model}.merge.pdb, so passing the wrong counter meant the merge file was never found and crosslink bonds were silently skipped.
Additionally, the final system .top listed #include "col_N.itp" for all N from 0 to total model count, even when some models failed or were skipped, causing GROMACS to fail on missing include files.
Changes
martini.py
cap_pdb(): if all chains already end in NME, set cter=none instead of cter=NME. Prevents double-capping.
cap_pdb(): restore proper N-terminus detection (ACE, GLN, or default ACE) instead of always none.
itp_.make_topology(): pass cnt_model (output file counter) instead of model_id. Fixes crosslink detection from merge file.
write_system_topology(): only #include topology files that were actually written, using the output counter, not the fibril model ID. Prevents GROMACS missing-include errors when models are skipped.
amber.py
write_itp(): match both Protein and Protein_chain_A when extracting the topology block from GROMACS output. GROMACS renamed this in version 2023+.
colbuilder.py
Move stage imports (build_sequence, build_geometry_anywhere, build_topology) from module level into the functions that use them. Prevents import errors at startup when optional dependencies are missing.
pyproject.toml
Bump numpy to >=1.26 (required for Python 3.12).
Pin vermouth==0.9.6 to newer version
Testing
Verified against a full rattus norvegicus fibril build with N_PYD_C_PYD trivalent crosslinks:
martinize2 failure rate dropped from ~20/45 to 0
Trivalent PYD crosslink bonds, angles and dihedrals present in all col_N.itp files
Final system .top includes only successfully built topology files