neurosnap.structure package#
Public structure package exports.
- class neurosnap.structure.Atom(x, y, z, chain_id, res_id, ins_code, res_name, hetero, atom_name, element, annotations=<factory>)[source]#
Bases:
objectImmutable atom-level hierarchy view.
- class neurosnap.structure.Chain(chain_id, _residues)[source]#
Bases:
objectImmutable chain-level hierarchy view.
A
Chainis a read-only hierarchy view over the residues associated with one chain identifier in a singleStructure. It provides chain- level traversal plus convenience helpers for sequence extraction and simple residue-number gap detection.- chain_id#
Chain identifier represented by this view.
- __getitem__(res_id)[source]#
Return a residue view by residue ID, not by positional index.
- Parameters:
res_id (
int) – Residue sequence number to retrieve.- Return type:
- Returns:
The first
Residuein this chain with the requested residue ID.- Raises:
Notes
This method looks up residues by their residue ID rather than by list position. If multiple residues share the same residue ID, such as inserted residues distinguished by insertion codes, the first matching residue is returned and a warning is emitted.
- missing_residue_ids()[source]#
Return missing residue numbers inferred from gaps in the chain.
Hetero residues are ignored so ligand or solvent numbering does not create artificial gaps in the polymer residue sequence.
- sequence(polymer_type='auto', include_modifications=False, modification_mode='inline', on_unknown_modified='raise')[source]#
Return the polymer sequence for this chain.
Protein, DNA, and RNA sequences are supported. Small molecules and other non-polymer residues in the chain are ignored. Modified residues can either be skipped, emitted inline as
(CCD), or mapped to their parent sequence code when available.- Parameters:
polymer_type (
Literal['auto','protein','dna','rna','nucleotide']) – Polymer family to extract."auto"infers the family from the chain contents."nucleotide"accepts either DNA or RNA, but raises if both are present.include_modifications (
bool) – Whether modified residues should contribute to the sequence. IfFalse, modified residues are skipped entirely.modification_mode (
Literal['inline','parent']) – How included modifications are emitted."inline"inserts(CCD)tokens, while"parent"uses the inferred parent residue code.on_unknown_modified (
Literal['raise','unknown']) – Behavior whenmodification_mode="parent"is requested but no parent code can be inferred."raise"raises aValueError;"unknown"inserts"X".
- Return type:
- Returns:
Sequence string for the selected polymer family. Returns an empty string if the chain contains no residues from the requested polymer family.
- Raises:
ValueError – If the chain mixes polymer families in a way that conflicts with
polymer_typeor if an unknown modified residue cannot be mapped in"parent"mode.
- class neurosnap.structure.Residue(chain_id, res_id, ins_code, res_name, hetero, _atoms, _atom_indices)[source]#
Bases:
objectImmutable residue-level hierarchy view.
A
Residuegroups atoms that share the same chain identifier, residue number, insertion code, residue name, and hetero flag. The object is a lightweight read-only view over the parsed atom table, intended for traversal and analysis rather than in-place editing.- chain_id#
Chain identifier containing the residue.
- res_id#
Residue sequence number.
- ins_code#
PDB insertion code for the residue.
- res_name#
Residue name / CCD code.
- hetero#
Truefor heterogens andFalsefor polymerATOMrecords.
- class neurosnap.structure.Structure(remove_annotations=True)[source]#
Bases:
objectSingle-model molecular structure container.
Coordinates are stored separately from per-atom annotations so geometry-heavy operations can work on compact numeric arrays while annotation schemas remain flexible.
- Parameters:
remove_annotations (
bool) – IfTrue, optional annotation columns that only contain default values are removed after initialization.
- add_annotation(name, dtype, values=None, *, fill_value=None, overwrite=False)[source]#
Add a new per-atom annotation column.
- Parameters:
name (
str) – Annotation name to add.dtype (
Any) – NumPy-compatible scalar dtype for the annotation values.values (
Any) – Optional per-atom values for the annotation.fill_value (
Any) – Optional default value used whenvaluesis not supplied.overwrite (
bool) – Whether to replace an existing optional annotation of the same name.
- Raises:
ValueError – If the name is invalid, reserved, already present, or the supplied values do not match the atom count.
TypeError – If the supplied dtype is not a scalar per-atom dtype.
- calculate_center_of_mass(chains=None)[source]#
Calculate the center of mass for the selected atoms.
- Parameters:
chains (
Optional[List[str]]) – Optional chain IDs to include. IfNone, all atoms are used.- Return type:
- Returns:
A length-3 NumPy array containing the center of mass in Å.
- Raises:
ValueError – If no atoms are found in the selected structure or if any selected atom has an unknown element mass.
- calculate_geometric_center(chains=None)[source]#
Calculate the geometric center for the selected atoms.
- Parameters:
chains (
Optional[List[str]]) – Optional chain IDs to include. IfNone, all atoms are used.- Return type:
- Returns:
A length-3 NumPy array containing the arithmetic mean of the selected atom coordinates in Å.
- Raises:
ValueError – If no atoms are found in the selected structure.
- calculate_rog(chains=None, center=None)[source]#
Calculate the radius of gyration for the selected atoms.
- center_at(x=0.0, y=0.0, z=0.0, chains=None)[source]#
Translate selected atoms so their center of mass matches a target point.
- distances_from(point, chains=None)[source]#
Calculate distances from a point for the selected atoms.
- remove_annotation(name)[source]#
Remove a non-mandatory annotation column and return its values.
- Parameters:
name (
str) – Annotation name to remove.- Returns:
Copy of the removed annotation values.
- Raises:
KeyError – If the annotation does not exist.
ValueError – If the name is invalid or refers to a mandatory annotation.
- renumber(chain=None, start=1)[source]#
Renumber residues in-place.
- Parameters:
Notes
Renumbering treats inserted residues as ordinary sequential residues and clears their insertion codes. For example, residues
10,10A, and10Bbecome1,2, and3(with empty insertion codes) when renumbered withstart=1.
- to_dataframe()[source]#
Export the structure as a pandas dataframe.
This dataframe is derived on demand from the current atom table and is never cached on the structure.
- Return type:
DataFrame
- class neurosnap.structure.StructureEnsemble(models=None, *, model_ids=None, metadata=None)[source]#
Bases:
objectOrdered collection of independent
Structuremodels.Unlike
StructureStack, models in an ensemble do not need to have the same atoms, annotations, or bonds.- Parameters:
- __getitem__(index)[source]#
Return a model by model ID or a sliced sub-ensemble by position.
Integer access uses
model_idlookup rather than positional indexing, soensemble[5]returns the model whose ID is5. Slice access keeps normal positional semantics to preserve standard Python iteration and slicing behavior.- Raises:
KeyError – If an integer model ID is requested but not present.
- __init__(models=None, *, model_ids=None, metadata=None)[source]#
Initialize an ordered collection of independent models.
- first()[source]#
Return the first model in the ensemble.
- Return type:
- Returns:
The first
Structurein stored order.- Raises:
IndexError – If the ensemble is empty.
- renumber(start=1)[source]#
Renumber model identifiers in-place.
- Parameters:
start (
int) – Starting model ID. Defaults to1.
- to_dataframe()[source]#
Export the ensemble as a pandas dataframe with a
modelcolumn.This dataframe is derived on demand from the current models and is never cached on the ensemble.
- Return type:
DataFrame
- to_stack()[source]#
Convert the ensemble into a
StructureStack.- Raises:
ValueError – If the models are not stack-compatible.
- Return type:
- class neurosnap.structure.StructureStack(models=None, *, model_ids=None, metadata=None)[source]#
Bases:
objectShared-annotation, shared-bond multi-model fast path.
All models in a stack must share the same atom ordering, per-atom annotations, and bonds. Only the coordinates vary between models.
- Parameters:
- __getitem__(index)[source]#
Return a materialized model by model ID or a sliced sub-stack by position.
Integer access uses
model_idlookup rather than positional indexing, sostack[5]returns the model whose ID is5. Slice access keeps normal positional semantics to preserve standard Python slicing behavior.- Raises:
KeyError – If an integer model ID is requested but not present.
- __init__(models=None, *, model_ids=None, metadata=None)[source]#
Initialize an empty or pre-populated stack of compatible models.
- append(model, *, model_id=None)[source]#
Append a stack-compatible model.
- Parameters:
- Raises:
ValueError – If the candidate model is not compatible with the existing stack.
- first()[source]#
Return the first model in the stack.
- Return type:
- Returns:
The first
Structurein stored order.- Raises:
IndexError – If the stack is empty.
- classmethod from_ensemble(ensemble)[source]#
Build a stack from an ensemble of compatible models.
- Return type:
- renumber(start=1)[source]#
Renumber model identifiers in-place.
- Parameters:
start (
int) – Starting model ID. Defaults to1.
- to_dataframe()[source]#
Export the stack as a pandas dataframe with a
modelcolumn.This dataframe is derived on demand from the current stack contents and is never cached on the stack.
- Return type:
DataFrame
- neurosnap.structure.align(reference, mobile, chains1=None, chains2=None)[source]#
Align a mobile structure onto a reference structure using polymer backbone atoms.
When both
chains1andchains2are provided, they are interpreted as explicit pairwise chain mappings in matching order.- Parameters:
- Returns:
None. The mobile structure is transformed in-place.
- neurosnap.structure.animate_frames(frames, output_fpath, *, title='', subtitles=None, interval=200, repeat=True, background_color=(255, 255, 255))[source]#
Animate a sequence of frames using Pillow only and write to disk.
- Parameters:
frames (
Iterable[Union[Image,ndarray]]) – Iterable of frames to animate (Pillow Images or arrays convertible to images)output_fpath (
Union[str,Path]) – Path where the animation will be written; format inferred from extension (gif, webp, mp4)title (
str) – Title text to display above the animation; omit if emptysubtitles (
Optional[Iterable[str]]) – Iterable of subtitle strings, one per frame (must match length of frames)interval (
int) – Delay between frames in millisecondsrepeat (
bool) – Whether the animation repeats when the sequence of frames is completed (loop=0 if True else 1 for gif/webp; ignored for mp4)background_color (
Tuple[int,int,int]) – RGB background color used for the entire canvas (including title/subtitle band)
- neurosnap.structure.ca_distance_matrix(structure, chain=None)[source]#
Alias for
calculate_distance_matrix().
- neurosnap.structure.calculate_bsa(structure, chain_group_1, chain_group_2, *, level='R')[source]#
Calculate buried surface area between two chain groups.
- The buried surface area (BSA) is computed as:
(SASA(group 1) + SASA(group 2)) - SASA(complex)
- Parameters:
- Return type:
- Returns:
Buried surface area in Ų.
- neurosnap.structure.calculate_distance_matrix(structure, chain=None)[source]#
Calculate the CA-atom distance matrix for a single structure.
- neurosnap.structure.calculate_hydrogen_bonds(structure, chain=None, chain_other=None, *, donor_acceptor_cutoff=3.5, angle_cutoff=120.0)[source]#
Count hydrogen bonds using explicit hydrogens and simple geometric cutoffs.
- Parameters:
chain (
Optional[str]) – Optional donor-chain ID. When omitted, all chains are searched.chain_other (
Optional[str]) – Optional acceptor-chain ID for inter-chain counting.donor_acceptor_cutoff (
float) – Maximum donor-acceptor distance in Å.angle_cutoff (
float) – Minimum donor-H-acceptor angle in degrees.
- Return type:
- Returns:
Total number of hydrogen bonds that satisfy the geometric cutoffs.
- neurosnap.structure.calculate_interface_hydrogen_bonding_residues(structure, chain=None, chain_other=None, *, donor_acceptor_cutoff=3.5, angle_cutoff=120.0)[source]#
Count unique residues that participate in inter- or intra-chain hydrogen bonds.
- Parameters:
chain (
Optional[str]) – Optional donor-chain ID. When omitted, all chains are searched.chain_other (
Optional[str]) – Optional acceptor-chain ID for inter-chain counting.donor_acceptor_cutoff (
float) – Maximum donor-acceptor distance in Å.angle_cutoff (
float) – Minimum donor-H-acceptor angle in degrees.
- Return type:
- Returns:
Number of unique residues that participate in at least one qualifying hydrogen bond.
- neurosnap.structure.calculate_protein_volume(structure, chain=None)[source]#
Estimate protein volume from atom van der Waals spheres.
The calculation sums the volumes of van der Waals spheres for atoms belonging to residues classified as protein. It is therefore a simple geometric estimate rather than an excluded-volume or solvent-corrected measurement.
- neurosnap.structure.calculate_rmsd(reference, mobile, chains1=None, chains2=None, align_structures=True)[source]#
Calculate backbone RMSD between two structures.
- Parameters:
- Return type:
- Returns:
Backbone RMSD in Å using the same residue/atom correspondence as
align().
- neurosnap.structure.calculate_surface_area(structure, level='R', probe_radius=1.4, n_sphere_points=96)[source]#
Estimate solvent-accessible surface area using a simple Shrake-Rupley approximation.
The returned total SASA is the same regardless of
level; the parameter is kept for compatibility with the public surface-area API.- Parameters:
level (
str) – Compatibility flag matching the historical public API. The returned total SASA is always a structure-level scalar, regardless of this value. Must be one of"A","R","C","M", or"S".probe_radius (
float) – Solvent probe radius in Å used to inflate atom radii during the accessibility calculation.n_sphere_points (
int) – Number of surface points sampled per atom for the Shrake-Rupley approximation.
- Return type:
- Returns:
Estimated solvent-accessible surface area in Ų.
- neurosnap.structure.extract_non_biopolymers(structure, output_dir, min_atoms=0)[source]#
Extract non-biopolymer fragments from a structure and write them as SDF files.
Biopolymer residues are removed using the same residue-name logic as the old implementation: any residue present in
AA_RECORDSorSTANDARD_NUCLEOTIDESis treated as part of a protein or nucleotide polymer, exceptUNKwhich is preserved. The remaining atoms are written to a temporary PDB, read into RDKit, split into disconnected fragments, and then exported as individual SDF files.- Parameters:
- Returns:
None. Matching fragments are written tooutput_diras.sdffiles.
- neurosnap.structure.find_contacts(atoms1, atoms2, cutoff=4.5)[source]#
Identify atom-atom contacts between two atom sets using a distance cutoff.
- neurosnap.structure.find_disulfide_bonds(structure, chain=None, threshold=2.05)[source]#
Find disulfide bonds between cysteine residues using SG-SG distance.
- Parameters:
- Return type:
- Returns:
List of
(residue1, residue2)cysteine pairs that satisfy the distance cutoff.
- neurosnap.structure.find_hydrophobic_residues(structure, chain=None)[source]#
Return hydrophobic residues from a single structure.
- neurosnap.structure.find_interface_contacts(structure, chain1, chain2, *, cutoff=4.5, hydrogens=True)[source]#
Identify atom-atom contacts between two chains using a distance cutoff.
- Parameters:
- Return type:
- Returns:
List of contacting
(atom1, atom2)pairs.
- neurosnap.structure.find_interface_residues(structure, chain1, chain2, *, cutoff=4.5, hydrogens=True)[source]#
Identify unique residue-residue contacts between two chains.
Multiple atom-atom contacts between the same residue pair are collapsed into one output pair.
- Parameters:
- Return type:
- Returns:
List of unique contacting
(residue1, residue2)pairs.
- neurosnap.structure.find_non_interface_hydrophobic_patches(structure, chain_pairs, target_chains=None, *, cutoff_interface=4.5, hydrogens=True, patch_cutoff=6.0, min_patch_area=40.0)[source]#
Identify solvent-exposed hydrophobic patches outside specified interfaces.
Hydrophobic residues are first filtered to remove interface residues and buried residues, then clustered by CA-CA proximity into connected components.
- Parameters:
chain_pairs (
Iterable[Tuple[str,str]]) – Iterable of chain-ID pairs whose interfaces should be excluded from patch detection.target_chains (
Optional[Iterable[str]]) – Optional chain IDs to search for patches. IfNone, all chains are considered.cutoff_interface (
float) – Distance cutoff in Å used to classify interface contacts.hydrogens (
bool) – Whether hydrogen atoms should be included in the interface contact search.patch_cutoff (
float) – CA-CA distance cutoff in Å used to connect hydrophobic residues into the same patch.min_patch_area (
float) – Minimum summed SASA in Ų required for a connected component to be returned.
- Return type:
- Returns:
List of residue lists, where each list represents one hydrophobic patch.
- neurosnap.structure.find_salt_bridges(structure, chain=None, cutoff=4.0)[source]#
Identify salt bridges using CA-CA distance as a simple proxy.
- Parameters:
- Return type:
- Returns:
List of
(positive_residue, negative_residue)pairs that satisfy the distance cutoff.
- neurosnap.structure.fix_nucleic_termini(structure, *, strip_3prime=False, chain=None)[source]#
Normalize nucleotide phosphate names and strip terminal phosphate atoms.
- neurosnap.structure.get_backbone(structure, chains=None, *, include_nucleotides=True)[source]#
Extract ordered backbone coordinates from a single structure.
Protein residues contribute
N,CA, andCatoms. Wheninclude_nucleotidesis enabled, DNA and RNA residues contribute their sugar-phosphate backbone atoms in a deterministic order. Non-polymers are ignored.- Parameters:
- Return type:
- Returns:
A NumPy array of backbone coordinates with shape
(n_atoms, 3).
- neurosnap.structure.remove_atoms(structure, predicate, *, chain=None)[source]#
Remove atoms from a structure in-place when they match a predicate.
- neurosnap.structure.remove_chains(structure, predicate)[source]#
Remove chains from a structure in-place when they match a predicate.
- neurosnap.structure.remove_non_biopolymers(structure, *, chain=None)[source]#
Remove non-protein and non-nucleotide residues from a structure in-place.
- Parameters:
- Returns:
None. The input structure is modified in-place.
Notes
Hetero residues are always removed by this filter, even if their residue names overlap with amino-acid or nucleotide dictionaries such as
UNK.
- neurosnap.structure.remove_nucleotides(structure, *, chain=None)[source]#
Remove DNA and RNA residues from a structure in-place.
- neurosnap.structure.remove_waters(structure, *, chain=None)[source]#
Remove water residues from a structure in-place.
- neurosnap.structure.render_pseudo3D(segments, *, c=None, sizes=None, cmap='gist_rainbow', cmin=None, cmax=None, image_size=(800, 800), padding=20, line_w=2.0, shadow=0.95, background_color=(255, 255, 255), upsample=2, chainbreak=5)[source]#
Plot the famous Pseudo 3D projection of a protein using Pillow.
Adapted from an algorithm originally written By Dr. Sergey Ovchinnikov.
- Parameters:
segments (
Iterable[Union[ndarray,DataFrame]]) – Iterable of XYZ coordinates, where each element is a segment/molecule to draw separatelyc (
Optional[Iterable[ndarray]]) – Iterable of 1D arrays used to color the protein, aligned one-to-one withsegments; defaults to residue indexsizes (
Optional[Iterable[ndarray]]) – Iterable of 1D arrays of radii/size values, aligned one-to-one withsegments; interpreted in the same units as coordinatescmap (
str) – Color map name or callable used for coloring the proteincmin (
Optional[float]) – Minimum value for coloring, automatically calculated if Nonecmax (
Optional[float]) – Maximum value for coloring, automatically calculated if Noneimage_size (
Tuple[int,int]) – Final image size in pixels (width, height)padding (
int) – Padding in pixels around the drawing regionline_w (
float) – Line width (interpreted in data space; converted to pixels)shadow (
float) – Shadow intensity between 0 and 1 inclusive, lower numbers mean darker more intense shadowsbackground_color (
Tuple[int,int,int]) – Background RGB colorupsample (
int) – Factor to draw at higher resolution and downsample for antialiasingchainbreak (
int) – Minimum distance in angstroms between chains / segments before being considered a chain break (int)
- Return type:
Image- Returns:
Pillow Image containing the rendering
- neurosnap.structure.render_structure_pseudo3D(structure, *, style='residue_id', use_radii=False, image_size=(576, 432), padding=20, shadow=0.95, upsample=2, chainbreak=5)[source]#
Render a single structure using the pseudo-3D Pillow renderer.
- Parameters:
style (
str) – Coloring mode (residue_id, chain_id, b-factor, pLDDT, residue_type)use_radii (
bool) – If True, apply van der Waals radii as per-atom sizesimage_size (
Tuple[int,int]) – Output image size (width, height)padding (
int) – Padding in pixels around the drawing regionupsample (
int) – Supersampling factor for antialiasingchainbreak (
int) – Distance threshold for breaking segmentsshadow (
float) – Shadow intensity between 0 and 1
- Return type:
Image- Returns:
Pillow Image containing the rendering
- neurosnap.structure.select_residues(structure, selectors, invert=False)[source]#
Select residues from a structure using a chain/residue selector string.
- Supported selector forms include:
"A"for an entire chain"A10"or"A10-20"for compact single-character chain selectors"AB:10"or"AB:10-20"for multi-character chain IDs
Submodules#
- neurosnap.structure.analysis module
- neurosnap.structure.compare module
- neurosnap.structure.filters module
- neurosnap.structure.interactions module
- neurosnap.structure.interface module
- neurosnap.structure.rendering module
- neurosnap.structure.selectors module
- neurosnap.structure.structure module
AtomChainResidueStructureStructure.__getitem__()Structure.__init__()Structure.__iter__()Structure.__len__()Structure.__repr__()Structure.add_annotation()Structure.calculate_center_of_mass()Structure.calculate_geometric_center()Structure.calculate_rog()Structure.center_at()Structure.chain_ids()Structure.chains()Structure.distances_from()Structure.remove_annotation()Structure.renumber()Structure.to_dataframe()Structure.translate()
StructureEnsembleStructureEnsemble.__getitem__()StructureEnsemble.__init__()StructureEnsemble.__iter__()StructureEnsemble.__len__()StructureEnsemble.__repr__()StructureEnsemble.append()StructureEnsemble.first()StructureEnsemble.models()StructureEnsemble.remove_model()StructureEnsemble.renumber()StructureEnsemble.to_dataframe()StructureEnsemble.to_stack()
StructureStackStructureStack.__getitem__()StructureStack.__init__()StructureStack.__iter__()StructureStack.__len__()StructureStack.__repr__()StructureStack.append()StructureStack.atom_countStructureStack.first()StructureStack.from_ensemble()StructureStack.models()StructureStack.remove_model()StructureStack.renumber()StructureStack.to_dataframe()StructureStack.to_ensemble()