neurosnap.structure package#

Public structure package exports.

class neurosnap.structure.Atom(x, y, z, chain_id, res_id, ins_code, res_name, hetero, atom_name, element, annotations=<factory>)[source]#

Bases: object

Immutable atom-level hierarchy view.

annotations: Mapping[str, Any]#

atom_name: str#

chain_id: str#

property coord: ndarray#: Return the atom coordinates as a length-3 NumPy array.

element: str#

hetero: bool#

ins_code: str#

res_id: int#

res_name: str#

x: float#

y: float#

z: float#

class neurosnap.structure.Chain(chain_id, _residues)[source]#

Bases: object

Immutable chain-level hierarchy view.

A Chain is a read-only hierarchy view over the residues associated with one chain identifier in a single Structure. It provides chain- level traversal plus convenience helpers for sequence extraction and simple residue-number gap detection.

chain_id#: Chain identifier represented by this view.

__getitem__(res_id)[source]#

Return a residue view by residue ID, not by positional index.

Parameters:

res_id (int) – Residue sequence number to retrieve.

Return type:

Residue

Returns:

The first Residue in this chain with the requested residue ID.

Raises:

TypeError – If res_id is not an integer residue ID.
KeyError – If no residue with the requested ID is present in the chain.

Notes

This method looks up residues by their residue ID rather than by list position. If multiple residues share the same residue ID, such as inserted residues distinguished by insertion codes, the first matching residue is returned and a warning is emitted.

__iter__()[source]#

Iterate over residues in residue order.

Return type:: Iterator[Residue]

chain_id: str#

missing_residue_ids()[source]#

Return missing residue numbers inferred from gaps in the chain.

Hetero residues are ignored so ligand or solvent numbering does not create artificial gaps in the polymer residue sequence.

Return type:: List[int]
Returns:: Sorted list of integer residue IDs that are absent between observed non-hetero residue numbers.

residues()[source]#

Return the residues that belong to this chain.

Return type:: List[Residue]
Returns:: List of immutable Residue views in residue order.

sequence(polymer_type='auto', include_modifications=False, modification_mode='inline', on_unknown_modified='raise')[source]#

Return the polymer sequence for this chain.

Protein, DNA, and RNA sequences are supported. Small molecules and other non-polymer residues in the chain are ignored. Modified residues can either be skipped, emitted inline as (CCD), or mapped to their parent sequence code when available.

Parameters:

polymer_type (Literal['auto', 'protein', 'dna', 'rna', 'nucleotide']) – Polymer family to extract. "auto" infers the family from the chain contents. "nucleotide" accepts either DNA or RNA, but raises if both are present.
include_modifications (bool) – Whether modified residues should contribute to the sequence. If False, modified residues are skipped entirely.
modification_mode (Literal['inline', 'parent']) – How included modifications are emitted. "inline" inserts (CCD) tokens, while "parent" uses the inferred parent residue code.
on_unknown_modified (Literal['raise', 'unknown']) – Behavior when modification_mode="parent" is requested but no parent code can be inferred. "raise" raises a ValueError; "unknown" inserts "X".

Return type:

str

Returns:

Sequence string for the selected polymer family. Returns an empty string if the chain contains no residues from the requested polymer family.

Raises:

ValueError – If the chain mixes polymer families in a way that conflicts with polymer_type or if an unknown modified residue cannot be mapped in "parent" mode.

class neurosnap.structure.Residue(chain_id, res_id, ins_code, res_name, hetero, _atoms, _atom_indices)[source]#

Bases: object

Immutable residue-level hierarchy view.

A Residue groups atoms that share the same chain identifier, residue number, insertion code, residue name, and hetero flag. The object is a lightweight read-only view over the parsed atom table, intended for traversal and analysis rather than in-place editing.

chain_id#: Chain identifier containing the residue.

res_id#: Residue sequence number.

ins_code#: PDB insertion code for the residue.

res_name#: Residue name / CCD code.

hetero#: True for heterogens and False for polymer ATOM records.

__eq__(other)[source]#: Compare two residue views by stable identity.

__hash__()[source]#: Return a hash derived from key().

atom_indices()[source]#

Return atom-table indices for the atoms in this residue.

Return type:: List[int]
Returns:: List of integer atom indices in atom-table order.

atoms()[source]#

Return the atoms that belong to this residue.

Return type:: List[Atom]
Returns:: List of immutable Atom views in atom-table order.

chain_id: str#

hetero: bool#

ins_code: str#

key()[source]#

Return a stable residue identity tuple.

The returned key is suitable for dictionary/set membership when residue identity needs to be tracked outside the object itself.

Return type:: Tuple[str, int, str, str, bool]
Returns:: (chain_id, res_id, ins_code, res_name, hetero)

res_id: int#

res_name: str#

class neurosnap.structure.Structure(*, remove_annotations=True)[source]#

Bases: object

Single-model molecular structure container.

Coordinates are stored separately from per-atom annotations so geometry-heavy operations can work on compact numeric arrays while annotation schemas remain flexible.

Parameters:: remove_annotations (bool) – If True, optional annotation columns that only contain default values are removed after initialization.

__getitem__(chain_id)[source]#

Return a chain view by chain ID.

Parameters:

chain_id (str) – Chain identifier to retrieve.

Return type:

Chain

Returns:

The matching Chain view.

Raises:

TypeError – If chain_id is not a string.
KeyError – If the requested chain is not present in the structure.

__init__(*, remove_annotations=True)[source]#: Initialize an empty single-model structure.

__iter__()[source]#

Iterate over chains in atom-table order.

Return type:: Iterator[Chain]

__len__()[source]#

Return the number of atoms in the structure.

Return type:: int

__repr__()[source]#

Return a compact string summary of the structure.

Return type:: str

add_annotation(name, dtype, values=None, *, fill_value=None, overwrite=False)[source]#

Add a new per-atom annotation column.

Parameters:

name (str) – Annotation name to add.
dtype (Any) – NumPy-compatible scalar dtype for the annotation values.
values (Any) – Optional per-atom values for the annotation.
fill_value (Any) – Optional default value used when values is not supplied.
overwrite (bool) – Whether to replace an existing optional annotation of the same name.

Raises:

ValueError – If the name is invalid, reserved, already present, or the supplied values do not match the atom count.
TypeError – If the supplied dtype is not a scalar per-atom dtype.

calculate_center_of_mass(chains=None)[source]#

Calculate the center of mass for the selected atoms.

Parameters:: chains (Optional[List[str]]) – Optional chain IDs to include. If None, all atoms are used.
Return type:: ndarray
Returns:: A length-3 NumPy array containing the center of mass in Å.
Raises:: ValueError – If no atoms are found in the selected structure or if any selected atom has an unknown element mass.

calculate_geometric_center(chains=None)[source]#

Calculate the geometric center for the selected atoms.

Parameters:: chains (Optional[List[str]]) – Optional chain IDs to include. If None, all atoms are used.
Return type:: ndarray
Returns:: A length-3 NumPy array containing the arithmetic mean of the selected atom coordinates in Å.
Raises:: ValueError – If no atoms are found in the selected structure.

calculate_rog(chains=None, center=None)[source]#

Calculate the radius of gyration for the selected atoms.

Parameters:

chains (Optional[List[str]]) – Optional chain IDs to include. If None, all atoms are used.
center (Optional[ndarray]) – Optional reference point. If None, the center of mass is used.

Return type:

float

Returns:

Radius of gyration in Å.

center_at(x=0.0, y=0.0, z=0.0, chains=None)[source]#

Translate selected atoms so their center of mass matches a target point.

Parameters:

x (float) – Target x-coordinate for the center of mass.
y (float) – Target y-coordinate for the center of mass.
z (float) – Target z-coordinate for the center of mass.
chains (Optional[List[str]]) – Optional chain IDs to center. If None, all atoms are used.

chain_ids()[source]#

Return all chains IDs found in the structure.

Return type:: List[str]
Returns:: List of strings for each chain.

chains()[source]#

Return all chains in the structure as immutable hierarchy views.

Return type:: List[Chain]
Returns:: List of Chain objects in atom-table order.

distances_from(point, chains=None)[source]#

Calculate distances from a point for the selected atoms.

Parameters:

point (ndarray) – Reference point as an array-like object with shape (3,).
chains (Optional[List[str]]) – Optional chain IDs to include. If None, all atoms are used.

Return type:

ndarray

Returns:

A 1D NumPy array containing Euclidean distances in atom-table order.

remove_annotation(name)[source]#

Remove a non-mandatory annotation column and return its values.

Parameters:

name (str) – Annotation name to remove.

Returns:

Copy of the removed annotation values.

Raises:

KeyError – If the annotation does not exist.
ValueError – If the name is invalid or refers to a mandatory annotation.

renumber(chain=None, start=1)[source]#

Renumber residues in-place.

Parameters:

chain (Optional[str]) – Chain ID to renumber. If None, all chains are renumbered in chain order using one continuous counter.
start (int) – Starting residue number.

Notes

Renumbering treats inserted residues as ordinary sequential residues and clears their insertion codes. For example, residues 10, 10A, and 10B become 1, 2, and 3 (with empty insertion codes) when renumbered with start=1.

save_cif(cif, *, minimal=False)[source]#

Write the structure directly as an mmCIF file.

This is a convenience wrapper around neurosnap.io.mmcif.save_cif(). It preserves the current atom table and metadata exactly as stored on the structure.

Parameters:

cif – Output filepath or open writable file handle.
minimal (bool) – If True, emit compact atom-site-only mmCIF output. If False (default), include entity/polymer/subchain metadata.

save_pdb(pdb)[source]#

Write the structure directly as a PDB file.

This is a convenience wrapper around neurosnap.io.pdb.save_pdb(). It is especially useful after select(), because the selected structure can be exported without rebuilding a new container manually.

Parameters:: pdb – Output filepath or open writable file handle.

select(*, chains=None, residues=None, predicate=None)[source]#

Return an independent atom-level subset of the structure.

The returned structure preserves the selected atoms exactly as parsed: coordinates, atom serials, residue identifiers, optional annotations, and any bonds whose endpoints remain in the subset. Bond indices are remapped onto the new atom table automatically so the subset can be exported directly with save_pdb() or save_cif().

Parameters:

chains (Optional[Sequence[str]]) – Optional chain IDs to keep. If None, atoms from all chains remain eligible for selection.
residues (Optional[Sequence[Union[int, Residue, Tuple[Any, ...]]]]) –
Optional residue selectors to keep. Supported selector forms are:
- integer residue IDs, matched across all selected chains
- Residue objects
- (chain_id, res_id) tuples
- (chain_id, res_id, ins_code) tuples
- full residue-key tuples (chain_id, res_id, ins_code, res_name, hetero)
predicate (Optional[Callable[[Atom], bool]]) – Optional atom-level predicate. When provided, each atom is exposed as an immutable Atom view and kept only if the predicate returns a truthy value.

Return type:

Structure

Returns:

A new Structure containing only atoms that satisfy every provided filter.

Raises:

ValueError – If a requested chain or residue selector is not present in the structure.
TypeError – If predicate is not callable or a residue selector has an unsupported type/shape.

to_dataframe()[source]#

Export the structure as a pandas dataframe.

This dataframe is derived on demand from the current atom table and is never cached on the structure.

Return type:: DataFrame

translate(x=0.0, y=0.0, z=0.0, chains=None)[source]#

Translate selected atoms in-place by a fixed vector.

Parameters:

x (float) – Translation along the x-axis.
y (float) – Translation along the y-axis.
z (float) – Translation along the z-axis.
chains (Optional[List[str]]) – Optional chain IDs to translate. If None, all atoms are translated.

class neurosnap.structure.StructureEnsemble(models=None, *, model_ids=None, metadata=None)[source]#

Bases: object

Ordered collection of independent Structure models.

Unlike StructureStack, models in an ensemble do not need to have the same atoms, annotations, or bonds.

Parameters:

models (Optional[List[Structure]]) – Optional initial list of models.
model_ids (Optional[List[int]]) – Optional identifiers corresponding to models.
metadata (Optional[Mapping[str, Any]]) – Optional ensemble-level metadata dictionary.

__getitem__(index)[source]#

Return a model by model ID or a sliced sub-ensemble by position.

Integer access uses model_id lookup rather than positional indexing, so ensemble[5] returns the model whose ID is 5. Slice access keeps normal positional semantics to preserve standard Python iteration and slicing behavior.

Raises:: KeyError – If an integer model ID is requested but not present.

__init__(models=None, *, model_ids=None, metadata=None)[source]#: Initialize an ordered collection of independent models.

__iter__()[source]#

Iterate over the stored models in order.

Return type:: Iterator[Structure]

__len__()[source]#

Return the number of models in the ensemble.

Return type:: int

__repr__()[source]#

Return a compact string summary of the ensemble.

Return type:: str

append(model, *, model_id=None)[source]#

Append a validated model to the ensemble.

Parameters:

model (Structure) – Model to append.
model_id (Optional[int]) – Optional model identifier. Defaults to the next sequential model ID starting at 1.

first()[source]#

Return the first model in the ensemble.

Return type:: Structure
Returns:: The first Structure in stored order.
Raises:: IndexError – If the ensemble is empty.

models()[source]#

Return the models as a shallow copied list.

Return type:: List[Structure]

remove_model(model_id)[source]#

Remove and return a model by model ID.

Parameters:: model_id (int) – Model identifier to remove.
Return type:: Structure
Returns:: The removed Structure.
Raises:: KeyError – If the requested model ID is not present.

renumber(start=1)[source]#

Renumber model identifiers in-place.

Parameters:: start (int) – Starting model ID. Defaults to 1.

save_cif(cif, *, minimal=False)[source]#

Write the ensemble directly as an mmCIF file.

Parameters:

cif – Output filepath or open writable file handle.
minimal (bool) – If True, emit compact atom-site-only mmCIF output. If False (default), include entity/polymer/subchain metadata.

save_pdb(pdb)[source]#

Write the ensemble directly as a PDB file.

Parameters:: pdb – Output filepath or open writable file handle.

select(*, models=None, chains=None, residues=None, predicate=None)[source]#

Return a filtered ensemble of independently subsetted models.

Parameters:

models (Optional[Sequence[int]]) – Optional model IDs to keep. If None, all models are considered.
chains (Optional[Sequence[str]]) – Optional chain IDs to keep within each selected model.
residues (Optional[Sequence[Union[int, Residue, Tuple[Any, ...]]]]) – Optional residue selectors to keep within each selected model.
predicate (Optional[Callable[[Atom], bool]]) – Optional atom-level predicate applied independently inside each selected model.

Return type:

StructureEnsemble

Returns:

A new StructureEnsemble whose model IDs match the selected source models and whose per-model contents are the corresponding Structure.select() subsets.

Raises:

ValueError – If any requested model ID, chain, or residue selector is not present.
TypeError – If predicate is not callable or a residue selector is malformed.

to_dataframe()[source]#

Export the ensemble as a pandas dataframe with a model column.

This dataframe is derived on demand from the current models and is never cached on the ensemble.

Return type:: DataFrame

to_stack()[source]#

Convert the ensemble into a StructureStack.

Raises:: ValueError – If the models are not stack-compatible.
Return type:: StructureStack

class neurosnap.structure.StructureStack(models=None, *, model_ids=None, metadata=None)[source]#

Bases: object

Shared-annotation, shared-bond multi-model fast path.

All models in a stack must share the same atom ordering, per-atom annotations, and bonds. Only the coordinates vary between models.

Parameters:

models (Optional[List[Structure]]) – Optional initial list of stack-compatible models.
model_ids (Optional[List[int]]) – Optional identifiers corresponding to models.
metadata (Optional[Mapping[str, Any]]) – Optional stack-level metadata dictionary.

__getitem__(index)[source]#

Return a materialized model by model ID or a sliced sub-stack by position.

Integer access uses model_id lookup rather than positional indexing, so stack[5] returns the model whose ID is 5. Slice access keeps normal positional semantics to preserve standard Python slicing behavior.

Raises:: KeyError – If an integer model ID is requested but not present.

__init__(models=None, *, model_ids=None, metadata=None)[source]#: Initialize an empty or pre-populated stack of compatible models.

__iter__()[source]#

Iterate over the stack as materialized Structure models.

Return type:: Iterator[Structure]

__len__()[source]#

Return the number of models in the stack.

Return type:: int

__repr__()[source]#

Return a compact string summary of the stack.

Return type:: str

append(model, *, model_id=None)[source]#

Append a stack-compatible model.

Parameters:

model (Structure) – Model to append.
model_id (Optional[int]) – Optional model identifier. Defaults to the next sequential model ID starting at 1.

Raises:

ValueError – If the candidate model is not compatible with the existing stack.

property atom_count: int#: Return the number of shared atoms per model.

first()[source]#

Return the first model in the stack.

Return type:: Structure
Returns:: The first Structure in stored order.
Raises:: IndexError – If the stack is empty.

classmethod from_ensemble(ensemble)[source]#

Build a stack from an ensemble of compatible models.

Return type:: StructureStack

models()[source]#

Materialize and return all models in the stack.

Return type:: List[Structure]

remove_model(model_id)[source]#

Remove and return a model by model ID.

Parameters:: model_id (int) – Model identifier to remove.
Return type:: Structure
Returns:: The removed Structure.
Raises:: KeyError – If the requested model ID is not present.

renumber(start=1)[source]#

Renumber model identifiers in-place.

Parameters:: start (int) – Starting model ID. Defaults to 1.

save_cif(cif, *, minimal=False)[source]#

Write the stack directly as an mmCIF file.

Parameters:

cif – Output filepath or open writable file handle.
minimal (bool) – If True, emit compact atom-site-only mmCIF output. If False (default), include entity/polymer/subchain metadata.

save_pdb(pdb)[source]#

Write the stack directly as a PDB file.

Parameters:: pdb – Output filepath or open writable file handle.

select(*, models=None, chains=None, residues=None, predicate=None)[source]#

Return a filtered multi-model subset of the stack.

The selection is executed through the ensemble path so each chosen model is subsetted with the same semantics as Structure.select(). If the resulting models still share identical atom annotations and bonds, the return value is a StructureStack; otherwise it falls back to a StructureEnsemble.

Parameters:

models (Optional[Sequence[int]]) – Optional model IDs to keep. If None, all models are considered.
chains (Optional[Sequence[str]]) – Optional chain IDs to keep within each selected model.
residues (Optional[Sequence[Union[int, Residue, Tuple[Any, ...]]]]) – Optional residue selectors to keep within each selected model.
predicate (Optional[Callable[[Atom], bool]]) – Optional atom-level predicate applied independently inside each selected model.

Return type:

Union[StructureStack, StructureEnsemble]

Returns:

A StructureStack when the subset remains stack-compatible, otherwise a StructureEnsemble.

to_dataframe()[source]#

Export the stack as a pandas dataframe with a model column.

This dataframe is derived on demand from the current stack contents and is never cached on the stack.

Return type:: DataFrame

to_ensemble()[source]#

Convert the stack into an independent StructureEnsemble.

Return type:: StructureEnsemble

neurosnap.structure.align(reference, mobile, chains1=None, chains2=None)[source]#

Align a mobile structure onto a reference structure using polymer backbone atoms.

When both chains1 and chains2 are provided, they are interpreted as explicit pairwise chain mappings in matching order.

Parameters:

reference (Structure) – Reference single-model Structure.
mobile (Structure) – Mobile single-model Structure to transform in-place.
chains1 (Optional[Sequence[str]]) – Optional reference chain IDs to include in the alignment.
chains2 (Optional[Sequence[str]]) – Optional mobile chain IDs to include in the alignment.

Returns:

None. The mobile structure is transformed in-place.

neurosnap.structure.animate_frames(frames, output_fpath, *, title='', subtitles=None, interval=200, repeat=True, background_color=(255, 255, 255))[source]#

Animate a sequence of frames using Pillow only and write to disk.

Parameters:

frames (Iterable[Union[Image, ndarray]]) – Iterable of frames to animate (Pillow Images or arrays convertible to images)
output_fpath (Union[str, Path]) – Path where the animation will be written; format inferred from extension (gif, webp, mp4)
title (str) – Title text to display above the animation; omit if empty
subtitles (Optional[Iterable[str]]) – Iterable of subtitle strings, one per frame (must match length of frames)
interval (int) – Delay between frames in milliseconds
repeat (bool) – Whether the animation repeats when the sequence of frames is completed (loop=0 if True else 1 for gif/webp; ignored for mp4)
background_color (Tuple[int, int, int]) – RGB background color used for the entire canvas (including title/subtitle band)

neurosnap.structure.ca_distance_matrix(structure, chain=None)[source]#

Alias for calculate_distance_matrix().

Parameters:

structure (Structure) – Input single-model Structure.
chain (Optional[str]) – Optional chain ID to restrict the calculation to.

Return type:

ndarray

Returns:

A square NumPy array of pairwise CA distances in Å.

neurosnap.structure.calculate_bsa(structure, chain_group_1, chain_group_2, *, level='R')[source]#

Calculate buried surface area between two chain groups.

The buried surface area (BSA) is computed as:: (SASA(group 1) + SASA(group 2)) - SASA(complex)

Parameters:

structure (Structure) – Input complex as a single-model Structure.
chain_group_1 (List[str]) – Chain IDs for the first group.
chain_group_2 (List[str]) – Chain IDs for the second group.
level (str) – Surface-area aggregation level forwarded to calculate_surface_area().

Return type:

float

Returns:

Buried surface area in Å².

neurosnap.structure.calculate_distance_matrix(structure, chain=None)[source]#

Calculate the CA-atom distance matrix for a single structure.

Parameters:

structure (Structure) – Input single-model Structure.
chain (Optional[str]) – Optional chain ID to restrict the calculation to.

Return type:

ndarray

Returns:

A square NumPy array of pairwise CA distances in Å.

neurosnap.structure.calculate_hydrogen_bonds(structure, chain=None, chain_other=None, *, donor_acceptor_cutoff=3.5, angle_cutoff=120.0)[source]#

Count hydrogen bonds using explicit hydrogens and simple geometric cutoffs.

Parameters:

structure (Structure) – Input single-model Structure.
chain (Optional[str]) – Optional donor-chain ID. When omitted, all chains are searched.
chain_other (Optional[str]) – Optional acceptor-chain ID for inter-chain counting.
donor_acceptor_cutoff (float) – Maximum donor-acceptor distance in Å.
angle_cutoff (float) – Minimum donor-H-acceptor angle in degrees.

Return type:

int

Returns:

Total number of hydrogen bonds that satisfy the geometric cutoffs.

neurosnap.structure.calculate_interface_hydrogen_bonding_residues(structure, chain=None, chain_other=None, *, donor_acceptor_cutoff=3.5, angle_cutoff=120.0)[source]#

Count unique residues that participate in inter- or intra-chain hydrogen bonds.

Parameters:

structure (Structure) – Input single-model Structure.
chain (Optional[str]) – Optional donor-chain ID. When omitted, all chains are searched.
chain_other (Optional[str]) – Optional acceptor-chain ID for inter-chain counting.
donor_acceptor_cutoff (float) – Maximum donor-acceptor distance in Å.
angle_cutoff (float) – Minimum donor-H-acceptor angle in degrees.

Return type:

int

Returns:

Number of unique residues that participate in at least one qualifying hydrogen bond.

neurosnap.structure.calculate_protein_volume(structure, chain=None)[source]#

Estimate protein volume from atom van der Waals spheres.

The calculation sums the volumes of van der Waals spheres for atoms belonging to residues classified as protein. It is therefore a simple geometric estimate rather than an excluded-volume or solvent-corrected measurement.

Parameters:

structure (Structure) – Input single-model Structure.
chain (Optional[str]) – Optional chain ID to restrict the calculation to.

Return type:

float

Returns:

Estimated protein volume in Å³.

neurosnap.structure.calculate_rmsd(reference, mobile, chains1=None, chains2=None, align_structures=True)[source]#

Calculate backbone RMSD between two structures.

Parameters:

reference (Structure) – Reference single-model Structure.
mobile (Structure) – Mobile single-model Structure.
chains1 (Optional[Sequence[str]]) – Optional reference chain IDs to include.
chains2 (Optional[Sequence[str]]) – Optional mobile chain IDs to include.
align_structures (bool) – If True, align the mobile structure before computing RMSD.

Return type:

float

Returns:

Backbone RMSD in Å using the same residue/atom correspondence as align().

neurosnap.structure.calculate_surface_area(structure, level='R', probe_radius=1.4, n_sphere_points=96)[source]#

Estimate solvent-accessible surface area using a simple Shrake-Rupley approximation.

The returned total SASA is the same regardless of level; the parameter is kept for compatibility with the public surface-area API.

Parameters:

structure (Structure) – Input single-model Structure.
level (str) – Compatibility flag matching the historical public API. The returned total SASA is always a structure-level scalar, regardless of this value. Must be one of "A", "R", "C", "M", or "S".
probe_radius (float) – Solvent probe radius in Å used to inflate atom radii during the accessibility calculation.
n_sphere_points (int) – Number of surface points sampled per atom for the Shrake-Rupley approximation.

Return type:

float

Returns:

Estimated solvent-accessible surface area in Å².

neurosnap.structure.extract_non_biopolymers(structure, output_dir, min_atoms=0)[source]#

Extract non-biopolymer fragments from a structure and write them as SDF files.

Biopolymer residues are removed using the same residue-name logic as the old implementation: any residue present in AA_RECORDS or STANDARD_NUCLEOTIDES is treated as part of a protein or nucleotide polymer, except UNK which is preserved. The remaining atoms are written to a temporary PDB, read into RDKit, split into disconnected fragments, and then exported as individual SDF files.

Parameters:

structure (Structure) – Input single-model Structure.
output_dir (str) – Directory where SDF files will be written. Any existing directory at that path is replaced.
min_atoms (int) – Minimum fragment atom count required for export.

Returns:

None. Matching fragments are written to output_dir as .sdf files.

neurosnap.structure.find_contacts(atoms1, atoms2, cutoff=4.5)[source]#

Identify atom-atom contacts between two atom sets using a distance cutoff.

Parameters:

atoms1 (List[Atom]) – First set of atoms.
atoms2 (List[Atom]) – Second set of atoms.
cutoff (float) – Distance cutoff in Å.

Return type:

List[Tuple[Atom, Atom]]

Returns:

List of (atom1, atom2) pairs within the cutoff distance.

neurosnap.structure.find_disulfide_bonds(structure, chain=None, threshold=2.05)[source]#

Find disulfide bonds between cysteine residues using SG-SG distance.

Parameters:

structure (Structure) – Input single-model Structure.
chain (Optional[str]) – Optional chain ID to restrict the search to.
threshold (float) – Maximum SG-SG distance in Å used to classify a disulfide bond.

Return type:

List[Tuple[Residue, Residue]]

Returns:

List of (residue1, residue2) cysteine pairs that satisfy the distance cutoff.

neurosnap.structure.find_hydrophobic_residues(structure, chain=None)[source]#

Return hydrophobic residues from a single structure.

Parameters:

structure (Structure) – Input single-model Structure.
chain (Optional[str]) – Optional chain ID to restrict the search to.

Return type:

List[Tuple[str, Residue]]

Returns:

List of (chain_id, residue) tuples for residues classified as hydrophobic.

neurosnap.structure.find_interface_contacts(structure, chain1, chain2, *, cutoff=4.5, hydrogens=True)[source]#

Identify atom-atom contacts between two chains using a distance cutoff.

Parameters:

structure (Structure) – Input single-model Structure.
chain1 (str) – First chain ID.
chain2 (str) – Second chain ID.
cutoff (float) – Contact cutoff distance in Å.
hydrogens (bool) – Whether hydrogen atoms should be included.

Return type:

List[Tuple[Atom, Atom]]

Returns:

List of contacting (atom1, atom2) pairs.

neurosnap.structure.find_interface_residues(structure, chain1, chain2, *, cutoff=4.5, hydrogens=True)[source]#

Identify unique residue-residue contacts between two chains.

Multiple atom-atom contacts between the same residue pair are collapsed into one output pair.

Parameters:

structure (Structure) – Input single-model Structure.
chain1 (str) – First chain ID.
chain2 (str) – Second chain ID.
cutoff (float) – Contact cutoff distance in Å.
hydrogens (bool) – Whether hydrogen atoms should be included in the contact search.

Return type:

List[Tuple[Residue, Residue]]

Returns:

List of unique contacting (residue1, residue2) pairs.

neurosnap.structure.find_non_interface_hydrophobic_patches(structure, chain_pairs, target_chains=None, *, cutoff_interface=4.5, hydrogens=True, patch_cutoff=6.0, min_patch_area=40.0)[source]#

Identify solvent-exposed hydrophobic patches outside specified interfaces.

Hydrophobic residues are first filtered to remove interface residues and buried residues, then clustered by CA-CA proximity into connected components.

Parameters:

structure (Structure) – Input single-model Structure.
chain_pairs (Iterable[Tuple[str, str]]) – Iterable of chain-ID pairs whose interfaces should be excluded from patch detection.
target_chains (Optional[Iterable[str]]) – Optional chain IDs to search for patches. If None, all chains are considered.
cutoff_interface (float) – Distance cutoff in Å used to classify interface contacts.
hydrogens (bool) – Whether hydrogen atoms should be included in the interface contact search.
patch_cutoff (float) – CA-CA distance cutoff in Å used to connect hydrophobic residues into the same patch.
min_patch_area (float) – Minimum summed SASA in Å² required for a connected component to be returned.

Return type:

List[List[Residue]]

Returns:

List of residue lists, where each list represents one hydrophobic patch.

neurosnap.structure.find_salt_bridges(structure, chain=None, cutoff=4.0)[source]#

Identify salt bridges using CA-CA distance as a simple proxy.

Parameters:

structure (Structure) – Input single-model Structure.
chain (Optional[str]) – Optional chain ID to restrict the search to.
cutoff (float) – Maximum CA-CA distance in Å used to classify a salt bridge.

Return type:

List[Tuple[Residue, Residue]]

Returns:

List of (positive_residue, negative_residue) pairs that satisfy the distance cutoff.

neurosnap.structure.fix_nucleic_termini(structure, *, strip_3prime=False, chain=None)[source]#

Normalize nucleotide phosphate names and strip terminal phosphate atoms.

Parameters:

structure (Structure) – Input Structure.
strip_3prime (bool) – If True, also remove O3P and OP3 from 3’ termini.
chain (Optional[str]) – Optional chain ID to restrict processing to.

Returns:

None. The input structure is modified in-place.

neurosnap.structure.get_backbone(structure, chains=None, *, include_nucleotides=True)[source]#

Extract ordered backbone coordinates from a single structure.

Protein residues contribute N, CA, and C atoms. When include_nucleotides is enabled, DNA and RNA residues contribute their sugar-phosphate backbone atoms in a deterministic order. Non-polymers are ignored.

Parameters:

structure (Structure) – Input single-model Structure.
chains (Optional[Sequence[str]]) – Optional chain IDs to include. If None, all chains are used.
include_nucleotides (bool) – If True, include DNA and RNA backbone atoms in addition to protein backbone atoms.

Return type:

ndarray

Returns:

A NumPy array of backbone coordinates with shape (n_atoms, 3).

neurosnap.structure.remove_atoms(structure, predicate, *, chain=None)[source]#

Remove atoms from a structure in-place when they match a predicate.

Parameters:

structure (Structure) – Input Structure.
predicate (Callable) – Callable that accepts an atom view and returns True when that atom should be removed.
chain (Optional[str]) – Optional chain ID to restrict atom removal to.

Returns:

None. The input structure is modified in-place.

neurosnap.structure.remove_chains(structure, predicate)[source]#

Remove chains from a structure in-place when they match a predicate.

Parameters:

structure (Structure) – Input Structure.
predicate (Callable) – Callable that accepts a chain view and returns True when that chain should be removed.

Returns:

None. The input structure is modified in-place.

neurosnap.structure.remove_non_biopolymers(structure, *, chain=None)[source]#

Remove non-protein and non-nucleotide residues from a structure in-place.

Parameters:

structure (Structure) – Input Structure.
chain (Optional[str]) – Optional chain ID to restrict residue removal to.

Returns:

None. The input structure is modified in-place.

Notes

Hetero residues are always removed by this filter, even if their residue names overlap with amino-acid or nucleotide dictionaries such as UNK.

neurosnap.structure.remove_nucleotides(structure, *, chain=None)[source]#

Remove DNA and RNA residues from a structure in-place.

Parameters:

structure (Structure) – Input Structure.
chain (Optional[str]) – Optional chain ID to restrict residue removal to.

Returns:

None. The input structure is modified in-place.

neurosnap.structure.remove_waters(structure, *, chain=None)[source]#

Remove water residues from a structure in-place.

Parameters:

structure (Structure) – Input Structure.
chain (Optional[str]) – Optional chain ID to restrict residue removal to.

Returns:

None. The input structure is modified in-place.

neurosnap.structure.render_pseudo3D(segments, *, c=None, sizes=None, cmap='gist_rainbow', cmin=None, cmax=None, image_size=(800, 800), padding=20, line_w=2.0, shadow=0.95, background_color=(255, 255, 255), upsample=2, chainbreak=5)[source]#

Plot the famous Pseudo 3D projection of a protein using Pillow.

Adapted from an algorithm originally written By Dr. Sergey Ovchinnikov.

Parameters:

segments (Iterable[Union[ndarray, DataFrame]]) – Iterable of XYZ coordinates, where each element is a segment/molecule to draw separately
c (Optional[Iterable[ndarray]]) – Iterable of 1D arrays used to color the protein, aligned one-to-one with segments; defaults to residue index
sizes (Optional[Iterable[ndarray]]) – Iterable of 1D arrays of radii/size values, aligned one-to-one with segments; interpreted in the same units as coordinates
cmap (str) – Color map name or callable used for coloring the protein
cmin (Optional[float]) – Minimum value for coloring, automatically calculated if None
cmax (Optional[float]) – Maximum value for coloring, automatically calculated if None
image_size (Tuple[int, int]) – Final image size in pixels (width, height)
padding (int) – Padding in pixels around the drawing region
line_w (float) – Line width (interpreted in data space; converted to pixels)
shadow (float) – Shadow intensity between 0 and 1 inclusive, lower numbers mean darker more intense shadows
background_color (Tuple[int, int, int]) – Background RGB color
upsample (int) – Factor to draw at higher resolution and downsample for antialiasing
chainbreak (int) – Minimum distance in angstroms between chains / segments before being considered a chain break (int)

Return type:

Image

Returns:

Pillow Image containing the rendering

neurosnap.structure.render_structure_pseudo3D(structure, *, style='residue_id', use_radii=False, image_size=(576, 432), padding=20, shadow=0.95, upsample=2, chainbreak=5)[source]#

Render a single structure using the pseudo-3D Pillow renderer.

Parameters:

structure (Structure) – Input single-model Structure.
style (str) – Coloring mode (residue_id, chain_id, b-factor, pLDDT, residue_type)
use_radii (bool) – If True, apply van der Waals radii as per-atom sizes
image_size (Tuple[int, int]) – Output image size (width, height)
padding (int) – Padding in pixels around the drawing region
upsample (int) – Supersampling factor for antialiasing
chainbreak (int) – Distance threshold for breaking segments
shadow (float) – Shadow intensity between 0 and 1

Return type:

Image

Returns:

Pillow Image containing the rendering

neurosnap.structure.select_residues(structure, selectors, invert=False)[source]#

Select residues from a structure using a chain/residue selector string.

Supported selector forms include:

"A" for an entire chain
"A10" or "A10-20" for compact single-character chain selectors
"AB:10" or "AB:10-20" for multi-character chain IDs

Parameters:

structure (Structure) – Input single-model Structure.
selectors (str) – Comma-delimited selector string.
invert (bool) – Whether to invert the selection within each chain.

Return type:

Dict[str, List[int]]

Returns:

Dictionary mapping chain IDs to sorted residue numbers.

neurosnap.structure package#

Submodules#

This Page