neurosnap.structure.structure module#
Data structures for representing molecular coordinates and annotations.
This module provides a single-model Structure, immutable hierarchy
views (Chain, Residue, and Atom), an ordered
multi-model container (StructureEnsemble), and a shared-annotation
multi-model fast path (StructureStack).
The universal length unit is Å.
- class neurosnap.structure.structure.Atom(x, y, z, chain_id, res_id, ins_code, res_name, hetero, atom_name, element, annotations=<factory>)[source]#
Bases:
objectImmutable atom-level hierarchy view.
- class neurosnap.structure.structure.Chain(chain_id, _residues)[source]#
Bases:
objectImmutable chain-level hierarchy view.
A
Chainis a read-only hierarchy view over the residues associated with one chain identifier in a singleStructure. It provides chain- level traversal plus convenience helpers for sequence extraction and simple residue-number gap detection.- chain_id#
Chain identifier represented by this view.
- __getitem__(res_id)[source]#
Return a residue view by residue ID, not by positional index.
- Parameters:
res_id (
int) – Residue sequence number to retrieve.- Return type:
- Returns:
The first
Residuein this chain with the requested residue ID.- Raises:
Notes
This method looks up residues by their residue ID rather than by list position. If multiple residues share the same residue ID, such as inserted residues distinguished by insertion codes, the first matching residue is returned and a warning is emitted.
- missing_residue_ids()[source]#
Return missing residue numbers inferred from gaps in the chain.
Hetero residues are ignored so ligand or solvent numbering does not create artificial gaps in the polymer residue sequence.
- sequence(polymer_type='auto', include_modifications=False, modification_mode='inline', on_unknown_modified='raise')[source]#
Return the polymer sequence for this chain.
Protein, DNA, and RNA sequences are supported. Small molecules and other non-polymer residues in the chain are ignored. Modified residues can either be skipped, emitted inline as
(CCD), or mapped to their parent sequence code when available.- Parameters:
polymer_type (
Literal['auto','protein','dna','rna','nucleotide']) – Polymer family to extract."auto"infers the family from the chain contents."nucleotide"accepts either DNA or RNA, but raises if both are present.include_modifications (
bool) – Whether modified residues should contribute to the sequence. IfFalse, modified residues are skipped entirely.modification_mode (
Literal['inline','parent']) – How included modifications are emitted."inline"inserts(CCD)tokens, while"parent"uses the inferred parent residue code.on_unknown_modified (
Literal['raise','unknown']) – Behavior whenmodification_mode="parent"is requested but no parent code can be inferred."raise"raises aValueError;"unknown"inserts"X".
- Return type:
- Returns:
Sequence string for the selected polymer family. Returns an empty string if the chain contains no residues from the requested polymer family.
- Raises:
ValueError – If the chain mixes polymer families in a way that conflicts with
polymer_typeor if an unknown modified residue cannot be mapped in"parent"mode.
- class neurosnap.structure.structure.Residue(chain_id, res_id, ins_code, res_name, hetero, _atoms, _atom_indices)[source]#
Bases:
objectImmutable residue-level hierarchy view.
A
Residuegroups atoms that share the same chain identifier, residue number, insertion code, residue name, and hetero flag. The object is a lightweight read-only view over the parsed atom table, intended for traversal and analysis rather than in-place editing.- chain_id#
Chain identifier containing the residue.
- res_id#
Residue sequence number.
- ins_code#
PDB insertion code for the residue.
- res_name#
Residue name / CCD code.
- hetero#
Truefor heterogens andFalsefor polymerATOMrecords.
- class neurosnap.structure.structure.Structure(remove_annotations=True)[source]#
Bases:
objectSingle-model molecular structure container.
Coordinates are stored separately from per-atom annotations so geometry-heavy operations can work on compact numeric arrays while annotation schemas remain flexible.
- Parameters:
remove_annotations (
bool) – IfTrue, optional annotation columns that only contain default values are removed after initialization.
- add_annotation(name, dtype, values=None, *, fill_value=None, overwrite=False)[source]#
Add a new per-atom annotation column.
- Parameters:
name (
str) – Annotation name to add.dtype (
Any) – NumPy-compatible scalar dtype for the annotation values.values (
Any) – Optional per-atom values for the annotation.fill_value (
Any) – Optional default value used whenvaluesis not supplied.overwrite (
bool) – Whether to replace an existing optional annotation of the same name.
- Raises:
ValueError – If the name is invalid, reserved, already present, or the supplied values do not match the atom count.
TypeError – If the supplied dtype is not a scalar per-atom dtype.
- calculate_center_of_mass(chains=None)[source]#
Calculate the center of mass for the selected atoms.
- Parameters:
chains (
Optional[List[str]]) – Optional chain IDs to include. IfNone, all atoms are used.- Return type:
- Returns:
A length-3 NumPy array containing the center of mass in Å.
- Raises:
ValueError – If no atoms are found in the selected structure or if any selected atom has an unknown element mass.
- calculate_geometric_center(chains=None)[source]#
Calculate the geometric center for the selected atoms.
- Parameters:
chains (
Optional[List[str]]) – Optional chain IDs to include. IfNone, all atoms are used.- Return type:
- Returns:
A length-3 NumPy array containing the arithmetic mean of the selected atom coordinates in Å.
- Raises:
ValueError – If no atoms are found in the selected structure.
- calculate_rog(chains=None, center=None)[source]#
Calculate the radius of gyration for the selected atoms.
- center_at(x=0.0, y=0.0, z=0.0, chains=None)[source]#
Translate selected atoms so their center of mass matches a target point.
- distances_from(point, chains=None)[source]#
Calculate distances from a point for the selected atoms.
- remove_annotation(name)[source]#
Remove a non-mandatory annotation column and return its values.
- Parameters:
name (
str) – Annotation name to remove.- Returns:
Copy of the removed annotation values.
- Raises:
KeyError – If the annotation does not exist.
ValueError – If the name is invalid or refers to a mandatory annotation.
- renumber(chain=None, start=1)[source]#
Renumber residues in-place.
- Parameters:
Notes
Renumbering treats inserted residues as ordinary sequential residues and clears their insertion codes. For example, residues
10,10A, and10Bbecome1,2, and3(with empty insertion codes) when renumbered withstart=1.
- to_dataframe()[source]#
Export the structure as a pandas dataframe.
This dataframe is derived on demand from the current atom table and is never cached on the structure.
- Return type:
DataFrame
- class neurosnap.structure.structure.StructureEnsemble(models=None, *, model_ids=None, metadata=None)[source]#
Bases:
objectOrdered collection of independent
Structuremodels.Unlike
StructureStack, models in an ensemble do not need to have the same atoms, annotations, or bonds.- Parameters:
- __getitem__(index)[source]#
Return a model by model ID or a sliced sub-ensemble by position.
Integer access uses
model_idlookup rather than positional indexing, soensemble[5]returns the model whose ID is5. Slice access keeps normal positional semantics to preserve standard Python iteration and slicing behavior.- Raises:
KeyError – If an integer model ID is requested but not present.
- __init__(models=None, *, model_ids=None, metadata=None)[source]#
Initialize an ordered collection of independent models.
- first()[source]#
Return the first model in the ensemble.
- Return type:
- Returns:
The first
Structurein stored order.- Raises:
IndexError – If the ensemble is empty.
- renumber(start=1)[source]#
Renumber model identifiers in-place.
- Parameters:
start (
int) – Starting model ID. Defaults to1.
- to_dataframe()[source]#
Export the ensemble as a pandas dataframe with a
modelcolumn.This dataframe is derived on demand from the current models and is never cached on the ensemble.
- Return type:
DataFrame
- to_stack()[source]#
Convert the ensemble into a
StructureStack.- Raises:
ValueError – If the models are not stack-compatible.
- Return type:
- class neurosnap.structure.structure.StructureStack(models=None, *, model_ids=None, metadata=None)[source]#
Bases:
objectShared-annotation, shared-bond multi-model fast path.
All models in a stack must share the same atom ordering, per-atom annotations, and bonds. Only the coordinates vary between models.
- Parameters:
- __getitem__(index)[source]#
Return a materialized model by model ID or a sliced sub-stack by position.
Integer access uses
model_idlookup rather than positional indexing, sostack[5]returns the model whose ID is5. Slice access keeps normal positional semantics to preserve standard Python slicing behavior.- Raises:
KeyError – If an integer model ID is requested but not present.
- __init__(models=None, *, model_ids=None, metadata=None)[source]#
Initialize an empty or pre-populated stack of compatible models.
- append(model, *, model_id=None)[source]#
Append a stack-compatible model.
- Parameters:
- Raises:
ValueError – If the candidate model is not compatible with the existing stack.
- first()[source]#
Return the first model in the stack.
- Return type:
- Returns:
The first
Structurein stored order.- Raises:
IndexError – If the stack is empty.
- classmethod from_ensemble(ensemble)[source]#
Build a stack from an ensemble of compatible models.
- Return type:
- renumber(start=1)[source]#
Renumber model identifiers in-place.
- Parameters:
start (
int) – Starting model ID. Defaults to1.
- to_dataframe()[source]#
Export the stack as a pandas dataframe with a
modelcolumn.This dataframe is derived on demand from the current stack contents and is never cached on the stack.
- Return type:
DataFrame