neurosnap.chemistry package#

Public chemistry package exports.

neurosnap.chemistry.align_molecule_to_reference(mol, ref_mol)[source]#

Aligns a molecule to a reference molecule and returns the aligned copy.

The alignment is performed using RDKit’s coordinate-based molecular alignment routine. The input molecule is copied before alignment, so the original object remains unchanged.

Parameters:
  • mol (Chem.Mol) – Molecule to align, with at least one conformer.

  • ref_mol (Chem.Mol) – Reference molecule defining the target orientation.

Returns:

A copy of mol aligned to ref_mol.

Return type:

Chem.Mol

Raises:

ValueError – If either molecule is None or lacks conformers.

neurosnap.chemistry.calculate_distance_matrix(mol)[source]#

Calculates the pairwise 3D distance matrix for a molecule.

Distances are computed from the atomic coordinates stored in the molecule’s active conformer. The returned matrix is square with one row and column per atom.

Parameters:

mol (Chem.Mol) – Input RDKit molecule with at least one conformer.

Returns:

A square NumPy array of shape (n_atoms, n_atoms)

containing pairwise Euclidean distances in Angstroms.

Return type:

np.ndarray

Raises:

ValueError – If the input molecule is None or has no conformers.

neurosnap.chemistry.calculate_rmsd(mol_a, mol_b)[source]#

Calculates the best-fit RMSD between two molecules.

This function uses RDKit’s alignment-based RMSD calculation, meaning the molecules are optimally superimposed before the RMSD value is reported. As a result, pure rigid-body translations and rotations do not by themselves increase the returned RMSD.

Parameters:
  • mol_a (Chem.Mol) – First RDKit molecule with at least one conformer.

  • mol_b (Chem.Mol) – Second RDKit molecule with at least one conformer.

Returns:

Best-fit root-mean-square deviation between the two molecules.

Return type:

float

Raises:

ValueError – If either molecule is None or lacks conformers.

neurosnap.chemistry.canonicalize_smiles(smiles)[source]#

Converts a SMILES string into its canonical RDKit representation.

This is useful for normalizing equivalent SMILES strings into a stable text form for storage, comparison, or deduplication.

Parameters:

smiles (str) – Input SMILES string to canonicalize.

Returns:

Canonical SMILES string produced by RDKit.

Return type:

str

Raises:

ValueError – If the input string cannot be parsed as a valid SMILES.

neurosnap.chemistry.find_LCS(mol)[source]#

Find the largest common substructure (LCS) between a set of conformers and aligns all conformers to the LCS.

Parameters:

mol (Mol) – Input RDkit molecule object, must already have conformers present

Return type:

Mol

Returns:

Resultant molecule object with all conformers aligned to the LCS

Raises:

Exception – if no LCS is detected

neurosnap.chemistry.generate(input_mol, output_name='unique_conformers', write_multi=False, num_confs=1000, min_method='auto', max_atoms=500)[source]#

Generate conformers for an input molecule.

Performs the following actions in order: 1. Generate conformers using ETKDG method 2. Minimize energy of all conformers and remove those below a dynamic threshold 3. Align & create RMSD matrix of all conformers 4. Clusters using Butina method to remove structurally redundant conformers 5. Return most energetically favorable conformers in each cluster

Parameters:
  • input_mol (Any) – Input molecule can be a path to a molecule file, a SMILES string, or an instance of rdkit.Chem.rdchem.Mol

  • output_name (str) – Output to write SDF files of passing conformers

  • write_multi (bool) – If True will write all unique conformers to a single SDF file, if False will write all unique conformers in separate SDF files in output_name

  • num_confs (int) – Number of conformers to generate

  • min_method (Optional[str]) – Method for minimization, can be either “auto”, “UFF”, “MMFF94”, “MMFF94s”, or None for no minimization

  • max_atoms (int) – Maximum number of atoms allowed for the input molecule

Return type:

DataFrame

Returns:

A dataframe with all conformer statistics. Note if energy minimization is disabled or fails then energy column will consist of None values.

neurosnap.chemistry.get_mol_center(mol, use_mass=False)[source]#

Computes the geometric center or center of mass of a molecule.

Parameters:
  • mol (Mol) – An RDKit molecule object with 3D coordinates.

  • use_mass (bool, optional) – If True, computes the center of mass using atomic masses. If False, computes the simple geometric center. Defaults to False.

Returns:

A NumPy array of shape (3,) representing the [x, y, z] center coordinates.

Returns None if the molecule has no conformers.

Return type:

np.ndarray

Raises:

ValueError – If no conformer is found in the molecule.

neurosnap.chemistry.largest_fragment(mol)[source]#

Selects the largest fragment from a multi-component molecule.

This is typically useful for salts, mixtures, or counterion-containing inputs where only the primary chemical component should be retained.

Parameters:

mol (Chem.Mol) – Input RDKit molecule, which may contain multiple fragments.

Returns:

A copy containing only the largest fragment.

Return type:

Chem.Mol

Raises:

ValueError – If the input molecule is None.

neurosnap.chemistry.minimize(mol, method='MMFF94', percentile=100.0)[source]#

Minimize conformer energy (kcal/mol) using RDkit and filter out conformers based on energy percentile.

Parameters:
  • mol (Mol) – RDkit mol object containing the conformers you want to minimize. (rdkit.Chem.rdchem.Mol)

  • method (str) – Can be either UFF, MMFF94, or MMFF94s (str)

  • percentile (float) – Filters out conformers above a given energy percentile (0 to 100). For example, 10.0 will retain conformers within the lowest 10% energy. (float)

Return type:

Tuple[float, Dict[int, float]]

Returns:

A tuple of the form (mol_filtered, energies) - mol_filtered: Molecule object with filtered conformers. - energies: Dictionary where keys are conformer IDs and values are calculated energies in kcal/mol.

neurosnap.chemistry.move_ligand_to_center(ligand_sdf_path, receptor_pdb_path, output_sdf_path, use_mass=False)[source]#

Moves the center of a ligand in an SDF file to match the center of a receptor in a PDB file.

This function reads a ligand from an SDF file and a receptor from a PDB file, calculates their respective centers (center of mass or geometric center), and translates the ligand such that its center aligns with the receptor’s center. The modified ligand is then saved to a new SDF file.

Parameters:
  • ligand_sdf_path (str) – Path to the input ligand SDF file.

  • receptor_pdb_path (str) – Path to the input receptor PDB file.

  • output_sdf_path (str) – Path where the adjusted ligand SDF will be saved.

  • use_mass (bool, optional) – If True, compute center of mass; otherwise use geometric center. Defaults to False.

Returns:

Path to the output SDF file with the translated ligand.

Return type:

str

Raises:

ValueError – If the ligand cannot be parsed from the input SDF file.

neurosnap.chemistry.neutralize_molecule(mol)[source]#

Neutralizes formal charges in a molecule where chemically supported.

This function uses RDKit’s uncharging logic to neutralize ionized atoms when a valid neutral form can be produced. Charges that cannot be safely neutralized are preserved.

Parameters:

mol (Chem.Mol) – Input RDKit molecule to neutralize.

Returns:

A copy of the molecule with reducible charges neutralized.

Return type:

Chem.Mol

Raises:

ValueError – If the input molecule is None.

neurosnap.chemistry.remove_salts(mol)[source]#

Removes common salt fragments while retaining the main molecular component.

The function first strips recognized salts and small counterions using RDKit’s salt remover, then selects the largest remaining fragment to produce a single primary molecule.

Parameters:

mol (Chem.Mol) – Input RDKit molecule that may contain salts or counterions.

Returns:

A desalted copy of the molecule.

Return type:

Chem.Mol

Raises:

ValueError – If the input molecule is None.

neurosnap.chemistry.sdf_to_smiles(fpath)[source]#

Converts molecules in an SDF file to SMILES strings.

Reads an input SDF file and extracts SMILES strings from its molecules. Invalid or unreadable molecules are skipped, with warnings logged.

Parameters:

fpath (str) – Path to the input SDF file.

Returns:

A list of SMILES strings corresponding to valid molecules in the SDF file.

Return type:

List[str]

Raises:
neurosnap.chemistry.smiles_to_sdf(smiles, output_path)[source]#

Converts a SMILES string to an sdf file. Will overwrite existing results.

NOTE: This function does the bare minimum in terms of generating the SDF molecule. The neurosnap.chemistry.conformers module should be used in most cases.

Parameters:
  • smiles (str) – Smiles string to parse and convert

  • output_path (str) – Path to output SDF file, should end with .sdf

Return type:

None

neurosnap.chemistry.standardize_molecule(mol)[source]#

Standardizes a molecule using RDKit’s cleanup workflow.

The standardization process applies RDKit’s built-in molecular cleanup rules, which can normalize representations such as functional groups, charges, and related valence patterns into a more consistent form.

Parameters:

mol (Chem.Mol) – Input RDKit molecule to standardize.

Returns:

A standardized copy of the input molecule.

Return type:

Chem.Mol

Raises:

ValueError – If the input molecule is None.

neurosnap.chemistry.translate_molecule(mol, vector)[source]#

Translates all atomic coordinates in a molecule by a vector.

The input molecule is not modified in place. Instead, a copy is made and every atom position in the first conformer is shifted by the provided [x, y, z] vector.

Parameters:
  • mol (Chem.Mol) – Input RDKit molecule with at least one conformer.

  • vector – Translation vector of length 3 containing the x, y, and z shifts.

Returns:

A translated copy of the input molecule.

Return type:

Chem.Mol

Raises:

ValueError – If the molecule is None, has no conformers, or the translation vector is not length 3.

neurosnap.chemistry.validate_smiles(smiles)[source]#

Validates a SMILES (Simplified Molecular Input Line Entry System) string.

Parameters:

smiles (str) – The SMILES string to validate.

Returns:

True if the SMILES string is valid, False otherwise.

Return type:

bool

Raises:

Exception – Logs any exception encountered during validation.

Submodules#