neurosnap.chemistry package#
Public chemistry package exports.
- neurosnap.chemistry.align_molecule_to_reference(mol, ref_mol)[source]#
Aligns a molecule to a reference molecule and returns the aligned copy.
The alignment is performed using RDKit’s coordinate-based molecular alignment routine. The input molecule is copied before alignment, so the original object remains unchanged.
- Parameters:
mol (Chem.Mol) – Molecule to align, with at least one conformer.
ref_mol (Chem.Mol) – Reference molecule defining the target orientation.
- Returns:
A copy of
molaligned toref_mol.- Return type:
Chem.Mol
- Raises:
ValueError – If either molecule is
Noneor lacks conformers.
- neurosnap.chemistry.calculate_distance_matrix(mol)[source]#
Calculates the pairwise 3D distance matrix for a molecule.
Distances are computed from the atomic coordinates stored in the molecule’s active conformer. The returned matrix is square with one row and column per atom.
- Parameters:
mol (Chem.Mol) – Input RDKit molecule with at least one conformer.
- Returns:
- A square NumPy array of shape
(n_atoms, n_atoms) containing pairwise Euclidean distances in Angstroms.
- A square NumPy array of shape
- Return type:
np.ndarray
- Raises:
ValueError – If the input molecule is
Noneor has no conformers.
- neurosnap.chemistry.calculate_rmsd(mol_a, mol_b)[source]#
Calculates the best-fit RMSD between two molecules.
This function uses RDKit’s alignment-based RMSD calculation, meaning the molecules are optimally superimposed before the RMSD value is reported. As a result, pure rigid-body translations and rotations do not by themselves increase the returned RMSD.
- Parameters:
mol_a (Chem.Mol) – First RDKit molecule with at least one conformer.
mol_b (Chem.Mol) – Second RDKit molecule with at least one conformer.
- Returns:
Best-fit root-mean-square deviation between the two molecules.
- Return type:
- Raises:
ValueError – If either molecule is
Noneor lacks conformers.
- neurosnap.chemistry.canonicalize_smiles(smiles)[source]#
Converts a SMILES string into its canonical RDKit representation.
This is useful for normalizing equivalent SMILES strings into a stable text form for storage, comparison, or deduplication.
- Parameters:
smiles (str) – Input SMILES string to canonicalize.
- Returns:
Canonical SMILES string produced by RDKit.
- Return type:
- Raises:
ValueError – If the input string cannot be parsed as a valid SMILES.
- neurosnap.chemistry.find_LCS(mol)[source]#
Find the largest common substructure (LCS) between a set of conformers and aligns all conformers to the LCS.
- neurosnap.chemistry.generate(input_mol, output_name='unique_conformers', write_multi=False, num_confs=1000, min_method='auto', max_atoms=500)[source]#
Generate conformers for an input molecule.
Performs the following actions in order: 1. Generate conformers using ETKDG method 2. Minimize energy of all conformers and remove those below a dynamic threshold 3. Align & create RMSD matrix of all conformers 4. Clusters using Butina method to remove structurally redundant conformers 5. Return most energetically favorable conformers in each cluster
- Parameters:
input_mol (
Any) – Input molecule can be a path to a molecule file, a SMILES string, or an instance of rdkit.Chem.rdchem.Moloutput_name (
str) – Output to write SDF files of passing conformerswrite_multi (
bool) – If True will write all unique conformers to a single SDF file, if False will write all unique conformers in separate SDF files in output_namenum_confs (
int) – Number of conformers to generatemin_method (
Optional[str]) – Method for minimization, can be either “auto”, “UFF”, “MMFF94”, “MMFF94s”, or None for no minimizationmax_atoms (
int) – Maximum number of atoms allowed for the input molecule
- Return type:
DataFrame- Returns:
A dataframe with all conformer statistics. Note if energy minimization is disabled or fails then energy column will consist of None values.
- neurosnap.chemistry.get_mol_center(mol, use_mass=False)[source]#
Computes the geometric center or center of mass of a molecule.
- Parameters:
mol (Mol) – An RDKit molecule object with 3D coordinates.
use_mass (bool, optional) – If True, computes the center of mass using atomic masses. If False, computes the simple geometric center. Defaults to False.
- Returns:
- A NumPy array of shape (3,) representing the [x, y, z] center coordinates.
Returns None if the molecule has no conformers.
- Return type:
np.ndarray
- Raises:
ValueError – If no conformer is found in the molecule.
- neurosnap.chemistry.largest_fragment(mol)[source]#
Selects the largest fragment from a multi-component molecule.
This is typically useful for salts, mixtures, or counterion-containing inputs where only the primary chemical component should be retained.
- Parameters:
mol (Chem.Mol) – Input RDKit molecule, which may contain multiple fragments.
- Returns:
A copy containing only the largest fragment.
- Return type:
Chem.Mol
- Raises:
ValueError – If the input molecule is
None.
- neurosnap.chemistry.minimize(mol, method='MMFF94', percentile=100.0)[source]#
Minimize conformer energy (kcal/mol) using RDkit and filter out conformers based on energy percentile.
- Parameters:
mol (
Mol) – RDkit mol object containing the conformers you want to minimize. (rdkit.Chem.rdchem.Mol)method (
str) – Can be either UFF, MMFF94, or MMFF94s (str)percentile (
float) – Filters out conformers above a given energy percentile (0 to 100). For example, 10.0 will retain conformers within the lowest 10% energy. (float)
- Return type:
- Returns:
A tuple of the form
(mol_filtered, energies)-mol_filtered: Molecule object with filtered conformers. -energies: Dictionary where keys are conformer IDs and values are calculated energies in kcal/mol.
- neurosnap.chemistry.move_ligand_to_center(ligand_sdf_path, receptor_pdb_path, output_sdf_path, use_mass=False)[source]#
Moves the center of a ligand in an SDF file to match the center of a receptor in a PDB file.
This function reads a ligand from an SDF file and a receptor from a PDB file, calculates their respective centers (center of mass or geometric center), and translates the ligand such that its center aligns with the receptor’s center. The modified ligand is then saved to a new SDF file.
- Parameters:
ligand_sdf_path (str) – Path to the input ligand SDF file.
receptor_pdb_path (str) – Path to the input receptor PDB file.
output_sdf_path (str) – Path where the adjusted ligand SDF will be saved.
use_mass (bool, optional) – If True, compute center of mass; otherwise use geometric center. Defaults to False.
- Returns:
Path to the output SDF file with the translated ligand.
- Return type:
- Raises:
ValueError – If the ligand cannot be parsed from the input SDF file.
- neurosnap.chemistry.neutralize_molecule(mol)[source]#
Neutralizes formal charges in a molecule where chemically supported.
This function uses RDKit’s uncharging logic to neutralize ionized atoms when a valid neutral form can be produced. Charges that cannot be safely neutralized are preserved.
- Parameters:
mol (Chem.Mol) – Input RDKit molecule to neutralize.
- Returns:
A copy of the molecule with reducible charges neutralized.
- Return type:
Chem.Mol
- Raises:
ValueError – If the input molecule is
None.
- neurosnap.chemistry.remove_salts(mol)[source]#
Removes common salt fragments while retaining the main molecular component.
The function first strips recognized salts and small counterions using RDKit’s salt remover, then selects the largest remaining fragment to produce a single primary molecule.
- Parameters:
mol (Chem.Mol) – Input RDKit molecule that may contain salts or counterions.
- Returns:
A desalted copy of the molecule.
- Return type:
Chem.Mol
- Raises:
ValueError – If the input molecule is
None.
- neurosnap.chemistry.sdf_to_smiles(fpath)[source]#
Converts molecules in an SDF file to SMILES strings.
Reads an input SDF file and extracts SMILES strings from its molecules. Invalid or unreadable molecules are skipped, with warnings logged.
- Parameters:
fpath (str) – Path to the input SDF file.
- Returns:
A list of SMILES strings corresponding to valid molecules in the SDF file.
- Return type:
List[str]
- Raises:
FileNotFoundError – If the SDF file cannot be found.
IOError – If the file cannot be read.
- neurosnap.chemistry.smiles_to_sdf(smiles, output_path)[source]#
Converts a SMILES string to an sdf file. Will overwrite existing results.
NOTE: This function does the bare minimum in terms of generating the SDF molecule. The
neurosnap.chemistry.conformersmodule should be used in most cases.
- neurosnap.chemistry.standardize_molecule(mol)[source]#
Standardizes a molecule using RDKit’s cleanup workflow.
The standardization process applies RDKit’s built-in molecular cleanup rules, which can normalize representations such as functional groups, charges, and related valence patterns into a more consistent form.
- Parameters:
mol (Chem.Mol) – Input RDKit molecule to standardize.
- Returns:
A standardized copy of the input molecule.
- Return type:
Chem.Mol
- Raises:
ValueError – If the input molecule is
None.
- neurosnap.chemistry.translate_molecule(mol, vector)[source]#
Translates all atomic coordinates in a molecule by a vector.
The input molecule is not modified in place. Instead, a copy is made and every atom position in the first conformer is shifted by the provided
[x, y, z]vector.- Parameters:
mol (Chem.Mol) – Input RDKit molecule with at least one conformer.
vector – Translation vector of length 3 containing the x, y, and z shifts.
- Returns:
A translated copy of the input molecule.
- Return type:
Chem.Mol
- Raises:
ValueError – If the molecule is
None, has no conformers, or the translation vector is not length 3.
- neurosnap.chemistry.validate_smiles(smiles)[source]#
Validates a SMILES (Simplified Molecular Input Line Entry System) string.