neurosnap.chemistry.smiles module#
Utilities for SMILES and SDF conversion.
- neurosnap.chemistry.smiles.canonicalize_smiles(smiles)[source]#
Converts a SMILES string into its canonical RDKit representation.
This is useful for normalizing equivalent SMILES strings into a stable text form for storage, comparison, or deduplication.
- Parameters:
smiles (str) – Input SMILES string to canonicalize.
- Returns:
Canonical SMILES string produced by RDKit.
- Return type:
- Raises:
ValueError – If the input string cannot be parsed as a valid SMILES.
- neurosnap.chemistry.smiles.largest_fragment(mol)[source]#
Selects the largest fragment from a multi-component molecule.
This is typically useful for salts, mixtures, or counterion-containing inputs where only the primary chemical component should be retained.
- Parameters:
mol (Chem.Mol) – Input RDKit molecule, which may contain multiple fragments.
- Returns:
A copy containing only the largest fragment.
- Return type:
Chem.Mol
- Raises:
ValueError – If the input molecule is
None.
- neurosnap.chemistry.smiles.neutralize_molecule(mol)[source]#
Neutralizes formal charges in a molecule where chemically supported.
This function uses RDKit’s uncharging logic to neutralize ionized atoms when a valid neutral form can be produced. Charges that cannot be safely neutralized are preserved.
- Parameters:
mol (Chem.Mol) – Input RDKit molecule to neutralize.
- Returns:
A copy of the molecule with reducible charges neutralized.
- Return type:
Chem.Mol
- Raises:
ValueError – If the input molecule is
None.
- neurosnap.chemistry.smiles.remove_salts(mol)[source]#
Removes common salt fragments while retaining the main molecular component.
The function first strips recognized salts and small counterions using RDKit’s salt remover, then selects the largest remaining fragment to produce a single primary molecule.
- Parameters:
mol (Chem.Mol) – Input RDKit molecule that may contain salts or counterions.
- Returns:
A desalted copy of the molecule.
- Return type:
Chem.Mol
- Raises:
ValueError – If the input molecule is
None.
- neurosnap.chemistry.smiles.sdf_to_smiles(fpath)[source]#
Converts molecules in an SDF file to SMILES strings.
Reads an input SDF file and extracts SMILES strings from its molecules. Invalid or unreadable molecules are skipped, with warnings logged.
- Parameters:
fpath (str) – Path to the input SDF file.
- Returns:
A list of SMILES strings corresponding to valid molecules in the SDF file.
- Return type:
List[str]
- Raises:
FileNotFoundError – If the SDF file cannot be found.
IOError – If the file cannot be read.
- neurosnap.chemistry.smiles.smiles_to_sdf(smiles, output_path)[source]#
Converts a SMILES string to an sdf file. Will overwrite existing results.
NOTE: This function does the bare minimum in terms of generating the SDF molecule. The
neurosnap.chemistry.conformersmodule should be used in most cases.
- neurosnap.chemistry.smiles.standardize_molecule(mol)[source]#
Standardizes a molecule using RDKit’s cleanup workflow.
The standardization process applies RDKit’s built-in molecular cleanup rules, which can normalize representations such as functional groups, charges, and related valence patterns into a more consistent form.
- Parameters:
mol (Chem.Mol) – Input RDKit molecule to standardize.
- Returns:
A standardized copy of the input molecule.
- Return type:
Chem.Mol
- Raises:
ValueError – If the input molecule is
None.