neurosnap.chemistry.smiles module#

Utilities for SMILES and SDF conversion.

neurosnap.chemistry.smiles.canonicalize_smiles(smiles)[source]#

Converts a SMILES string into its canonical RDKit representation.

This is useful for normalizing equivalent SMILES strings into a stable text form for storage, comparison, or deduplication.

Parameters:

smiles (str) – Input SMILES string to canonicalize.

Returns:

Canonical SMILES string produced by RDKit.

Return type:

str

Raises:

ValueError – If the input string cannot be parsed as a valid SMILES.

neurosnap.chemistry.smiles.largest_fragment(mol)[source]#

Selects the largest fragment from a multi-component molecule.

This is typically useful for salts, mixtures, or counterion-containing inputs where only the primary chemical component should be retained.

Parameters:

mol (Chem.Mol) – Input RDKit molecule, which may contain multiple fragments.

Returns:

A copy containing only the largest fragment.

Return type:

Chem.Mol

Raises:

ValueError – If the input molecule is None.

neurosnap.chemistry.smiles.neutralize_molecule(mol)[source]#

Neutralizes formal charges in a molecule where chemically supported.

This function uses RDKit’s uncharging logic to neutralize ionized atoms when a valid neutral form can be produced. Charges that cannot be safely neutralized are preserved.

Parameters:

mol (Chem.Mol) – Input RDKit molecule to neutralize.

Returns:

A copy of the molecule with reducible charges neutralized.

Return type:

Chem.Mol

Raises:

ValueError – If the input molecule is None.

neurosnap.chemistry.smiles.remove_salts(mol)[source]#

Removes common salt fragments while retaining the main molecular component.

The function first strips recognized salts and small counterions using RDKit’s salt remover, then selects the largest remaining fragment to produce a single primary molecule.

Parameters:

mol (Chem.Mol) – Input RDKit molecule that may contain salts or counterions.

Returns:

A desalted copy of the molecule.

Return type:

Chem.Mol

Raises:

ValueError – If the input molecule is None.

neurosnap.chemistry.smiles.sdf_to_smiles(fpath)[source]#

Converts molecules in an SDF file to SMILES strings.

Reads an input SDF file and extracts SMILES strings from its molecules. Invalid or unreadable molecules are skipped, with warnings logged.

Parameters:

fpath (str) – Path to the input SDF file.

Returns:

A list of SMILES strings corresponding to valid molecules in the SDF file.

Return type:

List[str]

Raises:
neurosnap.chemistry.smiles.smiles_to_sdf(smiles, output_path)[source]#

Converts a SMILES string to an sdf file. Will overwrite existing results.

NOTE: This function does the bare minimum in terms of generating the SDF molecule. The neurosnap.chemistry.conformers module should be used in most cases.

Parameters:
  • smiles (str) – Smiles string to parse and convert

  • output_path (str) – Path to output SDF file, should end with .sdf

Return type:

None

neurosnap.chemistry.smiles.standardize_molecule(mol)[source]#

Standardizes a molecule using RDKit’s cleanup workflow.

The standardization process applies RDKit’s built-in molecular cleanup rules, which can normalize representations such as functional groups, charges, and related valence patterns into a more consistent form.

Parameters:

mol (Chem.Mol) – Input RDKit molecule to standardize.

Returns:

A standardized copy of the input molecule.

Return type:

Chem.Mol

Raises:

ValueError – If the input molecule is None.

neurosnap.chemistry.smiles.validate_smiles(smiles)[source]#

Validates a SMILES (Simplified Molecular Input Line Entry System) string.

Parameters:

smiles (str) – The SMILES string to validate.

Returns:

True if the SMILES string is valid, False otherwise.

Return type:

bool

Raises:

Exception – Logs any exception encountered during validation.