neurosnap.database.ccd module#

Chemical Component Dictionary metadata helpers.

class neurosnap.database.ccd.CCD(code, name, smiles)[source]#

Bases: object

Minimal Chemical Component Dictionary entry.

code#

CCD identifier, typically 1-5 characters.

name#

Human-readable component name.

smiles#

SMILES string for the component (technically canonicalized but the canonicalization algorithm used by wwPDB is inconsistent with that of RDkit).

code: str#
name: str#
smiles: str#
smiles_canonical()[source]#

Return the RDKit-canonicalized SMILES string for this CCD entry.

Return type:

str

to_mol()[source]#

Return an RDKit molecule parsed from the canonical SMILES string.

Return type:

Mol

Returns:

RDKit molecule for the CCD entry.

Raises:

ValueError – If the stored canonical SMILES cannot be parsed.

neurosnap.database.ccd.get_ccd(code, *, cache_path='~/.cache/neurosnap/ccd_entries.json', overwrite=False, max_age_days=7, timeout=30)[source]#

Return a CCD entry by its component code.

Return type:

Optional[CCD]

neurosnap.database.ccd.get_ccd_entries(*, cache_path='~/.cache/neurosnap/ccd_entries.json', overwrite=False, max_age_days=7, timeout=30)[source]#

Fetch and cache CCD metadata entries.

The CCD payload is cached locally and refreshed when the cached payload exceeds max_age_days based on its embedded created_at timestamp.

Parameters:
  • cache_path (str) – Local cache file path for the raw JSON payload.

  • overwrite (bool) – If True, force a fresh download.

  • max_age_days (int) – Maximum accepted payload age in days.

  • timeout (int) – HTTP timeout in seconds for the download request.

Return type:

Dict[str, CCD]

Returns:

Dictionary mapping CCD code to CCD.

neurosnap.database.ccd.get_ccd_standard_aa(ccd, *, cache_path='~/.cache/neurosnap/ccd_entries.json', overwrite=False, max_age_days=7, timeout=30)[source]#

Return the most similar standard amino acid for a CCD entry.

If the input CCD code already has an explicit standard mapping in AA_RECORDS, that mapping is reused directly. Otherwise, the CCD entry is compared against the 20 canonical amino-acid CCD entries using RDKit Morgan fingerprints and the highest-similarity standard amino acid is returned.

Parameters:
  • ccd (Union[str, CCD]) – CCD code string or a CCD instance.

  • cache_path (str) – Local cache file path for the CCD JSON payload.

  • overwrite (bool) – If True, force a fresh CCD payload download.

  • max_age_days (int) – Maximum accepted payload age in days.

  • timeout (int) – HTTP timeout in seconds for CCD payload downloads.

Return type:

AARecord

Returns:

The best-matching standard amino-acid record.

Raises:
  • TypeError – If ccd is not a string or CCD.

  • ValueError – If the CCD code is unknown or its SMILES cannot be parsed.