neurosnap.algos.wolfpsort.wolfpsort module#

Structured WoLF PSORT localization prediction API.

This submodule provides the public Python interface for the WoLF PSORT-style localization workflow bundled in neurosnap.algos.wolfpsort. It exposes helpers for computing the model feature vector and for running the bundled fungi, animal, and plant localization models with dictionary or DataFrame outputs.

This implementation was developed as a distinct Python reimplementation for the academic community, while drawing technical reference and attribution from the original WoLF PSORT project by Paul Horton and Kenta Nakai. The referenced project materials consulted during development are available from the public WoLF PSORT source distribution rehosted at:

That distribution includes the historical PSORT / WoLF PSORT command-line code, model assets, and accompanying project notices.

When citation of the underlying software or prediction method is appropriate, the original WoLF PSORT references suggested by the project materials include:

  • Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ, Nakai K. “WoLF PSORT: protein localization predictor.” Nucleic Acids Research 35 (Web Server issue): W585-W587 (2007). https://doi.org/10.1093/nar/gkm259

  • Horton P, Park KJ, Obayashi T, Nakai K. “Protein Subcellular Localization Prediction with WoLF PSORT.” Asian Pacific Bioinformatics Conference (APBC/APCB 2006).

class neurosnap.algos.wolfpsort.wolfpsort.WoLFPSortPredictor(organism_type='fungi')[source]#

Bases: object

Pure Python WoLF PSORT port with structured outputs.

VALID_ORGANISMS = {'animal', 'fungi', 'plant'}#
__init__(organism_type='fungi')[source]#

Initialize a WoLF PSORT predictor for one bundled organism model.

Parameters:

organism_type (str) – Bundled model to use. Supported values are "fungi", "animal", and "plant".

Returns:

None. The predictor is initialized in place.

compute_features(sequences)[source]#

Compute WoLF PSORT features for one or more protein sequences.

Parameters:

sequences (Iterator[Tuple[str, str]]) – Iterator yielding (identifier, sequence) tuples.

Return type:

List[Dict[str, object]]

Returns:

List of dictionaries containing id and every WoLF PSORT feature used by the bundled models.

compute_features_dataframe(sequences)[source]#

Compute WoLF PSORT features and return them as a DataFrame.

Parameters:

sequences (Iterator[Tuple[str, str]]) – Iterator yielding (identifier, sequence) tuples.

Return type:

DataFrame

Returns:

DataFrame with one row per sequence and one column per feature.

predict(sequences, include_features=False, include_neighbors=False)[source]#

Predict localization scores for one or more protein sequences.

Parameters:
  • sequences (Iterator[Tuple[str, str]]) – Iterator yielding (identifier, sequence) tuples.

  • include_features (bool) – When True, attach the computed feature dictionary to each prediction record.

  • include_neighbors (bool) – When True, include the ranked training neighbors used during kNN scoring.

Return type:

List[Dict[str, object]]

Returns:

List of dictionaries containing the predicted class, ranked class scores, human-readable labels, best k value, and optional feature / neighbor details.

predict_dataframe(sequences, include_features=False)[source]#

Predict localization scores and return a tabular summary.

Parameters:
  • sequences (Iterator[Tuple[str, str]]) – Iterator yielding (identifier, sequence) tuples.

  • include_features (bool) – When True, expand the computed features into additional DataFrame columns.

Return type:

DataFrame

Returns:

DataFrame with one row per sequence plus the top prediction metadata and full score dictionary.

neurosnap.algos.wolfpsort.wolfpsort.compute_features(sequences)[source]#

Compute WoLF PSORT features using the fungi feature definition.

Parameters:

sequences (Iterator[Tuple[str, str]]) – Iterator yielding (identifier, sequence) tuples.

Return type:

List[Dict[str, object]]

Returns:

List of dictionaries containing structured feature values.

neurosnap.algos.wolfpsort.wolfpsort.compute_features_dataframe(sequences)[source]#

Compute WoLF PSORT features and return them in DataFrame form.

Parameters:

sequences (Iterator[Tuple[str, str]]) – Iterator yielding (identifier, sequence) tuples.

Return type:

DataFrame

Returns:

DataFrame with one row per sequence and one column per feature.

neurosnap.algos.wolfpsort.wolfpsort.predict_localization(sequences, organism_type='fungi', include_features=False, include_neighbors=False, as_dataframe=True)[source]#

Predict WoLF PSORT localization scores for one or more sequences.

Parameters:
  • sequences (Iterator[Tuple[str, str]]) – Iterator yielding (identifier, sequence) tuples.

  • organism_type (str) – Bundled organism model to use. Supported values are "fungi", "animal", and "plant".

  • include_features (bool) – When True, include the computed feature dictionary in dictionary output or expand feature columns in DataFrame output.

  • include_neighbors (bool) – When True, include the ranked training neighbors in dictionary output. This is not supported in DataFrame mode.

  • as_dataframe (bool) – When True, return a DataFrame summary. When False, return a list of dictionaries.

Returns:

DataFrame or list of dictionaries, depending on as_dataframe.