Use ProteinMPNN

Overview

ProteinMPNN is a powerful inverse folding model that is capable of not only predicting the amino acids of a protein structure, but also certain chains, and complexes. Additionally, ProteinMPNN can be used as a way to create functional homologs / mutants of existing proteins by inverse folding their structures and sampling the sequence space.

Features

  • Predecessor to LigandMPNN. Note this service has been largely replaced by ligandMPNN and should no longer be used.
  • Utilizes the faster & more feature rich ColabDesign implementation of ProteinMPNN.
  • Supports SolubleMPNN.
  • Allows you to specify fixed chains and positions.
  • Allows you to inverse fold any protein or complex of proteins.
  • Supports homo-oligomers.
  • Supports different sampling techniques to better explore the protein landscape.
  • Includes per sequence metrics such as an overall score and sequence recovery.
  • Includes amino acid probabilities by position.
  • Includes sampling temperature adjusted amino acid probabilities by position.
View Paper

Configuration & Options

Model Inputs

The input protein structure to predict the amino acid sequence of. Acceptable input file formats include PDB.

Design Options

Specify the name of the chains that you want to provide the model. Provided names need to match the names on the PDB file and need to be comma delimited. For example if you provide a PDB file containing two chains named A and B simply enter "A,B". If you leave this empty, we will automatically inverse fold all chains in the entire protein.

Specify whether the structure is a homo-oligomer (homomer). Lengths of chains should be the same for correct symmetric tying.

Allows you to specify which chains, amino acids, and residue ranges to fix. For positions to keep fixed in the sequence use something like "1,2-10". To control specific chains use "A1-10,B1-20". To fix entire chains for something like binder design use something like "A". Note for fixing positions use the same residue indices within the PDB/mmCIF file.

Invert the selected fixed positions above. Basically if you decide to fix a position like A1-10 above and enable this mode then instead of fixing A1-10, everything will be fixed except for A1-10.

The number of output sequences to generate.

Specify sampling temperature lower numbers produce higher probability sequences, higher numbers produce more diverse sequences. A sampling temperature greater than 1.0 means random sampling.

Advanced Settings

Select the model you want to use to predict your structure. The first number presents the number of edges (48), the 2nd number represents the deviation in ångströms (020 = 0.2Å). The best performing option is usually either v_48_030 or v_48_020 (according to the ProteinMPNN paper).

Select whether you want to use the original ProteinMPNN weights (default) or if you want to use the newer SolubleMPNN weights which is a version of ProteinMPNN trained on only soluble proteins. If your goal is to design soluble proteins then SolubleMPNN might be more useful.

Specify amino acid(s) to exclude (example: "C,A,T").

Ready to submit your job?

Once you're done just hit the submit button below and let us do the rest.

To submit a job please login or register an account.