Estimated Cost: Around $1.20 for a medium sized protein (~250 AA)
A protein structure prediction model that takes an amino acid sequence, MSA, and a template structure as inputs. Both the MSA and template structure are optional, and inference can be done without them. Additionally, options are available for automating generation of both.
Features
Includes 3D interactive visualizations of all your folded protein.
Includes interactive visualizations for pLDDT and PAE metrics as well as downloads..
Everything from the protein structure, to the MSA used are available for download.
Supports monomers and complexes.
Supports the faster mmseqs2 MSA algorithm.
Supports different MSA databases, single_sequence, and custom MSA.
Supports template detection, no templates, as well as custom templates.
Supports Amber structure relaxation / refinement.
Supports different pairing modes (unpaired+paired, paired, unpaired).
Supports different model types (ptm, multimer-v1, multimer-v2).
Estimated Cost: Around $0.10 for 100 samples of a medium sized protein (~250 AA)
ProteinMPNN is a powerful inverse folding model that is capable of not only predicting the amino acids of a protein structure, but also certain chains, and complexes. Additionally, ProteinMPNN can be used as a way to create functional homologs / mutants of existing proteins by inverse folding their structures and sampling the sequence space.
Features
Allows you to inverse fold any protein or complex of proteins.
Supports homo-oligomers.
Includes options to control which chains to design and which to keep fixed.
Supports different sampling techniques to better explore the protein landscape.
Includes per sequence metrics such as an overall score and sequence recovery.
Includes amino acid probabilities by position.
Includes sampling temperature adjusted amino acid probabilities by position.
DiffDock is a new state of the art method for molecular docking and drug binding structure prediction. This model takes in a protein structure and ligand as input and returns the ligand docked onto the protein as an output.
Features
Predict ligand binding to a protein target of interest.
Can be used for drug design as well as domain identification.
Includes an interactive 3D protein viewer with the docked ligand.
Estimated Cost: Around $3.87 for 100 samples of a medium-large sized protein (~512 AA)
ProGen2 is generative protein language model capable of generating de novo proteins and protein variants given an input sequence. Additionally, another version of ProGen2, progen2-oas can be used to generate antibody sequences.
Features
Allows you to generate novel proteins or extend existing proteins.
Includes options to control the number of output sequences.
Supports different sampling techniques to better explore the protein landscape and to diversify prompt outputs.
Allows you to specify max length.
Includes an output fasta of all the sequences.
Includes an output plot of most common residues per residue which can be further interepreted as residue probabilities per position.
Estimated Cost: Around $2.27 for 100 samples of a medium sized protein (~250 AA)
The ESM-IF1 inverse folding model predicts protein sequences from their backbone atom coordinates, trained with 12M protein structures predicted by AlphaFold2.
Features
Allows you to inverse fold any protein or complex of proteins.
Includes options to control which chains to design and which to keep fixed.
Supports different sampling techniques to better explore the protein landscape.
Includes per sequence metrics such as an overall score and sequence recovery.