Generative Antibody Design: Exploring DiffAb on the Neurosnap Platform

Written by Danial Gharaie Amirabadi

Published 2024-9-21

Preview

Antibodies are crucial immune proteins that defend the body by binding to specific antigens, such as viruses or bacteria. This interaction is largely driven by the complementarity-determining regions (CDRs) of the antibody. To streamline the design of novel CDRs for specific antigens, we offer a solution based on DiffAb, a deep generative model that jointly models both the sequences and structures of CDRs. Leveraging diffusion probabilistic models and equivariant neural networks, DiffAb is the first deep learning-based method to generate antibodies explicitly targeting specific antigen structures. Capable of sequence-structure co-design, this model can produce competitive results in terms of binding affinity, evaluated by biophysical energy functions and other protein design metrics.

Brief intro to Antibodies

Antibodies are essential immune proteins produced during an immune response, designed to recognize and neutralize pathogens. Each antibody consists of two heavy chains and two light chains, with their overall structure being the same. The specificity of an antibody to its target antigen is determined by six hypervariable regions known as Complementarity Determining Regions (CDRs). These CDRs—H1, H2, H3 for the heavy chains and L1, L2, L3 for the light chains—are critical for antigen recognition. The precise design of CDRs is the key step in developing effective therapeutic antibodies.

schematic of an Antibody from the paper Pre-training Antibody Language Models for Antigen-Specific Computational Antibody Design

Brief intro to Antibody design

The search space for Complementarity Determining Regions (CDRs) is vast, similar to other protein design tasks. A CDR sequence composed of n amino acids can yield up to 20n possible protein sequences, making it impractical to test all combinations experimentally. This necessitates the use of computational methods. Traditional computational approaches typically involve sampling protein sequences and structures based on complex biophysical energy functions. However, these methods are often time-consuming and can become trapped in local optima.

DiffAb: Generative Antibody Design

To tackle the challenges of antibody design, DiffAb proposes a diffusion-based generative model capable of jointly sampling antibody CDR sequences and their corresponding structures. Crucially, this model allows the joint distribution of a CDR sequence and its structure to be directly conditioned on antigen structures.

Given an input protein complex comprising an antigen and an antibody framework, DiffAb begins by initializing the CDR with arbitrary sequences, positions, and orientations. The diffusion model first aggregates information from both the antigen and the antibody framework. It then iteratively updates the amino acid type, position, and orientation of each amino acid within the CDR. Finally, the model reconstructs the CDR structure at the atomic level using side-chain packing algorithms based on the predicted orientations.

One of the primary advantages of selecting a diffusion-based model over other generative approaches, such as generative adversarial networks and variational autoencoders, is its iterative generation of CDR candidates within the sequence-structure space. This allows for the imposition of constraints during the sampling process, thereby accommodating a broader range of design tasks and enhancing the model's capability.

Overview of DiffAb from the paper Antigen-Specific Antibody Design and Optimization with Diffusion-Based Generative Models for Protein Structures

Using DiffAb on Neurosnap

To use DiffAb on Neurosnap, navigate to the services page. You can find DiffAb Antibody Design either by using the search function or by selecting the "Antibody" filter. Once on the DiffAb page, you will be prompted to input your antigen structure as a PDB file and provide an optional seed. You can leave the seed input as is to use a random seed.

DiffAb job submission panel on the Neurosnap platform

For the antigen structure, it is recommended to clean the PDB file beforehand to prevent any interference with the design process. You can use PyMOL to remove unnecessary sections of the protein, followed by PDBFixer for any necessary adjustments. This preparation helps optimize the input for the best results during the design phase.

For example, we use 7PEE as our input antigen. It represents the crystal structure of the extracellular part of human Trop2. TROP2, or tumor-associated calcium signal transducer 2, is a protein that is overexpressed in many types of cancer and is associated with increased tumor aggressiveness and metastasis.

Structure of 7PEE

First, we cut the terminal sections of the protein and then use PDBFixer to prepare the antigen PDB file for DiffAb. This preparation optimizes the structure for the design process. Once the structure is ready, we can provide it and proceed to run the service.

Interpreting the Results

Once you run DiffAb, you will be presented with a 3D protein viewer and a table containing the RMSD values for both the light and heavy chains alongside their sequences. The RMSD values, measured in angstroms, represent the difference between the template proteins and the designed proteins.

3D viewer Table

In the 3D protein viewer, you can assess where each designed CDR binds to your antigen, as well as visualize how the template CDR binds to the antigen. In the table, you can evaluate the sequences of the designed CDRs and compare their structural deviations based on the RMSD values.

Next Steps

Antibody design remains a challenging task, often requiring extensive and costly wet lab evaluations. One of the key limitations, as highlighted in the literature, is the uncertainty of whether the generated antibodies can be successfully produced in the lab and bind effectively to their target.

To address these challenges, we recommend testing the following metrics to evaluate the quality of your DiffAb-generated designs:

  1. Structural Stability Evaluation: Assessing the structural stability of the CDR sequences is critical for determining their potential functional efficacy. Since CDRs are hypervariable and essential for antigen binding, ensuring structural integrity is paramount for designing functional antibodies. We suggest using AlphaFold2 Service for structure predictions. For each generated sequence, collect the predicted local distance difference test (pLDDT) score, as a higher pLDDT score indicates greater confidence in the structural prediction. The pLDDT score is also strongly correlated with the structural order of proteins, making it a valuable metric for assessing the stability and proper folding of your antibody designs.
  2. Solubility and Aggregation Propensity: Monoclonal antibodies (mAbs) are prone to aggregation under non-native conditions, which can lead to a loss of functionality and increased toxicity. Therefore, evaluating the solubility and aggregation propensity of the generated CDR sequences is crucial for their developability. We recommend using our NetSolP-1.0 service, which predicts protein solubility directly from protein sequences. Additionally, we suggest using AGGRESCAN, a tool that measures the overall natural tendency of sequences to aggregate. These evaluations will help improve the likelihood that your designed antibodies are both functional and viable for production.
  3. Humanness Evaluation: A critical therapeutic property of antibodies is their resemblance to naturally occurring human antibodies, as this helps reduce the likelihood of triggering an immune response. To evaluate the "naturalness" of the designed CDR sequences, we suggest using the BioPhi tool. BioPhi assesses how common antibody sequences are within the human population. By comparing your generated CDR sequences with natural antibody repertoires and identifying their similarity to germline sequences, you can derive a humanness score, providing insight into the likelihood of immune compatibility.

By integrating these steps into your design process, you can better assess the potential of your antibodies before moving to experimental validation.

conclusion

In conclusion, DiffAb provides an essential tool for researchers working on antibody design. By enabling the joint sampling of antibody CDR sequences and structures, DiffAb facilitates the creation of targeted and efficient antibody candidates, streamlining the design process. Integrated into the Neurosnap platform, DiffAb is readily accessible to researchers looking to optimize antibody development and enhance therapeutic discoveries.

You can check out an example run at: https://neurosnap.ai/job/66ed6b8d64bfe809b0a7ef5c?share=66eead86c52abe9f9fbda10e

Want to get started with DiffAb Antibody Design? Register here and run your own jobs!

Continue Reading

  1. Antigen-Specific Antibody Design and Optimization with Diffusion-Based Generative Models for Protein Structures
  2. AbGPT: De Novo Antibody Design via Generative Language Modeling

Accelerate your lab's
research today

Register for free — upgrade anytime.

Interested in getting a license? Contact Sales.

Sign up free