RFdiffusion3 Online: All-Atom De Novo Protein Binder Design

Written by Danial Gharaie Amirabadi | Published 2026-6-4

Introduction

Designing a protein from scratch to bind a specific target is one of the hardest problems in structural biology. Until recently, it required experimental screening of thousands of candidates. RFdiffusion changed that by demonstrating that a diffusion model fine-tuned on protein structures can generate high-affinity binders de novo [1]. RFdiffusion3 (RFD3) extends that capability to the full atomic level: it explicitly models side chains, ligands, nucleic acids, and metal ions during generation, enabling precise conditioning on atom-level constraints that are simply out of reach for backbone-only models [2].

On Neurosnap, you can run RFdiffusion3 online without installing model weights or managing a GPU. This walkthrough designs a de novo miniprotein binder to human interleukin-2 (IL-2), a key cancer immunotherapy target, and explains every setting along the way.

RFdiffusion3 service page on Neurosnap

The RFdiffusion3 service page on Neurosnap.

What RFdiffusion3 Does

The original RFdiffusion (2023) fine-tuned the RoseTTAFold structure prediction network on protein backbone denoising tasks, producing a generative model that could design binders, oligomers, and enzyme scaffolds purely from backbone geometry [1]. A follow-up, RFdiffusion All-Atom (RFAA), extended this to model small molecules alongside protein backbones, but still lacked explicit side-chain generation during the diffusion process [3].

RFdiffusion3 resolves this by making the model fully atomistic from the start. Every step of the diffusion trajectory explicitly models all polymer atoms, including protein side chains, DNA and RNA bases, small molecules, and metal ions. The key consequences are:

Side chains are generated jointly with the backbone. Because the model never works at backbone-only resolution, it can directly reason about hydrogen-bond donors and acceptors, solvent accessibility, and contact geometry during generation rather than deferring these to a downstream sequence design step.

Non-protein atoms are first-class inputs. A small molecule ligand, a DNA duplex, or a metal coordination site can be provided as a fixed conditioning input. The diffusion model then generates a protein scaffold that is geometrically and chemically compatible with those atoms rather than ignoring them.

Throughput is ten times higher than RFdiffusion2. Despite the added atomic complexity, RFD3 uses a streamlined 168 M parameter architecture that runs roughly ten times faster than its predecessor on identical hardware [2]. On Neurosnap, this translates to faster job queuing and lower credit cost per design.

The RFD3 paper validated the model experimentally on two design tasks: DNA-binding proteins and cysteine hydrolase active site scaffolding. In both cases, designed proteins expressed, folded, and showed the intended biochemical activity [2].

RFdiffusion3 vs. Prior Approaches

Model All-atom generation Ligand conditioning DNA/RNA conditioning Speed vs. RFD2
RFdiffusion (2023) No No No Faster
RFdiffusion All-Atom Partial (backbone only) Yes (post-hoc) No Similar
RFdiffusion3 Yes Yes Yes ~10x faster

The clearest practical difference between RFdiffusion3 and the original RFdiffusion is what you can condition on. Original RFdiffusion is outstanding for protein-only binder design and symmetric assemblies. RFdiffusion3 is the right choice whenever your target contains non-protein atoms that should influence the design, or when you want explicit side-chain packing during generation rather than relying entirely on a downstream sequence design tool like ProteinMPNN.

When to Use RFdiffusion3 Online

RFdiffusion3 is most useful when:

For pure protein-only binder design where backbone-level geometry is sufficient, the original RFdiffusion service is also available on Neurosnap and remains highly capable.

You can open the RFdiffusion3 service here:

https://neurosnap.ai/service/RFdiffusion3

Example: De Novo Binder Design to IL-2

Human interleukin-2 is one of the most extensively studied proteins in immunology and a validated target for cancer immunotherapy. As a T cell growth factor, IL-2 signals through a tripartite receptor complex: a high-affinity three-subunit complex (IL-2Rα/β/γc, expressed on regulatory T cells) and an intermediate-affinity two-subunit complex (IL-2Rβ/γc, expressed on effector T cells and NK cells) [4].

The biology of this receptor system has major therapeutic implications. High-dose IL-2 was among the first FDA-approved cancer immunotherapies (for renal cell carcinoma and melanoma), but its clinical use is limited by severe toxicity arising from preferential activation of immunosuppressive regulatory T cells through the IL-2Rα-containing high-affinity complex [4]. Designing proteins that selectively engage the intermediate-affinity IL-2Rβγ surface on IL-2, rather than a natural antibody or small molecule, is an active area of protein engineering research. A landmark 2019 study demonstrated this is achievable: de novo designed IL-2 receptor mimetics that bind IL-2Rβγ with higher affinity than native IL-2 and elicit potent antitumor responses in mouse models, without Treg-associated toxicity [5].

For this tutorial, we use the structure of free IL-2 (PDB 1M47, 1.99 Å crystal structure) as the design target. The goal is to design a new miniprotein (60–80 residues) that contacts the IL-2Rβ binding surface, with hotspot conditioning on the key interface residues identified in the crystal structure of the IL-2/receptor complex.

The IL-2 sequence (chain A from PDB 1M47, 122 resolved residues):

The IL-2Rβ-binding face of IL-2 is formed by residues centered around Arg38, Thr41, Phe42, Tyr45, Leu72, and Asn88 (canonical IL-2 residue numbering). These residues form a hydrophobic and polar patch on the IL-2 surface that makes direct contacts with the IL-2Rβ extracellular domain in the receptor complex [5].

Configure the RFdiffusion3 Job

Open the RFdiffusion3 service on Neurosnap and configure it as follows.

Upload the Structure

Upload a PDB file containing just the IL-2 chain (chain A). PDB 1M47 has a few disordered loops that are missing electron density (residues 75, 76, and 99–102), which creates gaps in the residue numbering. RFdiffusion3 requires a contiguous residue index to avoid parsing errors, so those residues are removed and the chain is renumbered 1–122 before upload. This does not affect the 3D coordinates, only the numbering used in the contig and hotspot fields.

Contig String

The contig string defines which residues to hold fixed and how many new residues to generate:

A1-122,/0,60-80

This breaks down as: - A1-122: hold the full IL-2 chain (renumbered 1–122) fixed as the target - /0: chain break separator; the designed binder will be a separate chain - 60-80: generate a new binder chain of 60 to 80 residues

Without the /0 separator, RFdiffusion3 would treat the target and the generated segment as one continuous chain. The /0 ensures they are output as independent chains in the final structure.

Hotspot Specification

Hotspots tell the model which atoms on the target must be contacted by the designed protein. The format uses the : syntax:

A33: ALL
A36: ALL
A37: ALL
A40: ALL
A67: ALL
A81: ALL

These correspond to canonical IL-2 residues Arg38, Thr41, Phe42, Tyr45, Leu72, and Asn88 in the renumbered PDB. ALL includes every heavy atom of that residue as a potential contact point. You can also use BKBN (backbone N, CA, C, O only) or TIP (the standard terminal heavy atom for that residue type) for more specific conditioning. Using ALL on a handful of key residues is a reliable starting point for most binder designs.

Full Job Settings

Setting Value Rationale
Input Structure 1M47_renum.pdb (chain A, renumbered 1–122) IL-2 target, contiguous residue numbering required
Contig A1-122,/0,60-80 Fix IL-2, design 60–80 residue binder as separate chain
Hotspots A33: ALL through A81: ALL IL-2Rβ binding face (renumbered); canonical Arg38, Thr41, Phe42, Tyr45, Leu72, Asn88
Number Designs 5 Five independent designs for comparison

The completed public job used for this post:

https://neurosnap.ai/job/6a219069fb58bafccd97b4a8?share=6a219dc3fb58bafccd97b596

Reading the Results

Once the job completes, the Neurosnap results page shows five downloadable PDB files alongside a 3D structure viewer.

Output Files

RFdiffusion3 on Neurosnap outputs:

File Contents
design_1.pdb through design_5.pdb All-atom 3D structures with both the fixed IL-2 chain (A) and the generated binder chain (B)
bfactors.csv Per-residue B-factor (diffusion confidence) values for all residues across each design

Each design PDB contains two chains: chain A is the original IL-2 structure held fixed during diffusion, and chain B is the newly generated binder. The binder includes explicit side chains, which is one of RFD3's key advances over backbone-only generation.

The B-factors in the CSV and PDB files represent RFD3's per-residue diffusion confidence on a 0–1 scale (lower values indicate the model converged to stable geometry for that residue). They are distinct from crystallographic B-factors (Ångström²) and from AlphaFold pLDDT scores.

For the five designs generated in this job, the binder lengths ranged from 71 to 80 residues. Design 1 (76 residues) shows tight geometric complementarity with IL-2: the minimum Cα–Cα distance between the two chains is 4.7 Å, and 54 Cα–Cα pairs fall within 8 Å, indicating an extensive protein-protein interface.

The Structure Viewer

The interactive viewer below loads design 1 directly from the completed Neurosnap job. Chain A is IL-2 (fixed target); chain B is the 76-residue binder generated by RFD3.

When inspecting your own designs, check that:

  1. The binder (chain B, smaller) is physically docked against IL-2 at the face where your hotspot residues are located
  2. The binder is compact and globular rather than an extended random coil
  3. The binder and target overlap in space only at the intended interface, not sterically clashing throughout

What to Do Next

RFdiffusion3 generates backbone and side-chain coordinates for each design, but the output sequence is not yet optimized for foldability or stability. The standard downstream workflow is:

  1. ProteinMPNN (available on Neurosnap): design amino acid sequences that are predicted to fold into each RFD3-generated backbone. Run multiple sequences (typically 8–16) per backbone.
  2. AlphaFold2 refolding (or RoseTTAFold2): validate that the designed sequences independently fold into the designed structure. The iPAE between the binder and target is the most informative metric.

Designs where AF2 refolding independently produces a similar binder-target complex (low backbone RMSD, low interface PAE, high pLDDT on the binder) are the strongest candidates for experimental testing.

The original RFdiffusion paper demonstrated that requiring AF2-confirmed structures increases experimental hit rates by nearly tenfold compared to taking diffusion-generated backbones directly to synthesis [1].

Practical Tips

Start with 5–10 designs and review the distribution. Five designs from a single run are rarely enough for a full design campaign, but they are enough to evaluate whether the contig and hotspot setup is yielding geometrically diverse designs or converging on a single solution. If all five look nearly identical, try widening the binder length range or removing one hotspot residue to give the model more freedom.

Use specific atoms for constrained active sites. The ALL shorthand is convenient for hotspot residues on a protein target, but for enzyme design or ligand conditioning, specify individual atoms by name (e.g., A25: N,CA,C,O,CB) to enforce precise geometric constraints. The more specific the conditioning, the more the diffusion trajectory is guided toward a particular binding geometry.

Condition on small molecules using the Ligand field. One of RFD3's unique capabilities is direct conditioning on non-protein molecules. To design a scaffold around a cofactor or drug molecule, include the ligand as part of the input PDB (under the HETATM chain) and reference it in the Fixed Atoms or Contig fields. The model will generate a protein that wraps around the ligand's actual atom positions rather than treating it as a black box.

Use Fixed Atoms for catalytic precision. For enzyme active site scaffolding, the Fixed Atoms field lets you pin specific atoms (e.g., A25: N,CA,C,O,CB) while allowing the flanking backbone to be diffusion-generated. This is the mechanism the RFD3 authors used to design cysteine hydrolases with precisely positioned catalytic triads [2].

Sort by iPAE after AF2 refolding. Across the benchmark data in the RFdiffusion literature, interface PAE from AF2 refolding is the single strongest predictor of experimental binding affinity among the metrics computed automatically. After running ProteinMPNN + AF2 on your RFD3 designs, sort by iPAE first. Designs with iPAE below ~10 and binder pLDDT above 80 are the best candidates for experimental testing.

Check backbone RMSD after AF2 refolding. A backbone RMSD above ~2 Å between the RFD3 model and the AF2 refolded structure suggests the designed backbone is not independently stable. If the binder only folds correctly in the presence of the target, it may have lower experimental hit rate than one that refolds well alone.

Iterative design improves results. After an initial run, take the best backbone geometry, export the structure, and re-submit as an unindexed motif with slightly looser hotspot constraints. Each iteration tightens the design toward a more confident binding mode.

Sources

  1. De novo design of protein structure and function with RFdiffusion (Watson et al., 2023, Nature)
  2. De novo Design of All-atom Biomolecular Interactions with RFdiffusion3 (Butcher et al., 2025, bioRxiv)
  3. Generalized Biomolecular Modeling and Design with RoseTTAFold All-Atom (Krishna et al., 2023, bioRxiv)
  4. Engineering IL-2 for immunotherapy of autoimmunity and cancer (Hernandez et al., 2022, Nature Reviews Immunology)
  5. De novo design of potent and selective mimics of IL-2 and IL-15 (Silva et al., 2019, Nature)

Explore more posts

OpenFold3: Open Reproduction of AlphaFold3-Style Biomolecular Co-folding

By Danial Gharaie Amirabadi

RoseTTAFold3 Online: All-Atom Protein-DNA Structure Prediction

By Danial Gharaie Amirabadi

Interpreting Boltz-1 (AlphaFold3) Metrics and Visualizations on Neurosnap

By Danial Gharaie Amirabadi

What Is Affinity Maturation? A Deep Dive into Optimizing Protein Binders

By Keaun Amani

DARPins vs. Antibodies: A Comprehensive Guide to Next-Gen Protein Scaffolds

By Keaun Amani

Applications of Bioinformatics in Drug Discovery

By Keaun Amani

Making Scientific Research
Faster & Easier

Register for free — upgrade anytime.

Interested in getting a license? Contact Sales.

Try Free