RoseTTAFold3 Online: All-Atom Protein-DNA Structure Prediction

Written by Danial Gharaie Amirabadi | Published 2026-6-3

Introduction

RoseTTAFold3 is an open-source all-atom foundation model for biomolecular structure prediction and generative design, built by the RosettaCommons group. Where earlier tools were limited to proteins alone, RoseTTAFold3 handles arbitrary combinations of proteins, nucleic acids, small molecules, metals, and covalent modifications in a single joint prediction. It is trained on the AtomWorks data framework, a general, reusable pipeline for developing state-of-the-art biomolecular foundation models, and released under a permissive BSD license [1].

On Neurosnap, you can run RoseTTAFold3 online without installing model weights, configuring databases, or managing GPU resources. This tutorial walks through a protein-DNA structure prediction using human p53 bound to its consensus DNA response element, a well-characterized system with a high-resolution crystal structure for comparison.

RoseTTAFold3 service page on Neurosnap

The RoseTTAFold3 service page on Neurosnap.

What RoseTTAFold3 Does

RoseTTAFold3 (RF3) is the successor to RoseTTAFold All-Atom (RFAA) [3], extended and retrained on the AtomWorks framework [1]. Like AlphaFold3 [4] and OpenFold3, it moves beyond single-protein folding toward joint co-prediction of multimodal complexes: the model takes every molecular component (protein chains, DNA or RNA strands, small molecule ligands, metal ions, modified residues) and predicts their combined three-dimensional structure in one shot.

The three-track architecture of RoseTTAFold remains at the core: sequence, pairwise distance, and three-dimensional coordinates are processed together across multiple network layers [5]. RF3 extends this to handle atomic-level representations of non-protein entities, not just residue-level tokens for amino acids.

Two improvements distinguish RF3 from prior open-source implementations:

Improved chirality treatment. Chirality errors (predicting a mirror-image geometry) are a subtle but consequential failure mode for small molecules and modified residues. RF3 narrows the gap between open-source implementations and the closed-source AlphaFold3 in this area [1].

AtomWorks framework. AtomWorks is a general data framework for training biomolecular foundation models across diverse tasks: structure prediction, generative protein design, and fixed-backbone sequence design. By releasing AtomWorks alongside RF3, RosettaCommons makes it straightforward to train new models on the same infrastructure rather than starting from scratch [1].

BSD license. Unlike AlphaFold3, which restricts commercial and high-throughput use, RF3 is released under a permissive BSD license. This makes it suitable for integration into computational pipelines, commercial drug discovery workflows, and downstream tool development without API restrictions.

RoseTTAFold3 vs. Other All-Atom Models

Model	Open source	License	Proteins	DNA/RNA	Small molecules	Metals
AlphaFold3	No (weights only)	Restricted	Yes	Yes	Yes	Yes
OpenFold3	Yes	Apache 2	Yes	Yes	Yes	Yes
RoseTTAFold All-Atom	Yes	BSD	Yes	Yes	Yes	Yes
RoseTTAFold3	Yes	BSD	Yes	Yes	Yes	Yes
Chai-1	Partially	Restricted	Yes	Yes	Yes	Yes
Boltz-2	Yes	MIT	Yes	Yes	Yes	Yes

RF3's key advantage over the prior RFAA is the improved training data pipeline and chirality handling. Its key advantage over AlphaFold3 and Chai-1 is the permissive license and full open weights. For a systematic benchmark of these models across nine prediction task categories, see FoldBench [2].

When to Use RoseTTAFold3 Online

RF3 is most useful when:

You need open-source, auditable model weights with no API restrictions, no rate limits, and no commercial-use prohibitions
Your system contains nucleic acids: DNA-protein complexes, RNA-protein interactions, transcription factor binding sites
Your system contains small molecules, metals, or covalent modifications alongside proteins
You want to run a large batch or build a pipeline around an all-atom model (BSD license allows commercial integration)
You want a second opinion alongside AlphaFold3 or OpenFold3 on the same complex

For pure protein-only predictions, a dedicated tool like AlphaFold2 or ESMFold may be faster. RF3's strength is in the multimodal case.

You can open the RoseTTAFold3 service directly here:

https://neurosnap.ai/service/RoseTTAFold3

Example: p53 Tumor Suppressor Bound to DNA

For this walkthrough, we use the p53 DNA-binding domain with a consensus p53 response element — the same system captured in PDB 1TUP, a 2.35 Å crystal structure of the p53 core domain bound to DNA [6]. This is one of the most-studied protein-DNA complexes in biology: p53 is the most frequently mutated gene in human cancer, and its ability to bind specific DNA sequences and activate target genes is central to its tumor suppressor function [7].

The example is a good RF3 tutorial because it involves two molecular classes (protein and DNA), has a high-resolution reference structure for comparison, and presents a question that pure protein-folding tools cannot answer: how does the p53 domain grip the DNA duplex?

Input Sequences

The p53 DNA-binding domain (219 residues, from PDB 1TUP chains C/D/E):

The p53 consensus response element is two half-sites of the form RRRCWWGYYY separated by a 0–13 bp spacer. PDB 1TUP uses a 21-mer duplex:

Strand A: TTTCCTAGACTTGCCCAATTA
Strand B: ATAATTGGGCAAGTCTAGGAA

The two strands form a B-form DNA duplex. The p53 DBD uses two distinct DNA-contacting surfaces: the loop-sheet-helix motif reaches into the major groove (anchored by Arg273 and Lys120), while loop L3's Arg248 inserts into the minor groove [6].

Configure the RoseTTAFold3 Job

On the RoseTTAFold3 submission page, provide all three chains together in a single job.

Setting	Value
Input Sequences (protein)	p53 DBD, 219 residues
Input Sequences (DNA strand A)	`TTTCCTAGACTTGCCCAATTA`
Input Sequences (DNA strand B)	`ATAATTGGGCAAGTCTAGGAA`
Input Molecules	(none)
MSA Mode	`mmseqs2_uniref_env`
Number Recycles	10
Diffusion Steps	200 (default)

The most important setup detail is that all three chains (the protein and both DNA strands) are submitted together in the same job. RF3 is not being used here as a protein-folding step followed by docking; it reasons over the full protein-DNA system jointly.

The completed public job used for this post:

https://neurosnap.ai/job/6a20d98afb58bafccd97ab6c?share=6a2124e2fb58bafccd97ae60

Reading the Results

RoseTTAFold3 returned five ranked models, all with nearly identical scores — ranking scores span just 0.629 to 0.627 across ranks 1–5 with no clashes in any model. That level of convergence across independent diffusion trajectories is a good sign: the model found one stable solution rather than five distinct hypotheses.

Structure Viewer

Ranked model:

Confidence Metrics

RoseTTAFold3 outputs confidence scores that follow the same general framework as other AlphaFold-lineage models, with a few important interpretive differences for protein-DNA systems.

pLDDT (predicted local distance difference test) is a per-residue local confidence score. For the p53 DBD, high pLDDT (above 80–90) across the protein core indicates the model is confident in the local geometry of the fold. For the DNA strands, pLDDT is less interpretable than for proteins: the model is not trained on the same volume of nucleic acid structures as protein structures, and terminal nucleotides frequently show lower confidence.

pTM (predicted TM-score) summarizes global fold confidence across the full complex. Values above 0.5 indicate a broadly correct fold; values in the 0.7–0.9 range for a protein-DNA complex indicate confident overall geometry.

ipTM (interface pTM) is the metric that matters most for the protein-DNA question. A high ipTM indicates the model is confident about how the protein and DNA chains are positioned relative to each other (the interface geometry), not just that each chain is locally well-folded.

PAE (predicted aligned error) shows residue-pair positional uncertainty as a matrix. In a well-predicted protein-DNA complex, the cross-chain PAE between protein residues and DNA bases should be low where the protein contacts the major groove, and higher in distal regions. This plot is one of the best tools for identifying which protein residues the model places confidently near the DNA.

The rank 1 model scores (chain order: p53 DBD → strand A → strand B):

Metric	Value
Overall pLDDT	85.7
pTM	0.83
ipTM	0.58
Ranking score	0.630
Clashes	None

Per-chain pTM:

Chain	pTM
p53 DBD (protein)	0.86
DNA strand A	0.85
DNA strand B	0.85

Cross-chain PAE (mean / minimum, Å):

Interface	Mean PAE (Å)	Min PAE (Å)
p53 DBD ↔ Strand A	13.0	5.7
p53 DBD ↔ Strand B	13.0	5.7
Strand A ↔ Strand B	3.9	1.5

The per-chain pTM scores are uniformly high (0.85–0.86) — each chain is locally well-modelled. The ipTM of 0.58 is in a moderate range for a protein-DNA complex: this is not as high as a tight protein-protein interface, but protein-DNA ipTM values are systematically lower because the contact surface is smaller relative to the chain lengths. The mean protein–DNA PAE (~13 Å) reflects global positional uncertainty across the full chains, while the minimum PAE (~5.7 Å) captures the closest contact residue pairs. The DNA duplex internal PAE is tight (mean 3.9 Å, min 1.5 Å), confirming the model placed the two strands as a coherent double helix.

Confidence Plots

What to Check in a Protein-DNA Prediction

For a p53-DNA result specifically:

Is the protein fold reasonable? The p53 DBD is a well-characterized immunoglobulin-like beta-sandwich. The bulk of the domain should show high pLDDT and adopt the expected topology.
Are the key contact residues positioned correctly? The experimental 1TUP structure places Arg273 and Lys120 in the major groove and Arg248 in the minor groove [6]. A plausible prediction should have the loop-sheet-helix motif facing the major groove and loop L3 (carrying Arg248) positioned at the minor groove, not the reverse, and not floating free of the DNA.
Does ipTM support the interface? A high interface confidence score is necessary (but not sufficient) evidence that the predicted docking geometry is meaningful.
Does the PAE matrix show cross-chain confidence? Low cross-chain PAE between the protein contact residues and the central DNA bases is a positive sign.

Practical Interpretation

For a protein-DNA RF3 result, use this triage order:

Check that the protein fold is globally reasonable (pLDDT, pTM).
Check that the DNA strands form a recognizable double-helical structure.
Check that the protein is docked against the DNA, not floating free.
Check interface confidence (ipTM) and the cross-chain PAE matrix.
Compare predicted contact residues against known biochemistry or mutagenesis data.

Treat the result as a structural hypothesis. For p53, there is rich mutational and structural data to cross-reference: any prediction that misplaces Arg273 away from the major groove or Arg248 away from the minor groove should be viewed skeptically even if the global confidence scores look good.

Practical Tips

Submit all chains together. RF3 is a co-folding model. Do not fold the protein alone and then dock the DNA; submit everything in one job so the model can use the full molecular context during prediction.

Use both DNA strands. A duplex is two strands. Submitting only one strand gives the model an incomplete picture of the DNA geometry, particularly for the backbone interactions that anchor p53 to the phosphodiester backbone.

The DNA pLDDT is less informative than the protein pLDDT. Short DNA strands, especially terminal nucleotides, commonly show low confidence scores even in good predictions. Evaluate the DNA geometry visually and via the cross-chain PAE rather than relying on per-nucleotide pLDDT alone.

Check known contact residues. For well-characterized DNA-binding proteins like p53, mutagenesis data and prior crystal structures define which residues contact the DNA and in which groove. Arg273 and Lys120 are major groove contacts; Arg248 is a minor groove contact [6]. Use these as anchor points; a prediction that swaps these assignments is almost certainly wrong regardless of confidence scores.

Use RF3 results to guide experiments. Identify predicted contact residues with high confidence, then design point mutations (e.g., Arg-to-Ala at predicted contacts) to test computationally before synthesizing.

Conclusion

RoseTTAFold3 on Neurosnap brings open-source all-atom co-prediction to the browser. For protein-DNA problems such as predicting how a transcription factor grips its response element, or how a DNA-repair enzyme recognizes a lesion, this is a genuine capability that protein-only models cannot provide.

The key questions to ask of any RF3 protein-DNA result are:

Does the protein adopt a reasonable fold with high pLDDT?
Does the model place the protein docked against the DNA in a chemically sensible orientation?
Does the interface confidence (ipTM) and cross-chain PAE support the predicted contact geometry?

For the p53-DNA system shown here, these questions are backed by 30 years of structural and mutational data, making it a useful calibration case for any new all-atom prediction tool.

References

Accelerating Biomolecular Modeling with AtomWorks and RF3 — Corley et al.; 2025; bioRxiv.
Benchmarking all-atom biomolecular structure prediction with FoldBench — Xu et al.; 2025; Nature Communications.
Generalized Biomolecular Modeling and Design with RoseTTAFold All-Atom — Krishna et al.; 2024; Science.
Accurate structure prediction of biomolecular interactions with AlphaFold 3 — Abramson et al.; 2024; Nature.
Accurate prediction of protein structures and interactions using a three-track neural network — Baek et al.; 2021; Science.
Crystal structure of a p53 tumor suppressor-DNA complex: understanding tumorigenic mutations — Cho et al.; 1994; Science.
The p53 tumour suppressor gene — Lane & Crawford; 1979; Nature.

Explore more posts

GROMACS Solvent Box Comparison: Running GFP with Dodecahedron, Cubic, Triclinic, and Octahedron Boxes

By Keaun Amani

Designing DARPins with NeuroBind’s New Affinity Maturation Feature

By Keaun Amani

Practical Molecular Docking with DiffDock & Neurosnap.

By Keaun Amani

Synthetic Accessibility: Definition, Importance, and How to Assess It with Neurosnap

By Keaun Amani

Understanding Apo vs. Holo Proteins in Drug Discovery

By Keaun Amani

Creating Next Generation Fluorescent Proteins Using AlphaFold2 and ProteinMPNN

By Keaun Amani

Making Scientific Research
Faster & Easier

Interested in getting a license? Contact Sales.

Try Free

Introduction

The RoseTTAFold3 service page on Neurosnap.

What RoseTTAFold3 Does

RoseTTAFold3 vs. Other All-Atom Models

When to Use RoseTTAFold3 Online

Example: p53 Tumor Suppressor Bound to DNA

Input Sequences

Configure the RoseTTAFold3 Job

Reading the Results

Structure Viewer

Confidence Metrics

Confidence Plots

What to Check in a Protein-DNA Prediction

Practical Interpretation

Practical Tips

Conclusion

References

Explore more posts

GROMACS Solvent Box Comparison: Running GFP with Dodecahedron, Cubic, Triclinic, and Octahedron Boxes

Designing DARPins with NeuroBind’s New Affinity Maturation Feature

Practical Molecular Docking with DiffDock & Neurosnap.

Synthetic Accessibility: Definition, Importance, and How to Assess It with Neurosnap

Understanding Apo vs. Holo Proteins in Drug Discovery

Creating Next Generation Fluorescent Proteins Using AlphaFold2 and ProteinMPNN

Making Scientific ResearchFaster & Easier

Making Scientific Research
Faster & Easier