Use-Case Protocols

Boltz2 Parameter Generator playbooks for real modeling scenarios.

This page translates the full protocol guide into implementation-ready workflows: setup rules, validation guardrails, and step-by-step use cases for proteins, nucleic acids, ligands, templates, and constraints.

Coverage 14 workflow protocols

From single-chain predictions to template-guided and constrained multi-entity jobs.

Input model YAML-first builder

Supports protein, DNA, RNA, ligands, advanced constraints, and run parameter generation.

Quality guardrails Validation-aware flow

Includes ID rules, MSA modes, ligand constraints, affinity caveats, and troubleshooting patterns.

Overview

Main UI blocks and expected execution flow.

1. Chain Builder

Add entities with explicit IDs and sequence/value inputs.

Add Primary Protein Add Protein Add DNA Add RNA Add Ligand

2. Advanced Modeling

Optional but critical for guided or constrained prediction.

Structural Templates Covalent Ligand Bond Builder Pocket / Contact Conditioning

3. YAML to Run Flow

Use this order for reproducible runs.

  1. Fill entities and optional advanced settings
  2. Click Save YAML
  3. Click Next
  4. Define run parameters
  5. Click OK to save run configuration
Standard Workflow

Five-step protocol from molecular system to run config.

01

Build the system

Add proteins, DNA/RNA, and ligands with unique normalized chain IDs.

02

Add advanced controls

Attach templates, covalent bonds, pocket guidance, or contact conditioning when needed.

03

Save YAML

Validate all entries and resolve missing ID, sequence, ligand, or MSA errors before continuing.

04

Set run parameters

Define job name, potentials, override behavior, and optional advanced performance settings.

05

Save run configuration

Commit run settings to file for launch and reproducible execution.

Global Rules

Validation and reliability constraints that should always hold.

IDs

Entity IDs and chain IDs

  • Every entity must use a unique ID.
  • IDs are normalized to uppercase.
  • Homomers can be entered as comma-separated IDs like A,B,C.
MSA

Protein MSA modes

  • Server MSA: no upload required.
  • Custom MSA: upload .a3m for single chain or paired .csv.
  • Single-sequence mode lowers confidence and should be fallback only.
LIG

Ligand rules

  • Ligand definition must use exactly one of CCD or SMILES.
  • Do not provide both fields and do not leave both empty.
  • Covalent constraints are most reliable with CCD ligands.
AFF

Affinity mode constraints

  • Affinity selection supports one ligand chain only.
  • Intended for protein-small molecule systems.
  • RNA/DNA/co-factor targets may run but affinity output is unreliable.
TMP

Template constraints

  • Template files: .cif or .pdb.
  • Optional chain_id and template_id can be set explicitly.
  • If force is enabled, provide a threshold value.
CST

Conditioning constraints

  • Pocket/contact max_distance typically 4-20 A (default practical 6 A).
  • Use force only when prior assumptions are high confidence.
  • FASTA is deprecated for these YAML-only advanced features.
Core Use Cases

Primary protocols for proteins and protein-ligand systems.

1. Single Protein Prediction

Goal: Predict one protein chain.

  1. Add Primary Protein with ID A and sequence.
  2. Select MSA mode (server, custom .a3m, or single-sequence fallback).
  3. Save YAML, move Next, define run parameters, then save config.

What to enter: chain ID, sequence, optional .a3m.

2. Homo-Oligomer Prediction

Goal: Predict complexes with identical chains.

  1. Use one protein block with IDs like A,B (or A,B,C).
  2. Enter shared sequence once and choose MSA mode.
  3. Save YAML and run parameters as usual.

Key note: Use grouped IDs for homomers instead of duplicate manual blocks.

3. Hetero-Oligomer Prediction

Goal: Multi-chain protein complexes with distinct sequences.

  1. Add primary protein A and sequence.
  2. Add additional protein blocks for B, C, and so on.
  3. Use server MSA, custom MSA, or paired .csv for multi-chain pairing workflows.

Run tip: choose MSA pairing strategy (greedy or complete).

4. Protein-Ligand Interaction Prediction

Goal: Build structural protein-small molecule complexes.

  1. Add protein block and sequence.
  2. Add ligand block with ID L and choose CCD or SMILES.
  3. Save YAML and proceed to run configuration.

Rule: exactly one ligand encoding mode (CCD or SMILES).

5. Protein-Ligand + Affinity Prediction

Goal: Request affinity output for one ligand chain.

  1. Create protein and ligand entities first.
  2. Enable Predict Ligand Affinity and select the ligand chain.
  3. Save YAML, then finalize run parameters.

Reliability note: intended for protein-small molecule systems only.

Advanced Modeling

Constraint-guided and chemistry-aware protocols.

6. Residue Modification on Protein, DNA, or RNA

Goal: Add PTMs and chemical residue edits.

  1. Add polymer entity and sequence.
  2. Open Modified Residues, add position (1-based) and CCD code.
  3. Save YAML and proceed.
7. Cyclic Peptide or Cyclic Polymer

Goal: Model circular polymer topology.

  1. Add polymer block and sequence.
  2. Enable Cyclic Polymer.
  3. Save YAML and run settings.

Rule: cyclic applies to polymers, not ligands.

8. Template-Guided Protein Prediction

Goal: Bias structure generation with known template geometry.

  1. Build chain system then add .cif/.pdb template files.
  2. Optionally set explicit chain mapping and template_id.
  3. If force mode is enabled, provide threshold in angstrom.
9. Covalent Protein-Ligand Complex

Goal: Represent explicit covalent attachment.

  1. Add protein and ligand (prefer CCD ligand).
  2. In Covalent Ligand Bond Builder, define atom pairs for both sides.
  3. Save YAML and continue.

Example: A:145:SG to L:1:C1.

10. Pocket-Constrained Ligand Binding

Goal: Guide placement toward a known pocket.

  1. Add protein and ligand entities.
  2. In Pocket / Contact Conditioning, set binder chain and contacts.
  3. Set max_distance (commonly 6 A) and optional force flag.
11. Contact-Constrained Interface Modeling

Goal: Enforce specific interface contacts.

  1. Add complete molecular system.
  2. Create contact constraints using chain plus residue/atom tokens.
  3. Apply optional max_distance and force only when justified.
DNA / RNA

Nucleic-acid protocols for mixed and standalone systems.

12. Protein-DNA Complex

Goal: Model protein binding to DNA chains.

  1. Add protein entity then DNA entity with valid nucleotide sequence.
  2. Optionally add contact constraints for interface guidance.
  3. Save YAML and set run parameters.
13. Protein-RNA Complex

Goal: Model protein-RNA assemblies.

  1. Add protein and RNA chains with explicit IDs.
  2. Optionally apply contact or pocket-style conditioning.
  3. Save YAML and finalize run settings.

Note: affinity output is not considered reliable for RNA-target scenarios.

14. DNA-Only or RNA-Only Input

Goal: Run nucleic-acid-only predictions.

  1. Add DNA or RNA block.
  2. Provide IDs, sequences, and optional modifications.
  3. Save YAML and continue to run parameters.
Run Parameters

Required fields, common toggles, and advanced controls.

Required

  • Job Name is mandatory; spaces are normalized to underscores.

Common toggles

  • Use Potentials: improves physical plausibility for many jobs.
  • Override: reruns from fresh output state.

Advanced options

  • Recycling Steps: 3
  • Sampling Steps: 200
  • Diffusion Samples: 1
  • Step Scale: 1.638
  • Max MSA Sequences: 8192
  • Subsample MSA and pairing strategy (greedy/complete)
Validation & Troubleshooting

Most common errors and warning copy for guardrails.

Common validation errors

  • Missing IDs, missing sequences, or duplicate IDs
  • Ligand block with invalid CCD/SMILES configuration
  • Custom MSA selected but no upload provided
  • Affinity enabled without valid ligand chain selection

Recommended warning messages

  • Each chain or molecule must have a unique ID.
  • Ligands must be defined by exactly one of CCD or SMILES.
  • Custom MSA mode requires uploaded .a3m or paired .csv.
  • Single-sequence mode is lower accuracy and should be fallback only.
  • Affinity prediction is intended for protein-small molecule systems.