SUBSEQ.BIO
DOCS-BOLTZ-2

Boltz-2

Boltz structure and property prediction, wrapped for subseq.bio.

Inputs and outputs

Guided modes

Affinity metrics are produced only when the YAML contains properties: affinity. In guided mode that means Build Assembly with a small molecule and Predict Affinity enabled. In Custom Input or Batch Folder, each YAML file can produce affinity metrics if it contains a valid affinity property.

Example 1 — basic run

Use the default pattern shown on the New Job form: point Boltz-2 at your inputs directory and turn on potentials.

/inputs
--use_potentials
--out_dir=/outputs

Example 2 — local configuration file

You can also drive Boltz-2 with an explicit configuration file that lives under /inputs.

/inputs/my_config.yaml
--out_dir=/outputs/my_config_run
--use_potentials

Minimal no-MSA YAML pattern:

sequences:
  - protein:
      id: A
      sequence: "MKTAYIAKQRQISFVKSHFSRQDILDLI"
      msa: empty

Modified residues can be entered in guided Build Assembly mode, or specified directly in custom YAML:

version: 1
sequences:
  - protein:
      id: A
      sequence: "MSTNPKPQRKTKRNTNRRPQDVKFPGG"
      msa: empty
      modifications:
        - position: 2
          ccd: MSE

Example 3 — multimer from YAML

Guided Build Assembly mode can generate multiple protein, DNA, or RNA chains. Protein chains use msa: empty. For full control, create a YAML input under /inputs/multimer.yaml with multiple polymer chains:

sequences:
  - protein:
      id: A
      sequence: "MKTAYIAKQRQISFVKSHFSRQDILDLI"
      msa: empty
  - dna:
      id: B
      sequence: "ATCGATCG"

A DNA duplex is specified as two DNA chains, with both strands entered in their own 5-prime to 3-prime direction. In guided mode, use Add Reverse Complement after entering a DNA or RNA chain to append the reverse-complement strand.

version: 1
sequences:
  - dna:
      id: A
      sequence: "ATGCCGTA"
  - dna:
      id: B
      sequence: "TACGGCAT"

Then run:

/inputs/multimer.yaml
--use_potentials
--out_dir=/outputs/multimer

For de novo binder screening, a common pattern is to keep the designed binder chain as msa: empty, while giving the natural target chain a real MSA when available.

Quality-oriented sampling

The upstream defaults are already the normal full prediction path: --sampling_steps=200, --recycling_steps=3, --diffusion_samples=1, --step_scale=1.5, and --max_msa_seqs=8192.

/inputs/my_complex.yaml
--out_dir=/outputs/my_complex_seed01
--use_potentials
--recycling_steps=10
--sampling_steps=200
--diffusion_samples=1
--max_parallel_samples=1
--step_scale=1.5
--max_msa_seqs=8192
--output_format=pdb
--write_full_pae
--write_full_pde
--seed=12345
--override

Additional CLI options

SubSeq passes arguments directly to boltz predict, while managing the model cache automatically.

Show common boltz predict options
--out_dir PATH
--recycling_steps INT
--sampling_steps INT
--diffusion_samples INT
--max_parallel_samples INT
--step_scale FLOAT
--output_format [pdb|mmcif]
--num_workers INT
--preprocessing-threads INT
--override
--seed INT
--max_msa_seqs INT
--subsample_msa
--num_subsampled_msa INT
--no_kernels
--use_potentials
--method STRING
--model [boltz1|boltz2]
--affinity_mw_correction
--sampling_steps_affinity INT
--diffusion_samples_affinity INT
--write_full_pae
--write_full_pde
--write_embeddings

For detailed defaults and the YAML schema, see the upstream Boltz prediction instructions.

Notes