ESMFold2

Overview

Rapid protein fold checks and designed-sequence triage.
Protein complex and binder pose screening.
Context-rich systems with nucleic acids, ligands, modifications, or precomputed MSA information.

Modes

Mode	Input shape	When to use it
`fasta_inputs` FASTA Inputs Default	Consumes a folder or output set; useful for batches and pipeline handoffs.	Fold every FASTA input in the selected source. Multiple records in one FASTA become multiple chains.
`protein_sequence` Protein Sequence	No uploaded input is required by the mode itself.	Fold one typed protein sequence as a single chain.
`json_inputs` JSON Inputs	Consumes a folder or output set; useful for batches and pipeline handoffs.	Fold every structured JSON complex file in one selected folder.

Canonical Job Configuration

These are the fields exposed by the default job configuration for esmfold2. They are also returned by GET /api/v1/program/params?program=esmfold2 and submitted as the params JSON object to POST /api/v1/job/submit.

Parameter	Type	Modes	What it does
`protein_sequence` Protein Sequence	Sequence	Protein Sequence	Single-chain amino-acid sequence. Whitespace is ignored. Required
`prediction_name` Prediction Name	Text	Protein Sequence	Optional output stem for a direct sequence job.
`prediction_profile` Prediction Profile	Text	All modes	Auto uses the fast model for routine sequence/FASTA jobs and the full model for structured JSON complexes. Default: Auto; Options: Auto, Fast, Full
`sampling_effort` Sampling Effort	Text	All modes	Auto uses routine sampling for sequence/FASTA jobs and higher-effort sampling for structured JSON complexes. Default: Auto; Options: Auto, Fast, Standard, Thorough
`num_structures` Structures	Integer	All modes	How many structures to generate for each input. More structures increase runtime and are mainly useful for ranking or diversity. Default: 1; Range: 1-25

Advanced configuration fields

Parameter	Type	Modes	What it does
`split_fasta_chainbreaks` Split FASTA Chain Breaks	Yes/no	FASTA Inputs	When FASTA records contain ':' chain breaks, submit them to ESMFold2 as separate chains with stable IDs. Disable to preserve upstream ESMFold2's colon handling. Default: true
`random_seed` Random Seed	Integer	All modes	Set for reproducible sampling; leave blank for a random seed. Range: 0-999999999

Outputs And Metrics

Predicted mmCIF structures.
JSON confidence summaries for local and interface confidence.
plddt_mean is mean local confidence; higher is better.
ptm summarizes global fold confidence; iptm is most useful for multichain interfaces.

Common Examples

Single protein: Protein Sequence, Auto profile, Auto effort, one structure.
Two-chain FASTA: one FASTA file with two records.
Structured JSON complex: explicit protein, ligand, nucleic-acid, or modification records.

Example API params

{
  "mode": "protein_sequence",
  "protein_sequence": "MKTAYIAKQRQISFVKSHFSRQDILDLI",
  "prediction_profile": "Auto",
  "sampling_effort": "Auto",
  "num_structures": 1
}

Caveats

Predictions are static structural hypotheses, not dynamics, affinity, or activity measurements.
Include relevant chains, ligands, modifications, and MSA context when biology depends on them.
For interfaces, inspect geometry and iPTM together.

Advanced Submit

Advanced submit is still available for direct program arguments through POST /api/v1/job/submit-advanced. Prefer canonical configuration unless you need exact low-level arguments or are reproducing a known command line.

Advanced submit can pass prepared JSON inputs and lower-level inference controls.
Per-input limits are 1-8 loops, 1-200 sampling steps, and 1-25 diffusion samples.
Use JSON Inputs for the richest molecular context.

curl -X POST https://subseq.bio/api/v1/job/submit \
  -H "Authorization: Bearer <api_key>" \
  -F program=esmfold2 \
  -F 'params={"mode":"protein_sequence","protein_sequence":"MKTAYIAKQRQISFVKSHFSRQDILDLI","prediction_profile":"Auto","sampling_effort":"Auto","num_structures":1}'