SUBSEQ.BIO
DOCS-ESMFOLD2

ESMFold2

Biohub ESMFold2 all-atom structure prediction using the ESMC-6B language model.

How SubSeq Runs It

Input

Upload one or more FASTA files under /inputs. A single-chain protein can be one record; a protein complex can be one FASTA file with multiple records.

>A
MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG

For richer ESMFold2 inputs, upload JSON with sequences entries for proteins, DNA, RNA, ligands, modifications, MSAs, pockets, distograms, or covalent bonds.

{
  "id": "protein_ligand",
  "sequences": [
    {"type": "protein", "id": "A", "sequence": "MKTAYIAKQRQISFVKSHFSRQDILDL"},
    {"type": "ligand", "id": "L", "smiles": "CCO"}
  ]
}

Example Arguments

-I=/inputs
--model-preset=fast
--num-sampling-steps=32
--num-diffusion-samples=1

Use -i=/inputs/example.fasta to fold one FASTA file explicitly, or -j=/inputs/complex.json for structured JSON.

--model-preset=full
--num-sampling-steps=50
--num-loops=3
--chunk-size=64
--esmc-precision=bf16

Use fast for typical single-sequence FASTA folding. Choose full for richer structured JSON inputs, especially when using MSAs, ligands, nucleic acids, modifications, pockets, distograms, or covalent bonds.

Arguments

Inputs

Model

Sampling

32 steps is the normal fast preset for routine FASTA jobs. Use 50 for full, MSA-backed jobs, or quality checks. Try 100-200 only as an expensive retry when confidence is borderline; prefer adding MSA/context before increasing steps. More diffusion samples are useful for ranking/diversity, not as a reliable fix for missing context.

Validation

Server-Owned Paths

Outputs

The wrapper writes mmCIF structures and JSON confidence summaries, for example /outputs/example.cif and /outputs/example.json.

The JSON summary includes plddt_mean, ptm, iptm, and the output CIF path. It does not include Boltz-specific fields such as confidence_score or complex_ipde.

Submit

Queue a run from New Job -> ESMFold2.