SUBSEQ.BIO
DOCS-ESMC

ESMC

Protein language model jobs for embedding extraction and masked-token sequence scoring.

How SubSeq Runs It

Input

Use typed protein sequence text, one FASTA file, or a folder of FASTA files. Inputs should be unaligned protein sequences using amino-acid letters.

>example
MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG

Example Arguments

Embedding

embed
--sequence=MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG
--name=example
--pool=mean
--batch-size=1

Embedding a Folder

embed
--input-dir=/inputs/fastas
--recursive
--pool=both
--batch-size=1

Scoring

score
--fasta=/inputs/sequences.fasta
--batch-size=1
--max-score-length=1024

Raw API Submit

curl -X POST https://subseq.bio/api/v1/job/submit \
  -H "Authorization: Bearer sk-ss-admin-123" \
  -F program=esmc \
  -F args=embed \
  -F args=--sequence=MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG \
  -F args=--name=example \
  -F args=--pool=mean \
  -F args=--batch-size=1

Arguments

Modes

Input

Embeddings

Scoring

Outputs

Submit

Queue a run from New Job -> ESMC.