ESMC
Protein language model jobs for embedding extraction and masked-token sequence scoring.
How SubSeq Runs It
- Jobs run
/ref/bin/esmc_subseq.pywith the ESM runtime. - Model assets are managed by SubSeq.
- Network access is disabled for jobs, and output is forced to
/outputs.
Input
Use typed protein sequence text, one FASTA file, or a folder of FASTA files. Inputs should be unaligned protein sequences using amino-acid letters.
>example
MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG
Example Arguments
Embedding
embed
--sequence=MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG
--name=example
--pool=mean
--batch-size=1
Embedding a Folder
embed
--input-dir=/inputs/fastas
--recursive
--pool=both
--batch-size=1
Scoring
score
--fasta=/inputs/sequences.fasta
--batch-size=1
--max-score-length=1024
Raw API Submit
curl -X POST https://subseq.bio/api/v1/job/submit \
-H "Authorization: Bearer sk-ss-admin-123" \
-F program=esmc \
-F args=embed \
-F args=--sequence=MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG \
-F args=--name=example \
-F args=--pool=mean \
-F args=--batch-size=1
Arguments
Modes
embed: extract mean and/or per-token embeddings.score: compute masked-token pseudo-likelihood scores.
Input
--sequence=<protein>: one typed protein sequence.--name=<label>: output label for typed sequence input.--fasta=/inputs/example.fasta: one FASTA file.--input-dir=/inputs/fastas: folder of FASTA files.--recursive: search subfolders when using a FASTA folder.
Embeddings
--pool=mean|tokens|both: write one vector per sequence, per-token vectors, or both. Default:mean.--batch-size=<int>: embedding batch size. Default:1.
Scoring
--batch-size=<int>: masked-position scoring batch size. Default:1.--max-score-length=<int>: maximum token count for pseudo-likelihood scoring. Default:1024.
Outputs
- Embeddings:
/outputs/embeddings_manifest.jsonand one or more.npyarrays. - Scoring:
/outputs/scores.tsv,/outputs/scores.json, and per-sequence masked-token detail TSVs.
Submit
Queue a run from New Job -> ESMC.