September 13, 2022
ST.25 vs. ST.26 Sequence Listings
For the second article in our Big Bang Blog Series, we break down the differences between the WIPO Standards ST.25 and ST.26. Read along for details including a direct comparison of the old and the new.
What are the benefits of new WIPO standard ST.26?
- ST.26 allows standardization of sequence listing filing across multiple patent offices.
- Data could be lost during ST.25 transfer to sequence databases, while ST.26 is compliant with current public sequence database requirements.
- With ST.26 sequence listings, Offices and applicants will benefit from automated validation and comprehensive searching capabilities.
What are the main differences between ST.25 and ST.26?
- ST.25-compliant sequence listings could be filed as TXT or PDF files, while ST.26-compliant sequence listings are to be filed in XML (extensible markup language) format.
- ST.25 does not require inclusion of D-amino acids, linear portions of branched sequences, or nucleotide analogs, while ST.26 does.
- ST.25 does permit inclusion of sequences with less than 10 nucleotides and less than 4 amino acids, while such sequences are prohibited in ST.26.
- DNA and RNA molecule types must be further described.
- For more details on ST.25 to ST.26 changes, see the helpful table below reproduced from WIPO’s ST.26 Introduction Webinar: WIPO ST.26: Introduction..
WIPO ST.25 |
WIPO ST.26 |
ASCII .txt with numeric identifiers |
XML with elements and attributes |
Not required to include: – D-amino acids – Linear portions of branched sequences – Nucleotide analogs |
Must include: – D-amino acids – Linear portions of branched sequences – Nucleotide analogs |
Annotation of sequences: – Feature keys only |
Annotation of sequences: – Feature keys and qualifiers |
Permitted to include sequences: – < 10 specifically defined nucleotides – < 4 specifically defined amino acids |
Prohibited sequences: – < 10 specifically defined nucleotides – < 4 specifically defined amino acids |
ALL priority application information may be included |
ONLY the earliest priority application can be included |
ALL applicant and inventor names may be included |
ONLY one applicant AND optionally ONE inventor may be included |
One invention title permitted |
Multiple invention titles permitted, each one in a different language |
Applicant/inventor names and invention titles must be in basic Latin characters |
Applicant/inventor names may be included using any valid Unicode character along with a basic Latin translation or transliteration |
Sequences identified as DNA, RNA, or PRT only |
Sequences identified as DNA, RNA, or AA along with a mandatory mol_type qualifier to further describe the molecule |
Organism names: – Latin genus/species – Virus name – “artificial sequence” – “unknown” |
Organisms names: – Latin genus/species – Virus name – “synthetic construct” – “unidentified” |
“u” represents uracil in nucleotide sequences |
“t” represents uracil in RNA sequences and thymine in DNA sequences |
Amino acid sequences represented by three letter abbreviations |
Amino acid sequences represented by one letter abbreviations |
“n” and “Xaa” variables must have a definition provided in a feature |
Default value assumed for “n” and “X” variables with no definition |
Feature location format not clearly defined |
Strictly defined feature location formats; permits use of “<” and “>” in all sequence types, and “^”, “join”, “order”, and “complement” in nucleotide sequences |
“Mixed mode” sequences permitted – nucleotide sequence with amino acid translation shown below |
NO “mixed mode”; nucleotide translations are included in “translation” qualifiers only |