September 13, 2022

ST.25 vs. ST.26 Sequence Listings

For the second article in our Big Bang Blog Series, we break down the differences between the WIPO Standards ST.25 and ST.26. Read along for details including a direct comparison of the old and the new.

 

What are the benefits of new WIPO standard ST.26?

  • ST.26 allows standardization of sequence listing filing across multiple patent offices.
  • Data could be lost during ST.25 transfer to sequence databases, while ST.26 is compliant with current public sequence database requirements.
  • With ST.26 sequence listings, Offices and applicants will benefit from automated validation and comprehensive searching capabilities.

What are the main differences between ST.25 and ST.26?

  • ST.25-compliant sequence listings could be filed as TXT or PDF files, while ST.26-compliant sequence listings are to be filed in XML (extensible markup language) format.
  • ST.25 does not require inclusion of D-amino acids, linear portions of branched sequences, or nucleotide analogs, while ST.26 does.
  • ST.25 does permit inclusion of sequences with less than 10 nucleotides and less than 4 amino acids, while such sequences are prohibited in ST.26.
  • DNA and RNA molecule types must be further described.
  • For more details on ST.25 to ST.26 changes, see the helpful table below reproduced from WIPO’s ST.26 Introduction Webinar: WIPO ST.26: Introduction..

 

WIPO ST.25

WIPO ST.26

ASCII .txt with numeric identifiers

XML with elements and attributes

 

Not required to include:

– D-amino acids

– Linear portions of branched sequences

– Nucleotide analogs

Must include:

– D-amino acids

– Linear portions of branched sequences

– Nucleotide analogs

 

Annotation of sequences:

– Feature keys only

 

Annotation of sequences:

– Feature keys and qualifiers

 

Permitted to include sequences:

– < 10 specifically defined nucleotides

– < 4 specifically defined amino acids

 

Prohibited sequences:

– < 10 specifically defined nucleotides

– < 4 specifically defined amino acids

 

ALL priority application information may be included

 

ONLY the earliest priority application can be included

 

ALL applicant and inventor names may be included

 

ONLY one applicant AND optionally ONE inventor may be included

 

One invention title permitted

 

Multiple invention titles permitted, each one in a different language

 

Applicant/inventor names and invention titles must be in basic Latin characters

 

Applicant/inventor names may be included using any valid Unicode character along with a basic Latin translation or transliteration

 

Sequences identified as DNA, RNA, or PRT only

 

Sequences identified as DNA, RNA, or AA along with a mandatory mol_type qualifier to further describe the molecule

 

Organism names:

– Latin genus/species

– Virus name

– “artificial sequence”

– “unknown”

 

Organisms names:

– Latin genus/species

– Virus name

– “synthetic construct”

– “unidentified”

 

“u” represents uracil in nucleotide sequences

 

“t” represents uracil in RNA sequences and thymine in DNA sequences

 

Amino acid sequences represented by three letter abbreviations

 

Amino acid sequences represented by one letter abbreviations

 

“n” and “Xaa” variables must have a definition provided in a feature

 

Default value assumed for “n” and “X” variables with no definition

 

Feature location format not clearly defined

 

Strictly defined feature location formats; permits use of “<” and “>” in all sequence types, and “^”, “join”, “order”, and “complement” in nucleotide sequences

 

“Mixed mode” sequences permitted – nucleotide sequence with amino acid translation shown below

 

NO “mixed mode”; nucleotide translations are included in “translation” qualifiers only