SGD Help: All Associated Sequences

The All Associated Sequences page is a compendium of all Saccharomyces cerevisiae sequence entries for any allele or strain found in various external databases, including UniProt, EMBL, and Entrez. Links to MIPS were generated by using the MIPS database ID in the MIPS search program.


  1. Accessing All Associated Sequences
  2. Using All Associated Sequences

Accessing All Associated Sequences

All Associated Sequences for a given locus can be accessed from the "All Associated Seq" link in the External Links section, which is found near the bottom of the Locus Summary page beneath the Analyze Sequence section.

Using All Associated Sequences

The results page summarizes sequence IDs and the source databases in a table format. To retrieve sequence records, simply click on the Sequence ID to go to the appropriate site:

DNA accession ID DNA accession ID's come from the Entrez Nucleotide database, a collection of sequences from several sources, including GenBank, RefSeq, and PDB.
NCBI protein GI Protein version ID's come from the Entrez Protein search and retrieval system, and have been compiled from a variety of sources, including SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in GenBank.
TPA Protein version ID TPA protein version ID's come from the NCBI Third Party Annotation (TPA) database, which contains sequences that are derived or assembled from sequences in the International Nucleotide Sequence Database Collaboration (INSDC) databases (DDBJ, EMBL and GenBank). NCBI's TPA database contains nucleotide sequences built from the existing primary data with new annotation that has been published in a peer-reviewed scientific journal.
RefSeq Protein version ID RefSeq protein version ID's come from the NCBI Reference Sequences collection which provides a non-redundant set of DNA, RNA, and protein sequences for major research organisms. RefSeq entries for Saccharomyces cerevisiae are derived from SGD, and so should be identical to the 'reference' sequences stored at SGD. Note that other non-RefSeq sequences on the page may or may not represent other versions of the protein, which may differ from the SGD reference sequence due to allele or strain differences.
UniParc ID UniParc ID's come from the UniProt ARChive, a comprehensive non-redundant collection of protein sequences from many different publicly accessible sources, including the UniProt Consortium databases Swiss-Prot, TrEMBL and PIR-PSD, translations from the EMBL/DDBJ/GenBank nucleotide sequence databases, the Protein Data Bank (PDB), NCBI's Reference Sequence Collection (RefSeq), and protein sequences from the European, American and Japanese Patent Offices.
UniProt/Swiss-Prot ID UniProt/Swiss-Prot ID's come from the UniProt Knowledgebase, which was created by merging data in the Swiss-Prot, TrEMBL and PIR-PSD databases. The information in a UniProt/Swiss-Prot record is manually curated from computational analyses and literature.
UniProt/TrEMBL ID UniProt/TrEMBL ID's come from the UniProt Knowledgebase, which was created by merging data in the Swiss-Prot, TrEMBL and PIR-PSD databases. The information in a UniProt/TrEMBL record is computationally generated and not manually curated.

If you know of other sequences for a particular locus that are not included on the appropriate "All Associated Sequences" page, please let us know by sending an email to