|
SGD Help: Fungal Sequence Alignment |
Contents
Protein: This is the default option from both the synteny viewer and the locus page. It displays predicted translation products of spliced S. cerevisiae ORFs aligned with the predicted translation products of orthologous or highly similar ORFs from other fungi, when available. Ambiguous amino acids are indicated by an "X." ORF DNA: This option displays the genomic DNA (i.e. introns have not been removed) of S. cerevisiae genes aligned with orthologous or highly similar predicted ORFs from other fungi, when available. Ambiguous nucleotides are indicated by an "N." Upstream sequence: This option displays an alignment of the genomic DNA (i.e. introns have not been removed) of the 1 kb of sequence directly upstream of the orthologous or highly similar ORFs of interest. If a full 1 kb of sequence is not available (e.g. the ORF is near the end of a contig), all available sequence will be displayed. Ambiguous nucleotides are indicated by an "N." Downstream sequence: This option displays an alignment of the genomic DNA (i.e. introns have not been removed) of the 1 kb of sequence directly downstream of the orthologous or highly similar ORFs of interest. If a full 1 kb of sequence is not available (e.g. the ORF is near the end of a contig), all available sequence will be displayed. Ambiguous nucleotides are indicated by an "N." ORF DNA +1kb up/downstream: This option displays an alignment of the genomic DNA (i.e. introns have not been removed) of orthologous or highly similar ORFs of interest, along with 1 kb of sequence both up- and downstream. If a full 1 kb of sequence is not available (e.g. the ORF is near the end of a contig), all available sequence will be displayed. Ambiguous nucleotides are indicated by an "N."
| Color and Similarity | identical | strong similarity | weak similarity |
|---|---|---|---|
| Symbol | * | : | . |
| Conserved Amino Acid Groups | exact matches only | The conserved position contains amino acids from one
of the "strong" groups listed below (each row is a group):
STA
NEQK
NHQK
NDEQ
QHRK
MILV
MILF
HY
FYW
|
The conserved position contains amino acids from one
of the "weak" groups listed below (each row is a group):
CSA
ATV
SAG
STNK
STPA
SGND
SNDEQK
NDEQHK
NEQHRK
FVLIM
HFY
|
| Abbreviation | Fungal Species | Group | Source of Sequence | Published Reference |
|---|---|---|---|---|
| SGD_Scer | Saccharomyces cerevisiae | sensu stricto | Current reference sequence stored in SGD | |
| MIT_Spar | Saccharomyces paradoxus | sensu stricto | Manolis Kellis, Eric Lander, Bruce Birren and coworkers at the Whitehead Institute at MIT | Kellis et al |
| MIT_Smik | Saccharomyces mikatae | sensu stricto | Manolis Kellis, Eric Lander, Bruce Birren and coworkers at the Whitehead Institute at MIT | Kellis et al |
| WashU_Smik | Saccharomyces mikatae | sensu stricto | Paul Cliften, Mark Johnston and coworkers at Washington University | Cliften et al |
| MIT_Sbay | Saccharomyces bayanus | sensu stricto | Manolis Kellis, Eric Lander, Bruce Birren and coworkers at the Whitehead Institute at MIT | Kellis et al |
| WashU_Sbay | Saccharomyces bayanus | sensu stricto | Paul Cliften, Mark Johnston and coworkers at Washington University and Manolis Kellis, Eric Lander, Bruce Birren and coworkers at the Whitehead Institute at MIT (note: this WashU assembly combined published sequencing data from both the WashU and MIT sequencing efforts) | Cliften et al | Kellis et al |
| WashU_Skud | Saccharomyces kudriavzevii | sensu stricto | Paul Cliften, Mark Johnston and coworkers at Washington University | Cliften et al |
| WashU_Scas | Saccharomyces castellii | sensu lato | Paul Cliften, Mark Johnston and coworkers at Washington University | Cliften et al |
| WashU_Sklu | Saccharomyces kluyveri | sensu lato | Paul Cliften, Mark Johnston and coworkers at Washington University | Cliften et al |
The Saccharomyces genome sequences available here for alignments fall into two groups:
If you wish to identify conserved non-protein coding sequences, we recommend aligning the S. cerevisiae query sequence with only the sensu stricto species sequences. The sensu lato sequences are so diverged from S. cerevisiae sequence that very few of their non-protein coding sequences will align to their S. cerevisiae ortholog.
If you wish to identify conserved sequences in proteins, we recommend
starting by aligning the S. cerevisiae sequence to just the sensu lato
species' sequences (S. castellii and S. kluyveri). In most cases this
will best reveal the conserved sequences. Alignment to the sensu stricto
species' sequences is not expected to provide much additional information
in most cases.
Return to Saccharomyces Genome Database |
Send a Message to the SGD Curators ![]() |