SGD Help: Getting Started with the Saccharomyces Genome Database

Basic information to help you begin to use SGD. More specific help documents about different tools and informaion in SGD can be accessed from the Help Resources page.


  1. Types of Information Contained in SGD
  2. Organization of SGD
  3. Using SGD to Find Information About a Gene or Group of Genes
  4. Services Provided by SGD
  5. Getting Help as You Use SGD

Types of Information Contained in SGD

SGD is an organized collection of genetic and molecular biological information about Saccharomyces cerevisiae, bakers' and brewers' yeast. It contains the sequences of yeast genes and proteins; descriptions and classifications of their biological roles, molecular functions, and subcellular localizations; information about their mutant phenotypes, regulation, and protein-protein interactions; links to published papers; tools for analysis of datasets; and tools for analysis and comparison of sequences. Overviews of the genome and its annotation are available through the Genome Snapshot page and the genome browser JBrowse. Data can be downloaded in multiple ways, as described on the Download page. The information in SGD is continually updated by SGD curators. The SGD Home page is the main entry point for the database.

SGD curators have created a suite of short video tutorials that describe specific features and tools of SGD. In addition, OpenHelix has developed an online tutorial that describes navigation of SGD and many features of the database. We encourage you to contact SGD curators through our web form or at if you need help using the database.

SGD is designed for use by scientists; collected information about yeast for the non-scientist can be found on the General Topics page of the SGD wiki. SGD does not collect medical information, and SGD curators cannot answer health-related questions. There are no restrictions on academic use of the data in SGD, but they may not be repackaged or redistributed for profit-making enterprises. Contact SGD if you have questions.

Organization of SGD

An overview of the content and organization of SGD is presented in the Site Map. Many SGD pages, including tools, resources, data submission forms, and others, are accessible from the toolbar at the top of each page, including the SGD Home page. Current information about changes to SGD, such as new features or addition of new data, is displayed on the SGD Home page; this information is also archived.

The basic unit of SGD is the Locus Page (for an example, see ACT1). Each yeast gene or open reading frame has an individual Locus Page. Chromosomal features that do not encode proteins (such as centromeres, telomeres, tRNA genes) and genetic loci that are not mapped to a DNA sequence also have Locus Pages. All information relevant to a particular locus is either presented on or linked to the Locus Page.

Locus summary pages can be accessed by entering the name (standard name, alias name, or systematic name) in the Search box, which is found at the top of the SGD Home page and most SGD pages. Sometimes you may need to investigate genes or proteins without knowing their names. You can search for a class of similarly named genes using the wildcard character (e.g., searching for 'pet*' brings up PET54, PET123, PET494...). Or, you can search with the name of a protein (e.g., profilin) or protein complex (e.g., cytochrome c oxidase), or a Gene Ontology term (e.g., aerobic respiration) to bring up lists of the Locus Pages where this text occurs. Each gene name in the list resulting from the search is hyperlinked to its corresponding Locus Page.

All of the information on the Locus Page is curated from published papers listed in the Literature Guide for that gene. The Literature Guide also shows other genes discussed in those papers.  Clicking on the Literature Topics links on the left of the page takes you to papers that address those topics for that gene.  (For example, see the list of papers in the topic "Non-Fungal Related Genes/Proteins" for ADE2.) Note: If you have published a paper about a particular gene and do not see your paper in the Literature Guide, please contact SGD curators.

Using SGD to Find Information About a Gene or Group of Genes

Interpreting the results of large-scale genomic or proteomic experiments that identify groups of genes with a common property (e.g., transcriptional coregulation, similar null phenotypes, etc.) presents a special challenge in integrating what is known about each gene to find the significance of the trends observed. Several tools in SGD can help with the analysis of gene lists. These tools can be accessed from the toolbar located at the top of the SGD home page and most other pages.

  • YeastMine - YeastMine is a powerful search tool that allows you to query SGD for a variety of types of data for a single gene or a list of genes.  YeastMine allows for retrieval and analysis of chromosomal features, sequences, protein features, GO annotations, homologs, phenotypes, interaction data, expression data, pathway information, and curated literature; and other data types will be added as they become available.  These data can be accessed using pre-defined templates, or you can design custom queries using the QueryBuilder tool. Resulting data can be manipulated to organize tables for your work.  All data can be easily downloaded. The SGD tutorial page contains a variety of short tutorials to help you learn to use YeastMine.
  • Gene Ontology (GO) tools - The GO Slim Mapper and GO Term Finder can help you identify genes that share a function, role, or location. The GO Slim Mapper takes a list of genes and displays how many are annotated to each of the parent GO Slim terms, allowing visualization of the distribution of the input gene set over broad biological processes, biochemical functions, or subcellular localizations. The GO Term Finder tool also takes a list of genes as input, and identifies GO terms shared among members of the group. The difference between the two tools is that the GO Slim Mapper maps genes to broad parent GO terms, while the GO Term Finder identifies specific, granular terms shared by the group.
  • Sequence analysis tools - SGD contains several tools for sequence retrieval and analysis. BLAST and Fungal BLAST search non-fungal and fungal sequences. For query sequences of fewer than 20 residues, the Pattern Matching tool PatMatch works best. Restriction Mapper allows you to generate a restriction map of a given DNA sequence. Finally, you can design primers for use in sequencing and PCR with the Web Primer tool.
  • SPELL expression analysis tool - The SPELL (Serial Pattern of Expresson Levels Locator) search tool can be used to obtain expression data for one or multiple genes. SPELL contains microarray data from a large number of expression studies. The tool identifies the datasets that are most informative for the query gene or genes, and then within those datasets it identifies additional genes with expression profiles that are most similar to the query.

Services Provided by SGD

  • Proposing a New Gene Name - By consensus of the research community, SGD serves as the official arbiter of S. cerevisiae genetic nomenclature and maintains a gene name registry for new proposed gene names. Researchers who want to reserve a new S. cerevisiae gene name do so through SGD. The name is shown on the appropriate SGD Locus Page as 'reserved', and after publication the name becomes the standard gene name. SGD maintains a detailed list of guidelines for choosing and reserving new gene names, and for the resolution of conflicts over gene names. A web form is available for submitting gene name reservations. Occasionally, existing gene names are changed to more accurately reflect the function or role of the gene product. Such changes are only made if there is consensus among all researchers who have studied the gene. SGD curators can coordinate the process of proposing and discussing gene name changes.
  • Genomic Sequence of S288C - SGD also maintains the most up-to-date version of the complete genomic sequence of S. cerevisiae strain S288C. The most recent update of the S288C reference sequence occurred on February 2, 2011. This version, referred to as "S288C 2010" was provided by Fred Dietrich of Duke University, and was determined using new high fidelity sequencing from an individual yeast colony. Contact SGD curators if you have questions about the reference sequence.
  • Data Download - All of SGD's data are freely available to the public for download. See the Download page for more information.
  • Community Information - Several different kinds of information for the yeast community are available through SGD. You can add your contact information to SGD's directory of yeast colleagues, which can be searched by last name. You can also search or browse a list of several hundred yeast labs. SGD maintains a list of upcoming conferences and past yeast meetings, some of which are also listed on the SGD Home page.

Getting Help as You Use SGD

All SGD help resources are listed on the Help Resources page. The 'Help' icon (a question mark in a red circle) on each tool and Locus page is linked directly to help documentation for that particular page. In addition, check the tutorials page to see if your question is covered in one of our short video tutorials.

The Glossary page lists definitions of genetic, bioinformatic, and other terms used in SGD.

SGD curators may be contacted by using the web form that can be accessed from many SGD pages, or by direct email to We welcome comments and questions from the yeast community! Remember to check the home page for announcements of new features and enhancements in SGD.