SGD

SGD Help: Gene Ontology (GO)


Contents



Description

The Gene Ontology (GO) project was established to provide a common language to describe aspects of a gene product's biology. The use of a consistent vocabulary allows genes from different species to be compared based on their GO annotations.The Gene Ontology (GO) project started as a collaboration between three model organism databases, the Saccharomyces Genome Database (SGD), FlyBase (for Drosophila), and Mouse Genome Informatics (MGI). The GO Consortium has expanded considerably to include many additional model organism databases and annotation groups, each of which contributes to the development of the ontologies, generation of GO annotation files, or development of software tools to utilize GO depending on the nature of its affiliation.

Within SGD, GO annotations are used to describe what gene products do and where they are located. Thus GO annotations appear directly on the Locus Summary pages for both protein coding and non-coding RNA genes. More detail about the GO annotations or the GO terms are located on additional pages. GO tools such as the GO Term Finder and the GO Slim Mapper allow users to utilize the GO annotations to analyze sets of genes and identify common functions, processes, or locations.

What is GO?

The objective of GO is to provide controlled vocabularies for the description of the molecular function, biological process, and cellular component of gene products. The name and definition for each GO term and the parent-child relationships between terms are defined by the members of the GO Consortium. This combination of a controlled vocabulary of defined terms with a structure of relationships between items is referred to as an ontology.

See the GO Consortium's An Introduction to the Gene Ontology for a basic introduction to the Gene Ontologies, including descriptions of what comprises a GO term, the structure of the ontologies, and the types of relationships between terms.

This diagram shows a small portion of the biological process ontology. Terms at the top represent broader, more general concepts, while terms lower down represent more specific concepts. When referring to the structure between terms, a term that has terms below it is referred to as a parent term, while those terms below it are referred to as child terms. Note that each term will be a parent with respect to the terms below it and a child with respect to terms above it. There are two different relationship types between terms. While not shown in SGD, you may notice that the relationship types are shown in the AmiGO browser.

Note that the Gene Ontologies themselves contain only information about terms in the ontology and their relationships to other terms. They do not contain gene products of any specific organism.

What is a GO Annotation?

Basic parts of a GO Annotation

To provide specific information about gene products, a GO term, e.g. cytokinesis, is associated with a gene or gene product, e.g. ACT1 or Act1p to form a GO Annotation. In addition to the association between a gene product and a GO term, a GO annotation must also be associated with a specific reference, an evidence code, and the date on which the annotation was made. Thus a basic GO annotation includes these pieces of information:

These basic, essential parts of a GO annotation are all displayed on SGD's GO Evidence and References pages; see, for example, this one for the ACT1 GO Annotations.

This diagram shows a portion of the GO Biological Process ontology along with the GO Biological Process annotations of the genes BUD3, BUD4, and AXL2. As demonstrated by the annotations for BUD3, genes can be annotated with multiple GO terms which are at various levels within the ontology, depending on the experiments and type of evidence available to annotate each gene.

.

For a more complete guide to GO practice in the use of GO terms for the annotation of gene products, please see the GO Consortium's GO Annotation Guide and Guide to GO Evidence Codes.

Optional additional parts of a GO Annotation

In addition to the basic, essential components of a GO annotation, there are some optional pieces of information that may be associated with the GO annotation when appropriate. These include a qualifier and the with field:

Using GO Annotations

The use of GO terms to annotate gene products in many databases facilitates uniform queries across multiple species.

Using both the Gene Ontologies and GO annotations, tools can be built which allow the display of gene products annotated to GO terms to be displayed alongside the GO terms themselves or to find gene products that are involved in similar biological processes, similar molecular functions or similar cellular components. The GO Consortium develops and maintains the AmiGO browser to provide a means to make queries about either GO terms or the gene products annotated to them.

Annotation Methods

To differentiate annotations made from published small scale experiments, genome-wide or high-throughput experiments and computational predictions, we have separated GO annotations at SGD into three sets:

  1. Manually curated GO annotations

    Manually curated GO annotations reflect our best understanding of the basic molecular function, biological process, and cellular component for a gene product. Manually curated annotations are assigned by SGD curators reading the literature for each gene and making annotations from published papers when available. When published literature is available, such annotations may include those based on experiments, sequence similarity, or other computational analyses described in the paper, or on statements made by the authors. When no published literature is available for a gene, annotations may be made on the basis of curatorial judgements. Curators periodically review all Manually curated GO annotations for accuracy and completeness and update as necessary, adding new annotations to reflect advances in knowledge and removing any annotations that are no longer supported by the literature. The Last Reviewed on: date on the GO evidence and references page for a gene indicates the date when an SGD curator reviewed all of the Manually curated GO annotations for that gene. In addition, SGD also reviews and incorporates manual GO annotations for S. cerevisiae proteins from the GO Annotation (GOA) project at Uniprot. These annotations can be identified at SGD by the source, e.g., 'Uniprot', 'MGI', 'HGNC' (GO consortium members), displayed on the 'Assigned By' column of the GO evidence and references page.

  2. High-throughput GO Annotations

    GO annotations from high-throughput experiments are assigned based on a variety of large scale high-throughput experiments, including genome-wide experiments. Many of these annotations are made based on GO annotations (or mappings to GO annotations) assigned by the authors, rather than SGD curators. While SGD curators read these publications and often work closely with authors to incorporate the information, each individual annotation is not necessarily reviewed by a curator. GO Annotations from high-throughput experiments will be assigned only when this type of data is available, and thus may not be assigned in all three aspects of the Gene Ontologies.

  3. Computational GO Annotations

    Computational GO annotations are made by a variety of computational methods, such as sequence similarity methods, including protein domain motifs, and keyword mapping files. When annotations based on computational methods are NOT reviewed by a curator, they are placed in the Computational GO annotations section. Note that the criteria for including a GO annotation in this section is whether or not it was reviewed by a curator; when annotations made by a computational method, such as sequence analysis, are reviewed by a curator, they may be found in the Manually curated section. Currently (as of 09/2007), all computational GO annotations for S. cerevisiae are assigned by an external source (for example, the Gene Ontology Annotation (GOA) project of the European Bioinformatics Institute (EBI)).

In SGD, curators read the research literature and associate specific GO terms with the appropriate gene products to provide information about the state of knowledge of the yeast genome. We are constantly updating our GO annotations and always welcome suggestions for improvement or corrections when the understanding about a gene has changed since the last time we reviewed the literature for a given gene.

Accessing GO Annotations in SGD

Searching for GO terms within SGD

Users can search for GO terms in any of the three Gene Ontologies that match an alphanumerical query, e.g. "snRNP U1", using the Search box located in the tool bar found at the top of SGD pages or using the SGD Text Search option from the SGD Full Search page. The search result is a list of matches for the query term. Clicking on the Gene Ontology terms (GO terms, synonyms) link from the results page will provide a list of GO terms containing the query string.

Users can search for GO terms whose GOIDs (minus the "GO:" prefix and leading zeroes) match a purely numerical query, e.g. "5685", using the Search box located in the tool bar found at the top of SGD pages. The search result will usually be the GO term whose GOID matches the query. Occasionally, the search result will be a list of matches for the query term, where clicking on the Gene Ontology ID link will take you to the associated GO Term page.

Accessing GO annotations for specific genes or GO terms

At SGD, you can find GO annotations displayed at various levels of detail in three locations as described below.

GO Slim Mapping Tool at SGD

This tool identifies the major branches of the ontologies common to a list of genes or ORFs, based on their GO annotations. The GO terms that represent the major branches of the ontology are higher level terms, also known as the GO-slim terms. This is possible with GO because there are parent-child relationships recorded between the granular terms and the high level GO slim terms. For more information on this tool click here.

GO Term Finder Tool at SGD

This tool searches for significant shared GO terms or parents of GO terms used to describe your set of genes or ORFs. This tool helps you understand what is common among the genes/ORFs you are studying. Results from this search are displayed in a graphic and table form. The graphic view shows the parent-child relationships (DAG view) of the GO terms that are used to annotate the genes/ORFs. For more information on this tool, please click here.

Microarray Data and GO at SGD

Expression Connection is a tool that retrieves mRNA expression data for the locus of interest along with that for other loci that have similar expression levels. Search results also display the Process, Function and Component GO terms for the locus of interest and all the loci that have similar expression patterns. This helps identify relationships between a large number of genes (cluster) with similar expression profiles with respect to their Process, Function and Component GO terms.

Accessing the AmiGO browser

In SGD, The AmiGO browser is accessible from all GO Term pages, where the AmiGO icon link will take you directly to the corresponding AmiGO page for that GO term. AmiGO allows you to find genes from other organisms, as well as those from S. cerevisiae, which have been annotated to a specific GO term. On the AmiGO page for a GO term, you can view a clickable tree (DAG) view of the GO term and a list of all genes that have been annotated to the term, either directly or to any of its child terms.

Relevant Links

Associated Glossary Terms


Return to Saccharomyces Genome Database Send a Message to the SGD Curators