SGD Help: GO Slim Mapper
Contents
The Gene Ontology (GO) project was established to provide a common
language to describe aspects of a gene product's biology. A gene
product's biology is represented by three ontologies: molecular
function,
biological
process and
cellular
component. The use of a consistent vocabulary allows genes from
different species to be compared based on their GO annotations. To
provide the most detailed information available, gene products are
annotated to the most granular GO term(s) possible. For example, if a
gene product is localized to the perinuclear space, it will be
annotated to that specific term only and not the parent term
nucleus. In this example the term perinuclear space is
a child of nucleus. Parent-children relationships can be viewed
better using AMIGO Tree
View.
However, for many purposes, such as reporting the results of GO
annotation of a genome, analyzing the results of microarray expression
data, or cDNA collection, it is very useful to have a high level view
of the three ontologies. For example, if you wanted to find all the
genes in an expression cluster that were localized to the nucleus, it
would be useful to be able to map the granular annotations, such as
perinuclear space, to general terms, like nucleus. Thus,
GO slim was created. GO slim is a high level view of GO: a slice of
the broad, high level terms such as DNA replication,
transcription, and transport. There are several versions
of GO slims created for different genomes and the GO slim terms are
updated periodically. To view and/or download other GO slims, go to
the GO slim ftp
site. The GO slim tool at SGD uses the GO slim terms picked by the
SGD curators based on annotation statistics and biological
significance.
The GO Slim Mapper at SGD was created to allow you to map the granular
annotations of the query set of genes to one or more high level,
parent GO Slim terms. This is possible with GO because there are
parent:child relationships recorded between granular terms and more
general parent (i.e., GO slim) terms.
For more information on GO in general, visit the Gene Ontology website or the GO help page at
SGD.
The query page allows you to enter the list of gene names and select
your GO slim terms.
- Step 1: Enter your gene(s)
You can either type the names of the genes in the input box or upload a
file that contains the gene names. Note that the program requires
more time to process a long list (greater than 100 genes) than a short list.
- Step 2: Choose your GO Slim Set:
Three GO Slim Sets are available at SGD.
- Macromolecular complex terms: Component: A set of
granular protein complex terms from the cellular component ontology,
useful for determining whether your protein of interest is a member of
a particular complex. This set is a list of all protein complex terms
and not truly a Slim set.
- Super GO-Slim:Process: A small set of very broad, high
level GO Biological Process terms, useful for binning groups of genes
in general categories.
- Super GO-Slim:Function: A small set of very broad, high
level GO Molecular Function terms, useful for binning groups of genes
in general categories.
- Super GO-Slim:Component: A small set of very broad, high
level GO Cellular Component terms, useful for binning groups of genes
in general categories.
- Yeast GO-Slim:Component: A set of high level GO terms that
best represent the major biological
components that are found in S. cerevisiae. These terms have
been selected by SGD curators based on annotation statistics and
biological significance.
- Yeast GO-Slim: Function: A set of high level GO terms that best
represent the major biological functions found in S. cerevisiae. These terms have
been selected by SGD curators based on annotation statistics and
biological significance.
- Yeast GO-Slim: Process: A set of high level GO terms that
best represent the major biological processes that are found in S. cerevisiae. These terms have
been selected by SGD curators based on annotation statistics and
biological significance.
This tool is designed to search only one of the
GO Slim set at a time in order to minimize the search time.
When you choose a GO Slim Set, the terms from that set
will be listed in the box under Step 3.
- Step 3: Choose your GO Slim terms
Select at least one GO slim term from the list. You can also select
all the terms in the list. For information about a particular GO
Term and its definition, type the GO Term (for example, peptidase
activity, bud tip) in the Quick Search box at the top of the page.
- If you click the Search button after Step 3, the tool
will map annotations made to your input list of genes by compiling
data from both the Manually curated and High-throughput sets.
You can go to
optional Step 4 to filter by Annotation
Method.
GO Slim Mapper at SGD queries Manually curated and High-throughput
annotations only and does not query annotations obtained using
computational methods.
- Step 4: Select Annotation Method(s)
This is an optional step that allows you to select either the Manually
curated or the High-throughput annotation set rather than both when
using GO Slim Mapper to map annotations to your input set of genes.
The results page displays the number of genes from your input list that
were mapped, the reasons why some genes may not have been mapped and a
table that displays the mapping results.
Results Table
The first column in the table lists the GO slim terms that were chosen
in Step 3, the second column lists the frequency
with which each GO slim term is associated (directly, or
indirectly, via a parental relationship with a granular term) with the
genes in your list and the third column lists the genes that were mapped to that
GO term. Each gene name is hyperlinked to its locus page, which shows
all GO annotations associated with that gene.
You can also download the results into a tab-delimited (ie. excel
readable) file by clicking on the Download Results link.
Note: Occasionally browser-specific problems occur while uploading a file or displaying the results.
Please contact us at yeast-curator@genome.stanford.edu if you notice such problems.
Tips to interpret the results
If some or all of the genes in your input list are not mapped to a GO
Slim term, consider these possible reasons.
- Genes in your input list may have been binned into a category called 'cannot be mapped to a GO Slim
term'. What does this mean?
There are some genes whose annotations cannot be mapped to a GO slim
term because the GO term used for the annotation does not map to an
existing GO slim term. For example, SSB1 is annotated to the cell component 'polysome',
but the term 'polysome' does not map up to an existing GO slim
term. Hence it gets binned into this category.
- Genes in your input list were filtered out by the Annotation
filter option in Step 4. For example, if you are trying to map the GO
annotations of only the Annotation Method: Manually curated and your input list of genes
have annotations of both types (Manually curated and High-throughput), then filtering out Annotation Set 'High-throughput'
in Step 4 will remove those genes that have this type of annotation.
In this case you will see a message like the one below, on the top of the results page.
The following gene(s) URA7, URA3 cannot be mapped because they
have no annotations for the specified GO Slim terms in the selected
annotation set Manually curated.
Consider a small group of 4 genes: CDC53,
PHO2,
PHO3
and PHO4.
The following are the granular molecular function annotations for
these genes in SGD.
| Gene Name | Molecular Function Annotation |
| CDC53
| ubiquitin-protein ligase activity
|
| DNA replication origin binding
|
| protein binding, bridging activity
|
| PHO2
|
transcription factor activity
|
| PHO3
|
acid phosphatase activity
|
| PHO4
|
transcription factor activity
|
Searching for all the GO slim function terms will map these
annotations to the following:
| GO-Slim term |
Cluster frequency | Genes annotated to the term |
|
3 out of 4 genes, 75% |
CDC53,
PHO2,
PHO4 |
|
2 out of 4 genes, 50% |
PHO2,
PHO4 |
|
1 out of 4 genes, 25% | PHO3 |
|
1 out of 4 genes, 25% | CDC53 |
|
1 out of 4 genes, 25% | CDC53 |
From the two tables above the following conclusions can be drawn:
- The genes CDC53, PHO2 and PHO4 have DNA binding activity, although
their granular annotations are to the terms
DNA replication origin binding and transcription factor activity.
- The genes PHO2 and PHO4 function as transcription regulators
although their granular annotations are to the term 'transcription
factor activity'
- The terms DNA replication origin binding and transcription
factor binding are mapped to the parent term DNA binding activity as
shown below.
DNA binding->
sequence-specific DNA binding->
DNA replication origin binding->
transcription factor activity->
- This example also shows the existence of multiple parents for
the same term. 'transcription factor activity' has two parents- 'DNA
binding' and 'transcription regulator activity' as shown below.
DNA binding->
sequence-specific DNA binding->
DNA replication origin binding->
transcription factor activity->
molecular_function->
transcription regulator activity->
transcription factor activity->
- There are no genes in the input set that have been annotated
to the term 'molecular function unknown'.
You can see the relationships mentioned above and much more using
Some GO slim terms are children of other GO-slim terms. For example,
"meiosis" is a child of "cell cycle" but both are terms in the
GO-slim process ontology. The GO Slim Term Mapper uses a 'complete
mapping approach' and maps features to all available
GO slim terms, regardless of parentage. This means, for example, that
if a feature is annotated to "meiosis," then the GO Slim Term Mapper will
also report it as annotated to "cell cycle."
Data in the go_slim_mapping.tab file on SGD's
FTP site and the GO Slim Bar graphs on the Genome Snapshot page
are also generated using the 'complete mapping approach'.
- GO
Term Finder: The GO Term Finder searches for significant shared GO terms, or parents of those GO terms, used to describe the genes in your list to help you discover what the genes may have in common.
- GO
Tutorial: interactive tutorial for using GO in SGD.
- GO
Help Page: general help page for GO in SGD.
- GO Home Page: home page
of the GO consortium.