SGD

SGD Help: Expression Connection


Contents


Description

The presentation of gene expression data is aimed at better facilitating access to public microarray experiments. Researchers using Expression Connection at SGD can retrieve mRNA expression data for a single gene or a list of genes in a variety of ways.

Back to top

Search Page

The top of the page displays a menu of searches possible with Expression Connection. With the exception of Webminer, the searches directly query the data stored in the SGD database. Webminer was developed by Max Heiman at UCSF and queries a separate set of datafiles associated with this software. Therefore, the datasets available through Webminer and the datasets available through other the other Expression Connection searches may be different.

Red square Red squares in the search menu link to the search forms below. Red squares next to each search form link bact to the search menu at the top of the page.
green The green Stats icon links to a summary table containing information on how many genes in the dataset were increased, decreased, and etc. in each experiment in the dataset. See also Stats
red The red Reg icon links to an index page containing links to predicted regulatory modules for certain datasets. The predictions were done by Segal et al and the pages have been kindly provided to SGD by Eran Segal.

Back to top

Results Pages

Results pages have two sections. The first section, 'Your Search Parameters' reiterates the query you submitted, including, the type of search, any genes you entered, the parameters you chose, and links to the summary table for each dataset or gene. The second section of a results page, 'Results', contains the scale for assessing expression images, an option to change the layout of the results via a sort, and the summary table for each dataset or gene.

The results of any query are displayed in table form. Each row in the table contains links to all of the information available for a given gene. Each column provides a link to a specific kind of data or information. The column headings describe the kind of information that the link (a colored button or a icon) goes to. The purple sidebar on the left side of each table contains links to information pertinant to the gene or entire list of genes in the table.

ORF, Gene Name

The ORF is the systematic name (or ORF name) of the gene and reflects the chromosomal position of the gene. The Gene Name is the assigned standard name (or standard locus name).

Change in dataset

Back to top

Histogram (vs. Other Datasets)

bluebutton.gif The histogram helps users see the observed range of gene expression changes in every experiment in every dataset in the database. The data are in log2 space meaning that positive numbers represent increases in expression and negative numbers represent decreases in expression. For instance, a log2 ratio of 1.0 indicates an increase of the transcript level by 2-fold and -1.0 indicates a decrease by 2-fold. A log2 ratio of zero would indicate no change. The log2 ratios are rounded to the nearest 0.1 value for this display. To see the actual experiments and datasets where these changes occur, click on individual bars in the bar graph.

Thumbnail Display

Thumbnail Display is a miniature version of the 3-color graphical expression display. To see a large version with labels, click on the "Enlarge" link in the column header.

The page with the 'enlarged' graphical expression displays also shows the GO annotations associated with each gene. When more than one annotation exists, an asterisk is shown. You can see all of the GO terms for a gene by going to the locus page via the hyper-linked ORF name.

If you would like to analyze the list of genes further (Do they share a common function, act in a common process, or share a cellular component?), follow the links to the GO Term Mapper and GO Term Finder. See the "Help" button at the top of each page for more info, or the GO Resources page.

Back to top

Graphical Expression Display

Expression data is often displayed with a 3-color graphic: graphical expression display
Each square of color represents the expression level of a gene in a single experiment in the dataset. A miniature version is given for each gene in the thumbnail display column on the results pages. A larger version can be found by clicking the "Enlarge" link.

The convention followed by the vast majority of microarray experiments is this:

Red means the mRNA level (expression) is increased, relative to the standard.
Black means the mRNA level (expression) is unchanged, relative to the standard.
Green means the mRNA level (expression) is decreased, relative to the standard.

This scale is used as a reference to help determine the relative expression levels within the 3-color representation of the data (graphical expression display).

color scale

Note that there are exceptions to this general rule. For example, in the data set entitled "Gene regulation by HTZ1, SWR and SIR2", because of a dye quenching artifact observed with abundant mRNAs, the authors perform each experiment in quadruplicate and reverse the fluors. As a result the displayed colors alternate between red and green from replicate-to-replicate but the trends are still the same so that similarly expressed genes can be identified.

Similarly Expressed Genes

redbutton.gif This link goes to a page much like the results display in the original Expression Connection. Up to 20 genes expressed in a similar pattern to the gene of interest are listed along with the 3-color graphical expression display and the GO terms associated with each gene. Two genes are considered to be similarly expressed if the Pearson correlation is above 0.8. O.8 is an arbitrary cut-off, but one that is considered rather stringent.

If you would like to analyze the list of genes further (Do they share a common function, act in a common process, or share a cellular component?), follow the links to the GO Term Mapper and GO Term Finder. See the "Help" button at the top of each page for more info, or the GO Resources page.

Back to top

Expression Graph

graph.gif The Expression Graph is an alternative to the 3-color graphical expression display for displaying data. The 3-color graphical expresssion display often does not offer enough differentiation between data points, especially at the spectrum's extremes. In addition, the timecourse or concentration course of the dataset are not obvious. On the Expression Graph, the numerical values for expression changes are plotted versus a scaled x-axis (time, concentration and etc.), giving the user a better feel for the amplitutude and course of expression changes in the dataset.

Pearson Correlation

pearson.gif The Pearson Correlation is a graphical representation of the Pearson correlation coefficients for each pair of genes in a gene list. A Pearson correlation coefficient is a statistical measure of the extent of correlation between two independent items. In this setting it reflects the extant of correlation between the expression patterns of two genes in a microarray dataset.

The color of each square indicates the extent of correlation between the expression patterns of two genes. A Pearson correlation of 1.0 (black in the graphic) indicates perfect correlation, a calue of 0 indicated no correlation, and a score of -1 indicates perfect anti-correlation. Thus the closer to 1 the Pearson correlation is, the better correlated two profiles are and the darker red the square is. Each gene calculated against itself is perfectly correlated, hence the diagonal has all black squares. SGD uses 0.8 (an arbitrary cut-off) to indicate which gene pairs have expression patterns that are well correlated (dark red in the graphic).

The Pearson coefficient (phi G) is calculated by:
Equation 1 formula 1
Equation 2 formula 2
In equation 1 S(X,Y) is identical to the textbook Pearson Correlation if Goffset in equation 2 is set to the mean of the observations. For the calculations in Expression Connection, the Goffset is set to 0 (zero), calculating the Pearson correlation of uncentered metrics.

The program that calculate the Pearson Correlation was provided kindly by Gavin Sherlock, Stanford Microarray Database (SMD). See SMD help documentation for further information.

Back to top

GO Term Mapper

The GO Term Mapper takes a list of genes and maps the granular GO annotations assigned by SGD curators to more general terms, chosen by the user. This allows users to ask whether a group of genes share particular commonalities (function, process, cellular component). For more help, click the help button at the top of the GO Term Mapper page.

GO Term Finder

The GO Term Finder takes a list of genes and maps the granular GO annotations assigned by SGD curators to more general significant terms. This allows users to find any commonalities (in function, process, and cellular component) that might exist among genes. For more help, click the help button at the top of the GO Term Finder page.

Stats

green The green Stats icon links to a summary table containing information on how many genes in the dataset were increased, decreased, and etc. in each experiment in the dataset. SGD downloads expresssion data from websites accompanying the publication of the data. These statistics are calculated from the data and are not given to SGD by authors.

The title of the dataset is displayed at the top of the summary statistics page. The statistics for individual experiments is contained in the rows of the table. The columns contain the statistics. All numbers are expressed as a percent of the total spots (distinct mRNAs) in the experiment. The percent Increased or Decreased refers to the percent of genes that change expression at least 2-fold relative to the control.

Regulators

red A gene can be a Regulator of other genes and be Regulated by other genes. Eran Segal and colleagues have evaluated the expression patterns of genes in certain microarray datasets to elucidate potential regulatory modules, which consist of genes designated as regulators and regulatees, and possible regulatory motifs found in intergenic regions that may mediate the regulation. Eran Segal and colleagues have provided the web pages with the results of their analyses to SGD. For more help, see their comprehensive website.
Back to top

Webminer

Webminer is a tool that can search microarray data on an experiment by experiment basis. It was developed by Max Heiman while at UCSF. SGD began hosting Webminer in March, 2003. For more information on Webminer, please see the Webminer Introduction and Tutorial and Documentation.

Other Resources

Back to Top

Glossary Terms:


Return to Saccharomyces Genome Database Send a Message to the SGD Curators