|
SGD Help: Querying SAGE Data |
Contents
The SAGE technique (Serial Analysis of Gene Expression) has been used to analyze the expression profile of thousands of genes across the yeast genome, i.e. the yeast "transcriptome" (Velculescu, et al., (1997) Cell 88:243-251). A SAGE tag is a 14-nucleotide sequence that has been found within a mRNA. The relative abundance of a particular SAGE tag within a pool of tags gives some indication of the level of expression of the gene(s) containing that tag. Please Note: In order to interpret the expression data, it is essential to be familiar with the SAGE technique. For instance, it is important to realize that if there are two or more SAGE tags within a given ORF then only the data from the 3' most tag are a reflection of that ORF's expression. In addition, expression data obtained from SAGE tags that are not unique may reflect expression from more than one location. Please see Velculescu et al. (1995) "Serial analysis of gene expression," Science 270, 484-487, for additional information about the SAGE technique.
Each SAGE tag is put into one of four "classes" based on its
location relative to known ORFs and is assigned a color in graphic displays:
1 - within an ORF (orange);
2 - within 500 bp 3' of an ORF (violet);
3 - on the strand opposite an ORF (yellow);
4 - none of the above (bright pink).
SGD provides a both Simple and an Advanced Query to access the SAGE data. The Simple Query allows you to search the SAGE data by Gene or ORF name or by chromosomal region, while the Advanced Query allows you to search Gene or ORF name, tag sequence, or relative expression values.
With this search option, you can enter a gene or ORF name. After you have entered the name, hit the "Return" key or click on "query by gene." You can also enter part of a name and use the wildcard character (*). If a wildcard search matches more than one gene or ORF, a list of possible hits will be presented from which you can then select a single gene or ORF.
The output of this search option is a Chromosomal SAGE Map showing the SAGE tags near the chosen gene or ORF. In the display, the requested gene or ORF is highlighted in red text. Tags are indicated with colored triangles. Tags that are unique in the genome are boxed.
Note: There is a link at the top of the class 4 tag potential ORF page which goes to the SAGE tag information for that tag
The additional information about class 4 tags is three-fold:
With this option, you can click on any section of one of the red bars representing the chromosomes to go to a Chromosomal SAGE Map described above (see Section I above: "Enter a gene or ORF name").
The Advanced SAGE Query differs in two primary ways from the Simple SAGE Query. First, all queries are entered in a relatively simple syntax language which is described on the SAGE Advanced Query page. This allows for more query options. Secondly, all query results are returned in tabular form of the type shown below. The Chromosomal SAGE map for a resulting ORF is linked off of this table (by clicking on the tag coordinate listed under the (COORD [Map Link] column). Information about the SAGE tag is found by clicking on the SAGE tag sequence in the table.
There are three general types of queries which can be made:
This query can be made by entering the phrase "GENE = X" or "ORF = X," where X is the gene or ORF name. A table is returned with all tag sequences associated with the entered gene or ORF name (see above for an example of the entry GENE=CDC15).
It is also possible to search for tags affiliated with a gene or ORF name which have another query restriction. For instance, in order to search for all CDC28 associated tags which have an expression value for G2M which is greater than 2, one can use the following query: (GENE=CDC28) AND (G2M>2). Several more examples are listed in the SAGE Query Examples Table below.
The SAGE study compared gene expression under three different growth conditions:
This search feature allows you to retrieve SAGE tags by their relative expression levels under the three growth conditions. For example, if you enter "S>L," you will retrieve all the SAGE tags that are expressed at a higher level in S phase arrest than log phase growth. As is the case for the gene and ORF names, another query restriction can be added. For instance, one could locate all unique tags where S > L using the following equation: (HITS=1) AND (S>L). Several more examples are listed in the SAGE Query Examples Table below. You also have the option to sort the results in descending or ascending order by the values for any of the growth conditions.
One can search for a particular SAGE tag sequence or set of tag sequences using the syntax language TAG = X, where X is a sequence of 14 nucleotides or less than 14 nucleotides (in this case, one of the characters would be a wild-card character). It is also possible to search for all SAGE tag sequences where the number of hits in the genome is the search criteria (e.g. HITS=1). One can also combine these two parameters (e.g. (TAG=CATGCAA*) AND (HITS=2)). In addition, it is possible to combine either a TAG or HITS query with a gene or ORF name or a relative expression value phrase. Several more examples are listed in the SAGE Query Examples Table below.
Note: All queries are case insensitive.
| Search criteria | Description of query |
| GENE = ACT1 | Retrieve tags located within the gene ACT1 | GENE = MYO* | Retrieve tags located within any of the MYO genes | ORF = YNL301C | Retrieve tags located within the ORF YNL301C | ORF = YNL*C | Retrieve tags located on the Crick strand of the left arm of Chromosome XIV |
| TAG = CATGATTT* | Retrieve tags that start with the sequence CATGATTT and are follwed by any nucleotide. |
| HITS > 5 | Retrieve tags that have more than 5 sequence locations within the genome |
| (L>S)AND(S>G2M) | Retrieve tags that are expressed at a higher level in log phase than in S phase AND are expressed at a higher level in S phase than in G2/M |
| (L>100)OR(S>100) | Retrieve tags that are expressed at a level higher than 100 either in L phase OR S phase arrested |
| L BETWEEN 100 AND 200 | Retrieve tags whose expression value falls between 100 and 200 in log phase |
| GENE IN (CLN1, CLN2, CLN3) | Retrieve tags located in any of the genes CLN1, CLN2 or CLN3 |
| L>10 | Retrieve all tags where the value for L phase is greater than 10 |
| (L!=0)AND(S!=0)AND(G2M!=0) | Retrieve tags whose expression values in L phase, S phase, and G2M phase are not equal to zero |
| S != 0 AND L/S > 10 | Retrieve tags whose expression values in L phase are 10-fold or more higher than those in S phase AND S does not equal zero |
Last update 2005-11-11 ELH
Return to Saccharomyces Genome Database |
Send a Message to the SGD Curators ![]() |