Genome-wide analysis of DNA-binding proteins and the sequences they
recognize.
Su-Wen Ho, Paul Cliften, Mark Johnston
Genetics Department, Washington University, 4566 Scott Ave, St. Louis,
MO 63110, USA
Functional sequences in proteins are routinely
identified by comparing proteins of related species. It is more
challenging to identify functional non-protein coding sequences, because
of the rapid evolution of these sequences and their relatively simple
nature. It is necessary to compare the sequences of relatively closely-related species to identify functional non-protein coding sequences, so
we have partially determined the DNA sequence of the genomes of five
Saccharomyces species. Comparing the sequences of the orthologous
promoters of these species reveals conserved sequences, some of which
are likely to be recognition sites for DNA-binding proteins. Based on
analysis of 1179 S. cerevisiae gene promoters for which we have
the orthologous sequences from all five species, we estimate that there
are less than 2000 conserved non-coding sequence motifs. Some of these
are similar and are likely recognized by the same transcription factor.
The function of these sequences can be tested using appropriate gene
reporters, and the collection of yeast gene deletion mutants can be used
to identify the proteins that bind to these regulatory elements. We have
tested these methods using the consensus Gal4 binding site and have
shown that we can identify Gal4 as its binding protein. In these ways we
expect to identify most of the DNA-binding proteins and determine their
DNA-binding sites in the yeast genome.
Return to YGM 2002 Home at SGD