Yeast Genetics and Molecular Biology 1996
Madison, Wisconsin
August 1996


Name: Chervitz, Steve A
Mailing Address: Stanford University Medical Center, M309, Stanford, CA 94305-5120
Email Address: sac@genome.stanford.edu
Phone and Fax numbers: 415-498-7144, 415-723-7016

Data mining the yeast genome: Searching for structure

S.A. Chervitz, J.M. Cherry, and D. Botstein. Department of Genetics, Stanford University School of Medicine, Stanford, California 94305-5120

To enhance the usefulness of the recently completed genomic sequence of S. cerevisiae, we are exploring three-dimensional structural information for addition to the Saccharomyces Genome Database (SGD). All yeast ORFs peptide sequences are ranked according to the following criteria: (i) does the ORF have a known structure? (ii) does it have any homologs with known structure? (iii) does it have any domains with known structure? (iv) can a structure be predicted? Preliminary results for chromosome V reveal that 35 of 268 ORFs (13%) have significant similarity to protein sequences in the Protein Data Bank (PDB). Of these 35, three are structures of yeast proteins. We compared these results with EMBL's "GeneQuiz" analysis and found that GeneQuiz reported 18 ORFs on chromosome V with homologs in PDB, a subset of the 35 we identified. The information we obtain is presented graphically on the homologous structure to indicate both the locations of predicted homologous regions and the degree of similarity. We also present results on automating the analysis of structural information for yeast ORFs and for creating interactive 3D models of the structures.