Reference: Zhang CT and Wang J (2000) Recognition of protein coding genes in the yeast genome at better than 95% accuracy based on the Z curve. Nucleic Acids Res 28(14):2804-14

Reference Help

Abstract


The Z curve is a three-dimensional space curve constituting the unique representation of a given DNA sequence in the sense that each can be uniquely reconstructed from the other. Based on the Z curve, a new protein coding gene-finding algorithm specific for the yeast genome at better than 95% accuracy has been proposed. Six cross-validation tests were performed to confirm the above accuracy. Using the new algorithm, the number of protein coding genes in the yeast genome is re-estimated. The estimate is based on the assumption that the unknown genes have similar statistical properties to the known genes. It is found that the number of protein coding genes in the 16 yeast chromosomes is 0.5 or YZ < 0.5, respectively. The YZ scores for all the ORFs annotated in the MIPS database have been calculated and are available on request by sending e-mail to the corresponding author.

Reference Type
Journal Article | Research Support, Non-U.S. Gov't
Authors
Zhang CT, Wang J
Primary Lit For
Additional Lit For
Review For