Mass spectral libraries are collections of mass spectra curated specifically to facilitate the identification of small molecules, metabolites, and short peptides. One of the most comprehensive peptide spectral libraries is curated by NIST and contains upward of half a million annotated spectra dominated by human and model organisms including budding yeast and mouse. While motivated primarily by the technological goal of increasing sensitivity and specificity in spectral identification, we have found that the NIST spectral library constitutes a surprisingly rich source of biological knowledge. In this Article, we show that data-mining of these published libraries while applying strict empirical thresholds yields many characteristics of protein biology. In particular, we demonstrate that the size and increasingly comprehensive nature of these libraries, generated from whole-proteome digests, enables inference from the presence but crucially also from the absence of spectra for individual peptides. We illustrate implicit biological trends that lead to significant absence of spectra accounted for by complex post-translational modifications and overlooked proteolytic sites. We conclude that many subtle biological signatures such as genetic variants, regulated proteolysis, and post-translational modifications are exposed through the systematic mining of spectral collections originally compiled as general-purpose, technology-oriented resources.
|Evidence ID||Analyze ID||Interactor||Interactor Systematic Name||Interactor||Interactor Systematic Name||Type||Assay||Annotation||Action||Modification||Phenotype||Source||Reference||Note|
|Evidence ID||Analyze ID||Gene||Gene Systematic Name||Gene Ontology Term||Gene Ontology Term ID||Qualifier||Aspect||Method||Evidence||Source||Assigned On||Reference||Annotation Extension|
|Evidence ID||Analyze ID||Gene||Gene Systematic Name||Phenotype||Experiment Type||Experiment Type Category||Mutant Information||Strain Background||Chemical||Details||Reference|
|Evidence ID||Analyze ID||Regulator||Regulator Systematic Name||Target||Target Systematic Name||Experiment||Conditions||Strain||Source||Reference|